<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Bohdan Harabadzhyu</title>
    <description>The latest articles on DEV Community by Bohdan Harabadzhyu (@themysteriousstranger90).</description>
    <link>https://dev.to/themysteriousstranger90</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F317955%2F08a39e42-9508-421b-bf05-7058c444c393.JPG</url>
      <title>DEV Community: Bohdan Harabadzhyu</title>
      <link>https://dev.to/themysteriousstranger90</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/themysteriousstranger90"/>
    <language>en</language>
    <item>
      <title>From XML to Word: simplifying conversion with FileConversionLibrary</title>
      <dc:creator>Bohdan Harabadzhyu</dc:creator>
      <pubDate>Sat, 30 Nov 2024 13:15:00 +0000</pubDate>
      <link>https://dev.to/themysteriousstranger90/from-xml-to-word-simplifying-conversion-with-fileconversionlibrary-4548</link>
      <guid>https://dev.to/themysteriousstranger90/from-xml-to-word-simplifying-conversion-with-fileconversionlibrary-4548</guid>
      <description>&lt;h2&gt;
  
  
  FileConversionLibrary
&lt;/h2&gt;

&lt;p&gt;File conversion can be a tedious task for developers, but &lt;strong&gt;FileConversionLibrary&lt;/strong&gt; offers a basic solution for simple tasks. This library provides essential tools for converting CSV and XML files into formats like PDF, Word, YAML, and JSON. Available on &lt;a href="https://www.nuget.org/packages/FileConversionLibrary" rel="noopener noreferrer"&gt;NuGet&lt;/a&gt; and &lt;a href="https://github.com/TheMysteriousStranger90/FileConversionLibrary" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;, it’s an easy-to-use tool for streamlining data processing workflows. In this article, I will focus specifically on converting XML to Word&lt;/p&gt;




&lt;h2&gt;
  
  
  Introduction to XML and Word Formats
&lt;/h2&gt;

&lt;h3&gt;
  
  
  XML (eXtensible Markup Language)
&lt;/h3&gt;

&lt;p&gt;XML is a highly versatile format designed for storing and exchanging structured data. Developed by the World Wide Web Consortium (W3C) in 1998, it is widely used due to its compatibility and extensibility. Here are some technical details:&lt;/p&gt;

&lt;h4&gt;
  
  
  Key Features of XML:
&lt;/h4&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Structure&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;XML uses a tree-based hierarchy, starting with a single root element.&lt;/li&gt;
&lt;li&gt;Documents are composed of elements enclosed in tags.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Flexibility&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Users can define custom tags, adapting XML for various domains.&lt;/li&gt;
&lt;li&gt;Optional schemas like DTD (Document Type Definition) or XSD (XML Schema Definition) can enforce document structure.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Portability and Compatibility&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;XML is platform-agnostic and supported across programming languages.&lt;/li&gt;
&lt;li&gt;It forms the basis for standards like SOAP (Simple Object Access Protocol) and SVG (Scalable Vector Graphics).&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Drawbacks&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;XML can be verbose, leading to large file sizes.&lt;/li&gt;
&lt;li&gt;Parsing XML is computationally heavier compared to alternatives like JSON.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;




&lt;h3&gt;
  
  
  Word Documents (.docx)
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;.docx&lt;/code&gt; format is the modern standard for Microsoft Word documents, introduced in Office 2007 as part of the Office Open XML (OOXML) standard. It is designed to offer improved performance, compatibility, and extensibility.&lt;/p&gt;

&lt;h4&gt;
  
  
  Key Features of .docx:
&lt;/h4&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;File Structure&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;.docx&lt;/code&gt; files are essentially ZIP archives containing multiple XML files and resources.&lt;/li&gt;
&lt;li&gt;Components include &lt;code&gt;document.xml&lt;/code&gt; (content), &lt;code&gt;styles.xml&lt;/code&gt; (styling), and &lt;code&gt;settings.xml&lt;/code&gt; (document settings).&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Content Representation&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Text and elements like tables are stored using a structured XML vocabulary.&lt;/li&gt;
&lt;li&gt;Relationships, such as those linking images or styles, are managed via &lt;code&gt;rels&lt;/code&gt; (relationships) files.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Formatting and Extensibility&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Supports advanced formatting, including styles, fonts, and embedded media.&lt;/li&gt;
&lt;li&gt;Allows custom XML parts for metadata or structured content integration.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Advantages Over .doc&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Smaller file sizes due to ZIP compression.&lt;/li&gt;
&lt;li&gt;Improved interoperability with other software, thanks to its XML foundation.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Why Convert XML to Word?
&lt;/h2&gt;

&lt;p&gt;XML is excellent for structuring data but isn’t user-friendly for non-technical audiences. Converting XML to Word bridges this gap, providing readable and editable documents for various use cases, such as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Report Generation&lt;/strong&gt;: Transform XML into formatted reports suitable for stakeholders.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Documentation&lt;/strong&gt;: Automatically populate templates for manuals, invoices, or records.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Improved Accessibility&lt;/strong&gt;: Enable non-technical users to view and edit data in a familiar format.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;By converting XML to Word, you can unlock the potential of structured data while delivering it in a polished, user-friendly format.&lt;/p&gt;

&lt;h2&gt;
  
  
  Example XML
&lt;/h2&gt;

&lt;p&gt;To demonstrate how &lt;strong&gt;FileConversionLibrary&lt;/strong&gt; handles XML to Word conversion, we will use the following XML &lt;a href="https://learn.microsoft.com/en-us/previous-versions/windows/desktop/ms762271(v=vs.85)" rel="noopener noreferrer"&gt;books.xml&lt;/a&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;&lt;span class="cp"&gt;&amp;lt;?xml version="1.0"?&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;catalog&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;book&lt;/span&gt; &lt;span class="na"&gt;id=&lt;/span&gt;&lt;span class="s"&gt;"bk101"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;author&amp;gt;&lt;/span&gt;Gambardella, Matthew&lt;span class="nt"&gt;&amp;lt;/author&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;title&amp;gt;&lt;/span&gt;XML Developer's Guide&lt;span class="nt"&gt;&amp;lt;/title&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;genre&amp;gt;&lt;/span&gt;Computer&lt;span class="nt"&gt;&amp;lt;/genre&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;price&amp;gt;&lt;/span&gt;44.95&lt;span class="nt"&gt;&amp;lt;/price&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;publish_date&amp;gt;&lt;/span&gt;2000-10-01&lt;span class="nt"&gt;&amp;lt;/publish_date&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;description&amp;gt;&lt;/span&gt;An in-depth look at creating applications
            with XML.&lt;span class="nt"&gt;&amp;lt;/description&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;/book&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;book&lt;/span&gt; &lt;span class="na"&gt;id=&lt;/span&gt;&lt;span class="s"&gt;"bk102"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;author&amp;gt;&lt;/span&gt;Ralls, Kim&lt;span class="nt"&gt;&amp;lt;/author&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;title&amp;gt;&lt;/span&gt;Midnight Rain&lt;span class="nt"&gt;&amp;lt;/title&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;genre&amp;gt;&lt;/span&gt;Fantasy&lt;span class="nt"&gt;&amp;lt;/genre&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;price&amp;gt;&lt;/span&gt;5.95&lt;span class="nt"&gt;&amp;lt;/price&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;publish_date&amp;gt;&lt;/span&gt;2000-12-16&lt;span class="nt"&gt;&amp;lt;/publish_date&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;description&amp;gt;&lt;/span&gt;A former architect battles corporate zombies,
            an evil sorceress, and her own childhood to become queen
            of the world.&lt;span class="nt"&gt;&amp;lt;/description&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;/book&amp;gt;&lt;/span&gt;
 ...
&lt;span class="nt"&gt;&amp;lt;/catalog&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After converting the XML, the data will be structured and formatted into a Word document. Each CD entry will appear as a table or a section, depending on your template and configuration.&lt;/p&gt;

&lt;p&gt;Here’s an example of how it might look:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyto7ohe58jp7qn18r1em.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyto7ohe58jp7qn18r1em.png" alt="Example" width="800" height="343"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  DocumentFormat.OpenXml: A Powerful Library for Office Document Manipulation
&lt;/h2&gt;

&lt;p&gt;In the FileConversionLibrary, we leverage DocumentFormat.OpenXml to convert XML data into structured Word documents. DocumentFormat.OpenXml is an open-source library provided by Microsoft that enables developers to edit and process Office Open XML (OOXML) documents programmatically. This includes file formats such as .docx, .xlsx, and .pptx. The library eliminates the need to rely on Microsoft Office's COM Interop, making it lightweight and efficient for server-side or cross-platform applications.&lt;/p&gt;

&lt;h4&gt;
  
  
  Key Features of DocumentFormat.OpenXml
&lt;/h4&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Standard Compliance&lt;/strong&gt;

&lt;ul&gt;
&lt;li&gt;The library adheres to the ISO/IEC 29500 standard, ensuring compatibility with modern Office applications.
Works with .docx, .xlsx, and .pptx file formats introduced in Office 2007 and later.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Platform Independence&lt;/strong&gt;

&lt;ul&gt;
&lt;li&gt;Operates without requiring Microsoft Office installed on the host machine.
Ideal for server-side applications, cloud-based systems, or non-Windows environments.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Efficient File Manipulation&lt;/strong&gt;

&lt;ul&gt;
&lt;li&gt;Enables the creation, reading, and modification of Office documents with fine-grained control.
Supports advanced document structures, such as styles, metadata, tables, images, and charts.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;XML-Based Architecture&lt;/strong&gt;

&lt;ul&gt;
&lt;li&gt;OOXML documents are structured XML files encapsulated in ZIP archives.
Developers can directly interact with the XML nodes to customize or extract content.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  XmlHelperFile: Simplifying XML Data Extraction
&lt;/h2&gt;

&lt;p&gt;The XmlHelperFile class in FileConversionLibrary is designed to facilitate the extraction of data from XML files. This utility class provides an asynchronous method to read XML files and convert their contents into a structured format, making it easier to process and manipulate the data.&lt;/p&gt;

&lt;h4&gt;
  
  
  Key Features of XmlHelperFile
&lt;/h4&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Asynchronous Operation.&lt;/strong&gt;&lt;br&gt;
The ReadXmlAsync method is asynchronous, ensuring non-blocking I/O operations which are crucial for performance in applications that handle large files or multiple concurrent tasks.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;XML Parsing.&lt;/strong&gt;&lt;br&gt;
Utilizes XmlDocument and XmlNodeReader to parse the XML content, ensuring compatibility with standard XML structures.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;DataSet Integration.&lt;/strong&gt;&lt;br&gt;
Converts the XML data into a DataSet, leveraging its powerful data manipulation capabilities to handle complex XML structures.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Structured Output.&lt;/strong&gt;&lt;br&gt;
Extracts headers and rows from the XML, returning them as a tuple containing an array of headers and a list of rows. Each row is represented as an array of strings, corresponding to the columns in the XML.&lt;br&gt;
&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;public static class XmlHelperFile
{
    public static async Task&amp;lt;(string[] Headers, List&amp;lt;string[]&amp;gt; Rows)&amp;gt; ReadXmlAsync(string xmlFilePath)
    {
        // Read the entire content of the XML file asynchronously
        var xmlContent = await File.ReadAllTextAsync(xmlFilePath);

        // Load the XML content into an XmlDocument
        var xmlFile = new XmlDocument();
        xmlFile.LoadXml(xmlContent);

        // Create an XmlNodeReader from the XmlDocument
        var xmlReader = new XmlNodeReader(xmlFile);

        // Create a DataSet and read the XML data into it
        var dataSet = new DataSet();
        dataSet.ReadXml(xmlReader);

        // Check if the DataSet contains any tables
        if (dataSet.Tables.Count == 0)
        {
            throw new Exception("No tables found in the XML file.");
        }

        // Get the first table from the DataSet
        var table = dataSet.Tables[0];

        // Extract the column names (headers) from the table
        var headers = new string[table.Columns.Count];
        for (var i = 0; i &amp;lt; table.Columns.Count; i++)
        {
            headers[i] = table.Columns[i].ColumnName;
        }

        // Extract the rows from the table
        var rows = new List&amp;lt;string[]&amp;gt;();
        foreach (DataRow row in table.Rows)
        {
            var rowData = new string[table.Columns.Count];
            for (var i = 0; i &amp;lt; table.Columns.Count; i++)
            {
                rowData[i] = row[i].ToString();
            }
            rows.Add(rowData);
        }

        // Return the headers and rows as a tuple
        return (headers, rows);
    }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  IXmlConverter and XmlToWordConverter: Converting XML to Word
&lt;/h2&gt;

&lt;p&gt;The IXmlConverter interface and XmlToWordConverter class in FileConversionLibrary provide a structured way to convert XML files into Word documents. This section explains how these components work together to achieve the conversion.&lt;/p&gt;

&lt;h4&gt;
  
  
  IXmlConverter Interface
&lt;/h4&gt;

&lt;p&gt;The IXmlConverter interface defines a contract for converting XML files to other formats. It includes a single method, ConvertAsync, which performs the conversion asynchronously.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;public interface IXmlConverter
{
    Task ConvertAsync(string xmlFilePath, string outputFilePath);
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  XmlToWordConverter Class
&lt;/h4&gt;

&lt;p&gt;The XmlToWordConverter class implements the IXmlConverter interface to convert XML files into Word documents. It uses the XmlHelperFile class to read and parse the XML data, and the DocumentFormat.OpenXml library to create and manipulate the Word document.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;public class XmlToWordConverter : IXmlConverter
{
    public async Task ConvertAsync(string xmlFilePath, string wordOutputPath)
    {
        try
        {
            // Read and parse the XML file using XmlHelperFile
            var (headers, rows) = await XmlHelperFile.ReadXmlAsync(xmlFilePath);

            // Create a new Word document
            using (var wordDocument = WordprocessingDocument.Create(wordOutputPath, WordprocessingDocumentType.Document))
            {
                // Add a main document part to the Word document
                var mainPart = wordDocument.AddMainDocumentPart();
                mainPart.Document = new Document();
                var body = mainPart.Document.AppendChild(new Body());

                // Add headers to the Word document
                var headerParagraph = body.AppendChild(new Paragraph());
                var headerRun = headerParagraph.AppendChild(new Run());
                headerRun.AppendChild(new Text(string.Join(" ", headers)));

                // Add rows to the Word document
                foreach (var row in rows)
                {
                    var paragraph = body.AppendChild(new Paragraph());
                    var run = paragraph.AppendChild(new Run());
                    run.AppendChild(new Text(string.Join(" ", row)));
                }
            }
        }
        catch (FileNotFoundException e)
        {
            // Handle file not found exception
            Console.WriteLine($"File not found: {e.FileName}");
        }
        catch (XmlException e)
        {
            // Handle invalid XML exception
            Console.WriteLine($"Invalid XML: {e.Message}");
        }
        catch (Exception e)
        {
            // Handle any other unexpected exceptions
            Console.WriteLine($"Unexpected error: {e.Message}");
        }
    }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  How It Works
&lt;/h4&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Reading and Parsing XML&lt;/strong&gt;&lt;br&gt;
The ConvertAsync method starts by calling XmlHelperFile.ReadXmlAsync to read and parse the XML file. This method returns the headers and rows extracted from the XML.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Creating a Word Document&lt;/strong&gt;&lt;br&gt;
The method then creates a new Word document using WordprocessingDocument.Create. This document is created at the specified wordOutputPath.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Adding Main Document Part&lt;/strong&gt;&lt;br&gt;
A main document part is added to the Word document, which contains the body of the document.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Adding Headers&lt;/strong&gt;&lt;br&gt;
A paragraph is created for the headers, and a run is added to this paragraph. The headers are joined into a single string and added as text to the run.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Adding Rows&lt;/strong&gt;&lt;br&gt;
For each row in the extracted data, a new paragraph is created, and a run is added to this paragraph. The row data is joined into a single string and added as text to the run.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Exception Handling&lt;/strong&gt;&lt;br&gt;
The method includes exception handling for FileNotFoundException, XmlException, and general exceptions to provide informative error messages.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h4&gt;
  
  
  Executing XML to Word Conversion
&lt;/h4&gt;

&lt;p&gt;Here is an example of how to call the ConvertAsync method of the XmlToWordConverter class within the Main method:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;class Program
{
    static async Task Main(string[] args)
    {
        // XML to Word Conversion
        var xmlToWordConverter = new XmlToWordConverter();
        await xmlToWordConverter.ConvertAsync(@"C:\Users\User\Desktop\books.xml", @"C:\Users\User\Desktop\output.docx");
        Console.WriteLine("XML to Word conversion completed.");

        ...
    }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;In this article, we explored the process of converting XML files to Word documents using the FileConversionLibrary. My solution provides a basic, initial-level example of converting XML to Word. It serves as a basic example that can be taken as a starting point and improved upon.&lt;/p&gt;

</description>
      <category>csharp</category>
      <category>dotnet</category>
      <category>xml</category>
      <category>word</category>
    </item>
    <item>
      <title>Using CSCore for VoiceRecorder Application</title>
      <dc:creator>Bohdan Harabadzhyu</dc:creator>
      <pubDate>Sun, 16 Jun 2024 10:12:52 +0000</pubDate>
      <link>https://dev.to/themysteriousstranger90/using-cscore-for-voicerecorder-application-2hp0</link>
      <guid>https://dev.to/themysteriousstranger90/using-cscore-for-voicerecorder-application-2hp0</guid>
      <description>&lt;p&gt;&lt;strong&gt;Using CSCore for VoiceRecorder Application&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;CSCore is a powerful .NET library for audio processing and recording, offering a wide range of functionalities to work with sound. It supports multiple audio formats and devices, making it a go-to choice for developers working on audio-related applications. In this article I'll show you how to use CSCore using my simple VoiceRecorder application as an example, available on &lt;a href="https://github.com/TheMysteriousStranger90/VoiceRecorder" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt; and &lt;a href="https://sourceforge.net/projects/voice-recorder-stranger90/" rel="noopener noreferrer"&gt;SourceForge&lt;/a&gt;. I will focus on the core audio functionalities provided by CSCore and how they are integrated into my application. Note that we will not cover Avalonia UI or the MVVM design pattern.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Introduction to CSCore&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;CSCore offers a variety of features that make it an ideal choice for developers working on projects that require audio input and output. Some of the key features of CSCore include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Support for multiple audio formats:&lt;br&gt;
CSCore can handle various audio formats.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Audio capture and playback:&lt;br&gt;
The library provides robust methods for capturing audio from different devices and playing back audio files.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Audio processing:&lt;br&gt;
CSCore supports audio filtering, effects, and mixing, making it suitable for complex audio processing tasks.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Device management:&lt;br&gt;
The library allows easy enumeration and selection of audio devices.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Key Features of VoiceRecorder&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;VoiceRecorder is an easy-to-use application designed to record audio from selected devices. It’s good for quick recordings and audio testing. The application simplifies the process of selecting an audio device, starting and stopping recordings, and saving the audio to a file. VoiceRecorder allows you to choose from three available filters to enhance your audio or record without any filters if you prefer.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Core Classes and Functionality&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;AudioDevice Class&lt;/em&gt;&lt;br&gt;
The AudioDevice class is responsible for managing the audio devices available on the system. It uses the MMDeviceEnumerator from CSCore's CoreAudioAPI to list and select audio capture devices.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;public sealed class AudioDevice : IDisposable
    {
        // Enumerates multimedia devices, allowing us to list   
        private MMDeviceEnumerator _mmdeviceEnumerator;
        private bool _disposed = false;

        public AudioDevice()
        {
            _mmdeviceEnumerator = new MMDeviceEnumerator();
        }

        // Retrieves a list of available audio capture devices
        public List&amp;lt;string&amp;gt; GetAvailableDevices()
        {
            return _mmdeviceEnumerator.EnumAudioEndpoints(DataFlow.Capture, DeviceState.Active)
                .Select(device =&amp;gt; device.FriendlyName)  // Gets the friendly name of each device
                .ToList();
        }

        // Selects an audio device by its friendly name
        public MMDevice SelectDevice(string deviceName)
        {
            return _mmdeviceEnumerator.EnumAudioEndpoints(DataFlow.Capture, DeviceState.Active)
                .FirstOrDefault(device =&amp;gt; device.FriendlyName == deviceName);
        }

        // Implements the Dispose pattern to release unmanaged resources
        private void Dispose(bool disposing)
        {
            if (!_disposed)
            {
                if (disposing)
                {
                    if (_mmdeviceEnumerator != null)
                    {
                        _mmdeviceEnumerator.Dispose();
                        _mmdeviceEnumerator = null;
                    }
                }
                _disposed = true;
            }
        }

        public void Dispose()
        {
            Dispose(true);
            GC.SuppressFinalize(this);
        }
    }
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;em&gt;AudioRecorder Class&lt;/em&gt;&lt;br&gt;
The AudioRecorder class handles the core recording functionality. It uses WasapiCapture for capturing audio and WaveWriter for writing the captured audio to a WAV file. The class also supports applying audio filters during recording.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;public sealed class AudioRecorder : IDisposable
    {
        private WasapiCapture _capture; // Captures audio from the selected device
        private WaveWriter _writer; // Writes captured audio to a WAV file
        private bool _disposed = false;
        private SoundInSource _soundInSource; // Source for audio data from the capture device

        public IWaveSource CaptureSource =&amp;gt; _soundInSource; // Exposes the current audio source

        // Starts recording audio from the specified device to a WAV file
        public void StartRecording(string outputFilePath, MMDevice device, IAudioFilter filter)
        {
            try
            {
                // Initialize audio capture with the selected device
                _capture = new WasapiCapture();
                _capture.Device = device;
                _capture.Initialize();

                // Create a SoundInSource to handle audio data
                _soundInSource = new SoundInSource(_capture) { FillWithZeros = false };

                // Apply filter if provided, otherwise use the raw source
                IWaveSource filteredSource;
                if (filter != null)
                {
                    filteredSource = filter.ApplyFilter((IWaveSource)_soundInSource);
                }
                else
                {
                    filteredSource = _soundInSource;
                }

                // Initialize WaveWriter to save the audio to a file
                _writer = new WaveWriter(outputFilePath, filteredSource.WaveFormat);

                // Buffer to hold audio data
                byte[] buffer = new byte[filteredSource.WaveFormat.BytesPerSecond / 2];

                // Event handler for when audio data is available
                _capture.DataAvailable += (s, e) =&amp;gt;
                {
                    int read;
                    while ((read = filteredSource.Read(buffer, 0, buffer.Length)) &amp;gt; 0)
                    {
                        _writer.Write(buffer, 0, read);
                    }
                };

                // Start audio capture
                _capture.Start();
            }
            catch (Exception ex)
            {
                Console.WriteLine($"An error occurred: {ex.Message}");
            }
        }

        // Stops the recording process and releases resources
        public void StopRecording()
        {
            try
            {
                _capture.Stop();
                _writer.Dispose();
            }
            catch (Exception ex)
            {
                Console.WriteLine($"An error occurred: {ex.Message}");
            }
        }

        // Updates the audio source, allowing for dynamic changes during recording
        public void UpdateSource(IWaveSource newSource)
        {
            _capture.Stop();

            _soundInSource = newSource as SoundInSource;

            if (_soundInSource != null)
            {
                _capture.Start();
            }
            else
            {
                Console.WriteLine("newSource is not a SoundInSource");
            }
        }

        // Implements the Dispose pattern to release unmanaged resources
        private void Dispose(bool disposing)
        {
            if (!_disposed)
            {
                if (disposing)
                {
                    if (_capture != null)
                    {
                        _capture.Dispose();
                        _capture = null;
                    }

                    if (_writer != null)
                    {
                        _writer.Dispose();
                        _writer = null;
                    }
                }
                _disposed = true;
            }
        }

        public void Dispose()
        {
            Dispose(true);
            GC.SuppressFinalize(this);
        }
    }
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Key Moments of MainWindowViewModel&lt;/strong&gt;&lt;br&gt;
The MainWindowViewModel class manages the interaction between the user interface and the main recording functionality. It maintains application state, processes user commands, and coordinates the recording process. However, in this article I want to focus only on the StartRecording method&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;public class MainWindowViewModel : ViewModelBase
{
        ...

     public void StartRecording(string deviceName, VoiceFilterViewModel filterViewModel)
  {
    // Generate a unique file path for the recording based on the device name
    string filePath = AudioFilePathHelper.GenerateAudioFilePath(deviceName);

    // Select the audio device that matches the provided device name
    var device = Device.SelectDevice(deviceName);

    // Check if a filter is provided and start recording with or without the filter
    if (filterViewModel != null &amp;amp;&amp;amp; filterViewModel.FilterStrategy != null)
    {
        // Start recording with the provided filter strategy
        Recorder.StartRecording(filePath, device, filterViewModel.FilterStrategy);
    }
    else
    {
        // Start recording without any filter
        Recorder.StartRecording(filePath, device, null);
    }

    // Set the recording state to true to indicate that recording has started
    IsRecording = true;

    // If filters are applied and a filter is selected, apply the filter command
    if (IsFilterApplied &amp;amp;&amp;amp; SelectedFilterViewModel != null)
    {
        ApplyFilterCommand();
    }
}

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The StartRecording method in the MainWindowViewModel class initializes the recording process by selecting the appropriate audio device, determining whether to apply an audio filter, and starting the recording. It also updates the recording state and applies any selected filter if applicable.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Conclusion&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;In this article I did not touch on many topics: for example, working with filters and View classes. However, the purpose of this article was to show a clear example of using the CSCore library to capture and process audio. Using CSCore's robust features, we can create a simple yet powerful audio recording application. The focus on core classes and functions makes the application efficient and easy to understand, making it a suitable starting point for more complex audio processing projects. Happy coding!&lt;/p&gt;

</description>
      <category>csharp</category>
      <category>dotnet</category>
      <category>programming</category>
      <category>opensource</category>
    </item>
    <item>
      <title>Image Scraping with HtmlAgilityPack: A Practical Guide Using ConsoleWebScraper</title>
      <dc:creator>Bohdan Harabadzhyu</dc:creator>
      <pubDate>Sun, 09 Jun 2024 10:11:00 +0000</pubDate>
      <link>https://dev.to/themysteriousstranger90/image-scraping-with-htmlagilitypack-a-practical-guide-using-consolewebscraper-57km</link>
      <guid>https://dev.to/themysteriousstranger90/image-scraping-with-htmlagilitypack-a-practical-guide-using-consolewebscraper-57km</guid>
      <description>&lt;p&gt;&lt;strong&gt;Image Scraping with HtmlAgilityPack: A Practical Guide Using ConsoleWebScraper&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Web scraping is a valuable tool for automating the collection of information from websites. My simple, open-source ConsoleWebScraper application, available on &lt;a href="https://github.com/TheMysteriousStranger90/ConsoleWebScraper" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt; and &lt;a href="https://sourceforge.net/projects/consolewebscraper/" rel="noopener noreferrer"&gt;SourceForge&lt;/a&gt;, demonstrates how to use the HtmlAgilityPack library to scrape images from web pages. This guide will focus on the image scraping capabilities of the application and provide an overview of its core functionality.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Introduction to HtmlAgilityPack&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;HtmlAgilityPack is a .NET library that simplifies HTML parsing, making it a favorite among developers for web scraping tasks. It provides a robust way to traverse and manipulate HTML documents. With HtmlAgilityPack, extracting elements like images and text from web pages becomes straightforward and efficient.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why Scrape Images?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Images are a significant part of web content, used for various purposes such as visual data representation, marketing, and documentation. Scraping images can be useful for creating archiving content, or monitoring website changes. ConsoleWebScraper application serves as a simple example to demonstrate how you can automate this process using HtmlAgilityPack.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key Features of ConsoleWebScraper&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;ConsoleWebScraper offers a few essential functionalities:&lt;br&gt;
URL Input: Prompts the user to enter a URL and retrieves the HTML content.&lt;br&gt;
HTML Parsing: Extracts inner URLs and images from the HTML content.&lt;br&gt;
File Saving: Saves scraped URLs, images, and HTML content (with tags removed) to separate files for easy access and further analysis.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Additional Functionalities&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Before diving into the core image scraping method, it's helpful to understand the broader functionality of the ConsoleWebScraper application, which includes several supporting methods and classes.&lt;br&gt;
Another classes : &lt;br&gt;
The Client class manages the interaction with the user and controls the application's flow. It listens for user commands and executes the appropriate actions&lt;br&gt;
The Printer class provides simple methods to display the application's start page and main menu to the user.&lt;br&gt;
The Controller class orchestrates the scraping process, managing user input, folder creation, and invoking the web scraper service methods.&lt;br&gt;
The HtmlTags class provides a method to remove HTML tags from the content, leaving only the text.&lt;br&gt;
The IWebScraperService interface defines methods for saving URLs, content, and images, which are implemented in the WebScraperService class.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Core Method: SaveImagesToDoc&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The heart of the image scraping functionality is encapsulated in the SaveImagesToDoc method. Let's dive deeper into this method to understand how it works.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;public async Task SaveImagesToDoc(string fileName, string htmlContent, string baseUrl)
{
    // Create directory to save images
    Directory.CreateDirectory(fileName);

    // Load HTML content into HtmlDocument
    var doc = new HtmlAgilityPack.HtmlDocument();
    doc.LoadHtml(htmlContent);

    // Extract image URLs
    var images = doc.DocumentNode.Descendants("img")
        .Select(e =&amp;gt; e.GetAttributeValue("src", null))
        .Where(src =&amp;gt; !string.IsNullOrEmpty(src))
        .Select(src =&amp;gt; new Uri(new Uri(baseUrl), src).AbsoluteUri)
        .ToList();

    // Initialize HttpClient
    using (HttpClient client = new HttpClient())
    {
        int pictureNumber = 1;
        foreach (var img in images)
        {
            try
            {
                // Download image as byte array
                var imageBytes = await client.GetByteArrayAsync(img);

                // Get image file extension
                var extension = Path.GetExtension(new Uri(img).AbsolutePath);

                // Save image to file
                await File.WriteAllBytesAsync($"{fileName}\\Image{pictureNumber}{extension}", imageBytes);
                pictureNumber++;
            }
            catch (Exception ex)
            {
                // Log any errors
                Console.WriteLine($"Failed to download or save image {img}: {ex.Message}");
            }
        }
    }
}

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Step-by-Step Breakdown&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Create Directory: The method starts by creating a directory to store the downloaded images.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Directory.CreateDirectory(fileName);
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Load HTML Content: It then loads the provided HTML content into an HtmlDocument object from the HtmlAgilityPack library.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;var doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(htmlContent);
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Extract Image URLs: The method identifies all img elements in the HTML and extracts their src attributes. It converts these relative URLs to absolute URLs using the base URL of the web page.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;var images = doc.DocumentNode.Descendants("img")
    .Select(e =&amp;gt; e.GetAttributeValue("src", null))
    .Where(src =&amp;gt; !string.IsNullOrEmpty(src))
    .Select(src =&amp;gt; new Uri(new Uri(baseUrl), src).AbsoluteUri)
    .ToList();
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Download and Save Images: Using HttpClient, the method iterates through the list of image URLs, downloads each image as a byte array, and saves it to the designated directory with an appropriate file extension. Errors during the download or save process are caught and logged.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;using (HttpClient client = new HttpClient())
{
    int pictureNumber = 1;
    foreach (var img in images)
    {
        try
        {
            var imageBytes = await client.GetByteArrayAsync(img);
            var extension = Path.GetExtension(new Uri(img).AbsolutePath);
            await File.WriteAllBytesAsync($"{fileName}\\Image{pictureNumber}{extension}", imageBytes);
            pictureNumber++;
        }
        catch (Exception ex)
        {
            Console.WriteLine($"Failed to download or save image {img}: {ex.Message}");
        }
    }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Conclusion&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The ConsoleWebScraper application demonstrates the fundamental use of the HtmlAgilityPack library to scrape images from web pages. Despite its simplicity, the tool offers basic functions, making it a good solution for entry-level tasks. By automating image extraction and storage, you can streamline your data collection efforts. Happy scraping!&lt;/p&gt;

</description>
      <category>csharp</category>
      <category>dotnet</category>
      <category>webscraping</category>
      <category>webdev</category>
    </item>
  </channel>
</rss>
