• How to Parse (Read) XML using XML Reader in C#

    Posted on April 5, 2012 by in C#, Dotnet

    XmlTextReader provides forward-only, read-only access to a stream of XML data. It is used if you need to access the XML as raw data without the overhead of a DOM; thus, the XmlTextReader class provides a faster mechanism for reading XML. This class implements Xml Reader and conforms to the W3C Extensible Markup Language (XML) 1.0 and the Namespaces in XML recommendations.

    Majority of the applications use either 3 tier architecture or service oriented architecture or combination of both. In service oriented applications, you would typically return an XML string as an output to the client. Depending upon if it is a thin client or a thick client, parsing logic resides either in the UI layer or business logic layer. Now how would you approach this scenario? Would you repeat same parsing logic with different set of attributes or would you segregate the whole logic in one big gigantic class. I have seen people doing both ways. In this article, I am going to design the whole parsing logic in a flexible way or let’s just say OBJECT ORIENTED WAY.

    Usecase:
    We would like to get list of products available for sale. The output from the service could be either Products XML or some other XML (e.g.: Employee XML, Sales Order XML etc). We would want to parse the XML and prepare Products List or employees list.  Since fetching the data from the database through a service is not the objective of this article, we simple store the responses in XML files and just use them.

    Design:

    ElementInfo: All the elements and its attributes are stored in this class.
    XMLParser: The core parsing logic resides in this class.
    ProductXMLParser: This class extends from XMLParser and will handle any Product related things.
    ProductResponseHelper: This is a business layer class responsible for abstracting client from all back-end operations. It fetches response from the service (in our case, from XML file). Uses ProductXMLParser to parse the output XML and returns the product list.
    Program.cs: Console application to start up things and prints the output to the console

    Solution:

    Create an instance of an XmlTextReader object, and populate it with the XML file.

    XmlTextReader xmlTextReader = new XmlTextReader(new StringReader(xmlString));

    After you create the XmlTextReader object, use the Read method to read the XML data. The Read method continues to move through the XML file sequentially until it reaches the end of the file, at which point the Read method returns a value of “False.” 

    while (xmlTextReader.Read())
    {
        //do work
    }

    At each level, check the node type. If the node type is an element store its name and value in ElementInfo.  Then iterate through attributes and store their information in the element info attributes list. The MovetoNextAttribute method moves sequentially through each attribute in the element.  As you would notice, depending upon the current node location, we are calling HandleStartElement or HandleEndElement. XMLParser is an independent class whose sole responsibility is to parse the XML. It doesn’t care if it is a Products XML or Employee XML.

    //start reading the XML until it reaches end of file
    while (xmlTextReader.Read())
    {
        //check each node by node type
        switch (xmlTextReader.NodeType)
        {
            case XmlNodeType.Element: 
                //if we come across an element, store that element info
                // and move forward
     elementInfo = new ElementInfo(
                                  xmlTextReader.LocalName, 
                                  xmlTextReader.AttributeCount);
     HandleStartElement(elementInfo);
                //if element has any attributes, traverse through them
                while (xmlTextReader.MoveToNextAttribute())
                {
                    elementInfo.AddAttribute(xmlTextReader.LocalName, 
                                             xmlTextReader.Value);
                }
                //Moves to the element that contains the current attribute node.
                xmlTextReader.MoveToElement();            
            case XmlNodeType.Text:
                elementInfo.val = xmlTextReader.Value;
                break;
            case XmlNodeType.EndElement:            
                HandleEndElement(elementInfo);            
                break;
        }
    }

    It is the responsibility of the ProductsXMLParser to implement HandleStartElement and HandleEndElement. Again, it only cares about the product XML and not the product object. ProductXMLParser doesn’t know anything about which object or collection be populated with this information. So based on the element name, it raises element specific events as shown below:

    public delegate void ElementHandler(object sender, ElementInfo e);
    public event ElementHandler StartProduct;
    protected override void HandleStartElement(ElementInfo elementInfo)
            {
                switch (elementInfo.name)
                {
                    case "Product":
                        StartProduct(this, elementInfo);
                        break;
                 }
             }

    ProductHelper subscribes to the element specific events raised by ProductXMLParser and populates the PRODUCT object accordingly.

    ProductXMLParser parser = new ProductXMLParser(productsXML);
    parser.StartProduct += new ElementHandler(parser_StartProduct);
    private void parser_StartProduct(object sender, ElementInfo e)
    {
        product = new Product();
        if (e.attribs.Count > 0 && e.attribs.ContainsKey("ID"))
            product.ProductID = Convert.ToInt32(e.attribs["ID"].val);
    }

    Extensibility:
    In the future, if we need to parse new output response xml’s you would create classes such EmployeeXMLParser & EmployeeHelper similar to ProductsXMLParser and ProductHelper.

    Source Code:

    Program.cs

    public class Program
    {
        public static void Main(string[] args)
        {
            ProductHelper productHelper = new ProductHelper();
            List<Product> products = productHelper.RetrieveProducts();
            foreach (Product product in products)
            {            
                Console.WriteLine("/**************************************/");
                Console.WriteLine("Product ID:      " + product.ProductID);
                Console.WriteLine("Product Name:    " + product.ProductName);
                Console.WriteLine("Product Number:  " + product.ProductNumber);
                Console.WriteLine("Product Price:   " + product.ListPrice);
            }
            Console.ReadLine();
        }
    }

    ProductHelper.cs

    public class ProductHelper
    {
        ProductXMLParser parser;
        private List<Product> products;
        Product product;
        public ProductHelper()
        {
            products = new List<Product>();
        }
    
        public List<Product> RetrieveProducts()
        {
            string productsXML = ReadXML(@"C:\Temp\Products.xml");
            parser = new ProductXMLParser(productsXML);
            SubscribeEvents();
            parser.ParseXML();
            return products;
        }
    
        public void SubscribeEvents()
        {
            parser.StartProduct += new ElementHandler(parser_StartProduct);
            parser.EndProduct += new ElementHandler(parser_EndProduct);
    
            parser.StartProductName += new 
                                   ElementHandler(parser_StartProductName);
            parser.StartProductNumber += new 
                                   ElementHandler(parser_StartProductNumber);
            parser.StartListPrice += new ElementHandler(parser_StartListPrice);
        }
        private void parser_StartProduct(object sender, ElementInfo e)
        {
            product = new Product();
            if (e.attribs.Count > 0 && e.attribs.ContainsKey("ID"))
                product.ProductID = Convert.ToInt32(e.attribs["ID"].val);
        }
        private void parser_StartProductName(object sender, ElementInfo e)
        {
            product.ProductName = e.val;
        }
        private void parser_StartProductNumber(object sender, ElementInfo e)
        {
            product.ProductNumber = e.val;
        }
        private void parser_StartListPrice(object sender, ElementInfo e)
        {
            product.ListPrice = Convert.ToDouble(e.val);
        }
        private void parser_EndProduct(object sender, ElementInfo e)
        {
            products.Add(product);
        }
    
        public string ReadXML(string filePath)
        {
            using (StreamReader sr = new StreamReader(filePath))
            {
                return sr.ReadToEnd();
            }
        }
    }

    ProgramXMLParser.cs

    public class ProductXMLParser : XMLParser
    {
        public event ElementHandler StartProduct;
        public event ElementHandler EndProduct;
        public event ElementHandler StartProductID;
        public event ElementHandler EndProductID;
        public event ElementHandler StartProductName;
        public event ElementHandler EndProductName;
        public event ElementHandler StartProductNumber;
        public event ElementHandler StartListPrice;
    
        public ProductXMLParser(string xmlString) : base(xmlString)
        {      }
        public void ParseXML()
        {
            base.Parse();
        }
        protected override void HandleStartElement(ElementInfo elementInfo)
        {
            switch (elementInfo.name)
            {
                case "Product":
                    StartProduct(this, elementInfo);
                    break;
                case "ProductName":
                    StartProductName(this, elementInfo);
                    break;
                case "ProductNumber":
                    StartProductNumber(this, elementInfo);
                    break;
                case "ListPrice":
                    StartListPrice(this, elementInfo);
                    break;
            }
        }
    
        protected override void HandleEndElement(ElementInfo elementInfo)
        {
            switch (elementInfo.name)
            {
                case "Product":
                    EndProduct(this, elementInfo);
                    break;
            }
        }

    XMLParser.cs

    public delegate void ElementHandler(object sender, ElementInfo e);
    public class XMLParser
    {
        XmlTextReader xmlTextReader;
        public XMLParser(string xmlString)
        {
            xmlTextReader = new XmlTextReader(new StringReader(xmlString));
        }
        protected void Parse()
        {
            ElementInfo elementInfo = null;
            bool startHandled = true;
            //start reading the XML until it reaches end of file
            while (xmlTextReader.Read())
            {
                //check each node by node type
                switch (xmlTextReader.NodeType)
                {
                    case XmlNodeType.Element:
                        //if we come across an element, store that element info
                        // and move forward
                        if (startHandled == false)
                        {
                            HandleStartElement(elementInfo);
                            startHandled = true;
                        }
                        elementInfo = new ElementInfo(xmlTextReader.LocalName, 
                                              xmlTextReader.AttributeCount);
                        startHandled = false;
                        //if element has any attributes, traverse through them
                        while (xmlTextReader.MoveToNextAttribute()){                    
                            elementInfo.AddAttribute(
                                xmlTextReader.LocalName, xmlTextReader.Value);
                        }
                        //Moves to the element that contains the current 
                        //attribute node.
                        xmlTextReader.MoveToElement();
                   //if the element doesnt have any child elements or 
                   //attributes we dont need to traverse through it anymore
                        if (xmlTextReader.IsEmptyElement)
                        {
                            HandleStartElement(elementInfo);
                            startHandled = true;
                            HandleEndElement(elementInfo);
                        }
                        break;
                    case XmlNodeType.Text:
                        elementInfo.val = xmlTextReader.Value;
                        break;
                    case XmlNodeType.EndElement:
                        if (startHandled == false)
                        {
                            HandleStartElement(elementInfo);
                            startHandled = true;
                        }
                        elementInfo=new ElementInfo(xmlTextReader.LocalName,0);
                        HandleEndElement(elementInfo);
                        elementInfo = null;
                        startHandled = true;
                        break;
                }
            }
            xmlTextReader.Close();
        }
        protected virtual void HandleStartElement(ElementInfo elementInfo){ }
        protected virtual void HandleEndElement(ElementInfo elementInfo){ }
    }

    Products.XML

    <Products>
        <Product ID="125">
          <ProductName>test product 1</ProductName>
          <ProductNumber>number 1</ProductNumber>
          <ListPrice>45</ListPrice>
      </Product>
      <Product ID="126">
          <ProductName>test product 2</ProductName>
          <ProductNumber>number 2</ProductNumber>
          <ListPrice>450</ListPrice>
      </Product>
      <Product ID="127">
          <ProductName>test product 3</ProductName>
          <ProductNumber>number 2</ProductNumber>
          <ListPrice>500</ListPrice>
      </Product>
    </Products>

    Output:
    XML Reader 1

    Be Sociable, Share!
      Post Tagged with ,

    Written by

    Software architect with over 10 years of proven experience in designing & developing n-tier and web based software applications, for Finance, Telecommunication, Manufacturing, Internet and other Commercial industries. He believes that success depends on one's ability to integrate multiple technologies to solve a simple as well as complicated problem.

    View all articles by

    Email : [email protected]

    Leave a Reply