Книга: Professional Visual Basic 2012 and .NET 4.5

Назад: Chapter 7: Arrays, Collections, and Generics

Дальше: Chapter 10: Data Access with the Entity Framework

Chapter 8

Using XML with Visual Basic

What's in this chapter?

The rationale behind XML

How to serialize objects to XML (and vice versa)

How to read and write XML

How to use LINQ to XML to read and edit XML

How to use XML literals within your code

on the Download Code tab. The code is in the chapter 8 download. The code for this chapter is a single solution with multiple projects. Each project represents a separate example.

This chapter describes how you can generate and manipulate Extensible Markup Language (XML) using Visual Basic 2012. The .NET Framework exposes many XML-specific namespaces that contain over 100 different classes. In addition, dozens of other classes support and implement XML-related technologies, such as those provided in ADO.NET, SQL Server, and BizTalk. Consequently, this chapter focuses on the general concepts and the most important classes.

The chapter is organized from older technologies and lower-level XML manipulation to the latest and greatest functionality. This is done because it is important you understand how XML is actually structured and manipulated in order for you to gain the most from it.

Visual Basic relies on the classes exposed in the following XML-related namespaces to transform, manipulate, and stream XML documents:

System.Xml provides core support for a variety of XML standards, including DTD (Document Type Definition), namespace, DOM (Document Object Model), XDR (XML Data Reduced — an old version of the XML schema standard), XPath, XSLT (XML Transformation), and SOAP (formerly Simple Object Access Protocol; now the acronym doesn't stand for anything).

System.Xml.Serialization provides the objects used to transform objects to and from XML documents or streams using serialization.

System.Xml.Schema provides a set of objects that enable schemas to be loaded, created, and streamed. This support is achieved using a suite of objects that support in-memory manipulation of the entities that compose an XML schema.

System.Xml.XPath provides a parser and evaluation engine for the XML Path language (XPath).

System.Xml.Xsl provides the objects necessary when working with Extensible Stylesheet Language (XSL) and XSL Transformations (XSLT).

System.Xml.Linq provides the support for querying XML using LINQ (also covered in chapter 9).

This chapter makes sense of this range of technologies by introducing some basic XML concepts and demonstrating how Visual Basic, in conjunction with the .NET Framework, can make use of XML.

At the end of this chapter, you will be able to generate, manipulate, and transform XML using Visual Basic.

and
A schema can be associated with an XML document and describes the data it contains (name, type, scale, precision, length, and so on). Either the actual schema or a reference to where the schema is located can be contained in the XML document. In either case, an XML schema is a standard representation that can be used by all applications that consume XML. This means that applications can use the supplied schema to validate the contents of an XML document generated by the Serialize method of the XmlSerializer object.

The code snippet that demonstrated the Serialize method displayed the generated XML to the Console.Out stream. Clearly, you do not expect an application to use Console.Out when it would like to access a FilmOrder object in XML form. The point was to show how serialization can be performed in just two lines of code, one call to a constructor and one call to a method.

The Serialize method's first parameter is overridden so that it can serialize XML to a file, a Stream, a TextWriter, or an XmlWriter. When serializing to Stream, TextWriter, or XmlWriter, adding a third parameter to the Serialize method is permissible. This third parameter is of type XmlSerializerNamespaces and is used to specify a list of namespaces that qualify the names in the XML-generated document.

Deserializing

Since serialization produces an XML document from an object, it stands to reason that deserialization would do the opposite. This is handled by the Deserialize method of XmlSerializer. This method is overridden and can deserialize XML presented as a Stream, a TextReader, or an XmlReader. The output of the various Deserialize methods is a generic Object, so you need to cast the resulting object to the correct data type.

The example that demonstrates how to deserialize an object can be found in the FilmOrderList project. This is just an updated version of the previous example. The first step is to look at the new FilmOrderList class. This class contains an array of film orders (actually an array of FilmOrder objects). FilmOrderList is defined as follows (code file: FileMorderList.vb):

Public Class FilmOrderList      Public FilmOrders() As FilmOrder      Public Sub New()      End Sub      Public Sub New(ByVal multiFilmOrders() As FilmOrder)          Me.FilmOrders = multiFilmOrders      End Sub  End Class

The FilmOrderList class contains a fairly complicated object, an array of FilmOrder objects. The underlying serialization and deserialization of this class is more complicated than that of a single instance of a class that contains several simple types, but the programming effort involved on your part is just as simple as before. This is one of the great ways in which the .NET Framework makes it easy for you to work with XML data, no matter how it is formed.

To work through an example of the deserialization process, first create a sample order stored as an XML file called Filmorama.xml:

<?xml version="1.0" encoding="utf-8" ?>  <FilmOrderList xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"   xmlns:xsd="http://www.w3.org/2001/XMLSchema">    <FilmOrders>      <FilmOrder>        <Name>Grease</Name>        <FilmId>101</FilmId>        <Quantity>10</Quantity>      </FilmOrder>      <FilmOrder>        <Name>Lawrence of Arabia</Name>        <FilmId>102</FilmId>        <Quantity>10</Quantity>      </FilmOrder>      <FilmOrder>        <Name>Star Wars</Name>        <FilmId>103</FilmId>        <Quantity>10</Quantity>      </FilmOrder>    </FilmOrders>  </FilmOrderList>

Note

In order for this to run, you should either have the .xml file in the location of the executable or load the file using the full path of the file within the code example. To have the XML in the same directory as the executable, add the XML file to the project, and set the Copy to Output Directory to “Copy if newer.”

Once the XML file is in place, the next step is to change your console application so it will deserialize the contents of this file. First, ensure that your console application has made the proper namespace references:

Imports System.Xml  Imports System.Xml.Serialization  Imports System.IO

The code that actually performs the deserialization is found in the Sub Main()method (code file: Main.vb):

’ Open file Filmorama.xml  Dim dehydrated As FileStream = _     New FileStream("Filmorama.xml", FileMode.Open)  ’ Create an XmlSerializer instance to handle deserializing,  ’ FilmOrderList  Dim serialize As XmlSerializer = _     New XmlSerializer(GetType(FilmOrderList))  ’ Create an object to contain the deserialized instance of the object.  Dim myFilmOrder As FilmOrderList = _     New FilmOrderList  ’ Deserialize object  myFilmOrder = serialize.Deserialize(dehydrated)

This code demonstrates the deserialization of the Filmorama.xml file into a FilmOrderList instance. This is accomplished, mainly, by the call to the Deserialize method of the XmlSerializer class.

Once deserialized, the array of film orders can be displayed. The following code shows how this is accomplished (code file: Main.vb):

Dim SingleFilmOrder As FilmOrder  For Each SingleFilmOrder In myFilmOrder.FilmOrders     Console.Out.WriteLine("{0}, {1}, {2}", _        SingleFilmOrder.Name, _        SingleFilmOrder.FilmId, _        SingleFilmOrder.Quantity)  Next  Console.ReadLine()

Running the example will result in the following output:

Grease, 101, 10  Lawrence of Arabia, 102, 10  Star Wars, 103, 10

XmlSerializer also implements a CanDeserialize method. The prototype for this method is as follows:

Public Overridable Function CanDeserialize(ByVal xmlReader As XmlReader) _     As Boolean

If CanDeserialize returns True, then the XML document specified by the xmlReader parameter can be deserialized. If the return value of this method is False, then the specified XML document cannot be deserialized. Using this method is usually preferable to attempting to deserialize and trapping any exceptions that may occur.

The FromTypes method of XmlSerializer facilitates the creation of arrays that contain XmlSerializer objects. This array of XmlSerializer objects can be used in turn to process arrays of the type to be serialized. The prototype for FromTypes is shown here:

Public Shared Function FromTypes(ByVal types() As Type) As XmlSerializer()

Source Code Style Attributes

Thus far, you have seen attributes applied to a specific portion of an XML document. Visual Basic, as with most other languages, has its own flavor of attributes. These attributes refer to annotations to the source code that specify information, or metadata, that can be used by other applications without the need for the original source code. You will call such attributes Source Code Style attributes.

In the context of the System.Xml.Serialization namespace, Source Code Style attributes can be used to change the names of the elements generated for the data members of a class or to generate XML attributes instead of XML elements for the data members of a class. To demonstrate this, you will update the FilmOrder class using these attributes to change the outputted XML. This updated version is part of the new example found in the FilmOrderAttributes project.

In the previous section, you saw that serialization used the name of the property as the name of the element that is automatically generated. To rename this generated element, a Source Code Style attribute will be used. This Source Code Style attribute specifies that when FilmOrder is serialized, the name data member is represented as an XML element named <Title>. The actual Source Code Style attribute that specifies this is as follows:

<XmlElementAttribute("Title")>   Public Name As String

The updated FilmOrder also contains other Source Code Style attributes (code file: FilmOrder.vb):

Imports System.Xml.Serialization  Public Class FilmOrder    <XmlElementAttribute("Title")> Public Name As String    <XmlAttributeAttribute("ID")> Public FilmId As Integer    <XmlAttributeAttribute("Qty")> Public Quantity As Integer    Public Sub New()    End Sub    Public Sub New(ByVal name As String, _                   ByVal filmId As Integer, _                   ByVal quantity As Integer)        Me.Name = name        Me.FilmId = filmId        Me.Quantity = quantity    End Sub  End Class

The additional attributes that were added to the example class are:

<XmlAttributeAttribute(”ID”)> specifies that FilmId is to be serialized as an XML attribute named ID.
<XmlAttributeAttribute(”Qty”)> specifies that Quantity is to be serialized as an XML attribute named Qty.

Note that you needed to include the System.Xml.Serialization namespace to bring in the Source Code Style attributes used.

The following Sub Main() method for this project is no different from the ones previously shown (code file: Main.vb):

Dim serialize As XmlSerializer = _      New XmlSerializer(GetType(FilmOrder))  Dim MyMovieOrder As FilmOrder = _      New FilmOrder("Grease", 101, 10)  serialize.Serialize(Console.Out, MyMovieOrder)  Console.Readline()

The console output generated by this code reflects the Source Code Style attributes associated with the class:

<?xml version="1.0" encoding="IBM437"?>  <FilmOrder xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"               xmlns:xsd="http://www.w3.org/2001/XMLSchema" ID="101" Qty="10">    <Title>Grease</Title>  </FilmOrder>

Compare this to the earlier version that does not include the attributes.

The example only demonstrates the Source Code Style attributes XmlAttributeAttribute and XmlElementAttribute. shows some additional attributes available.

Additional Source Code Attributes Available

Attribute	Description
XmlArrayAttribute	Allows the name of an array element to be specified.
XmlArrayItemAttribute	Allows the name of an array's child elements to be specified.
XmlRoot	Denotes the root element.
XmlType	Used to specify the data type for an element. This is used in XSD files, which are discussed later.
XmlIgnoreAttribute	Instructs the serializer to not serialize the current element or attribute.
XmlEnumAttribute	Controls how enumeration values are serialized.

, which provides the specifics.

Additional Source Code Attributes Available

Element	Node
XmlDeclaration	<?xml version=”1.0” encoding=”utf-8”?>
XmlAttribute	Version
XmlAttribute	Encoding
XmlElement	FilmOrder
XmlAttribute	FilmId
XmlElement	Name
XmlText	Grease
XmlEndElement	Name
XmlElement	Quantity
XmlText	10
XmlEndElement	Quantity
XmlWhitespace	Nothing
XmlEndElement	FilmOrder

<?xml version="1.0" encoding="utf-8"?>  <FilmOrder filmId="101">    <Name>Grease</Name>    <Quantity>10</Quantity>  </FilmOrder>

The following classes that access a stream of XML (read XML) and generate a stream of XML (write XML) are contained in the System.Xml namespace:

XmlWriter—This abstract class specifies a noncached, forward-only stream that writes an XML document (data and schema).
XmlReader—This abstract class specifies a noncached, forward-only stream that reads an XML document (data and schema).

The diagram of the classes associated with the XML stream-style parser refers to one other class, XslTransform. This class is found in the System.Xml.Xsl namespace and is not an XML stream-style parser. Rather, it is used in conjunction with XmlWriter and XmlReader. This class is covered in detail later.

The System.Xml namespace exposes a plethora of additional XML manipulation classes in addition to those shown in the architecture diagram. The classes shown in the diagram include the following:

XmlResolver—This abstract class resolves an external XML resource using a Uniform Resource Identifier (URI). XmlUrlResolver is an implementation of an XmlResolver.
XmlNameTable—This abstract class provides a fast means by which an XML parser can access element or attribute names.

Writing an XML Stream

An XML document can be created programmatically in .NET. One way to perform this task is by writing the individual components of an XML document (schema, attributes, elements, and so on) to an XML stream. Using a unidirectional write-stream means that each element and its attributes must be written in order—the idea is that data is always written at the end of the stream. To accomplish this, you use a writable XML stream class (a class derived from XmlWriter). Such a class ensures that the XML document you generate correctly implements the W3C Extensible Markup Language (XML) 1.0 specification and the namespaces in the XML specification.

Why is this necessary when you have XML serialization? You need to be very careful here to separate interface from implementation. XML serialization works for a specific class, such as the FilmOrder class used in the earlier samples. This class is a proprietary implementation and not the format in which data is exchanged. For this one specific case, the XML document generated when FilmOrder is serialized just so happens to be the XML format used when placing an order for some movies. You can use Source Code Style attributes to help it conform to a standard XML representation of a film order summary, but the eventual structure is tied to that class.

In a different application, if the software used to manage an entire movie distribution business wants to generate movie orders, then it must generate a document of the appropriate form. The movie distribution management software achieves this using the XmlWriter object.

Before reviewing the subtleties of XmlWriter, note that this class exposes over 40 methods and properties. The example in this section provides an overview that touches on a subset of these methods and properties. This subset enables the generation of an XML document that corresponds to a movie order.

The example, located in the FilmOrdersWriter project, builds the module that generates the XML document corresponding to a movie order. It uses an instance of XmlWriter, called FilmOrdersWriter, which is actually a file on disk. This means that the XML document generated is streamed to this file directly. Because the FilmOrdersWriter variable represents a file, you have to take a few actions against the file. For instance, you have to ensure the file is:

Created—The instance of XmlWriter, FilmOrdersWriter, is created by using the Create method, as well as by assigning all the properties of this object by using the XmlWriterSettings object.
Opened—The file the XML is streamed to, FilmOrdersProgrammatic.xml, is opened by passing the filename to the constructor associated with XmlWriter.
Generated—The process of generating the XML document is described in detail at the end of this section.
Closed—The file (the XML stream) is closed using the Close method of XmlWriter or by simply making use of the Using keyword, which ensures that the object is closed at the end of the Using statement.

Before you create the XmlWriter object, you first need to customize how the object operates by using the XmlWriterSettings object. This object, introduced in .NET 2.0, enables you to configure the behavior of the XmlWriter object before you instantiate it, as seen here:

Dim myXmlSettings As New XmlWriterSettings()  myXmlSettings.Indent = True  myXmlSettings.NewLineOnAttributes = True

You can specify a few settings for the XmlWriterSettings object that define how XML creation will be handled by the XmlWriter object.

Once the XmlWriterSettings object has been instantiated and assigned the values you deem necessary, the next steps are to invoke the XmlWriter object and make the association between the XmlWriterSettings object and the XmlWriter object.

The basic infrastructure for managing the file (the XML text stream) and applying the settings class is either

Dim FilmOrdersWriter As XmlWriter = _     XmlWriter.Create("..\FilmOrdersProgrammatic.xml", myXmlSettings)  FilmOrdersWriter.Close()

or the following, if you are utilizing the Using keyword, which is the recommended approach:

Using FilmOrdersWriter As XmlWriter = _     XmlWriter.Create("..\FilmOrdersProgrammatic.xml", myXmlSettings)  End Using

With the preliminaries completed, file created, and formatting configured, the process of writing the actual attributes and elements of your XML document can begin. The sequence of steps used to generate your XML document is as follows:

1. Write an XML comment using the WriteComment method. This comment describes from whence the concept for this XML document originated and generates the following code:

 <!-- Same as generated by serializing, FilmOrder -->

2. Begin writing the XML element, <FilmOrder>, by calling the WriteStartElement method. You can only begin writing this element, because its attributes and child elements must be written before the element can be ended with a corresponding </FilmOrder>. The XML generated by the WriteStartElement method is as follows:

 <FilmOrder>

3. Write the attributes associated with <FilmOrder> by calling the WriteAttributeString method twice, specifying a different attribute each time. The XML generated by calling the WriteAttributeString method twice adds to the <FilmOrder> XML element that is currently being written to the following:

 <FilmOrder FilmId="101" Quantity="10">

4. Using the WriteElementString method, write the child XML element <Title>. The XML generated by calling this method is as follows:

 <Title>Grease</Title>

5. Complete writing the <FilmOrder> parent XML element by calling the WriteEndElement method. The XML generated by calling this method is as follows:

 </FilmOrder>

The complete code for accomplishing this is shown here (code file: Main.vb):

Imports System.Xml             Module Main      Sub Main()          Dim myXmlSettings As New XmlWriterSettings          myXmlSettings.Indent = True          myXmlSettings.NewLineOnAttributes = True          Using FilmOrdersWriter As XmlWriter =              XmlWriter.Create("FilmOrdersProgrammatic.xml", myXmlSettings)              FilmOrdersWriter.WriteComment(" Same as generated " &                 "by serializing, FilmOrder ")              FilmOrdersWriter.WriteStartElement("FilmOrder")              FilmOrdersWriter.WriteAttributeString("FilmId", "101")              FilmOrdersWriter.WriteAttributeString("Quantity", "10")              FilmOrdersWriter.WriteElementString("Title", "Grease")              FilmOrdersWriter.WriteEndElement() ' End  FilmOrder          End Using      End Sub  End Module

Once this is run, you will find the XML file FilmOrdersProgrammatic.xml created in the same folder as where the application was executed from, which is most likely the bin directory. The content of this file is as follows:

<?xml version="1.0" encoding="utf-8"?>  <!-- Same as generated by serializing, FilmOrder -->  <FilmOrder    FilmId="101"    Quantity="10">    <Title>Grease</Title>  </FilmOrder>

At a closer look, you should see that the XML document generated by this code is virtually identical to the one produced by the serialization example. Also, notice that in the previous XML document, the <Title> element is indented two characters and that each attribute is on a different line in the document. This formatting was achieved using the XmlWriterSettings class.

The sample application covers only a small portion of the methods and properties exposed by the XML stream-writing class, XmlWriter. Other methods implemented by this class manipulate the underlying file, such as the Flush method; and some methods allow XML text to be written directly to the stream, such as the WriteRaw method.

The XmlWriter class also exposes a variety of methods that write a specific type of XML data to the stream. These methods include WriteBinHex, WriteCData, WriteString, and WriteWhiteSpace.

You can now generate the same XML document in two different ways. You have used two different applications that took two different approaches to generating a document that represents a standardized movie order. The XML serialization approach uses the “shape” of the class to generate XML, whereas the XmlWriter allows you more flexibility in the output, at the expense of more effort.

However, there are even more ways to generate XML, depending on the circumstances. Using the previous scenario, you could receive a movie order from a store, and this order would have to be transformed from the XML format used by the supplier to your own order format.

Reading an XML Stream

In .NET, XML documents can be read from a stream as well. Data is traversed in the stream in order (first XML element, second XML element, and so on). This traversal is very quick because the data is processed in one direction, and features such as write and move backward in the traversal are not supported. At any given instance, only data at the current position in the stream can be accessed.

Before exploring how an XML stream can be read, you need to understand why it should be read in the first place. Returning to your movie supplier example, imagine that the application managing the movie orders can generate a variety of XML documents corresponding to current orders, preorders, and returns. All the documents (current orders, preorders, and returns) can be extracted in stream form and processed by a report-generating application. This application prints the orders for a given day, the preorders that are going to be due, and the returns that are coming back to the supplier. The report-generating application processes the data by reading in and parsing a stream of XML.

One class that can be used to read and parse such an XML stream is XmlReader. The .NET Framework includes more specific XML readers, such as XmlTextReader, that are derived from the XmlReader class. XmlTextReader provides the functionality to read XML from a file, a stream, or another XmlReader. This example, found in the FilmOrdersReader project, uses an XmlReader to read an XML document contained in a file. Reading XML from a file and writing it to a file is not the norm when it comes to XML processing, but a file is the simplest way to access XML data. This simplified access enables you to focus on XML-specific issues.

The first step in accessing a stream of XML data is to create an instance of the object that will open the stream. This is accomplished with the following code (code file: Main.vb):

Dim myXmlSettings As New XmlReaderSettings()  Using readMovieInfo As XmlReader = XmlReader.Create(fileName, myXmlSettings)

This code creates a new XmlReader, called readMovieInfo, using the specified filename and XmlReaderSettings instance. As with the XmlWriter, the XmlReader also has a settings class. You will use this class a little later.

The basic mechanism for traversing each stream is to move from node to node using the Read method. Node types in XML include element and white space. Numerous other node types are defined, but this example focuses on traversing XML elements and the white space that is used to make the elements more readable (carriage returns, line feeds, and indentation spaces). Once the stream is positioned at a node, the MoveToNextAttribute method can be called to read each attribute contained in an element. The MoveToNextAttribute method only traverses attributes for nodes that contain attributes (nodes of type element). You accomplish this basic node and attribute traversal using the following code (code file: Main.vb):

   While readMovieInfo.Read()        ' Process node here.        While readMovieInfo.MoveToNextAttribute()           ' Process attribute here.        End While     End While

This code, which reads the contents of the XML stream, does not utilize any knowledge of the stream's contents. However, a great many applications know exactly how the stream they are going to traverse is structured. Such applications can use XmlReader in a more deliberate manner and not simply traverse the stream without foreknowledge. This would mean you could use the GetAttribute method as well as the various ReadContentAs and ReadElementContentAs methods to retrieve the contents by name, rather than just walking through the XML.

Once the example stream has been read, it can be cleaned up using the End Using call:

End Using

The complete code for the method that reads the data is shown here (code file: Main.vb):

Private Sub ReadMovieXml(ByVal fileName As String)     Dim myXmlSettings As New XmlReaderSettings()     Using readMovieInfo As XmlReader = XmlReader.Create(fileName, _        myXmlSettings)        While readMovieInfo.Read()           ' Process node here.           ShowXmlNode(readMovieInfo)           While readMovieInfo.MoveToNextAttribute()             ' Process attribute here.              ShowXmlNode(readMovieInfo)           End While        End While     End Using  End Sub

The ReadMovieXml method takes a string parameter that specifies the name of the XML file to be read. For each node encountered after a call to the Read method, ReadMovieXml calls the ShowXmlNode subroutine. Similarly, for each attribute traversed, the ShowXmlNode subroutine is called. The code for the following ShowXmlNode method (code file: Main.vb):

Private Sub ShowXmlNode(ByVal reader As XmlReader)    If reader.Depth > 0 Then       For depthCount As Integer = 1 To reader.Depth          Console.Write(" ")       Next    End If    If reader.NodeType = XmlNodeType.Whitespace Then       Console.Out.WriteLine("Type: {0} ", reader.NodeType)    ElseIf reader.NodeType = XmlNodeType.Text Then       Console.Out.WriteLine("Type: {0}, Value: {1} ", _                            reader.NodeType, _                            reader.Value)    Else       Console.Out.WriteLine("Name: {0}, Type: {1}, " & _                            "AttributeCount: {2}, Value: {3} ", _                            reader.Name, _                            reader.NodeType, _                            reader.AttributeCount, _                            reader.Value)    End If  End Sub

This subroutine breaks down each node into its subentities:

Depth—This property of XmlReader determines the level at which a node resides in the XML document tree. To understand depth, consider the following XML document composed solely of elements:

 <A>       <B></B>       <C>           <D></D>       </C>   </A>.

Element <A> is the root element, and when parsed would return a depth of 0. Elements <B> and <C> are contained in <A> and hence reflect a depth value of 1. Element <D> is contained in <C>. The Depth property value associated with <D> (depth of 2) should, therefore, be one more than the Depth property associated with <C> (depth of 1).

Type—The type of each node is determined using the NodeType property of XmlReader. The node returned is of enumeration type, XmlNodeType. Permissible node types include Attribute, Element, and Whitespace. (Numerous other node types can also be returned, including CDATA, Comment, Document, Entity, and DocumentType.)
Name—The name of each node is retrieved using the Name property of XmlReader. The name of the node could be an element name, such as <FilmOrder>, or an attribute name, such as FilmId.
Attribute Count—The number of attributes associated with a node is retrieved using the AttributeCount property of XmlReader NodeType.
Value—The value of a node is retrieved using the Value property of XmlReader. For example, the element node <Title> contains a value of Grease.

The subroutine ShowXmlNode is implemented as follows. Within the ShowXmlNode subroutine, each level of node depth adds two spaces to the output generated:

If reader.Depth > 0 Then    For depthCount As Integer = 1 To reader.Depth      Console.Write(" ")    Next  End If

You add these spaces in order to create human-readable output (so you can easily determine the depth of each node displayed). For each type of node, ShowXmlNode displays the value of the NodeType property. The ShowXmlNode subroutine makes a distinction between nodes of type Whitespace and other types of nodes. The reason for this is simple: a node of type Whitespace does not contain a name or an attribute count. The value of such a node is any combination of white-space characters (space, tab, carriage return, and so on). Therefore, it doesn't make sense to display the properties if the NodeType is XmlNodeType.WhiteSpace. Nodes of type Text have no name associated with them, so for this type, subroutine ShowXmlNode displays only the properties NodeType and Value. For all other node types (including elements and attributes), the Name, AttributeCount, Value, and NodeType properties are displayed.

To finalize this module, add a Sub Main as follows:

Sub Main(ByVal args() As String)     ReadMovieXml("MovieManage.xml")  End Sub

The MovieManage.xml file, used as input for the example, looks like this:

<?xml version="1.0" encoding="utf-8" ?>  <MovieOrderDump>    <FilmOrderList>      <multiFilmOrders>        <FilmOrder filmId="101">          <name>Grease</name>          <quantity>10</quantity>        </FilmOrder>        <FilmOrder filmId="102">          <name>Lawrence of Arabia</name>          <quantity>10</quantity>        </FilmOrder>        <FilmOrder filmId="103">          <name>Star Wars</name>          <quantity>10</quantity>        </FilmOrder>      </multiFilmOrders>    </FilmOrderList>    <PreOrder>      <FilmOrder filmId="104">        <name>Shrek III - Shrek Becomes a Programmer</name>        <quantity>10</quantity>      </FilmOrder>    </PreOrder>    <Returns>      <FilmOrder filmId="103">        <name>Star Wars</name>        <quantity>2</quantity>      </FilmOrder>    </Returns>  </MovieOrderDump>

Running this module produces the following output (a partial display, as it would be rather lengthy):

Name: xml, Type: XmlDeclaration, AttributeCount: 2, Value: version="1.0"  encoding="utf-8"   Name: version, Type: Attribute, AttributeCount: 2, Value: 1.0   Name: encoding, Type: Attribute, AttributeCount: 2, Value: utf-8  Type: Whitespace  Name: MovieOrderDump, Type: Element, AttributeCount: 0, Value:  Type: Whitespace   Name: FilmOrderList, Type: Element, AttributeCount: 0, Value:   Type: Whitespace    Name: multiFilmOrders, Type: Element, AttributeCount: 0, Value:    Type: Whitespace     Name: FilmOrder, Type: Element, AttributeCount: 1, Value:      Name: filmId, Type: Attribute, AttributeCount: 1, Value: 101      Type: Whitespace      Name: name, Type: Element, AttributeCount: 0, Value:       Type: Text, Value: Grease      Name: name, Type: EndElement, AttributeCount: 0, Value:      Type: Whitespace      Name: quantity, Type: Element, AttributeCount: 0, Value:       Type: Text, Value: 10      Name: quantity, Type: EndElement, AttributeCount: 0, Value:      Type: Whitespace     Name: FilmOrder, Type: EndElement, AttributeCount: 0, Value:     Type: Whitespace

This example managed to use three methods and five properties of XmlReader. The output generated was informative but far from practical. XmlReader exposes over 50 methods and properties, which means that you have only scratched the surface of this highly versatile class. The remainder of this section looks at the XmlReaderSettings class, introduces a more realistic use of XmlReader, and demonstrates how the classes of System.Xml handle errors.

The XmlReaderSettings Class

Just like the XmlWriter object, the XmlReader.Create method allows you to specify settings to be applied for instantiation of the object. This means that you can provide settings specifying how the XmlReader object behaves when it is reading whatever XML you might have for it. This includes settings for dealing with white space, schemas, and other common options. An example of using this settings class to modify the behavior of the XmlReader class is as follows:

Dim myXmlSettings As New XmlReaderSettings()  myXmlSettings.IgnoreWhitespace = True  myXmlSettings.IgnoreComments = True  Using readMovieInfo As XmlReader = XmlReader.Create(fileName, myXmlSettings)     ' Use XmlReader object here.  End Using

In this case, the XmlReader object that is created ignores the white space that it encounters, as well as any of the XML comments. These settings, once established with the XmlReaderSettings object, are then associated with the XmlReader object through its Create method.

Traversing XML Using XmlReader

In cases where the format of the XML is known, the XmlReader can be used to parse the document in a more deliberate manner rather than hitting every node. In the previous section, you implemented a class that serialized arrays of movie orders. The next example, found in the FilmOrdersReader2 project, takes an XML document containing multiple XML documents of that type and traverses them. Each movie order is forwarded to the movie supplier via fax. The general process for traversing this document is outlined by the following pseudo code:

Read root element: <MovieOrderDump>      Process each <FilmOrderList> element          Read <multiFilmOrders> element              Process each <FilmOrder>                  Send fax for each movie order here

The basic outline for the program's implementation is to open a file containing the XML document, parse, then traverse it from element to element as follows (code file: Main.vb):

Dim myXmlSettings As New XmlReaderSettings()  Using readMovieInfo As XmlReader = XmlReader.Create(fileName, myXmlSettings)        readMovieInfo.Read()        readMovieInfo.ReadStartElement("MovieOrderDump")        Do While (True)           '****************************************************           '* Process FilmOrder elements here                  *           '****************************************************        Loop        readMovieInfo.ReadEndElement() '  </MovieOrderDump>  End Using

The preceding code opened the file using the constructor of XmlReader, and the End Using statement takes care of shutting everything down for you. The code also introduced two methods of the XmlReader class:

1. ReadStartElement(String)—This verifies that the current node in the stream is an element and that the element's name matches the string passed to ReadStartElement. If the verification is successful, then the stream is advanced to the next element.

2. ReadEndElement()—This verifies that the current element is an end tag; and if the verification is successful, then the stream is advanced to the next element.

The application knows that an element, <MovieOrderDump>, will be found at a specific point in the document. The ReadStartElement method verifies this foreknowledge of the document format. After all the elements contained in element <MovieOrderDump> have been traversed, the stream should point to the end tag </MovieOrderDump>. The ReadEndElement method verifies this.

The code that traverses each element of type <FilmOrder> similarly uses the ReadStartElement and ReadEndElement methods to indicate the start and end of the <FilmOrder> and <multiFilmOrders> elements. The code that ultimately parses the list of movie orders and then faxes the movie supplier (using the FranticallyFaxTheMovieSupplier subroutine) is as follows (code file: Main.vb):

    Private Sub ReadMovieXml(ByVal fileName As String)          Dim myXmlSettings As New XmlReaderSettings()          Dim movieName As String          Dim movieId As String          Dim quantity As String                     Using readMovieInfo As XmlReader =              XmlReader.Create(fileName, myXmlSettings)              'position to first element              readMovieInfo.Read()              readMovieInfo.ReadStartElement("MovieOrderDump")              Do While (True)                  readMovieInfo.ReadStartElement("FilmOrderList")                  readMovieInfo.ReadStartElement("multiFilmOrders")                             'for each order                  Do While (True)                                          readMovieInfo.MoveToContent()                      movieId = readMovieInfo.GetAttribute("filmId")                      readMovieInfo.ReadStartElement("FilmOrder")                                 movieName = readMovieInfo.ReadElementString()                      quantity = readMovieInfo.ReadElementString()                      readMovieInfo.ReadEndElement() ' clear </FilmOrder>                                 FranticallyFaxTheMovieSupplier(movieName, movieId, quantity)                                 ' Should read next FilmOrder node                      ' else quits                      readMovieInfo.Read()                      If ("FilmOrder" <> readMovieInfo.Name) Then                          Exit Do                      End If                  Loop                  readMovieInfo.ReadEndElement() ' clear </multiFilmOrders>                  readMovieInfo.ReadEndElement() ' clear </FilmOrderList>                  ' Should read next FilmOrderList node                  ' else you quit                  readMovieInfo.Read() ' clear </MovieOrderDump>                  If ("FilmOrderList" <> readMovieInfo.Name) Then                      Exit Do                  End If              Loop              readMovieInfo.ReadEndElement() '  </MovieOrderDump>          End Using      End Sub

The values are read from the XML file using the ReadElementString and GetAttribute methods. Notice that the call to GetAttribute is done before reading the FilmOrder element. This is because the ReadStartElement method advances the location for the next read to the next element in the XML file. The MoveToContent call before the call to GetAttribute ensures that the current read location is on the element, and not on white space.

While parsing the stream, it was known that an element named name existed and that this element contained the name of the movie. Rather than parse the start tag, get the value, and parse the end tag, it was easier to get the data using the ReadElementString method.

The intended output of this example is a fax, which is not implemented in order to focus on XML. The format of the document is still verified by XmlReader as it is parsed.

The XmlReader class also exposes properties that provide more insight into the data contained in the XML document and the state of parsing: IsEmptyElement, EOF, HasAttributes, and IsStartElement.

.NET CLR–compliant types are not 100 percent interchangeable with XML types. The .NET Framework includes methods in the XmlReader class to make the process of casting from one of these XML types to .NET types easier.

Using the ReadElementContentAs method, you can easily perform the necessary casting required, as seen here:

Dim username As String = _     myXmlReader.ReadElementContentAs(GetType(String), Nothing)  Dim myDate As DateTime = _     myXmlReader.ReadElementContentAs(GetType(DateTime), Nothing)

In addition to the general ReadElementContentAs method, there are specific ReadElementContentAsX methods for each of the common data types; and in addition to these methods, the raw XML associated with the document can also be retrieved, using ReadInnerXml and ReadOuterXml. Again, this only scratches the surface of the XmlReader class, a class quite rich in functionality.

Handling Exceptions

XML is text and could easily be read using mundane methods such as Read and ReadLine. A key feature of each class that reads and traverses XML is inherent support for error detection and handling. To demonstrate this, consider the following malformed XML document found in the file named Malformed.xml, also included in the FilmOrdersReader2 project:

<?xml version="1.0" encoding="IBM437" ?>  <FilmOrder FilmId="101", Qty="10">     <Name>Grease</Name>  <FilmOrder>

This document may not immediately appear to be malformed. By wrapping a call to the method you developed (ReadMovieXml), you can see what type of exception is raised when XmlReader detects the malformed XML within this document as shown in Sub Main(). Comment out the line calling the MovieManage.xml file, and uncomment the line to try to open the malformed.xml file:

Try      'ReadMovieXml("MovieManage.xml")      ReadMovieXml("Malformed.xml")  Catch xmlEx As XmlException      Console.Error.WriteLine("XML Error: " + xmlEx.ToString())  Catch ex As Exception      Console.Error.WriteLine("Some other error: " + ex.ToString())  End Try

The methods and properties exposed by the XmlReader class raise exceptions of type System.Xml.XmlException. In fact, every class in the System.Xml namespace raises exceptions of type XmlException. Although this is a discussion of errors using an instance of type XmlReader, the concepts reviewed apply to all errors generated by classes found in the System.Xml namespace. The XmlException extends the basic Exception to include more information about the location of the error within the XML file.

The error displayed when subroutine ReadMovieXML processes Malformed.xml is as follows:

XML Error: System.Xml.XmlException: The ',’ character, hexadecimal value 0x2C,   cannot begin a name. Line 2, position 49.

The preceding snippet indicates that a comma separates the attributes in element <FilmOrder FilmId=”101”, Qty=”10”>. This comma is invalid. Removing it and running the code again results in the following output:

XML Error: System.Xml.XmlException: This is an unexpected token. Expected  'EndElement'. Line 5, position 27.

Again, you can recognize the precise error. In this case, you do not have an end element, </FilmOrder>, but you do have an opening element, <FilmOrder>.

The properties provided by the XmlException class (such as LineNumber, LinePosition, and Message) provide a useful level of precision when tracking down errors. The XmlReader class also exposes a level of precision with respect to the parsing of the XML document. This precision is exposed by the XmlReader through properties such as LineNumber and LinePosition.

Document Object Model (DOM)

The Document Object Model (DOM) is a logical view of an XML file. Within the DOM, an XML document is contained in a class named XmlDocument. Each node within this document is accessible and managed using XmlNode. Nodes can also be accessed and managed using a class specifically designed to process a specific node's type (XmlElement, XmlAttribute, and so on). XML documents are extracted from XmlDocument using a variety of mechanisms exposed through such classes as XmlWriter, TextWriter, Stream, and a file (specified by a filename of type String). XML documents are consumed by an XmlDocument using a variety of load mechanisms exposed through the same classes.

A DOM-style parser differs from a stream-style parser with respect to movement. Using the DOM, the nodes can be traversed forward and backward; and nodes can be added to the document, removed from the document, and updated. However, this flexibility comes at a performance cost, since the entire document is read into memory. It is faster to read or write XML using a stream-style parser.

The DOM-specific classes exposed by System.Xml include the following:

XmlDocument—Corresponds to an entire XML document. A document is loaded using the Load or LoadXml methods. The Load method loads the XML from a file (the filename specified as type String), TextReader, or XmlReader. A document can be loaded using LoadXml in conjunction with a string containing the XML document. The Save method is used to save XML documents. The methods exposed by XmlDocument reflect the intricate manipulation of an XML document. For example, the following creation methods are implemented by this class: CreateAttribute, CreateCDataSection, CreateComment, CreateDocumentFragment, CreateDocumentType, CreateElement, CreateEntityReference, CreateNavigator, CreateNode, CreateProcessingInstruction, CreateSignificantWhitespace, CreateTextNode, CreateWhitespace, and CreateXmlDeclaration. The elements contained in the document can be retrieved. Other methods support the retrieving, importing, cloning, loading, and writing of nodes.
XmlNode—Corresponds to a node within the DOM tree. This is the base class for the other node type classes. A robust set of methods and properties is provided to create, delete, and replace nodes. The contents of a node can similarly be traversed in a variety of ways: FirstChild, LastChild, NextSibling, ParentNode, and PreviousSibling.
XmlElement—Corresponds to an element within the DOM tree. The functionality exposed by this class contains a variety of methods used to manipulate an element's attributes.
XmlAttribute—Corresponds to an attribute of an element (XmlElement) within the DOM tree. An attribute contains data and lists of subordinate data, so it is a less complicated object than an XmlNode or an XmlElement. An XmlAttribute can retrieve its owner document (property, OwnerDocument), retrieve its owner element (property, OwnerElement), retrieve its parent node (property, ParentNode), and retrieve its name (property, Name). The value of an XmlAttribute is available via a read-write property named Value. Given the diverse number of methods and properties exposed by XmlDocument, XmlNode, XmlElement, and XmlAttribute (and there are many more than those listed here), it's clear that any XML 1.0-or 1.1-compliant document can be generated and manipulated using these classes. In comparison to their XML stream counterparts, these classes offer more flexible movement within the XML document and through any editing of XML documents.

A similar comparison could be made between DOM and data serialized and deserialized using XML. Using serialization, the type of node (for example, attribute or element) and the node name are specified at compile time. There is no on-the-fly modification of the XML generated by the serialization process.

DOM Traversing XML

The first DOM example, located in the DomReading project, loads an XML document into an XmlDocument object using a string that contains the actual XML document. The example over the next few pages simply traverses each XML element (XmlNode) in the document (XmlDocument) and displays the data to the console. The data associated with this example is contained in a variable, rawData, which is initialized as follows:

Dim rawData  =     <multiFilmOrders>        <FilmOrder>           <name>Grease</name>           <filmId>101</filmId>           <quantity>10</quantity>        </FilmOrder>        <FilmOrder>           <name>Lawrence of Arabia</name>           <filmId>102</filmId>           <quantity>10</quantity>        </FilmOrder>     </multiFilmOrders>

The XML document in rawData is a portion of the XML hierarchy associated with a movie order. Notice the lack of quotation marks around the XML: this is an XML literal. XML literals allow you to insert a block of XML directly into your VB source code, and are covered a little later in this chapter. They can be written over a number of lines, and can be used wherever you might normally load an XML file.

The basic idea in processing this data is to traverse each <FilmOrder> element in order to display the data it contains. Each node corresponding to a <FilmOrder> element can be retrieved from your XmlDocument using the GetElementsByTagName method (specifying a tag name of FilmOrder). The GetElementsByTagName method returns a list of XmlNode objects in the form of a collection of type XmlNodeList. Using the For Each statement to construct this list, the XmlNodeList (movieOrderNodes) can be traversed as individual XmlNode elements (movieOrderNode). The general code for handling this is as follows:

Dim xmlDoc As New XmlDocument  Dim movieOrderNodes As XmlNodeList  Dim movieOrderNode As XmlNode  xmlDoc.LoadXml(rawData.ToString())  ’ Traverse each <FilmOrder>  movieOrderNodes = xmlDoc.GetElementsByTagName("FilmOrder")  For Each movieOrderNode In movieOrderNodes      '**********************************************************      ' Process <name>, <filmId> and <quantity> here      '**********************************************************  Next

Each XmlNode can then have its contents displayed by traversing the children of this node using the ChildNodes method. This method returns an XmlNodeList (baseDataNodes) that can be traversed one XmlNode list element at a time, shown here (code file: Main.vb):

Dim baseDataNodes As XmlNodeList  Dim bFirstInRow As Boolean  baseDataNodes = movieOrderNode.ChildNodes  bFirstInRow = True  For Each baseDataNode As XmlNode In baseDataNodes    If (bFirstInRow) Then      bFirstInRow = False    Else      Console.Write(", ")    End If    Console.Write(baseDataNode.Name & ": " & baseDataNode.InnerText)  Next  Console.WriteLine()

The bulk of the preceding code retrieves the name of the node using the Name property and the InnerText property of the node. The InnerText property of each XmlNode retrieved contains the data associated with the XML elements (nodes) <name>, <filmId>, and <quantity>. The example displays the contents of the XML elements using Console.Write. The XML document is displayed to the console as follows:

name: Grease, quantity: 10  name: Lawrence of Arabia, quantity: 10

Other, more practical, methods for using this data could have been implemented, including the following:

The contents could have been directed to an ASP.NET Response object, and the data retrieved could have been used to create an HTML table (<table> table, <tr> row, and <td> data) that would be written to the Response object.
The data traversed could have been directed to a ListBox or ComboBox Windows Forms control. This would enable the data returned to be selected as part of a GUI application.
The data could have been edited as part of your application's business rules. For example, you could have used the traversal to verify that the <filmId> matched the <name>. Something like this could be done if you wanted to validate the data entered into the XML document in any manner.

Writing XML with the DOM

You can also use the DOM to create or edit XML documents. Creating new XML items is a two-step process, however. First, you use the containing document to create the new element, attribute, or comment (or other node type), and then you add that at the appropriate location in the document.

Just as there are a number of methods in the DOM for reading the XML, there are also methods for creating new nodes. The XmlDocument class has the basic CreateNode method, as well as specific methods for creating the different node types, such as CreateElement, CreateAttribute, CreateComment, and others. Once the node is created, you add it in place using the AppendChild method of XmlNode (or one of the children of XmlNode).

The example for this section is in the DomWriting project and will be used to demonstrate writing XML with the DOM. Most of the work in this sample will be done in two functions, so the Main method can remain simple, as shown here (code file: Main.vb):

   Sub Main()                     Dim data As String          Dim fileName As String = "filmorama.xml"          data = GenerateXml(fileName)                     Console.WriteLine(data)          Console.WriteLine("Press ENTER to continue")          Console.ReadLine()                 End Sub

The GenerateXml function creates the initial XmlDocument, and calls the CreateFilmOrder function multiple times to add a number of items to the structure. This creates a hierarchical XML document that can then be used elsewhere in your application. Typically, you would use the Save method to write the XML to a stream or document, but in this case it just retrieves the OuterXml (that is, the full XML document) to display (code file: Main.vb):

    Private Function GenerateXml(ByVal fileName As String) As String          Dim result As String          Dim doc As New XmlDocument          Dim elem As XmlElement                     'create root node          Dim root As XmlElement = doc.CreateElement("FilmOrderList")          doc.AppendChild(root)          'this data would likely come from elsewhere          For i As Integer = 1 To 5              elem = CreateFilmOrder(doc, i)              root.AppendChild(elem)          Next          result = doc.OuterXml          Return result      End Function

The most common error made when writing an XML document using the DOM is to create the elements but forget to append them into the document. This step is done here with the AppendChild method, but other methods can be used, in particular InsertBefore, InsertAfter, PrependChild, and RemoveChild.

Creating the individual FilmOrder nodes uses a similar CreateElement/AppendChild strategy. In addition, attributes are created using the Append method of the Attributes collection for each XmlElement. The following shows the CreateFilOrder method (code file: Main.vb):

    Private Function CreateFilmOrder(ByVal parent As XmlDocument,         ByVal count As Integer) As XmlElement          Dim result As XmlElement          Dim id As XmlAttribute          Dim title As XmlElement          Dim quantity As XmlElement                     result = parent.CreateElement("FilmOrder")          id = parent.CreateAttribute("id")          id.Value = 100 + count                     title = parent.CreateElement("title")          title.InnerText = "Some title here"                     quantity = parent.CreateElement("quantity")          quantity.InnerText = "10"                     result.Attributes.Append(id)          result.AppendChild(title)          result.AppendChild(quantity)          Return result      End Function

This generates the following XML, although it will all be on one line in the output:

<FilmOrderList>    <FilmOrder id="101">      <title>Some title here</title>      <quantity> 10 </quantity>    </FilmOrder>    <FilmOrder id="102">      <title>Some title here</title>      <quantity> 10 </quantity>    </FilmOrder>    <FilmOrder id="103">      <title>Some title here</title>      <quantity> 10 </quantity>    </FilmOrder>    <FilmOrder id="104">      <title>Some title here</title>      <quantity>10</quantity>    </FilmOrder>     <FilmOrder id="105">       <title>Some title here</title>      <quantity>10</quantity>    </FilmOrder>  </FilmOrderList>

Once you get the hang of creating XML with the DOM (and forget to add the new nodes a few dozen times), it is quite a handy method for writing XML. If the XML you need to create can all be created at once, it is probably better to use the XmlWriter class instead. Writing XML with the DOM is best left for those situations when you need to either edit an existing XML document or move backward through the document as you are writing. In addition, because the DOM is an international standard, it means that code using the DOM is portable to other languages that also provide a DOM.

In addition to the XmlWriter, the XElement shown later in this chapter provides yet another method for reading and writing XML.

. At this link you'll find all of Shakespeare's plays as XML files.

XDocument

The XDocument class is a replacement of the XmlDocument object from the pre-LINQ world. While it does not comply with any international standards, the XDocument object is easier to work with when dealing with XML documents. It works with the other new objects in this space, such as the XNamespace, XComment, XElement, and XAttribute objects.

The LinqRead project provides an example that demonstrates the use of the XDocument class. This project is covered in more detail a little later. One of the more important members of the XDocument object is the Load method:

The preceding example loads the Hamlet.xml contents as an in-memory XDocument object. You can also pass a TextReader or an XmlReader object into the Load method. From here, you can programmatically work with the XML (code file: Main.vb):

Another important member to be aware of is the Save method, which enables you to save to a physical disk location or to a TextWriter or an XmlWriter object:

XElement

Another common object that you will work with is the XElement object. With this object, you can easily create even single-element objects that are XML documents themselves, and even fragments of XML. For instance, here is an example of writing an XML element with a corresponding value:

When creating a new XElement object, you can define the name of the element as well as the value used in the element. In this case, the name of the element will be <Company>, while the value of the <Company> element will be Wrox. Running this in a console application, you will get the following result:

The XElementWriting project provides an example that demonstrates how you can also create a more complete XML document using multiple XElement objects, as shown here (code file: Main.vb):

XNamespace

The XNamespace is an object that represents an XML namespace, and it is easily applied to elements within your document. An example of this can be found in the XElementWritingNamespaces project. It is a variation of the previous example with only minor edits to include a namespace for the root element, as seen here:

In this case, an XNamespace object is created by assigning it a value of http://. From there, it is actually used in the root element <Company> with the instantiation of the XElement object:

Besides dealing with the root element, you can also apply namespaces to all your elements (code file: Main.vb):

Since the namespace was applied to the <CompanyAddress>, all of its child elements (<Street>, <City>, <State>, <Country>, and <Zip>) also have this same namespace, since elements inherit the namespace of their parent.

XAttribute

In addition to elements, another important aspect of XML is attributes, as mentioned earlier in this chapter. Adding and working with attributes is done through the use of the XAttribute object. The following example adds an attribute to the root <Company> node:

Here, the attribute MyAttribute with a value of MyAttributeValue is added to the root element of the XML document, producing the results shown in .

XML Literals

LINQ provides a great feature, called XML literals, that can be used to greatly simplify working with XML. Using XML literals, you can place XML directly in your code for working with the XDocument and XElement objects. This works due to the fact that the literal XML is converted directly to appropriate objects, such as XElement and XAttribute.

Earlier, in the XElementWriting example, the use of the XElement object was presented as follows:

The XmlLiterals project instead uses XML literals to perform the same functionality, seen here (code file: Main.vb):

This enables you to place the XML directly in the code. The best part about this is the IDE support for XML literals. Visual Studio 2012 has IntelliSense and excellent color-coding for the XML that you place in your code file. As shows, there is no difference in the output between this example and the previous one, which didn't use XML literals.

You can also use inline variables in the XML document. For instance, if you wanted to declare the value of the <CompanyName> element outside the XML literal, then you could use a construct similar to the following:

In this case, the <CompanyName> element is assigned a value of Wrox from the companyName variable, using the syntax <%= companyName %>.

Querying XML Documents

As mentioned in the beginning of this section, and in other chapters, LINQ stands for Language Integrated Query. The primary purpose for its existence is to provide a streamlined approach to querying data.

Now that you can get your XML documents into an XDocument object and work with the various parts of this document, you can also use LINQ to XML to query your XML documents and work with the results.

Static XML Documents

The functionality provided by LINQ makes querying a static XML document take almost no work at all. The following example, from the LinqRead project, makes use of the hamlet.xml file. The example demonstrates querying for all the players (actors) who appear in a play. Each of these players is defined in the XML document with the <PERSONA> element (code file: Main.vb):

In this case, an XDocument object loads a physical XML file (hamlet.xml) and then performs a LINQ query over the contents of the document:

The people object is a representation of all the <PERSONA> elements found in the document. Then the Select statement gets at the values of these elements. From there, a Console.WriteLine method is used to write out a count of all the players found, using query.Count. Next, each of the items is written to the screen in a For Each loop. The results you should see are presented here:

Dynamic XML Documents

Numerous dynamic XML documents can be found on the Internet these days. Blog feeds, podcast feeds, and more provide XML documents by sending a request to a specific URL endpoint. These feeds can be viewed either in the browser, through an RSS aggregator, or as pure XML. The LinqReadDynamic project includes an example to demonstrate reading and querying an RSS feed. The code to do this is (code file: Main.vb):

Here, the Load method of the XDocument object points to a URL where the XML is retrieved. The first query pulls out all the main subelements of the <channel> element in the feed and creates new objects called Title, Description, and Link to get at the values of these subelements.

From there, a For Each statement is run to iterate through all the items found in this query. The second query works through all the <item> elements and the various subelements it contains (these are all the blog entries found in the blog). Though a lot of the items found are rolled up into properties, in the For Each loop, only the Title property is used. You will see results similar to that shown in .

Reading and writing XML Documents

If you have been working with the XML document hamlet.xml, you probably noticed that it is quite large. You've seen how you can query into the XML document in a couple of ways, and now this section takes a look at reading and writing to the XML document.

Reading from an XML Document

Earlier you saw just how easy it is to query into an XML document using the LINQ query statements, as shown here:

This query returns all the players found in the document. Using the Element method of the XDocument object, you can also get at specific values of the XML document you are working with. For instance, continuing to work with the hamlet.xml document, the following XML fragment shows you how the title is represented:

As you can see, the <TITLE> element is a nested element of the <PLAY> element. You can easily get at the title by using the following bit of code:

This bit of code writes out the title “The Tragedy of Hamlet, Prince of Denmark” to the console screen. In the code, you were able to work down the hierarchy of the XML document by using two Element method calls—first calling the <PLAY> element, and then the <TITLE> element found nested within the <PLAY> element.

Continuing with the hamlet.xml document, you can view a long list of players who are defined with the use of the <PERSONA> element:

This piece of code starts at <PLAY>, works down to the <PERSONAE> element, and then makes use of the <PERSONA> element. However, using this you will get the following result:

Although there is a collection of <PERSONA> elements, you are dealing only with the first one that is encountered using the Element().Value call.

Writing to an XML Document

In addition to reading from an XML document, you can also write to the document just as easily. The LinqWrite project demonstrates this by providing an example that allows you to change the name of the first player of the hamlet file. The code to accomplish this is (code file: Main.vb):

In this case, the first instance of the <PERSONA> element is overwritten with the value of Foo deBar, King of Denmark using the SetValue method of the Element object. After the SetValue is called and the value is applied to the XML document, the value is then retrieved using the same approach as before. Running this bit of code, you can indeed see that the value of the first <PERSONA> element has been changed.

Another way to change the document is shown in the LinqAddElement project. This example creates the elements you want as XElement objects and then adds them to the document, shown here (code file: Main.vb):

In this case, an XElement document called xe is created. The construction of xe gives you the following XML output:

Then, using the Element().Add method from the XDocument object, you are able to add the created element:

Next, querying all the players, you will now find that instead of 26, as before, you now have 27, with the new one at the bottom of the list. Besides Add, you can also use AddFirst, which does just that—adds the player to the beginning of the list instead of the end, which is the default.

Output for the Transformation example

Note

Don't confuse displaying this HTML file with ASP.NET. Displaying an HTML file in this manner takes place on a single machine without the involvement of a Web server.

As demonstrated, the backbone of the System.Xml.Xsl namespace is the XslCompiledTransform class. This class uses XSLT files to transform XML documents. XslCompiledTransform exposes the following methods and properties:

XmlResolver—This get/set property is used to specify a class (abstract base class, XmlResolver) that is used to handle external references (import and include elements within the style sheet). These external references are encountered when a document is transformed (the method, Transform, is executed). The System.Xml namespace contains a class, XmlUrlResolver, which is derived from XmlResolver. The XmlUrlResolver class resolves the external resource based on a URI.
Load—This overloaded method loads an XSLT style sheet to be used in transforming XML documents. It is permissible to specify the XSLT style sheet as a parameter of type XPathNavigator, filename of an XSLT file (specified as parameter type String), XmlReader, or IXPathNavigable. For each type of XSLT supported, an overloaded member is provided that enables an XmlResolver to also be specified. For example, it is possible to call Load(String, XsltSettings, XmlResolver), where String corresponds to a filename, XsltSettings is an object that contains settings to affect the transformation, and XmlResolver is an object that handles references in the style sheet of type xsl:import and xsl:include. It would also be permissible to pass in a value of Nothing for the third parameter of the Load method (so that no XmlResolver would be specified).
Transform—This overloaded method transforms a specified XML document using the previously specified XSLT style sheet. The location where the transformed XML is to be output is specified as a parameter to this method. The first parameter of each overloaded method is the XML document to be transformed. The most straightforward variant of the Transform method is Transform(String, String). In this case, a file containing an XML document is specified as the first parameter, and a filename that receives the transformed XML document is specified as the second. This is exactly how the first XSLT example utilized the Transform method:

 myXslTransform.Transform("FilmOrders.xml", destFileName)

The first parameter to the Transform method can also be specified as IXPathNavigable or XmlReader. The XML output can be sent to an object of type Stream, TextWriter, or XmlWriter. In addition, a parameter containing an object of type XsltArgumentList can be specified. An XsltArgumentList object contains a list of arguments that are used as input to the transform. These may be used within the XSLT file to affect the output.

XSLT Transforming between XML Standards

The first example used four XSLT elements to transform an XML file into an HTML file. Such an example has merit, but it doesn't demonstrate an important use of XSLT: transforming XML from one standard into another standard. This may involve renaming elements/attributes, excluding elements/attributes, changing data types, altering the node hierarchy, and representing elements as attributes, and vice versa.

Returning to the example, a case of differing XML standards could easily affect your software that automates movie orders coming into a supplier. Imagine that the software, including its XML representation of a movie order, is so successful that you sell 100,000 copies. However, just as you are celebrating, a consortium of the largest movie supplier chains announces that they are no longer accepting faxed orders and that they are introducing their own standard for the exchange of movie orders between movie sellers and buyers.

Rather than panic, you simply ship an upgrade that includes an XSLT file. This upgrade (a bit of extra code plus the XSLT file) transforms your XML representation of a movie order into the XML representation dictated by the consortium of movie suppliers. Using an XSLT file enables you to ship the upgrade immediately. If the consortium of movie suppliers revises their XML representation, then you are not obliged to change your source code. Instead, you can simply ship the upgraded XSLT file that ensures each movie order document is compliant.

This new example can be found in the Transformation2 project. This project includes the MovieOrdersOriginal.xml file, which is no different than the Filmorama.xml file used in the previous example. This document represents the original source file.

The project also includes the ConvertLegacyToNewStandard.xslt file. This file is the XSLT transform that is responsible for transforming the source file into the new format as follows (code file: ConvertLegacyToNewStandard.xslt):

In the previous snippet of XSLT, the following XSLT elements are used to facilitate the transformation:

Several new XSLT terms have crept into your vocabulary: element, attribute, and for-each. Using the element node in an XSLT places an element in the destination XML document, while an attribute node places an attribute in the destination XML document. The for-each element iterates over all of the specified elements in the document.

Now that you have and understand the XSLT, here is the code to perform the actual transform (code file: Main.vb):

Recall that the input XML document (MovieOrdersOriginal.xml) does not match the format required by your consortium of movie supplier chains. The content of this source XML file is as follows:

The format exhibited in the preceding XML document does not match the format of the consortium of movie supplier chains. To be accepted by the collective of suppliers, you must transform the document as follows:

Many of the steps performed by the transform could have been achieved using an alternative technology. For example, you could have used Source Code Style attributes with your serialization to generate the correct XML attribute and XML element name. Had you known in advance that a consortium of suppliers was going to develop a standard, you could have written your classes to be serialized based on the standard. The point is that you did not know, and now one standard (your legacy standard) has to be converted into a newly adopted standard of the movie suppliers' consortium. The worst thing you could do would be to change your working code and then force all users working with the application to upgrade. It is vastly simpler to add an extra transformation step to address the new standard.

The file produced by this example looks like this (code file: MovieOrdersModified.xml):

The preceding example spans several pages but contains just a few lines of code. This demonstrates that there is more to XML than learning how to use it in Visual Basic and the .NET Framework. Among other things, you also need a good understanding of XSLT, XPath, and XQuery. For more details on these standards, see Professional XML from Wrox.

Other Classes and Interfaces in System.Xml.Xsl

You just took a good look at XSLT and the System.Xml.Xsl namespace, but there is a lot more to it than that. Other classes and interfaces exposed by the System.Xml.Xsl namespace include the following:

Output for the XmlWeb example

Besides working from static XML files such as the Painters.xml file, the XmlDataSource file can work from dynamic, URL-accessible XML files. One popular XML format pervasive on the Internet today is blogs, or weblogs. Blogs can be viewed either in the browser (see ), through an RSS aggregator, or just as pure XML.

Example output from asp.net weblog

Now that you know the location of the XML from the blog, you can use this XML with the XmlDataSource control and display some of the results in a DataList control. The code for this example, from the ViewRss project, is shown here (code file: Default.aspx):

<%@ Page Language="vb" AutoEventWireup="false"      CodeBehind="Default.aspx.vb" Inherits="ViewingRss._Default" %>             <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"      "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">  <html xmlns="http://www.w3.org/1999/xhtml">  <head runat="server">      <title>Viewing RSS</title>  </head>  <body>      <form id="form1" runat="server">      <div>          <asp:DataList ID="RssList" runat="server"              DataSourceID="RssData">              <HeaderTemplate>                  <table border="1" cellpadding="3">              </HeaderTemplate>              <ItemTemplate>                  <tr>                      <td>                          <b>                              <%# XPath("title") %></b><br />                          <i>                              <%# "published on " + XPath("pubDate") %></i><br />                          <%# XPath("description").ToString().Substring(0,100) %>                      </td>                  </tr>              </ItemTemplate>              <AlternatingItemTemplate>                  <tr style="background-color: #e0e0e0;">                      <td>                          <b>                              <%# XPath("title") %></b><br />                          <i>                              <%# "published on " + XPath("pubDate") %></i><br />                          <%# XPath("description").ToString().Substring(0,100) %>                      </td>                  </tr>              </AlternatingItemTemplate>              <FooterTemplate>                  </table>              </FooterTemplate>          </asp:DataList>          <asp:XmlDataSource ID="RssData" runat="server"              DataFile="http://weblogs.asp.net/mainfeed.aspx"              XPath="rss/channel/item" />      </div>      </form>  </body>  </html>

This example shows that the DataFile points to a URL where the XML is retrieved. The XPath property filters out all the <item> elements from the RSS feed. The DataList control creates an HTML table and pulls out specific data elements from the RSS feed, such as the <title>, <pubDate>, and <description> elements. To make things a little more visible, only the first 100 characters of each post are displayed.

Running this page in the browser results in something similar to what is shown in .

Output for the Transformation example

This approach also works with XML Web Services, even those for which you can pass in parameters using HTTP-GET. You just set up the DataFile value in the following manner:

DataFile="http://www.someserver.com/GetWeather.asmx/ZipWeather?zipcode=63301"

The XmlDataSource Control's Namespace Problem

One big issue with using the XmlDataSource control is that when using the XPath capabilities of the control, it is unable to understand namespace-qualified XML. The XmlDataSource control chokes on any XML data that contains namespaces, so it is important to yank out any prefixes and namespaces contained in the XML.

To make this a bit easier, the XmlDataSource control includes the TransformFile attribute. This attribute takes your XSLT transform file, which can be applied to the XML pulled from the XmlDataSource control. That means you can use an XSLT file, which will transform your XML in such a way that the prefixes and namespaces are completely removed from the overall XML document. An example of this XSLT document is illustrated here:

Now, with this XSLT document in place within your application, you can use the XmlDataSource control to pull XML data and strip that data of any prefixes and namespaces:

The Xml Server Control

Since the very beginning of ASP.NET, there has always been a server control called the Xml server control. This control performs the simple operation of XSLT transformation upon an XML document. The control is easy to use: all you do is point to the XML file you wish to transform using the DocumentSource attribute, and the XSLT transform file using the TransformSource attribute. The XmlControl project contains an example to demonstrate this.

To see this in action, use the Painters.xml file shown earlier. Create your XSLT transform file, as shown in the following example (code file: painters.xslt):

With the XML document and the XSLT document in place, the final step is to combine the two using the Xml server control provided by ASP.NET (code file: Default.aspx):

Deserializing

Source Code Style Attributes

Writing an XML Stream

Reading an XML Stream

The XmlReaderSettings Class

Traversing XML Using XmlReader

Handling Exceptions

Document Object Model (DOM)

DOM Traversing XML

Writing XML with the DOM

. At this link you'll find all of Shakespeare's plays as XML files.

XDocument

XElement

XNamespace

XAttribute

XML Literals

Querying XML Documents

Static XML Documents

Dynamic XML Documents

Reading and writing XML Documents

Reading from an XML Document

Writing to an XML Document

XSLT Transforming between XML Standards

Other Classes and Interfaces in System.Xml.Xsl

The XmlDataSource Control's Namespace Problem

The Xml Server Control

Summary