See Also: XmlReader Members
System.Xml.XmlReader provides forward-only, read-only access to a stream of XML data. The System.Xml.XmlReader class conforms to the W3C Extensible Markup Language (XML) 1.0 and the Namespaces in XML recommendations.
The current node refers to the node on which the reader is positioned. The reader is advanced using any of the read methods, and properties reflect the value of the current node.
Although the .NET Framework includes concrete implementations of the System.Xml.XmlReader class, such as the System.Xml.XmlTextReader, System.Xml.XmlNodeReader, and the System.Xml.XmlValidatingReader classes, in the 2.0 release the recommended practice is to create System.Xml.XmlReader instances using the erload:System.Xml.XmlReader.Create method. For more information, see Creating XML Readers.
System.Xml.XmlReader throws an System.Xml.XmlException on XML parse errors. After an exception is thrown, the state of the reader is not predictable. For example, the reported node type may be different from the actual node type of the current node. Use the XmlReader.ReadState property to check whether the reader is in error state.
For further discussion on the System.Xml.XmlReader class, see Reading XML with the XmlReader.
The following methods can be used with asynchronous method calls:
Some synchronous methods have asynchronous counterparts that include "Async" at the end of the method name. For example, the asynchronous equivalents for the ReadContentAsXxx and ReadElementContentAsXxx methods are:
XmlReader.ReadContentAsObjectAsync and XmlReader.ReadElementContentAsObjectAsync
XmlReader.ReadContentAsStringAsync and XmlReader.ReadElementContentAsStringAsync
XmlReader.ReadContentAsAsync(Type, IXmlNamespaceResolver) and XmlReader.ReadElementContentAsAsync(Type, IXmlNamespaceResolver)
The following sections describe asynchronous usage for methods that don't have asynchronous counterparts.
Example
public static async Task ReadStartElementAsync(this XmlReader reader, string localname, string ns) { if (await reader.MoveToContentAsync() != XmlNodeType.Element) { throw new InvalidOperationException( reader.NodeType.ToString() + " is an invalid XmlNodeType"); } if (reader.LocalName == localname && reader.NamespaceURI == ns) { await reader.ReadAsync(); } else { throw new InvalidOperationException("localName or namespace doesn’t match"); } }
Extension function:
Example
public static async Task ReadEndElementAsync(this XmlReader reader) { if (await reader.MoveToContentAsync() != XmlNodeType.EndElement) { throw new InvalidOperationException(); } await reader.ReadAsync(); }
Example
public static async Task<bool> ReadToNextSiblingAsync(this XmlReader reader, string localName, string namespaceURI) { if (localName == null || localName.Length == 0) { throw new ArgumentException ("localName is empty or null"); } if (namespaceURI == null) { throw new ArgumentNullException("namespaceURI"); } // atomize local name and namespace localName = reader.NameTable.Add(localName); namespaceURI = reader.NameTable.Add(namespaceURI); // find the next sibling XmlNodeType nt; do { await reader.SkipAsync(); if (reader.ReadState != ReadState.Interactive) break; nt = reader.NodeType; if (nt == XmlNodeType.Element && ((object)localName == (object)reader.LocalName) && ((object)namespaceURI == (object)reader.NamespaceURI)) { return true; } }while(nt != XmlNodeType.EndElement && !reader.EOF); return false; }
Example
public static async Task<bool> ReadToFollowingAsync(this XmlReader reader, string localName, string namespaceURI) { if (localName == null || localName.Length == 0) { throw new ArgumentException( "localName is empty or null"); } if (namespaceURI == null) { throw new ArgumentNullException( "namespaceURI"); } // atomize local name and namespace localName = reader.NameTable.Add(localName); namespaceURI = reader.NameTable.Add(namespaceURI); // find element with that name while (await reader.ReadAsync()) { if (reader.NodeType == XmlNodeType.Element && ((object)localName == (object)reader.LocalName) && ((object)namespaceURI == (object)reader.NamespaceURI)) { return true; } } return false; }
Example
public static async Task<bool> ReadToDescendantAsync(this XmlReader reader, string localName, string namespaceURI) { if (localName == null || localName.Length == 0) { throw new ArgumentException("localName is empty or null"); } if (namespaceURI == null) { throw new ArgumentNullException("namespaceURI"); } // save the element or root depth int parentDepth = reader.Depth; if (reader.NodeType != XmlNodeType.Element) { // adjust the depth if we are on root node if (reader.ReadState == ReadState.Initial) { parentDepth--; } else { return false; } } else if (reader.IsEmptyElement) { return false; } // atomize local name and namespace localName = reader.NameTable.Add(localName); namespaceURI = reader.NameTable.Add(namespaceURI); // find the descendant while (await reader.ReadAsync() && reader.Depth > parentDepth) { if (reader.NodeType == XmlNodeType.Element && ((object)localName == (object)reader.LocalName) && ((object)namespaceURI == (object)reader.NamespaceURI)) { return true; } } return false; }
The following items are things to consider when working with the System.Xml.XmlReader class.
Exceptions thrown from the System.Xml.XmlReader can disclose path information that you do not want bubbled up to the application. Your applications must catch exceptions and process them appropriately.
Do not enable DTD processing if you are concerned about denial of service issues or if you are dealing with untrusted sources. DTD processing is disabled by default for System.Xml.XmlReader objects created by the erload:System.Xml.XmlReader.Create method.
If you have DTD processing enabled, you can use the System.Xml.XmlSecureResolver to restrict the resources that the System.Xml.XmlReader can access. You can also design your application so that the XML processing is memory and time constrained. For example, configure time-out limits in your ASP.NET application.
XML data can include references to external resources such as a schema file. By default external resources are resolved using an System.Xml.XmlUrlResolver object with no user credentials. You can secure this further by doing one of the following:
The System.Xml.Schema.XmlSchemaValidationFlags.ProcessInlineSchema and System.Xml.Schema.XmlSchemaValidationFlags.ProcessSchemaLocation validation flags of an System.Xml.XmlReaderSettings object are not set by default. This helps to protect the System.Xml.XmlReader against schema-based attacks when it is processing XML data from an untrusted source. When these flags are set, the XmlReaderSettings.XmlResolver of the System.Xml.XmlReaderSettings object is used to resolve schema locations encountered in the instance document in the System.Xml.XmlReader. If the XmlReaderSettings.XmlResolver property is set to null, schema locations are not resolved even if the System.Xml.Schema.XmlSchemaValidationFlags.ProcessInlineSchema and System.Xml.Schema.XmlSchemaValidationFlags.ProcessSchemaLocation validation flags are set.
Schemas added during validation add new types and can change the validation outcome of the document being validated. As a result, external schemas should only be resolved from trusted sources.
We recommend disabling the System.Xml.Schema.XmlSchemaValidationFlags.ProcessIdentityConstraints flag (enabled by default) when validating, untrusted, large XML documents in high availability scenarios against a schema with identity constraints over a large part of the document.
XML data can contain a large number of attributes, namespace declarations, nested elements and so on that require a substantial amount of time to process. To limit the size of the input that is sent to the System.Xml.XmlReader, you can:
The XmlReader.ReadValueChunk(Char[], int, int) method can be used to handle large streams of data. This method reads a small number of characters at a time instead of allocating a single string for the whole value.
When reading an XML document with a large number of unique local names, namespaces, or prefixes, a problem can occur. If you are using a class that derives from System.Xml.XmlReader, and you call either the XmlReader.LocalName, XmlReader.Prefix, or XmlReader.NamespaceURI property for each item, the returned string is added to a System.Xml.NameTable. The collection held by the System.Xml.NameTable never decreases in size, creating a virtual "memory leak" of the string handles. One mitigation for this is to derive from the System.Xml.NameTable class and enforce a maximum size quota. (There is no way to prevent the use of a System.Xml.NameTable, or to switch the System.Xml.NameTable when it is full). Another mitigation is to avoid using the properties mentioned and instead use the erload:System.Xml.XmlReader.MoveToAttribute method with the erload:System.Xml.XmlReader.IsStartElement method where possible; those methods do not return strings and thus avoid the problem of overfilling the System.Xml.NameTable collection.
System.Xml.XmlReaderSettings objects can contain sensitive information such as user credentials. An untrusted component could use the System.Xml.XmlReaderSettings object and its user credentials to create System.Xml.XmlReader objects to read data. You should be careful when caching System.Xml.XmlReaderSettings objects, or when passing the System.Xml.XmlReaderSettings object from one component to another.
Do not accept supporting components, such as System.Xml.NameTable, System.Xml.XmlNamespaceManager, and System.Xml.XmlResolver objects, from an untrusted source.