* XML Classes ** Abstract XML library is used by several areas of Mono such as ADO.NET and XML Digital Signature (xmldsig). Here I write about System.Xml.dll and related tools. This page won't include any classes which are in other assemblies such as XmlDataDocument. Note that current corlib has its own XML parser class (Mono.Xml.MiniParser). Basically System.XML.dll feature is almost finished, so I write this document mainly for bugs and improvement hints. ** System.Xml namespace *** Document Object Model (Core) DOM implementation has finished and our DOM implementation scores better than MS.NET as to the NIST DOM test results (it is ported by Mainsoft hackers and in our unit tests). *** Xml Writer Here XmlWriter almost equals to XmlTextWriter. If you want to see another implementation, check XmlNodeWriter.cs and DTMXPathDocumentWriter.cs in System.XML sources. XmlTextWriter is completed, though it looks a bit slower than MS.NET (I tried 1.1). *** XmlResolver Currently XmlTextReader uses specified XmlResolver. If nothing was supplied, then it uses XmlUrlResolver. XmlResolver is used to parse external DTD, importing XSL stylesheets and schemas etc. XmlSecureResolver, which is introduced in MS .NET Framework 1.1 is basically implemented, but it requires CAS (code access security) feature. We need to fixup this class after ongoing CAS effort works. You might also be interested in an improved XmlCachingResolver by Ben Maurer. If even one time download is not acceptable, you can use this one. *** XmlNameTable NameTable itself is implemented. It should be actually used in several classes. Currently it makes sense if compared names are both in the table, they should be simply compared using ReferenceEquals(). We have done where it seems possible e.g. in XmlNamespaceManager (in .NET 2.0 methods; if the build is not NET_2_0, it will be used internally). NameTable also needs performance improvement. Optimization hackings are welcome. *** Xml Stream Reader When we are using ASCII document, we don't care which encoding we are using. However, XmlTextReader must be aware of the specified encoding in XML declaration. So we have internal XmlStreamReader class (and currently XmlInputStream class. This may disappear since XmlStreamReader is enough to handle this problem). However, there used to be some problems in these classes on reading network stream (especially on Linux). However, this might be already fixed with some network stream bugfixes. *** XML Reader XmlTextReader, XmlNodeReader and XmlValidatingReader are almost finished. XmlTextReader and XmlValidatingReader should be faster than now. Currently XmlTextReader looks nearly twice as slow as MS.NET, and XmlValidatingReader (which uses this slow XmlTextReader) looks nearly three times slower. (Note that XmlValidatingReader won't be slow as itself. It uses schema validating reader and dtd validating reader.) **** Some Advantages The design of Mono's XmlValidatingReader is radically different from that of Microsoft's implementation. Under MS.NET, DTD content validation engine is in fact simple replacement of XML Schema validation engine. Mono's DTD validation is designed fully separate and does validation as normal XML parser does. For example, Mono allows non-deterministic DTD. Another advantage of this XmlValidatingReader is support for *any* XmlReader. Microsoft supports only XmlTextReader (this bug will be fixed in VS 2005, taking shape of XmlFactory). I added extra support interface named "IHasXmlParserContext", which is considered in XmlValidatingReader.ResolveEntity(). This is now made as internal interface. Microsoft failed to design XmlReader so that XmlReader cannot be subtree-pluggable (i.e. wrapping use of other XmlReader) since XmlParserContext shoud be supplied for DTD information support (e.g. entity references cannot be expanded) and namespace manager. (In .NET 2.0, Microsoft also supported similar to IHasXmlParserContext, named IXmlNamespaceResolver, but it still does not provide DTD information.) We also have RELAX NG validating reader. See mcs/class/Commons.Xml.Relaxng. ** System.Xml.Schema *** Summary Basically it is completed. We can compile complex and simple types, refer to external schemas, extend or restrict other types, or use substitution groups. You can test how current schema validation engine is complete (incomplete) by using standalone test module (see mcs/class/System.XML/Test/System.Xml.Schema/standalone_tests). At least in my box, msxsdtest fails only 30 cases with bugfixed catalog - this score is better than that of Microsoft implementation. *** Schema Object Model Completed, except for some things to be fixed: *** Validating Reader XML Schema validation feature is (currently) implemented on Mono.Xml.Schema.XsdValidatingReader, which is internally used in XmlValidatingReader. Basically this is implemented and actually its feature is almost complete, but I have only did validation feature testing. So we have to write more tests on properties, methods, and events (validation errors). ** System.Xml.Serialization Lluis rules ;-) Well, in fact XmlSerializer is almost finished and is on bugfix phase. However, we appliciate more tests. Please try And if any problems were found, please file it to bugzilla. Lluis also built interesting standalone test system placed under mcs/class/System.Web.Services/Test/standalone. You might also interested in genxs, which enables you to create custom XML serializer. This is not included in Microsoft.NET. See here and manpages for details. Code files are in mcs/tools/genxs. ** System.Xml.XPath and System.Xml.Xsl There are two XSLT implementations. One and historical implementation is based on libxslt (aka Unmanaged XSLT). Now we uses fully implemented and managed XSLT by default. To use Unmanaged XSLT, set MONO_UNMANAGED_XSLT environment value (any value is acceptable). As for Managed XSLT, we support msxsl:script. It would be nice if we can support EXSLT. Microsoft has tried to do some of them, but it is not good code since it depends on internal concrete derivatives of XPathNodeIterator classes. In general, .NET's "extension objects" (including msxsl:script) is not useful to return node-sets (MS XSLT implementation rejects just overriden XPathNodeIterator, but accepts only their hidden classes. And are the same in Mono though classes are different), so if we support EXSLT, it has to be done inside our System.XML.dll. Volunteers are welcome. Our managed XSLT implementation is slower than MS XSLT for some kind of stylesheets, and faster for some. ** System.Xml and ADO.NET v2.0 Microsoft released the second beta version of .NET Framework 2.0 with Visual Studio 2005 alpha version. They are only available on MSDN _subscriber_ download (i.e. it is not publicly downloadable yet). It contains several new classes. There are two assemblies related to System.Xml v2.0; System.Xml.dll and System.Data.SqlXml.dll (here I treat sqlxml.dll as part of System.Xml v2.0, but note that it is also one of the ADO.NET 2.0 feature). There are several namespaces such as MS.Internal.Xml and System.Xml. Note that .NET Framework is pre-release version so that they are subject to change. System.Xml 2.0 contains several features such as: Tim Coleman started ADO.NET 2.0 related works. Currently I have no plan to implement System.Xml v2.0 classes and won't touch with them immediately, but will start in some months. If any of you wants to try this frontier, we welcome your effort. *** New XPathNavigator System.Xml v2.0 implementation will be started from new XPathDocument and XPathNavigator implementations (they are called as XPathDocument2 and XPathNavigator2, and they were very different from existing one). First, its document structure and basic navigation feature will be implemented. And next, XPath2 engine should be implemented (XPathNavigator2 looks very different from XPathNavigator). There are some trivial tasks such as schema validation (we have XPathDocumentReader that just wraps XPathNavigator, and our XmlValidatingReader can accept any XmlReader). *** XML Query XML Query is a new face XML data manipulation language (well, at least new face in .NET world). It is similar to SQL, but intended to manipulate and to support XML. It is similar to XSLT, but extended to support new features such as XML Schema based datatypes. XML Query implementation can be found mainly in System.Xml.Query and MS.Internal.Xml.Query namespaces. Note that they are in System.Data.SqlXml.dll. MSDN documentation says that there are two kind of API for XML Query: High Level API and Low Level API. At the time of this beta version, the Low Level API is described not released yet (though it may be MS.Internal.Xml.* classes). However, to implement the High Level API, the Low Level API will be used. They looks to have interesting class structures in MS.Internal.Xml related stuff, so it would be nice (and I will) start to learn about them. They looks to have IL generator classes, but it might be difficult to start from them. *** System.Data.Mapping System.Data.Mapping and System.Data.Mapping.RelationalSchema are the namespaces for mapping support between database and xml. This is at stubbing phase (incomplete as yet). *** XmlAdapter XmlAdapter is used to support XML based query and update using (new) XPathDocument and XPathNavigator. This class is designed to synthesize ADO.NET and System.Xml. It connects to databases, and querys data in XML shape into XPathDocument, using Mapping schema above. This must be done after several classes such as XPathDocument and MappingSchema. ** Miscellaneous Class Libraries *** RELAX NG I implemented an experimental RelaxngValidatingReader. It is still not complete, for example some simplification stuff (see RELAX NG spec chapter 4; especially 4.17-19) and some constraints (especially 7.3). See mcs/class/Commons.Xml.Relaxng/README for details. It supports custom datatype handling. Right now, you can use XML schema datatypes ( http://www.w3.org/2001/XMLSchema-datatypes ) as well as RELAX NG default datatypes (as used in relaxng.rng). In Commons.Xml.Relaxng.dll, there is also RELAX NG Compact Syntax support. See Commons.Xml.Relaxng.Rnc.RncParser class. I am planning improvements (giving more kind error messages, and even object mapping), but it won't be come true until Mono 1.0 release. ** Tools *** xsd.exe See ADO.NET page. Microsoft has another inference class from XmlReader to XmlSchemaCollection (Microsoft.XsdInference). It may be useful, but it won't be so easy. ** Miscellaneous *** Mutual assembly dependency Sometimes I hear complain about System.dll and System.Xml.dll mutual dependency: System.dll references to System.Xml.dll (e.g. System.Configuration.ConfigXmlDocument extended from XmlDocument), while System.Xml.dll vice versa (e.g. XmlUrlResolver.ResolveUri takes System.Uri). Since they are in public method signatures, so at least we cannot get rid of these mutual references. Nowadays System.Xml.dll is built using incomplete System.dll (lacking System.Xml dependent classes such as ConfigXmlDocument). Full System.dll is built after System.Xml.dll is done. Note that you still need System.dll to run mcs.