** Abstract
- XML library is used by several field of Mono such as ADO.NET and XML
- Digital Signature (xmldsig). Here I write about System.Xml.dll and
- related tools. This page won't include any classes which are in other
+ XML library is used by several areas of Mono such as ADO.NET and XML
+ Digital Signature (xmldsig). Here I write about System.Xml.dll and
+ related tools. This page won't include any classes which are in other
assemblies such as XmlDataDocument.
- Note that current corlib has its own XML parser class named Mono.Xml.MiniParser.
+ Note that current corlib has its own XML parser class (Mono.Xml.MiniParser).
- Basically System.XML.dll feature has finished, or almost finished, so
- I write this page mainly for bugs and improvement hints.
+ Basically System.XML.dll feature is almost finished, so I write this
+ document mainly for bugs and improvement hints.
** System.Xml namespace
*** Document Object Model (Core)
- DOM feature has already implemented. There is still missing feature.
-
- <ul>
- * ID constraint support is problematic because W3C DOM does not specify
- handling of ID attributes into non-adapted element. (MS.NET also
- looks incomplete in this area).
-
- * I think, event feature is not fully tested. There are no concrete
- desctiption on which events are risen, so we have to do some
- experiment on MS.NET.
- </ul>
+ DOM implementation has finished and our DOM implementation scores better
+ than MS.NET as to the NIST DOM test results (it is ported by Mainsoft
+ hackers and in our unit tests).
*** Xml Writer
Here XmlWriter almost equals to XmlTextWriter. If you want to see
- another implementation, check XmlNodeWriter.cs used in monodoc.
+ another implementation, check XmlNodeWriter.cs and DTMXPathDocumentWriter.cs
+ in System.XML sources.
- XmlTextWriter is completed. However, it looks nearly twice as slow as
- MS.NET (I tried 1.1)
+ XmlTextWriter is completed, though it looks a bit slower than MS.NET (I
+ tried 1.1).
*** XmlResolver
then it uses XmlUrlResolver. XmlResolver is used to parse external DTD,
importing XSL stylesheets and schemas etc.
- However, XmlUrlResolver is still buggy (mainly because System.Uri is also
- incomplete yet) and this results in several loading error.
-
XmlSecureResolver, which is introduced in MS .NET Framework 1.1 is basically
implemented, but it requires CAS (code access security) feature. We need to
fixup this class after ongoing CAS effort works.
+ You might also be interested in an improved <a href="http://codeblogs.ximian.com/blogs/benm/archives/000039.html">XmlCachingResolver</a> by Ben Maurer.
+ If even one time download is not acceptable, you can use <a href="http://primates.ximian.com/~atsushi/XmlStoredResolver.cs">this one</a>.
*** XmlNameTable
- XmlNameTable itself is implemented. However, it should be actually used in
- several classes. Currently it makes sense if compared names are both in
- the table, but if it is obvious that compared names are both in this table,
- it should be simply compared using ReferenceEquals() (if these names are
- different, the comparison is still inefficient yet).
+ NameTable itself is implemented. It should be actually used in several
+ classes. Currently it makes sense if compared names are both in the table,
+ they should be simply compared using ReferenceEquals(). We have done where
+ it seems possible e.g. in XmlNamespaceManager (in .NET 2.0 methods; if the
+ build is not NET_2_0, it will be used internally).
+ NameTable also needs performance improvement. Optimization hackings are
+ welcome.
*** Xml Stream Reader
XmlInputStream class. This may disappear since XmlStreamReader is enough to
handle this problem).
- However, there are some problems lies in these classes on reading network
- stream (especially on Linux). This should be fixed soon.
-
+ However, there used to be some problems in these classes on reading network
+ stream (especially on Linux). However, this might be already fixed with
+ some network stream bugfixes.
*** XML Reader
XmlTextReader, XmlNodeReader and XmlValidatingReader are almost finished.
- - Most of the OASIS conformance test passes as Microsoft does, but
- about W3C tests, it is not perfect.
-
- - I won't add any XDR support on XmlValidatingReader. (I haven't
- ever seen XDR used other than Microsoft's BizTalk Server 2000,
- and Now they have 2003 with XML Schema support)
+ <ul>
+ * All OASIS conformance test passes as Microsoft does. Some
+ W3C tests fail, but it looks better.
+ * Entity expansion and its well-formedness check is incomplete.
+ It incorrectly allows divided content models. It incorrectly
+ treats its Base URI, so some dtd fails.
+ * I won't add any XDR support on XmlValidatingReader. (I haven't
+ ever seen XDR used other than Microsoft's BizTalk Server 2000,
+ and Now they have 2002 with XML Schema support)
+ </ul>
XmlTextReader and XmlValidatingReader should be faster than now. Currently
XmlTextReader looks nearly twice as slow as MS.NET, and XmlValidatingReader
as normal XML parser does. For example, Mono allows non-deterministic DTD.
Another advantage of this XmlValidatingReader is support for *any* XmlReader.
- Microsoft supports only XmlTextReader.
-
- I added extra support interface named "IHasXmlParserContext", which is
- considered in XmlValidatingReader.ResolveEntity(). Microsoft failed to
- design XmlReader to support pluggable use of XmlReader (i.e. wrapping use
- of other XmlReader) since XmlParserContext is required to support both
- entity resolution and namespace manager. (In .NET 1.2, Microsoft also
- supported similar to IHasXmlParserContext, named IXmlNamespaceResolver,
- but it still does not provide any DTD information.)
+ Microsoft supports only XmlTextReader (this bug will be fixed in VS 2005,
+ taking shape of XmlFactory).
+
+ <del>I added extra support interface named "IHasXmlParserContext", which is
+ considered in XmlValidatingReader.ResolveEntity(). </del><ins>This is now
+ made as internal interface.</ins> Microsoft failed to design XmlReader
+ so that XmlReader cannot be subtree-pluggable (i.e. wrapping use of other
+ XmlReader) since XmlParserContext shoud be supplied for DTD information
+ support (e.g. entity references cannot be expanded) and namespace manager.
+ (In .NET 2.0, Microsoft also supported similar to IHasXmlParserContext,
+ named IXmlNamespaceResolver, but it still does not provide DTD information.)
We also have RELAX NG validating reader. See mcs/class/Commons.Xml.Relaxng.
** System.Xml.Schema
-*** Schema Object Model
+*** Summary
- Basically it is implemented. Some features still needs to fix:
+ Basically it is completed. We can compile complex and simple types, refer to
+ external schemas, extend or restrict other types, or use substitution groups.
+ You can test how current schema validation engine is complete (incomplete)
+ by using standalone test module
+ (see mcs/class/System.XML/Test/System.Xml.Schema/standalone_tests).
+ At least in my box, msxsdtest fails only 30 cases with bugfixed catalog -
+ this score is better than that of Microsoft implementation.
- - Complete facet support. Currently some of them is missing. Recently
- David Sheldon is doing several fixes on them.
+*** Schema Object Model
- - Complete derivation by restriction (DBR) support. Especially
- substitution group won't work with it (However, I won't recommend
- both substitution group and DBR, regardless of this incompleteness.)
+ Completed, except for some things to be fixed:
- Some bugs are remaining, but as far as I tried W3C XML Schema test suite
- with bugfixes (of test suite), only 69 out of 7581 has failed. With my test
- suite fix, MS.NET failed 48 cases.
+ <ul>
+ * Complete facet support. Currently some of them is missing.
+ Recently David Sheldon is doing several fixes on them.
+ * ContentTypeParticle for pointless xs:choice is incomplete
+ (It is because fixing this arose another bugs in
+ compilation. Interestingly, MS.NET also fails around here,
+ so it might be nature of ContentTypeParticle design)
+ * Some derivation by restriction (DBR) handling is incorrect.
+ </ul>
*** Validating Reader
XML Schema validation feature is (currently) implemented on
Mono.Xml.Schema.XsdValidatingReader, which is internally used in
- XmlValidatingReader.
-
+ XmlValidatingReader.
+
Basically this is implemented and actually its feature is almost complete,
but I have only did validation feature testing. So we have to write more
tests on properties, methods, and events (validation errors).
Lluis rules ;-)
Well, in fact XmlSerializer is almost finished and is on bugfix phase.
- However, more tests are required especially schema import and export
- feature. Please try xsd.exe to create classes from schema, or schema
- from class. And if any problems were found, please file it to bugzilla.
+ However, we appliciate more tests. Please try
+
+ <ul>
+ * System.Web.Services to invoke SOAP services.
+ * xsd.exe and wsdl.exe to create classes.
+ </ul>
-** System.Xml.XPath and System.Xml.Xsl
+ And if any problems were found, please file it to bugzilla.
+
+ Lluis also built interesting standalone test system placed under
+ mcs/class/System.Web.Services/Test/standalone.
+
+ You might also interested in genxs, which enables you to create custom
+ XML serializer. This is not included in Microsoft.NET.
+ See <a
+ href="http://primates.ximian.com/~lluis/blog/archives/000120.html">here</a>
+ and manpages for details. Code files are in mcs/tools/genxs.
- There are two implementations for XSLT. One (and historical) implementation
- is based on libxslt. Now we uses fully implemented managed XSLT.
- Putting aside bug fixes, we have to support:
+** System.Xml.XPath and System.Xml.Xsl
+
+ There are two XSLT implementations. One and historical implementation is
+ based on libxslt (aka Unmanaged XSLT). Now we uses fully implemented and
+ managed XSLT by default. To use Unmanaged XSLT, set MONO_UNMANAGED_XSLT
+ environment value (any value is acceptable).
- - embedded script (such as VB, C#, JScript). So some packages like
- latest NAnt (for MS.NET) won't be compiled.
+ As for Managed XSLT, we support msxsl:script.
It would be nice if we can support <a href="http://www.exslt.org/">EXSLT</a>.
- <a href="http://msdn.microsoft.com/WebServices/default.aspx?pull=/library/en-us/dnexxml/html/xml05192003.asp">Microsoft has already done it</a>, but it
- is not good code since it depends on internal concrete derivatives of
- XPathNodeIterator classes. In general, .NET's "extension objects" is not
- usable to return node-sets, so if we support EXSLT, it has to be done
- internally inside our System.XML.dll. Volunteers are welcome.
+ <a href="http://msdn.microsoft.com/WebServices/default.aspx?pull=/library/en-us/dnexxml/html/xml05192003.asp">Microsoft has tried to do some of them</a>,
+ but it is not good code since it depends on internal concrete derivatives of
+ XPathNodeIterator classes.
- Our managed XSLT implementation is still inefficient. XslTransform.Load()
- and .Transform() looks three times slower (However it depends on
- XmlTextReader which is also slow, so we are starting optimization from
- that class, not XSLT itself). These number are only for specific cases,
- and there might be more critical point on XSLT engine (mainly
- XPathNodeIterator).
+ In general, .NET's "extension objects" (including msxsl:script) is not
+ useful to return node-sets (MS XSLT implementation rejects just overriden
+ XPathNodeIterator, but accepts only their hidden classes. And are the same
+ in Mono though classes are different), so if we support EXSLT, it has to
+ be done inside our System.XML.dll. Volunteers are welcome.
+
+ Our managed XSLT implementation is slower than MS XSLT for some kind of
+ stylesheets, and faster for some.
+
+
+** System.Xml and ADO.NET v2.0
+
+ Microsoft released the second beta version of .NET Framework 2.0 with
+ Visual Studio 2005 alpha version. They are only available on MSDN
+ _subscriber_ download (i.e. it is not publicly downloadable yet). It
+ contains several new classes.
+
+ There are two assemblies related to System.Xml v2.0; System.Xml.dll and
+ System.Data.SqlXml.dll (here I treat sqlxml.dll as part of System.Xml v2.0,
+ but note that it is also one of the ADO.NET 2.0 feature). There are several
+ namespaces such as MS.Internal.Xml and System.Xml. Note that .NET Framework
+ is pre-release version so that they are subject to change.
+
+ System.Xml 2.0 contains several features such as:
+
+ <ul>
+ * new XPathNavigator and XPathDocument
+ * XML Query
+ * XmlAdapter
+ * XSLT IL generator (similar to Apache XSLTC) - it is
+ internal use
+ </ul>
+
+ Tim Coleman started ADO.NET 2.0 related works. Currently I have no plan to
+ implement System.Xml v2.0 classes and won't touch with them immediately,
+ but will start in some months. If any of you wants to try this frontier,
+ we welcome your effort.
+
+*** New XPathNavigator
+
+ System.Xml v2.0 implementation will be started from new XPathDocument and
+ XPathNavigator implementations (they are called as XPathDocument2 and
+ XPathNavigator2, and they were very different from existing one). First,
+ its document structure and basic navigation feature will be implemented.
+ And next, XPath2 engine should be implemented (XPathNavigator2 looks very
+ different from XPathNavigator).
+
+ There are some trivial tasks such as schema validation (we have
+ <a href="http://www24.brinkster.com/ginga/XPathDocumentReader.cs.txt">
+ XPathDocumentReader</a> that just wraps XPathNavigator, and our
+ XmlValidatingReader can accept any XmlReader).
+
+*** XML Query
+
+ XML Query is a new face XML data manipulation language (well, at least new
+ face in .NET world). It is similar to SQL, but intended to manipulate and to
+ support XML. It is similar to XSLT, but extended to support new features
+ such as XML Schema based datatypes.
+
+ XML Query implementation can be found mainly in System.Xml.Query and
+ MS.Internal.Xml.Query namespaces. Note that they are in
+ System.Data.SqlXml.dll.
+
+ MSDN documentation says that there are two kind of API for XML Query: High
+ Level API and Low Level API. At the time of this beta version, the Low Level
+ API is described not released yet (though it may be MS.Internal.Xml.*
+ classes). However, to implement the High Level API, the Low Level API will
+ be used. They looks to have interesting class structures in MS.Internal.Xml
+ related stuff, so it would be nice (and I will) start to learn about them.
+
+ They looks to have IL generator classes, but it might be difficult to
+ start from them.
+
+*** System.Data.Mapping
+
+ System.Data.Mapping and System.Data.Mapping.RelationalSchema are the
+ namespaces for mapping support between database and xml. This is at
+ stubbing phase (incomplete as yet).
+
+*** XmlAdapter
+
+ XmlAdapter is used to support XML based query and update using (new)
+ XPathDocument and XPathNavigator. This class is designed to synthesize
+ ADO.NET and System.Xml. It connects to databases, and querys data in XML
+ shape into XPathDocument, using Mapping schema above. This must be
+ done after several classes such as XPathDocument and MappingSchema.
** Miscellaneous Class Libraries
*** RELAX NG
- I implemented an experimental RelaxngValidatingReader. It is far from
- complete, especially simplification stuff (see RELAX NG spec chapter 4),
- some constraints (in chapter 7), and datatype handling.
+ I implemented an experimental RelaxngValidatingReader. It is still not
+ complete, for example some simplification stuff (see RELAX NG spec
+ chapter 4; especially 4.17-19) and some constraints (especially 7.3).
+ See mcs/class/Commons.Xml.Relaxng/README for details.
+
+ It supports custom datatype handling. Right now, you can use XML schema
+ datatypes ( http://www.w3.org/2001/XMLSchema-datatypes ) as well
+ as RELAX NG default datatypes (as used in relaxng.rng).
- I am planning improvements (starts with renaming classes, giving more
- kind error messages, supporting compact syntax and even object mapping),
- but it is still my wishlist.
+ In Commons.Xml.Relaxng.dll, there is also RELAX NG Compact Syntax support.
+ See Commons.Xml.Relaxng.Rnc.RncParser class.
+
+ I am planning improvements (giving more kind error messages, and even
+ object mapping), but it won't be come true until Mono 1.0 release.
** Tools
*** xsd.exe
- xsd.exe is used to:
+ See <a href="ado-net.html">ADO.NET page</a>.
+
+ Microsoft has another inference class from XmlReader to XmlSchemaCollection
+ (Microsoft.XsdInference). It may be useful, but it won't be so easy.
- 1) generate classes source code from schema
- 2) generate DataSet classes source code from schema
- 3) generate schema documents from assembly (classes)
- 4) infer schema documents from XML instance
- 5) convert XDR into XSD
- As descrived above, I won't work on 5) XDR stuff.
+** Miscellaneous
- Current xsd.exe supports 1) and 3)
+*** Mutual assembly dependency
- As for 2) and 4), Currently there is no works on them. (This inference
- feature is rather DataSet specific than general purpose use.)
+ Sometimes I hear complain about System.dll and System.Xml.dll mutual
+ dependency: System.dll references to System.Xml.dll (e.g.
+ System.Configuration.ConfigXmlDocument extended from XmlDocument), while
+ System.Xml.dll vice versa (e.g. XmlUrlResolver.ResolveUri takes System.Uri).
+ Since they are in public method signatures, so at least we cannot get rid
+ of these mutual references.
- Microsoft has another inference class from XmlReader to XmlSchemaCollection.
- It may be useful, but it won't be so easy.
+ Nowadays System.Xml.dll is built using incomplete System.dll (lacking
+ System.Xml dependent classes such as ConfigXmlDocument). Full System.dll
+ is built after System.Xml.dll is done.
- any volunteers?
+ Note that you still need System.dll to run mcs.