* XML Classes
** Abstract
XML library is used by several field of Mono such as ADO.NET and XML
Digital Signature (xmldsig). Here I write about System.Xml.dll and
related tools. This page won't include any classes which are in other
assemblies such as XmlDataDocument.
Note that current corlib has its own XML parser class named Mono.Xml.MiniParser.
Basically System.XML.dll feature has finished, or almost finished, so
I write this page mainly for bugs and improvement hints.
** System.Xml namespace
*** Document Object Model (Core)
DOM feature has already implemented. There is still missing feature.
* ID constraint support is problematic because W3C DOM does not
specify handling of ID attributes into non-adapted element.
(MS.NET also looks incomplete in this area).
* I think, event feature is not fully tested. There are no
concrete desctiption on which events are risen, so we have to
do some experiment on MS.NET.
*** Xml Writer
Here XmlWriter almost equals to XmlTextWriter. If you want to see
another implementation, check XmlNodeWriter.cs used in monodoc.
XmlTextWriter is completed. However, it looks slower than MS.NET (I
tried 1.1). After some optimization, it became better, but maybe it can be
done more.
*** XmlResolver
Currently XmlTextReader uses specified XmlResolver. If nothing was supplied,
then it uses XmlUrlResolver. XmlResolver is used to parse external DTD,
importing XSL stylesheets and schemas etc.
However, XmlUrlResolver is still buggy (mainly because System.Uri is also
incomplete yet) and this results in several loading error.
XmlSecureResolver, which is introduced in MS .NET Framework 1.1 is basically
implemented, but it requires CAS (code access security) feature. We need to
fixup this class after ongoing CAS effort works.
You might also be interested in an improved XmlCachingResolver by Ben Maurer.
*** XmlNameTable
XmlNameTable itself is implemented. However, it should be actually used in
several classes. Currently it makes sense if compared names are both in
the table, they should be simply compared using ReferenceEquals(). We
have partially done in XmlNamespaceManager (in .NET 1.2 methods; if the
build is not NET_1_2 then it is internal use only).
*** Xml Stream Reader
When we are using ASCII document, we don't care which encoding we are using.
However, XmlTextReader must be aware of the specified encoding in XML
declaration. So we have internal XmlStreamReader class (and currently
XmlInputStream class. This may disappear since XmlStreamReader is enough to
handle this problem).
However, there seems some problems in these classes on reading network
stream (especially on Linux). This should be fixed soon, if we found the
actual reason.
*** XML Reader
XmlTextReader, XmlNodeReader and XmlValidatingReader are almost finished.
* All OASIS conformance test passes as Microsoft does. Some
W3C tests fail, but it looks better.
* Entity expansion and its well-formedness check is incomplete.
It incorrectly allows divided content models. It incorrectly
treats its Base URI, so some dtd fails.
* Unicode surrogate pair character is not supported yet.
* I won't add any XDR support on XmlValidatingReader. (I haven't
ever seen XDR used other than Microsoft's BizTalk Server 2000,
and Now they have 2002 with XML Schema support)
XmlTextReader and XmlValidatingReader should be faster than now. Currently
XmlTextReader looks nearly twice as slow as MS.NET, and XmlValidatingReader
(which uses this slow XmlTextReader) looks nearly three times slower. (Note
that XmlValidatingReader won't be slow as itself. It uses schema validating
reader and dtd validating reader.)
**** Some Advantages
The design of Mono's XmlValidatingReader is radically different from
that of Microsoft's implementation. Under MS.NET, DTD content validation
engine is in fact simple replacement of XML Schema validation engine.
Mono's DTD validation is designed fully separate and does validation
as normal XML parser does. For example, Mono allows non-deterministic DTD.
Another advantage of this XmlValidatingReader is support for *any* XmlReader.
Microsoft supports only XmlTextReader.
I added extra support interface named "IHasXmlParserContext", which is
considered in XmlValidatingReader.ResolveEntity(). Microsoft failed to
design XmlReader to support pluggable use of XmlReader (i.e. wrapping use
of other XmlReader) since XmlParserContext is required to support both
entity resolution and namespace manager. (In .NET 1.2, Microsoft also
supported similar to IHasXmlParserContext, named IXmlNamespaceResolver,
but it still does not provide any DTD information.)
We also have RELAX NG validating reader. See mcs/class/Commons.Xml.Relaxng.
** System.Xml.Schema
*** Summary
Basically it is completed. We can compile complex and simple types, refer to
external schemas, extend or restrict other types, or use substitution groups.
You can test how current schema validation engine is (in)complete by using
standalone test module
(see mcs/class/System.XML/Test/System.Xml.Schema/standalone_tests).
At least in my box, msxsdtest fails only 30 cases with bugfixed catalog.
*** Schema Object Model
Completed, except for some things to be fixed:
* Complete facet support. Currently some of them is missing.
Recently David Sheldon is doing several fixes on them.
* ContentTypeParticle for pointless xs:choice is incomplete
(It is because fixing this arose another bugs in
compilation. Interestingly, MS.NET also fails around here,
so it might be nature of ContentTypeParticle design)
* Some derivation by restriction (DBR) handling is incorrect.
* Some simple type restriction handling is still incorrect.
*** Validating Reader
XML Schema validation feature is (currently) implemented on
Mono.Xml.Schema.XsdValidatingReader, which is internally used in
XmlValidatingReader.
Basically this is implemented and actually its feature is almost complete,
but I have only did validation feature testing. So we have to write more
tests on properties, methods, and events (validation errors).
** System.Xml.Serialization
Lluis rules ;-)
Well, in fact XmlSerializer is almost finished and is on bugfix phase.
However, we appliciate more tests. Please try
* System.Web.Services to invoke SOAP services.
* xsd.exe and wsdl.exe to create classes.
And if any problems were found, please file it to bugzilla.
Lluis also built interesting standalone test system placed under
mcs/class/System.Web.Services/Test/standalone.
You might also interested in genxs, which enables you to create custom
XML serializer. This is not included in Microsoft.NET.
See mcs/tools/genxs for the details.
** System.Xml.XPath and System.Xml.Xsl
There are two implementations for XSLT. One (and historical) implementation
is based on libxslt (aka Unmanaged XSLT). Now we uses fully implemented
managed XSLT. To use Unmanaged XSLT, set MONO_UNMANAGED_XSLT environment
value (any value is acceptable).
As for Managed XSLT, we support msxsl:script.
It would be nice if we can support EXSLT.
Microsoft has already done it, but it
is not good code since it depends on internal concrete derivatives of
XPathNodeIterator classes. In general, .NET's "extension objects" is not
usable to return node-sets, so if we support EXSLT, it has to be done
internally inside our System.XML.dll. Volunteers are welcome.
Our managed XSLT implementation is still inefficient. XslTransform.Load()
and .Transform() looks three times slower (However it depends on
XmlTextReader which is also slow, so we are starting optimization from
that class, not XSLT itself). These number are only for specific cases,
and there might be more critical point on XSLT engine (mainly
XPathNodeIterator).
** System.Xml and ADO.NET v2.0
Microsoft introduced the first beta version of .NET Framework 1.2 runtime
and sdk (and Visual Studio Whidbey). They are now available on MSDN
_subscriber_ download (i.e. it is not publicly downloadable yet). It
contains several new classes.
There are two assemblies related to System.Xml v2.0; System.Xml.dll and
System.Data.SqlXml.dll (here I treat sqlxml.dll as part of System.Xml v2.0,
but note that it is also one of the ADO.NET 2.0 feature). There are several
namespaces such as MS.Internal.Xml and System.Xml. Note that .NET Framework
is pre-release version and MS.Internal.Xml namespace apparently shows that
it is not in stable status as yet.
System.Xml 2.0 contains several features such as:
* XPathNavigator2 and XPathDocument2
* XML Query
* XmlAdapter
* XSLT IL generator (similar to Apache XSLTC) - it is
internal use
Tim Coleman started ADO.NET 2.0 related works. Currently I have no plan to
implement System.Xml v2.0 classes and won't touch with them immediately,
but will start in next some months. If any of you wants to try this
frontier, we welcome your effort.
*** XPathNavigator2
System.Xml v2.0 implementation will be started from XPathDocument2 and
XPathNavigator2 implementations. Firstly, its document structure and
basic navigation feature will be implemented. And next, XPath2 engine
should be implemented (XPathNavigator2 looks very different from
XPathNavigator). Another requirement is schema based validation feature.
It needs some schema improvements, such like IXmlInfosetReader support.
(IXmlInfosetReader is in MS.Internal.Xml.)
*** XML Query
XML Query is a new face XML data manipulation language (well, at least new
to .NET world). It is similar to SQL, but intended to manipulate and to
support XML. It is similar to XSLT, but extended to support new features
such as XML Schema based datatypes.
XML Query implementation can be found mainly in System.Xml.Query and
MS.Internal.Xml.Query namespaces. Note that they are in
System.Data.SqlXml.dll.
MSDN documentation says that there are two kind of API for XML Query: High
Level API and Low Level API. At the time of this beta version, the Low Level
API is described not released yet (though it may be MS.Internal.Xml.*
classes). However, to implement the High Level API, the Low Level API will
be used. They looks to have interesting class structures in MS.Internal.Xml
related stuff, so it would be nice (and I will) start to learn about them.
They looks to have IL generator classes, but it would be difficult to
start from them.
*** System.Data.Mapping
System.Data.Mapping and System.Data.Mapping.RelationalSchema are the
namespaces for mapping support between database and xml. This is at
stubbing phase (incomplete as yet).
*** XmlAdapter
XmlAdapter is used to support XML based query and update using
XPathDocument2 and XPathNavigator2. This class is designed to synthesize
ADO.NET and System.Xml. It connects to databases, and querys data however
in XML shape into XPathDocument2, using Mapping schema above. This must be
done after several classes such as XPathDocument2 and MappingSchema.
** Miscellaneous Class Libraries
*** RELAX NG
I implemented an experimental RelaxngValidatingReader. It is far from
complete, especially simplification stuff (see RELAX NG spec chapter 4),
some constraints (in chapter 7), and datatype handling.
I am planning improvements (starts with renaming classes, giving more
kind error messages, supporting compact syntax and even object mapping),
but it is still my wishlist.
** Tools
*** xsd.exe
See ADO.NET page.
Microsoft has another inference class from XmlReader to XmlSchemaCollection
(Microsoft.XsdInference). It may be useful, but it won't be so easy.
** Miscellaneous
Sometimes I hear complain about System.dll and System.Xml.dll mutual
dependency: System.dll references to System.Xml.dll (e.g.
System.Configuration.ConfigXmlDocument extended from XmlDocument), while
System.Xml.dll vice versa (e.g. XmlUrlResolver.ResolveUri takes System.Uri).
Since they are in public method signatures, so at least we cannot get rid
of these mutual references.
However, for those who really want to build System.Xml.dll without System.dll,
I created dummy classes in System.dll. To build System.Xml.dll in such way, remove
/r:System.dll
from Makefile, and add this source to
System.Xml.dll.sources. Note that this is at the point of Mono 0.30 release.
Also note that you still need System.dll to run mcs.