SAX versus DOM. Benchmark performed with Xerces and Crimson
<< Understanding the Results
XML Parsing Benchmark
Using Namespaces and Validation >>
The building of a large DOM tree is a heavy task (especially with Crimson), but you get access to the whole information of a document. The SAX parsing is very fast, but difficult to use. The mixing of the two APIs is the right compromise in many cases. See the introduction of the SAXDOMIX framework.
The unlimited scalability is the main benefit of SAXDOMIX. You may parse arbitrarily large documents because the DOM sub-trees are garbage-collected. The GC's activity makes the mixed parsing slower than expected (especially when compared with the fast DOM builder of Xerces), but you can optimize your applications constructing only the necessary DOM sub-trees and ignoring the irrelevant information.
Testing the Parsing Methods with JDK 1.3, Crimson 1.1, no validation and no namespaces
The DOM parsing is surprisingly slow with Crimson (the JAXP 1.1 reference implementation), but the SAX and mixed SAX-DOM parsing are very fast.
Testing the Parsing Methods with JDK 1.3, Xerces 1.4, no validation and no namespaces
The DOM scalability is limited by the computer's memory, while SAX and SAXDOMIX can process arbitrarily large documents.
<< Understanding the Results
XML Parsing Benchmark
Using Namespaces and Validation >>
|