With several JDK versions, Apache Xerces and Sun's XML Parser
<< Designing the Benchmark
XML Parsing Benchmark
Understanding the Results >>
In order to understand the benchmark's results, you need to know how the tests were done. You might also want to repeat the tests on your own configurations.
We ran the benchmark with Sun's JDK 1.1, 1.2, 1.3 and different versions of the Apache Xerces and Sun's JAXP XML parser. The Sun's JAXP reference implementation is now developed by Apache and is called Crimson. The computer had a Pentium III 667 MHz processor and 256 MB of RAM. The operating system was Windows 2000.
There were 270 different configurations determined by the following parameters:
- JDK version: 1.1, 1.2, 1.3
- XML parser: JAXP 1.0, JAXP 1.1 (Crimson 1.1), Xerces 1.2, Xerces 1.3, Xerces 1.4
- Parsing method: SAX 1/2, DOM Level 1/2, SAXDOMIX with/without TrAX
- Validation: on/off
- Namespaces: on/off
JAXP 1.0 supports only SAX 1 and DOM Level 1, while all other parsers also support SAX 2 and DOM Level 2. Therefore, JAXP 1.0 doesn't really support XML namespaces even though a "namespaceAware" property exists. All five versions of the two parsers support XML validation.
The mixed SAX-DOM parsing can be done with or without the TrAX API. Apache Xalan 2.0 implements this API. We used Xalan only with Crimson 1.1, Xerces 1.3 and Xerces 1.4 to do mixed parsing with the help of TrAX. We also tested the mixed parsing using all five versions of the two XML parsers without the help of TrAX.
The latest version of a product usually offers more features, fixes old bugs and is supposed to work better. As users or developers we usually prefer the latest versions of our applications and tools. When doing a benchmark, however, things are different. In this case, we test multiple products that work together, such as JDKs, parsers and parsing methods. First, we want to see how each product evolved: the adding of new features may slow down a piece of software and changes have all kinds of side effects. Second, there are many production environments where the old versions are still used. Third, testing with multiple versions helps us to understand the performance problems and find the product that causes them.
There are four groups of XML tables: one for each possible combination of the two flags indicating if the XML validation is performed and the namespace support is enabled.
Each group has XML tables containing 0, 10000, 20000, 30000, 40000 and 50000 records. A benchmark main class was executed for each of the 270 configurations and each of the six tables of the corresponding group. There were 270 x 6 = 1620 significant runs only for single-threading testing. (Each group of six runs was preceded by an ignored run that used the empty table. Its role was to give the operating system the chance to cache in memory the JDK and the used classes.)
There are also tables of 500 records for multithreading tests. In this case, the record count is fixed, but there are six runs for each configuration too. The first run of the group creates no thread doing the parsing within the application's main thread as in the single-threading tests. The successive runs do the parsing in 20, 40, 60, 80 and 100 concurrent threads.
The scripts we used are packaged together with the SAXDOMIX framework and the benchmark's code.
initEnv.bat
|
Initializes the environment variables: JDK home directories, JDK command lines, JDK classpaths, parser classpaths, main classes, record counts and thread counts. You'll probably have to edit this file in order to run the benchmark on your own computer. |
setEnv.bat
|
Sets the environment variables for a particular configuration (jdk, parser, method, validation, namespaces) |
createDB.bat
|
Creates the XML tables used as input. |
rcLoop.bat
|
Runs the benchmark for a given parsing method and each record count. |
thLoop.bat
|
Runs the benchmark for a given parsing method and each thread count. |
proc.bat
|
Sets the environment for a given JDK and parser and calls rcLoop.bat or thLoop.bat |
parser.bat
|
Calls proc.bat for a given JDK and parser and each parsing method with the features on and off. |
test.bat
|
Calls parser.bat for a given JDK and each parser. |
all.bat
|
Creates the XML tables with createDB.bat and then calls test.bat for each JDK. |
For each of the 270 configurations, there is an output file containing six rows and four columns. Each line represents a run of Main1 or Main2 , and contains the record/thread count, the time spent to do the parsing (in seconds), the memory used during the parsing and the used memory after invoking the garbage collector (in megabytes).
<< Designing the Benchmark
XML Parsing Benchmark
Understanding the Results >>
|