All about JAXP, Part 1
By Brett McLaughlin2005-07-15
Going bigtime |
Without SAX, DOM, or another XML parsing API, you cannot parse XML. I have seen many requests for a comparison of SAX, DOM, JDOM, and dom4j to JAXP, but making such comparisons is impossible because the first four APIs serve a completely different purpose from JAXP. SAX, DOM, JDOM, and dom4j all parse XML. JAXP provides a means of getting to these parsers and the data that they expose, but doesn't offer a new way to parse an XML document. Understanding this distinction is critical if you're going to use JAXP correctly. It will also most likely put you miles ahead of many of your fellow XML developers.
If you're still dubious, make sure you have the JAXP distribution (see Going bigtime). Fire up a Web browser and load the JAXP API docs. Navigate to the parsing portion of the API, located in the javax.xml.parsers package. Surprisingly, you'll find only six classes. How hard can this API be? All of these classes sit on top of an existing parser. And two of them are just for error handling. JAXP is a lot simpler than people think. So why all the confusion?
Sitting on top of the world |
Sun's JAXP and Sun's parser
A lot of the parser/API confusion results from how Sun packages JAXP and the parser that JAXP uses by default. In earlier versions of JAXP, Sun included the JAXP API (with those six classes I just mentioned and a few more used for transformations) and a parser, called Crimson. Crimson was part of the com.sun.xml package. In newer versions of JAXP -- included in the JDK -- Sun has repackaged the Apache Xerces parser (see Resources). In both cases, though, the parser is part of the JAXP distribution, but not part of the JAXP API.
Think about it this way: JDOM ships with the Apache Xerces parser. That parser isn't part of JDOM, but is used by JDOM, so it's included to ensure that JDOM is usable out of the box. The same principle applies for JAXP, but it isn't as clearly publicized: JAXP comes with a parser so it can be used immediately. However, many people refer to the classes included in Sun's parser as part of the JAXP API itself. For example, a common question on newsgroups used to be, "How can I use the XMLDocument class that comes with JAXP? What is its purpose?" The answer is somewhat complicated.
What's in a (package) name? |
First, the com.sun.xml.tree.XMLDocument class is not part of JAXP. It is part of Sun's Crimson parser, packaged in earlier versions of JAXP. So the question is misleading from the start. Second, a major purpose of JAXP is to provide vendor independence when dealing with parsers. With JAXP, you can use the same code with Sun's XML parser, Apache's Xerces XML parser, and Oracle's XML parser. Using a Sun-specific class, then, violates the point of using JAXP. Are you starting to see how this subject has gotten muddied? The parser and the API in the JAXP distribution have been lumped together, and some developers mistake classes and features from one as part of the other, and vice versa.
Now that you can see beyond all the confusion, you're ready to move on to some code and concepts.
Tutorial pages:
|
First published by IBM developerWorks
|
|||||||||
You might also want to check these out:
|
Link to This Tutorial Page!

