XML and How it Will Change the Web
By Doug Tidwell2003-09-17
The answer to these questions is that HTML and XML serve different functions: HTML tags describe how to render things on the screen, while XML tags describe what things are. Put another way, HTML tags are designed for the interaction between humans and computers; XML tags are designed for the interaction between two computers.
To see this difference, look at the HTML and XML versions of a short document. Listing 1 shows the HTML version.
Listing 1. The HTML version of an address |
When this document is rendered in a browser, it looks something like this:
Mrs. Mary McGoon
1401 Main Street
Anytown, NC 34829
Anyone familiar with postal addresses in the United States will recognize this document as someone's address. Even if you're from another country where postal codes and other conventions are different, you can still surmise that this is someone's address. Imagine writing code to interpret this document, however. To extract the zip code from this address, our algorithm might look like this: Given a
tag that contains two
tags, take the text of the second
tag. In that text, everything up to the comma is the name of the city, the two-character token following the comma is the name of the state, and the final token is the zip code.
While this algorithm would work for our sample HTML document, it's easy to think of a perfectly valid address that breaks our algorithm. We've also completely sidestepped the issue of distinguishing a
tag that contains an address from any other
tag. While the address formats beautifully in a browser, our HTML markup isn't nearly as well suited for use by another program.
Now let's take a look at an XML version of the same document in Listing 2.
Listing 2. The XML version of the same address |
As with our HTML document, anyone familiar with U.S. postal addresses will recognize this document as an address. More importantly, a computer can recognize the parts of this address as well. Here's a much more robust algorithm for finding the zip code in our XML document:
The zip code is the text of thetag.
This algorithm is much simpler to code, and it would be difficult, if not impossible, to write a valid address that breaks this algorithm. A computer can understand all of the parts of the address and how they relate to each other, and the computer can decide the best way to render that data. For example, the XML document might be rendered like this:
Mrs. Mary McGoon
1401 Main Street
Anytown, NC 34829
In rendering the XML tags in this style, you could convert them into HTML markup that's virtually identical to the earlier HTML document. If you want to print a mailing label for this address, you might render the document like this:
In this case, you print Mrs. McGoon's zip code as a bar code for the benefit of the scanners at the post office. The most important concept here is that content and presentation are separate. The data and its structure are tagged in a presentation-independent way, and the decision of how to render it is delayed as long as possible.
Tutorial pages:
|
First published by IBM DeveloperWorks
|
|||||||||
You might also want to check these out:
|
Link to This Tutorial Page!

