|
Helping ordinary people create extraordinary websites! |
XML and How it Will Change the WebBy Doug Tidwell2003-09-17
Why do we need XML? When people first hear about XML, they often ask why we need another markup language. Everybody's browser supports HTML today, so why create more tags? Given that lots of HTML tags haven't been implemented the same way by the big browser vendors, why let anybody and everybody create their own tags? The answer to these questions is that HTML and XML serve different functions: HTML tags describe how to render things on the screen, while XML tags describe what things are. Put another way, HTML tags are designed for the interaction between humans and computers; XML tags are designed for the interaction between two computers. To see this difference, look at the HTML and XML versions of a short document. Listing 1 shows the HTML version. Listing 1. The HTML version of an address
When this document is rendered in a browser, it looks something like this: Mrs. Mary McGoon Anyone familiar with postal addresses in the United States will recognize this document as someone's address. Even if you're from another country where postal codes and other conventions are different, you can still surmise that this is someone's address. Imagine writing code to interpret this document, however. To extract the zip code from this address, our algorithm might look like this: Given a tag that contains two While this algorithm would work for our sample HTML document, it's easy to think of a perfectly valid address that breaks our algorithm. We've also completely sidestepped the issue of distinguishing a tag that contains an address from any other tag. While the address formats beautifully in a browser, our HTML markup isn't nearly as well suited for use by another program. Now let's take a look at an XML version of the same document in Listing 2. Listing 2. The XML version of the same address
As with our HTML document, anyone familiar with U.S. postal addresses will recognize this document as an address. More importantly, a computer can recognize the parts of this address as well. Here's a much more robust algorithm for finding the zip code in our XML document: The zip code is the text of the This algorithm is much simpler to code, and it would be difficult, if not impossible, to write a valid address that breaks this algorithm. A computer can understand all of the parts of the address and how they relate to each other, and the computer can decide the best way to render that data. For example, the XML document might be rendered like this: Mrs. Mary McGoon In rendering the XML tags in this style, you could convert them into HTML markup that's virtually identical to the earlier HTML document. If you want to print a mailing label for this address, you might render the document like this: In this case, you print Mrs. McGoon's zip code as a bar code for the benefit of the scanners at the post office. The most important concept here is that content and presentation are separate. The data and its structure are tagged in a presentation-independent way, and the decision of how to render it is delayed as long as possible. Tutorial Pages: » Why do we need XML? » How XML will change the Web » The promise of XML » Resources First published by IBM DeveloperWorks |
|