//, XML/Parsing XML using PHP5

Parsing XML using PHP5

Parsing XML using PHP5

This tutorial is a follow up to my earlier tutorial, Parsing XML using PHP4 and highlights the improvements in XML handling with PHP5.

The release of PHP5 updated PHP with much needed OOP features with the release of a new object model. In addition, support for MySQL 4.1.x (with the improved mysqli extension) and a whole host of changes where introduced with PHP5. The most significant change with regards to XML is the introduction of the simplexml extension, which provides a complete set of tools for parsing XML documents.

I’ll concentrate on the XML-bits (for now). Simplexml is enabled by default with PHP5, so anyone running PHP5 should be ready to go.

Simplexml provides a very easy framework for parsing XML documents. For example, there are functions to load XML documents from a file, or from a string. In addition, the DOM can be queried using Xpath. Just to show you easy it is, here is a quick example that parses the bash.org random quote feed.

Note: This example will work if you have your url wrappers enabled.


  1. $xml = simplexml_load_file("http://www.bash.org/xml/?random&num=1");
  2. echo $xml->item->description;

The simplexml_* functions return an object of class SimpleXMLElement. You can then use this object to perform operations on the XML file. The object returned with simplxml_load_file represents the entire structure of the XML file as properties of the object. Each tag is turned into a SimpleXMLElement object, and so on.

Here is what the feed that you get from bash.org looks like :


  1. <rdf :RDF>
  2. <channel rdf:about="http://bash.org/xml/about.html">
  3. <title>QDB: Quote Database</title>
  4. <link>http://www.bash.org</link>
  5. <description>
  6. Latest, random, top, bottom, and individual quotes from bash.org.
  7. </description>
  8. <dc :publisher>Bash.org</dc>
  9. <dc :creator>Josh (josh@bash.org)</dc>
  10. <image rdf:resource="http://www.bash.org/xml/img.php"/>
  11. <items>
  12. <rdf :Seq>
  13. <rdf :li resource="http://bash.org/?5034"/>
  14. </rdf>
  15. </items>
  16. </channel>
  17. <image rdf:about="http://www.bash.org/xml/about.html">
  18. <title>QDB: Quote Database</title>
  19. <url>http://www.bash.org/xml/img.php</url>
  20. <link>http://bash.org</link>
  21. </image>
  22. <item rdf:about="http://bash.org/?5034">
  23. <title>QDB: Quote #5034</title>
  24. <link>http://bash.org/?5034</link>
  25. <description>
  26. &lt;Lith&gt; im tellin ya.. im sitting on a land mine<br />&lt;Lith&gt; err<br />&lt;Lith&gt; gold mine
  27. </description>
  28. </item>
  29. </rdf>

The resulting SimpleXMLElement object that you get looks something like :


  1. object(SimpleXMLElement)#1 (3) {
  2. ["channel"]=>
  3. object(SimpleXMLElement)#2 (7) {
  4. ["title"]=>
  5. string(19) "QDB: Quote Database"
  6. ["link"]=>
  7. string(19) "http://www.bash.org"
  8. ["description"]=>
  9. string(65) "Latest, random, top, bottom, and individual quotes from bash.org."
  10. ["publisher"]=>
  11. string(8) "Bash.org"
  12. ["creator"]=>
  13. string(20) "Josh (josh@bash.org)"
  14. ["image"]=>
  15. object(SimpleXMLElement)#4 (0) {
  16. }
  17. ["items"]=>
  18. object(SimpleXMLElement)#7 (1) {
  19. ["Seq"]=>
  20. object(SimpleXMLElement)#8 (1) {
  21. ["li"]=>
  22. object(SimpleXMLElement)#9 (0) {
  23. }
  24. }
  25. }
  26. }
  27. ["image"]=>
  28. object(SimpleXMLElement)#3 (3) {
  29. ["title"]=>
  30. string(19) "QDB: Quote Database"
  31. ["url"]=>
  32. string(31) "http://www.bash.org/xml/img.php"
  33. ["link"]=>
  34. string(15) "http://bash.org"
  35. }
  36. ["item"]=>
  37. object(SimpleXMLElement)#5 (3) {
  38. ["title"]=>
  39. string(16) "QDB: Quote #4696"
  40. ["link"]=>
  41. string(21) "http://bash.org/?4696"
  42. ["description"]=>
  43. object(SimpleXMLElement)#10 (0) {
  44. }
  45. }
  46. }

You’ll note that the description itself is an object, but its contents are the actual quote. PHP5 provides object iteration which allows you to query the properties of an object, much like you would those of an array. So, when we query the description object :


  1. foreach($xml->item->description as $key => $value)
  2. {
  3. echo $key." = ".$value."<br />";
  4. }

We get the following output:


  1. description = &lt;lith&gt; im tellin ya.. im sitting on a land mine
  2. &lt;lith&gt; err
  3. &lt;lith&gt; gold mine

Now it becomes clear. The description object holds the description (the quote) that we are after. Now its easy to see why my two-liner at the very beginning outputs the correct results.

This is only scratching the surface of the new simplexml extension with PHP5. You can, for example, write your own class that extends SimpleXMLElement, and then have all the simplexml_* functions return an object of your class, not the default SimpleXMLElement class — allowing for increased flexibility and ease when dealing with XML documents.

2010-05-25T23:10:05+00:00 August 16th, 2005|PHP, XML|0 Comments

About the Author:

Leave A Comment