spacer
Web Development Tutorials PHP Tutorials
 Developer Newsletter

Tutorials
AJAX
ASP
CGI & Perl
CSS
Flash
HTML
Illustrator
Java
JavaScript
Linux
MySQL
PHP
Photoshop
Python
Wireless
XML
Miscellaneous


Scripts Directory
AJAX Scripts
ASP Scripts
ASP.NET Scripts
CGI & Perl Scripts
Flash Scripts
Java Scripts
JavaScript Scripts
PHP Scripts
Python Scripts
Remotely Hosted Scripts
Tools & Utilities Scripts
XML Scripts

Web Hosting Directory
ASP.NET
Budget
Dedicated Servers
Ecommerce
Linux
Resellers
Shared
Small Business
Windows

Developer Manuals
Learn HTML
Learn PHP
Learn CSS
Learn JavaScript
Learn Pear
Free White Papers

Developer Resources
Developer Tools
Developer Content
Survey Software
Dedicated Servers




PHP Site Search Made Easy

By Akash Mehta
2008-03-30


A brief crash course on search APIs

In this tutorial, we're going to build a site search system for a website using the Yahoo search web services. The web services provided by Yahoo are essentially web-based machine-readable interfaces to Yahoo's various products. There are quite a few web services made available; head over to developer.yahoo.com for a full list (scroll down to "Services" in the sidebar).

Let's get RESTful

The web search API, or application programming interface, falls into the category of RESTful web services - that is, it's an API delivered over the web that uses the REST protocol. As far as you're concerned, REST, or REpresentational State Transfer, just involves HTTP and URLs (URIs, actually) - technologies and concepts you will be familiar with. To demonstrate, here's a sample URL to access to the search web service:

http://search.yahooapis.com/WebSearchService/V1/webSearch?appid=YahooDemo&query=madonna

A couple of things to note here. First, it's a perfectly normal URL: you can load it up in your web browser and see the XML returned. Second, there's clearly a query parameter in the URL that we can change. Here's a sample of what the results might look like:

<ResultSet>
<Result>
<Title>Madonna</Title>
<Summary>
Official site of pop diva Madonna, with news, music, media, and fan club.
</Summary>
<Url>http://www.madonna.com/</Url>
<DisplayUrl>www.madonna.com/</DisplayUrl>
<ModificationDate>1206428400</ModificationDate>
<MimeType>text/html</MimeType>
<Cache><Size>18519</Size></Cache>
</Result>
</ResultSet>

I've removed a fair bit of information, but this is enough to demonstrate how the information is provided. If you load up a Yahoo search page and search for "madonna", you'll get the same result. Now, this is entirely machine readable - it's XML - but we can go one step further. If we add another parameter to the URL, output, and give it a value of php, we get the following:

a:1:{s:9:"ResultSet";a:6:{s:6:"Result";
a:1:{i:0;a:8:{s:5:"Title";s:7:"Madonna";s:7:"Summary";
s:73:"Official site of pop diva Madonna, with news, music, media, and fan club.";s:3:"Url";s:23:"http://www.madonna.com/";s:10:"DisplayUrl";
s:16:"www.madonna.com/";s:16:"ModificationDate";i:1206428400;
s:8:"MimeType";s:9:"text/html";s:5:"Cache";a:2:{s:3:"Url";
s:316:"http://uk.wrs.yahoo.com/_ylt=A0Je5VfTHupHWj4AgjbdmMwF;
_ylu=X3oDMTBwOHA5a2tvBGNvbG8DdwRwb3MDMQRzZWMDc3IEdnRpZAM-/SIG=15vvp3oak/EXP=
1206612051/**http%3A//66.218.69.11/search/cache%3Fei=UTF-8%26appid=
YahooDemo%26query=madonna%26results=1%26output=php%26u=
www.madonna.com/%26w=madonna%26d=A_opCvH_Qg-3%26icp=1%26.intl=
us";s:4:"Size";s:5:"18519";}}}}}

Doesn't look like much? Run it through the PHP unserialize() function, and then print_r().

[ResultSet] => Array
(
[Result] => Array
(
[0] => Array
(
[Title] => Madonna
[Summary] => Official site of pop diva Madonna, with news, music, media,
and fan club.
[Url] => http://www.madonna.com/
[DisplayUrl] => www.madonna.com/
[ModificationDate] => 1206428400
[MimeType] => text/html
[Cache] => Array
(
[Size] => 18519
)
)
)
)

Now that looks easy to work with. But how did we get to this stage? Well, as the APIs are accessed via a simple URL, we can first fetch the data using file_get_contents(). Now, this will give us the mess of characters we saw earlier. We then run it through unserialize() and finally print_r(). Here's the code:

<?php
$data = file_get_contents('http://search.yahooapis.com/'.
'WebSearchService/V1/webSearch?'.
'appid=YahooDemo&query=madonna'.
'&results=1&output=php');
echo '<pre>'.print_r(unserialize($data),true);

Go ahead, run it on your web server. You'll see roughly the sample above, plus a few extra elements.

Building a real site search system

Now, I've added a results=1 to our previous example, to cut down on data here, but let's take that out (it will default to 10) and do something real. Ignoring my multi-line file_get_contents() URL, we can build a functional web search in just five lines of code. You can experiment with that array (go ahead, a foreach works fine), but here's how I did it:

 

<?php
$data = file_get_contents('http://search.yahooapis.com/'.
'WebSearchService/V1/webSearch?'.
'appid=YahooDemo&query=madonna'.
'&output=php');
$results = unserialize($data);
foreach ($results['ResultSet']['Result'] as $result) {
echo "<h3><a href=\"{$result['Url']}\">{$result['Title']}</a></h3>\n";
echo "<p>{$result['Summary']}</p>\n";
}

Load it up in your web browser or run it via CLI. Provided PHP can connect to the Yahoo API server, you'll see something like the following:

<h3><a href="http://www.madonna.com/">Madonna</a></h3>
<p>Official site of pop diva Madonna, with news, music, media, and fan club.</p>
<h3><a href="http://madonnalicious.typepad.com/">madonnalicious</a></h3>
<p>Pictures, articles, downloads, concert info, news, and more about Madonna.</p>
<h3><a href="http://www.myspace.com/madonna">MySpace.com - Madonna - Pop / Rock - www.myspace.com/madonna</a></h3>
<p>Madonna MySpace page with news, blog, music downloads, desktops, wallpapers, and more.</p>

But wait - we're building a specific site search here, and chances are you aren't terribly interested in Madonna. The web service has yet another parameter up its sleeve: site (unsurprisingly). Let's say I was building a site search for engadget.com, and I needed to give users a way to actually choose what to search for. First, we set the site parameter to engadget.com, and then we set the actual query to a user supplied value. We'll use a simple form for the user to enter their search query, and then pass it to the Yahoo APIs from $_GET. Here's what I came up with:

 

<form action="" method="get">
<input type="text" name="q" /><input type="submit" />
</form>
<?php
if (isset($_GET['q'])) {
$q = $_GET['q'];
$data = file_get_contents('http://search.yahooapis.com/'.
'WebSearchService/V1/webSearch?'.
'appid=YahooDemo&query='.$q.
'&output=php&site=engadget.com');
$results = unserialize($data);
foreach ($results['ResultSet']['Result'] as $result) {
echo "<h3><a href=\"{$result['ClickUrl']}\">{$result['Title']}</a></h3>\n";
echo "<p>{$result['Summary']}</p>\n";
}
}

One more thing to note here - ClickUrl. If you noticed the output of the array we unserialized earlier, you would have seen the 'ClickUrl' parameter. It's rather long, and not terribly interesting, so I've left it out of the demonstrations, but when sending a user to a link you fetch from the Yahoo services, you should be using the ClickUrl parameter and not just Url. By using ClickUrl, the great folks at Yahoo can analyse how to improve their engine to improve the quality of their search service - which is good for everyone. When you send a user to the ClickUrl, it is hosted at Yahoo but it will send the user right back to the normal Url.

Anyway, this is not the most elegant solution, but probably one of the simplest. Load it up in your web browser and search for 'iphone', your form and first result will look something like this:

The Apple iPhone - Engadget

... history -- and that's saying a lot -- the iPhone has been announced today. ... partnership with Yahoo will allow all iPhone customers to hook up with free push ...

Compare that to searching for iphone site:engadget.com in a normal Yahoo search page:

Essentially, Yahoo just gave you the full power of their search system.

A note on application IDs

You might have noticed the appid parameter in our calls to the Yahoo web service. This parameter represents the application ID, and allows Yahoo to identify your application from everyone else's. While just testing, it's okay to use the 'YahooDemo' application ID, but when you go to build a real application you should register it with Yahoo.

If something goes wrong with your application, they may need to shut it down entirely and cut off your access to the APIs. By registering with Yahoo, you provide them with some basic contact details and details of your application. If they see something wrong with queries coming from your application, they can then easily work out that you are in charge of the application, and contact you before taking any actions that might break your code.



Tutorial Pages:
» Why site search?
» A brief crash course on search APIs
» Building a real site search system
» Further reading


Related Tutorials:
» Web Database Access from Desktop Applications
» CubeCart 3.0 Installation and Configuration
» Installing and Configuring Drupal 6.1
» Desktop Application Development with PHP-GTK
» Installing PHP on Windows
» Easy PDF Generation in PHP



About the NetVisits, Inc Network | Write For Us | Advertise
Copyright ©2007 NetVisits, Inc Network. All Rights Reserved. Privacy Policy.
Visit other NetVisits, Inc. sites: