PHP Site Search Made Easyby: Akash MehtaWhy site search? When users want to find information on your website, they first look for a search box. Failing that, they head back to Google - potentially finding a competing site and taking their business elsewhere, or simply becoming frustrated with your service. But implementing effective search doesn't have to be hard. In this tutorial, I'll show you how to build a basic site-specific web search in just five lines of code, using the Yahoo! APIs. Today's websites have a lot of content. Wikipedia, for example, can
compress absolutely all its content into one 6.4 GB archive. And that's
before it decompresses to up to 20 times that. Needless to say,
effectively searching all that data can be a real challenge, and
chances are at the end of the day you'll still be stuck with a slow and
ineffective search system. Your users will resort to using Google or
Yahoo to search for your content. Unless you have an experienced user
base that will use the Well, at least they got their search results quickly, some might think. But it doesn't have to be that way. With many of te popular web search providers offering APIs, you can quickly use the power of their engines and the quality of their indexes to give your users high quality search any way you want it. Forget Google's "customised site search" box, where you send the user to Google with nothing but a few site parameters. Using APIs, you can do whatever you want. Search only for registered users? Sure. Showing an actual image from the page next to a lesult? You got it. AJAX result loading? Absolutely. Anything to get a real, effective site search to your users. A brief crash course on search APIs In this tutorial, we're going to build a site search system for a website using the Yahoo search web services. The web services provided by Yahoo are essentially web-based machine-readable interfaces to Yahoo's various products. There are quite a few web services made available; head over to developer.yahoo.com for a full list (scroll down to "Services" in the sidebar). Let's get RESTfulThe web search API, or application programming interface, falls into the category of RESTful web services - that is, it's an API delivered over the web that uses the REST protocol. As far as you're concerned, REST, or REpresentational State Transfer, just involves HTTP and URLs (URIs, actually) - technologies and concepts you will be familiar with. To demonstrate, here's a sample URL to access to the search web service: http://search.yahooapis.com/WebSearchService/V1/webSearch?appid=YahooDemo&query=madonnaA couple of things to note here. First, it's a perfectly normal URL:
you can load it up in your web browser and see the XML returned.
Second, there's clearly a <ResultSet> I've removed a fair bit of information, but this is enough to
demonstrate how the information is provided. If you load up a Yahoo
search page and search for "madonna", you'll get the same result. Now,
this is entirely machine readable - it's XML - but we can go one step
further. If we add another parameter to the URL,
Doesn't look like much? Run it through the PHP [ResultSet] => Array Now that looks easy to work with. But how did we get to
this stage? Well, as the APIs are accessed via a simple URL, we can
first fetch the data using <?php Go ahead, run it on your web server. You'll see roughly the sample above, plus a few extra elements. Building a real site search systemNow, I've added a
<?php Load it up in your web browser or run it via CLI. Provided PHP can connect to the Yahoo API server, you'll see something like the following:
But wait - we're building a specific site search here, and chances
are you aren't terribly interested in Madonna. The web service has yet
another parameter up its sleeve:
<form action="" method="get"> One more thing to note here - ClickUrl. If you noticed the output of the array we unserialized earlier, you would have seen the 'ClickUrl' parameter. It's rather long, and not terribly interesting, so I've left it out of the demonstrations, but when sending a user to a link you fetch from the Yahoo services, you should be using the ClickUrl parameter and not just Url. By using ClickUrl, the great folks at Yahoo can analyse how to improve their engine to improve the quality of their search service - which is good for everyone. When you send a user to the ClickUrl, it is hosted at Yahoo but it will send the user right back to the normal Url. Anyway, this is not the most elegant solution, but probably one of the simplest. Load it up in your web browser and search for 'iphone', your form and first result will look something like this:
Compare that to searching for iphone site:engadget.com in a normal Yahoo search page:
Essentially, Yahoo just gave you the full power of their search system. A note on application IDsYou might have noticed the If something goes wrong with your application, they may need to shut it down entirely and cut off your access to the APIs. By registering with Yahoo, you provide them with some basic contact details and details of your application. If they see something wrong with queries coming from your application, they can then easily work out that you are in charge of the application, and contact you before taking any actions that might break your code. Building a real site search system So, we've built a simple site search system with a form and a raw call to the web services. However, we have no validation, and our results aren't exactly pretty. We also want the user to be able to search from elsewhere in our site, without having to go to a special search page. First, let's tackle the search system. We need to:
The first will be easy - we just add a configuration variable. The second is also fairly simple: where we used The third, however, is a little tricky - Yahoo's standard error
system doesn't really work for PHP output. Instead, it just gives us
the text "Array", which not only will fail in unserialize(), but is
also rather unhelpful when working out what went wrong. The services do
provide HTTP error codes in the headers - such as 403 for forbidden -
but working with these with only file_get_contents isn't an option. The
simplest way to get around this would be to suppress errors, and then
check that a single item in the result set actually exists - we can use
Finally, to return more than 10 results, we simply add back the results parameter I mentioned earlier. This simply states how many results to return. The service also has a system for paging through results to find what the user is looking for, but unless you have thousands of pages you probably won't want to use this. Let's stick to 20 results, which should be more than enough for most sites. Incorporating these modifications, here's the code (changes in bold): <form action="" method="get"> Save it as search.php and load it up in your web browser. Try searching for nothing; the script will simply show you the search box. Try searching for some random collection of characters - I chose 'dkrkwrialkc' - it should tell you that no results were found. Now, we just need to make it a little more flexible. Search from far and wideWhen your users want to search, they won't look around your page for the link to a search page - they'll look for a search box outright, and they want to see one. Just about every major online portal has a search box on every page to help the user find their way around the site. We need to do the same. Luckily, our search system is extendible out of the box. It doesn't check where the search is coming from, it just searches, so we can send the user to the search page from anywhere. And the best way to do that is to simply copy and paste the form. This HTML at the top of the script is the key: <form action="" method="get"> The Once that's taken care of, copy and paste the HTML for your form anywhere into your site. The top of a sidebar is a good choice, but so is somewhere in your header, or in a seperate section between your header and main content. As long as it's "above the fold", or visible without scrolling, your users will make liberal use of the feature to navigate your site. Further reading Now that you've built your site search system, have a thorough read through the search web service specification and experiment with the data it gives you. Maybe you want to add pagination, or use the number of results returned to see if you need to suggest a different search query to the user. You definitely want to look into caching, to improve the performance of your search system and ease the load on Yahoo's API servers. Head to Yahoo's PHP Developer Center, where Rasmus Lerdorf himself provides some code samples for best practices. If you're interested in exploring the Yahoo APIs, the Yahoo Developer Network should be your first point of call, where you'll find all the web service APIs thoroughly documented, as well as SDKs to help you learn how to work with the services. © 2008 NetVisits, Inc. All rights reserved. |