• Home

Logo

Navigation
  • Home
  • Articles
    • Content Writing
    • Design
    • General
    • Internet Marketing
    • Social Media
    • Tools and Tips
    • Usability
    • Web Hosting Articles
  • Tutorials
    • AJAX Tutorials
    • ASP Tutorials
    • C# Tutorials
    • CGI and Perl Tutorials
    • CSS Tutorials
    • Flash Tutorials
    • HTML Tutorials
    • Illustrator Tutorials
    • Java Tutorials
    • JavaScript Tutorials
    • Linux Tutorials
    • Miscellaneous Tutorials
    • MySQL Tutorials
    • Photoshop Tutorials
    • PHP Tutorials
    • Python Tutorials
    • Wireless Tutorials
    • WordPress Tutorials
    • XML Tutorials
  • Scripts
    • AJAX Scripts
    • ASP Scripts
    • ASP.NET Scripts
    • CGI & Perl Scripts
    • Flash Scripts
    • Java Scripts
    • JavaScript Scripts
    • PHP Scripts
    • Python Scripts
    • Remotely Hosted
    • Tools and Utilities
    • XML Scripts
  • Answers
  • Online Services
  • Tools

PHP Site Search Made Easy

By Akash Mehta | on Mar 30, 2008 | 0 Comment
PHP Tutorials
  • Tweet
  • Share
  • Tweet
  • Share

Why site search?

When users want to find information on your website, they first look for a search box. Failing that, they head back to Google – potentially finding a competing site and taking their business elsewhere, or simply becoming frustrated with your service. But implementing effective search doesn’t have to be hard. In this tutorial, I’ll show you how to build a basic site-specific web search in just five lines of code, using the Yahoo! APIs.

Today’s websites have a lot of content. Wikipedia, for example, can compress absolutely all its content into one 6.4 GB archive. And that’s before it decompresses to up to 20 times that. Needless to say, effectively searching all that data can be a real challenge, and chances are at the end of the day you’ll still be stuck with a slow and ineffective search system. Your users will resort to using Google or Yahoo to search for your content. Unless you have an experienced user base that will use the site: parameter, however, chances are they’ll end up on a site other than your own.

Well, at least they got their search results quickly, some might think. But it doesn’t have to be that way. With many of te popular web search providers offering APIs, you can quickly use the power of their engines and the quality of their indexes to give your users high quality search any way you want it. Forget Google’s “customised site search” box, where you send the user to Google with nothing but a few site parameters. Using APIs, you can do whatever you want. Search only for registered users? Sure. Showing an actual image from the page next to a lesult? You got it. AJAX result loading? Absolutely. Anything to get a real, effective site search to your users.

A brief crash course on search APIs

In this tutorial, we’re going to build a site search system for a website using the Yahoo search web services. The web services provided by Yahoo are essentially web-based machine-readable interfaces to Yahoo’s various products. There are quite a few web services made available; head over to developer.yahoo.com for a full list (scroll down to “Services” in the sidebar).

Let’s get RESTful

The web search API, or application programming interface, falls into the category of RESTful web services – that is, it’s an API delivered over the web that uses the REST protocol. As far as you’re concerned, REST, or REpresentational State Transfer, just involves HTTP and URLs (URIs, actually) – technologies and concepts you will be familiar with. To demonstrate, here’s a sample URL to access to the search web service:

http://search.yahooapis.com/WebSearchService/V1/webSearch?appid=YahooDemo&query=madonna

A couple of things to note here. First, it’s a perfectly normal URL: you can load it up in your web browser and see the XML returned. Second, there’s clearly a query parameter in the URL that we can change. Here’s a sample of what the results might look like:

<ResultSet>
 <Result>
  <Title>Madonna</Title>
  <Summary>
Official site of pop diva Madonna, with news, music, media, and fan club.
  </Summary>
  <Url>http://www.madonna.com/</Url>
  <DisplayUrl>www.madonna.com/</DisplayUrl>
  <ModificationDate>1206428400</ModificationDate>
  <MimeType>text/html</MimeType>
  <Cache><Size>18519</Size></Cache>
 </Result>
</ResultSet>

I’ve removed a fair bit of information, but this is enough to demonstrate how the information is provided. If you load up a Yahoo search page and search for “madonna”, you’ll get the same result. Now, this is entirely machine readable – it’s XML – but we can go one step further. If we add another parameter to the URL, output, and give it a value of php, we get the following:

a:1:{s:9:"ResultSet";a:6:{s:6:"Result";
a:1:{i:0;a:8:{s:5:"Title";s:7:"Madonna";s:7:"Summary";
s:73:"Official site of pop diva Madonna, with news, music, media, and fan club.";s:3:"Url";s:23:"http://www.madonna.com/";s:10:"DisplayUrl";
s:16:"www.madonna.com/";s:16:"ModificationDate";i:1206428400;
s:8:"MimeType";s:9:"text/html";s:5:"Cache";a:2:{s:3:"Url";
s:316:"http://uk.wrs.yahoo.com/_ylt=A0Je5VfTHupHWj4AgjbdmMwF;
_ylu=X3oDMTBwOHA5a2tvBGNvbG8DdwRwb3MDMQRzZWMDc3IEdnRpZAM-/SIG=15vvp3oak/EXP=
1206612051/**http%3A//66.218.69.11/search/cache%3Fei=UTF-8%26appid=
YahooDemo%26query=madonna%26results=1%26output=php%26u=
www.madonna.com/%26w=madonna%26d=A_opCvH_Qg-3%26icp=1%26.intl=
us";s:4:"Size";s:5:"18519";}}}}}

Doesn’t look like much? Run it through the PHP unserialize() function, and then print_r().

[ResultSet] => Array
 (
  [Result] => Array
   (
    [0] => Array
     (
      [Title] => Madonna
      [Summary] => Official site of pop diva Madonna, with news, music, media,
	  and fan club.
      [Url] => http://www.madonna.com/
      [DisplayUrl] => www.madonna.com/
      [ModificationDate] => 1206428400
      [MimeType] => text/html
      [Cache] => Array
       (
        [Size] => 18519
       )
     )
   )
 )

Now that looks easy to work with. But how did we get to this stage? Well, as the APIs are accessed via a simple URL, we can first fetch the data using file_get_contents(). Now, this will give us the mess of characters we saw earlier. We then run it through unserialize() and finally print_r(). Here’s the code:

<?php
$data = file_get_contents('http://search.yahooapis.com/'.
                          'WebSearchService/V1/webSearch?'.
                          'appid=YahooDemo&query=madonna'.
                          '&results=1&output=php');
echo '<pre>'.print_r(unserialize($data),true);

Go ahead, run it on your web server. You’ll see roughly the sample above, plus a few extra elements.

Building a real site search system

Now, I’ve added a results=1 to our previous example, to cut down on data here, but let’s take that out (it will default to 10) and do something real. Ignoring my multi-line file_get_contents() URL, we can build a functional web search in just five lines of code. You can experiment with that array (go ahead, a foreach works fine), but here’s how I did it:

<?php
$data = file_get_contents('http://search.yahooapis.com/'.
                          'WebSearchService/V1/webSearch?'.
                          'appid=YahooDemo&query=madonna'.
                          '&output=php');
$results = unserialize($data);
foreach ($results['ResultSet']['Result'] as $result) {
 echo "<h3><a href="{$result['Url']}">{$result['Title']}</a></h3>n";
 echo "<p>{$result['Summary']}</p>n";
}

Load it up in your web browser or run it via CLI. Provided PHP can connect to the Yahoo API server, you’ll see something like the following:

<h3><a href="http://www.madonna.com/">Madonna</a></h3>
<p>Official site of pop diva Madonna, with news, music, media, and fan club.</p>
<h3><a href="http://madonnalicious.typepad.com/">madonnalicious</a></h3>
<p>Pictures, articles, downloads, concert info, news, and more about Madonna.</p>
<h3><a href="http://www.myspace.com/madonna">MySpace.com - Madonna - Pop / Rock - www.myspace.com/madonna</a></h3>
<p>Madonna MySpace page with news, blog, music downloads, desktops, wallpapers, and more.</p>

But wait – we’re building a specific site search here, and chances are you aren’t terribly interested in Madonna. The web service has yet another parameter up its sleeve: site (unsurprisingly). Let’s say I was building a site search for engadget.com, and I needed to give users a way to actually choose what to search for. First, we set the site parameter to engadget.com, and then we set the actual query to a user supplied value. We’ll use a simple form for the user to enter their search query, and then pass it to the Yahoo APIs from $_GET. Here’s what I came up with:

<form action="" method="get">
<input type="text" name="q" /><input type="submit" />
</form>
<?php
if (isset($_GET['q'])) {
 $q    = $_GET['q'];
 $data = file_get_contents('http://search.yahooapis.com/'.
                           'WebSearchService/V1/webSearch?'.
                           'appid=YahooDemo&query='.$q.
                           '&output=php&site=engadget.com');
 $results = unserialize($data);
 foreach ($results['ResultSet']['Result'] as $result) {
  echo "<h3><a href="{$result['ClickUrl']}">{$result['Title']}</a></h3>n";
  echo "<p>{$result['Summary']}</p>n";
 }
}

One more thing to note here – ClickUrl. If you noticed the output of the array we unserialized earlier, you would have seen the ‘ClickUrl’ parameter. It’s rather long, and not terribly interesting, so I’ve left it out of the demonstrations, but when sending a user to a link you fetch from the Yahoo services, you should be using the ClickUrl parameter and not just Url. By using ClickUrl, the great folks at Yahoo can analyse how to improve their engine to improve the quality of their search service – which is good for everyone. When you send a user to the ClickUrl, it is hosted at Yahoo but it will send the user right back to the normal Url.

Anyway, this is not the most elegant solution, but probably one of the simplest. Load it up in your web browser and search for ‘iphone’, your form and first result will look something like this:

The Apple iPhone – Engadget

… history — and that’s saying a lot — the iPhone has been announced today. … partnership with Yahoo will allow all iPhone customers to hook up with free push …

Compare that to searching for iphone site:engadget.com in a normal Yahoo search page:

Essentially, Yahoo just gave you the full power of their search system.

A note on application IDs

You might have noticed the appid parameter in our calls to the Yahoo web service. This parameter represents the application ID, and allows Yahoo to identify your application from everyone else’s. While just testing, it’s okay to use the ‘YahooDemo’ application ID, but when you go to build a real application you should register it with Yahoo.

If something goes wrong with your application, they may need to shut it down entirely and cut off your access to the APIs. By registering with Yahoo, you provide them with some basic contact details and details of your application. If they see something wrong with queries coming from your application, they can then easily work out that you are in charge of the application, and contact you before taking any actions that might break your code.

Building a real site search system

So, we’ve built a simple site search system with a form and a raw call to the web services. However, we have no validation, and our results aren’t exactly pretty. We also want the user to be able to search from elsewhere in our site, without having to go to a special search page.

First, let’s tackle the search system. We need to:

  • Make it easy to configure which site we want to search (i.e. ours)
  • Validate that the user actually entered something
  • Make sure no errors were encountered in the web search call
  • Return more than 10 results

The first will be easy – we just add a configuration variable.

The second is also fairly simple: where we used isset($_GET['q']), we can add AND !empty($_GET['q']), which will make sure that not only did the user submit the form, they also entered something into the search box.

The third, however, is a little tricky – Yahoo’s standard error system doesn’t really work for PHP output. Instead, it just gives us the text “Array”, which not only will fail in unserialize(), but is also rather unhelpful when working out what went wrong. The services do provide HTTP error codes in the headers – such as 403 for forbidden – but working with these with only file_get_contents isn’t an option. The simplest way to get around this would be to suppress errors, and then check that a single item in the result set actually exists – we can use isset() on item 0.

Finally, to return more than 10 results, we simply add back the results parameter I mentioned earlier. This simply states how many results to return. The service also has a system for paging through results to find what the user is looking for, but unless you have thousands of pages you probably won’t want to use this. Let’s stick to 20 results, which should be more than enough for most sites.

Incorporating these modifications, here’s the code (changes in bold):

<form action="" method="get">
<input type="text" name="q" /><input type="submit" />
</form>
<?php
$site = 'engadget.com';
if (isset($_GET['q']) AND !empty($_GET['q'])) {
 $q    = $_GET['q'];
 $data = file_get_contents('http://search.yahooapis.com/'.
                           'WebSearchService/V1/webSearch'.
                           '?appid=YahooDemo&query='.$q.
                           '&output=php&site='.$site.
                           '&results=20');
 $results = @unserialize($data); // @ = suppress errors
 if (isset($results['ResultSet']['Result'][0])) {
 foreach ($results['ResultSet']['Result'] as $result) {
  echo "<h3><a href="{$result['ClickUrl']}">{$result['Title']}</a></h3>n";
  echo "<p>{$result['Summary']}</p>n";
 }
 } else {
  echo "<p>No results were found.</p>";
 }
}

Save it as search.php and load it up in your web browser. Try searching for nothing; the script will simply show you the search box. Try searching for some random collection of characters – I chose ‘dkrkwrialkc’ – it should tell you that no results were found. Now, we just need to make it a little more flexible.

Search from far and wide

When your users want to search, they won’t look around your page for the link to a search page – they’ll look for a search box outright, and they want to see one. Just about every major online portal has a search box on every page to help the user find their way around the site. We need to do the same.

Luckily, our search system is extendible out of the box. It doesn’t check where the search is coming from, it just searches, so we can send the user to the search page from anywhere. And the best way to do that is to simply copy and paste the form. This HTML at the top of the script is the key:

<form action="" method="get">
<input type="text" name="q" /><input type="submit" />
</form>

The action property is the vital element. Wherever we put the script, we simply make sure the action property points to the location of search.php. For example, if I put search.php in the root of my website – i.e. accessible at http://mydomain.com/search.php – I can set action to /search.php and copy the form to any page under mydomain.com. If that isn’t an option, consider using mod_rewrite to make search.php in your root point to the right location, as this is the easiest option by far for the action property.

Once that’s taken care of, copy and paste the HTML for your form anywhere into your site. The top of a sidebar is a good choice, but so is somewhere in your header, or in a seperate section between your header and main content. As long as it’s “above the fold”, or visible without scrolling, your users will make liberal use of the feature to navigate your site.

Further reading

Now that you’ve built your site search system, have a thorough read through the search web service specification and experiment with the data it gives you. Maybe you want to add pagination, or use the number of results returned to see if you need to suggest a different search query to the user.

You definitely want to look into caching, to improve the performance of your search system and ease the load on Yahoo’s API servers. Head to Yahoo’s PHP Developer Center, where Rasmus Lerdorf himself provides some code samples for best practices.

If you’re interested in exploring the Yahoo APIs, the Yahoo Developer Network should be your first point of call, where you’ll find all the web service APIs thoroughly documented, as well as SDKs to help you learn how to work with the services.

Share this story:
  • tweet

Author Description

Akash Mehta is a web solutions consultant and application developer. He regularly advises website owners and small business on their online challenges, while researching and writing on innovative uses of web-related technologies for the developer community. In his copious free time, he enjoys cycling and investigating creative accounting methods.

No Responses to “PHP Site Search Made Easy”

You must be logged in to post a comment.

Connect With Us

RSSSubscribe 0Followers 497Likes
  • Popular
  • Recent
  • Comments
  • Creating Energy Spheres in Photoshop

    Apr 15, 2008 - 96 Comments
  • Easy Screen Scraping in PHP with the Simple HTML DOM Library

    Aug 6, 2008 - 20 Comments
  • Calculating date difference more precisely in PHP

    Mar 7, 2008 - 13 Comments
  • When Does Hosting Your Website in the Cloud Make Sense?

    Oct 8, 2010 - 2 Comments
  • Fun with the Microsoft Managed Extensibility Framework Part 2

    Oct 6, 2010 - 0 Comment
  • Fun with the Microsoft Managed Extensibility Framework Part 1

    Sep 22, 2010 - 0 Comment
  • Website Management on the go with the iPad

    I appreciated your post, but I was looking for something I didn't...
    November 24, 2012 - drmoderator
  • Creating Energy Spheres in Photoshop

    I'm a little stuck down here especially at the step of creating the...
    November 23, 2012 - sarah
  • Running background processes in PHP

    Can you give an example? As see it, you can use this only when you...
    November 16, 2012 - Shaked Klein Orbach
Developer Resources
  • Tutorial Directory
  • Learn HTML
  • Learn PHP
  • Learn CSS
  • Learn AJAX
  • Learn JavaScript
  • Learn Pear
  • White Papers
  • Resources
    • NetVisits Web Directory
    • Realtor Pixels
    • Answers On The Run
    • Ask A Geek
  • Recent Posts

    • When Does Hosting Your Website in the Cloud Make Sense?
    • Fun with the Microsoft Managed Extensibility Framework Part 2
    • Fun with the Microsoft Managed Extensibility Framework Part 1
    • Website Management on the go with the iPad
    • Code Contracts in C# 4.0 – Part 1

    Calendar

    June 2013
    M T W T F S S
    « Oct    
     12
    3456789
    10111213141516
    17181920212223
    24252627282930

    Recent Comments

    • drmoderator on Website Management on the go with the iPad
    • sarah on Creating Energy Spheres in Photoshop
    • Shaked Klein Orbach on Running background processes in PHP
    • Thomas Cuvillier on How To Upload Files Using PHP
    • rizal aditya on Extracting text from Word Documents via PHP and COM
    • Home
    © 2003 - 2013 DeveloperTutorials.com. All Rights Reserved. Privacy Policy.