• Home

Logo

Navigation
  • Home
  • Articles
    • Content Writing
    • Design
    • General
    • Internet Marketing
    • Social Media
    • Tools and Tips
    • Usability
    • Web Hosting Articles
  • Tutorials
    • AJAX Tutorials
    • ASP Tutorials
    • C# Tutorials
    • CGI and Perl Tutorials
    • CSS Tutorials
    • Flash Tutorials
    • HTML Tutorials
    • Illustrator Tutorials
    • Java Tutorials
    • JavaScript Tutorials
    • Linux Tutorials
    • Miscellaneous Tutorials
    • MySQL Tutorials
    • Photoshop Tutorials
    • PHP Tutorials
    • Python Tutorials
    • Wireless Tutorials
    • WordPress Tutorials
    • XML Tutorials
  • Scripts
    • AJAX Scripts
    • ASP Scripts
    • ASP.NET Scripts
    • CGI & Perl Scripts
    • Flash Scripts
    • Java Scripts
    • JavaScript Scripts
    • PHP Scripts
    • Python Scripts
    • Remotely Hosted
    • Tools and Utilities
    • XML Scripts
  • Answers
  • Online Services
  • Tools

Google Suggest With PHP

By Robert Plank | on May 25, 2005 | 0 Comment
PHP Tutorials
  • Tweet
  • Share
  • Tweet
  • Share

Google Suggest With PHP

Google Suggest is an experiment that tries putting an "autocomplete" into search queries.  This isn’t brand new, I’ve seen this sort of thing done on sites like PHP.net on and off for about a year.  Basically you’d type in the first few letters of a function and it would suggest a bunch to you.

Gavi Narra (http://www.objectgraph.com/dictionary) came up with a cool way of showing how to do this on your own: by making a dictionary suggestion tool.  I liked his demo so much I decided to show how it could be done in PHP.

The first step is getting a copy of the Webster 1913 dictionary.  It’s a good one to use because it’s out of copyright.  There are other public domain dictionaries you can use like WordNet, but that one might be too hard for you to import into a database because it’s made of up of thousands of little individual files.

Webster 1913 is available on Project Gutenberg but I don’t like to use that one since they put in line breaks which are a pain to take out.  I prefer this version: http://www.jumpx.com/tutorials/googlesuggest/Webster-1913.gz (right click and save)

If you don’t want to wait to download it to your computer, and then upload it to your server, and you have shell access you can just do:

wget http://www.jumpx.com/tutorials/googlesuggest/Webster-1913.gz

If you download this with a "regular" browser it will probably compress the file for you.  You download about 5 megs but the result should be around 30.  If the file looks small then you will have to gunzip it.

So, on your server do this: gunzip *.gz

Now the zip file should be gone, and you will have a new file in there called Webster-1913.

I tried reading this into a variable using the file() function, which puts each line of a file on a different array, but it didn’t work.  That’s because this file was saved with a Mac, so the end of each line was denoted by a carriage return (r) instead of a newline character (n).

This is still easy to get around, first read the whole thing into one huge string…

$file = "Webster-1913";
$fp = fopen($file, "r");
$contents = fread($fp, filesize($file));
fclose($fp);

And then explode by the "r" character.

$contents = explode("r", $contents);

We don’t have to trim the whole array now, like we would’ve with file().

Loop through each of those lines…

foreach ($contents as $line) {
}

And inside that loop, use a regular expression to get the parts of the text we want:

preg_match_all("/<hw>(.*?)</hw>.*?<pos>(.*?)</pos>.*?<def>(.*?)</def>/", $line, $results, PREG_SET_ORDER);

This looks complicated, but it isn’t.  If you look at the dictionary file a sample line looks like this:

<p><hw>Ge*ne"va</hw> (?), <pos><i>n.</i></pos> <def>The chief city of Switzerland.</def></p>

The stuff between the <hw> and </hw> tags is the actual word.  The text inside the "pos" tag tells us if the word is a noun, preposition, etc.  Then, the text inside the <def> </def> tags contain the actual definition.

All that went into the regular expression, with the .*? in between them to show we need to skip over that text.  The parentheses around the wildcard means we want to save that text and put that into out result array.

I put .*? in between each tag because there might be other stuff in between, like spaces or maybe alternate definitions which we don’t care about.

Now, we’re reading from the variable $line and putting the matches into the array called $results.  The PREG_SET_ORDER part at the end structures the array so the first set of matches into $result[0], second set of matches into $result[1], etc.  The "SET" in "PREG_SET_ORDER" isn’t the verb "set," it’s the noun "set."  Meaning we want to group our matches by each set.

So, if the size of $result[0] is greater than 0, there are matches.  This would be when:

count($matches[0]) > 0

So just to see what comes up, let’s print_r() that $matches array when we find that first match, and then die() so that only the first match is shown.  Here’s your whole script now:

<?php

set_time_limit(0);

// Read file to a variable
$file = "Webster-1913";
$fp = fopen($file, "r");
$contents = fread($fp, filesize($file));
fclose($fp);

$contents = explode("r", $contents);

foreach ($contents as $line) {
   
preg_match_all("/<hw>(.*?)</hw>.*?<pos>(.*?)</pos>.*?<def>(.*?)</def>/", $line, $results, PREG_SET_ORDER);
   if (
count($results[0]) > 0) {
      
// This is a dictionary line
      
print_r($results); die();
   }
}

?>

Save that as read.php.

This is better if you can run it from the shell, if that’s the case telnet or ssh into your host, browse to the folder and type "php read.php."  If you’re loading this from your browser, just view the source after it loads instead of the HTML output.

When I ran this my output was:

Array
(
   [0] => Array
       (
           [0] => <hw>A</hw> (&adot;), <pos><i>prep.</i></pos> [Abbreviated form of <i>an</i> (AS. <i>on</i>). See <u>On</u>.] <sn><b>1.</b></sn> <def>In; on; at; by.</def>
           [1] => A
           [2] => <i>prep.</i>
           [3] => In; on; at; by.
       )

)

Since we really only care about the first match, we can show only the contents of $results[0], or…

$info = $results[0];
print_r($info); die();

Array
(
   [0] => <hw>A</hw> (&adot;), <pos><i>prep.</i></pos> [Abbreviated form of <i>an</i> (AS. <i>on</i>). See <u>On</u>.] <sn><b>1.</b></sn> <def>In; on; at; by.</def>
   [1] => A
   [2] => <i>prep.</i>
   [3] => In; on; at; by.
)

Even better: we don’t need that first element, which is just the whole match.

$info = $results[0];
array_shift($info);
print_r($info); die();

Array
(
   [0] => A
   [1] => <i>prep.</i>
   [2] => In; on; at; by.
)

Now go into phpMyAdmin and make a new mySQL table.  Don’t forget to add an index for the "word" field.  Run this query and you’ll have an exact copy of my table:

CREATE TABLE `dict_list` (
 `id` int(11) NOT NULL auto_increment,
 `word` varchar(255) NOT NULL default '',
 `type` varchar(50) NOT NULL default '',
 `definition` text NOT NULL,
 KEY `id` (`id`),
 KEY `word` (`word`)
) TYPE=MyISAM AUTO_INCREMENT=1 ;

There are still a few tiny things to be done, but I’ll just show you the changes I’ve made:

<?php

mysql_connect("localhost", "your_mysql_user", "your_mysql_password");
mysql_select_db("your_mysql_database");

set_time_limit(0);

// Read file to a variable
$file = "Webster-1913";
$fp = fopen($file, "r");
$contents = fread($fp, filesize($file));
fclose($fp);

$contents = explode("r", $contents);

$i = 0;

mysql_query("TRUNCATE dict_list") or die(mysql_error());

foreach ($contents as $line) {
   
preg_match_all("/<hw>(.*?)</hw>.*?<pos>(.*?)</pos>.*?<def>(.*?)</def>/", $line, $results, PREG_SET_ORDER);
   if (
count($results[0]) > 0) {
      
// This is a dictionary line
      
$info = $results[0];
      
array_shift($info);

      list($word, $type, $definition) = $info;
      
$word = preg_replace("/[*"'|`-]/", "", $word);

      $word = addslashes($word);
      
$type = addslashes($type);
      
$definition = addslashes($definition);

      mysql_query("INSERT INTO dict_list SET word = '$word', type = '$type', definition = '$definition'") or die(mysql_error());

      echo $i++ . " ";
   }
}

?>

First of all, I truncated the table before the build was done, which clears out the whole table.  That way if you run the read.php script more than once, you won’t get any duplicates.

Then, this line:

list($word, $type, $definition) = $info;

Puts the 0th element of $info into $word, the 1st element into $type, and the 2nd into $definition.  This just puts these items into their own variables so what we do is a bit more readable.

$word = preg_replace("/[*"'|`-]/", "", $word);

Next, we remove all the pronunciation characters you would have seen in the words if you had taken a peek at the file.  Stuff like an asterisk, double quote, single quote, pipe, backquote, and so on, all removed so we’re left with the regular word.

$word = addslashes($word);
$type = addslashes($type);
$definition = addslashes($definition);

Slashes added to each of those variables to prevent SQL injection attacks.  If the word we were importing into the database was "it’s", the first part of the query would look like:

INSERT INTO dict_list SET word = 'it's'

This is confusing, because mySQL doesn’t know where we want to end the string.  Adding the slashes makes it look like this:

INSERT INTO dict_list SET word = 'it's'

And it says okay, I see ‘ instead of just ‘ so that means you want a single quote, you don’t want to mark the end of that string.

Finally do the query that adds this line into the database table dict_list:

mysql_query("INSERT INTO dict_list SET word = '$word', type = '$type', definition = '$definition'") or die(mysql_error());

And finally that echo statement in there is just for me to tell when each entry is being added, as sort of a progress indicator so I don’t get bored waiting.

So, run that script, preferably in the shell, but in the browser it should work okay.  It takes a couple of minutes but will import 110,000 dictionary entries into your database.  With the "word" column indexed so it can be retrieved quickly — this is important.

Next you have to put together a very simple script that will search the table based on a query.  This is really easy, done in about 20 lines of code here:

<?php

mysql_connect("localhost", "your_mysql_user", "your_mysql_password");
mysql_select_db("your_mysql_database");

$q = addslashes($_GET["q"]);
$limit = 10;

if ($q) {
   
$query = mysql_query("SELECT * FROM dict_list WHERE word LIKE '$q%' LIMIT $limit") or die(mysql_error());

   $results = array();

   while ($row = mysql_fetch_assoc($query)) {
      
$word = $row["word"];
      
$definition = $row["definition"];
      
$type = $row["type"];

      $results[] = "<b>$word</b>: $type $definition";
   }

   echo implode("<br>n", $results);
}

?>

First, we connect to the database.  Then add those slashes to the query (the script is called in the form of "find.php?q=apples", for example).

In mySQL you can use LIKE as a simple search.  Use the percent sign as a wildcard.  By putting the percent sign at the end it means we’ll use that query as the START of the word we’re looking for.  If "nap" is given as a query, a word like "napkin" could be suggested, because it begins with "nap".  But "snap" wouldn’t be suggested, since even though "nap" contains part of that word, it doesn’t start with "nap"… get it?

Oh yeah, and it’s sorted alphabetically by the "word" field and limited to 10 rows.  Always put a limit on your queries when you can.

Good.  Then that while loop adds each row onto an array.  I like to put my queries into an array, that way I can do stuff with it later if I want to.  In this case it’s also useful because I want to separate each result with a line break but I don’t want to have a line break at the end.  It just comes out cleaner than adding things on to the end of a string.

And it outputs that text.  Now try this out.  Upload find.php onto your server, edit the settings so it connects to your own mySQL database with your mySQL user, and try a URL like this:

http://www.example.com/yourfolder/find.php?q=goo

It will give you the first 10 words starting with "goo."  Now let’s get to work on the HTML side of things.

I’m going to make a simple HTML file called "index.html" like this:

<div align="center">
<
form action="find.php" method="GET" target="searchWindow">
<
input type="text" name="q" size="20"><input type="submit" value="Search">
</
form>

<iframe name="searchWindow" src="find.php" width="500" height="300"></iframe>
</
div>

Very basic, a text box and search button with an inline frame below it.  The form submits into the inline frame, it doesn’t reload the current page.  Whatever you type into the search box is passed to the script as a parameter, but you have to hit the search button… it doesn’t really autocomplete for you.

That’s easy, all you have to do is add an "onkeyup" JavaScript event to that search box to re-submit the form any time the text has changed.  That HTML file becomes:

<div align="center">
<
form action="find.php" method="GET" target="searchWindow">
<
input type="text" name="q" size="20" onkeyup="this.form.submit()">
</
form>

<iframe name="searchWindow" src="find.php" width="500" height="300"></iframe>
</
div>

If that’s not cool I don’t know what is.  We can make this even cooler by using XMLHttpRequest, making it look like Google Suggest.  (link: http://www.google.com/webhp?complete=1&hl=en)

Google Suggest uses XMLHttpRequest to load the contents of a URL into a variable, and then writes that to a DIV layer, instead of using an iframe.  It gives everything a more built-in feeling.  Here’s that page made to look more like Google Suggest:

<html>
<
head>

<style type="text/css">
body { font-family:Tahoma, Verdana; font-size:11px; }
</
style>

<script language="JavaScript">
<!--
var
req;

function loadXMLDoc(url) {

   // Internet Explorer
   
try { req = new ActiveXObject("Msxml2.XMLHTTP"); }
   
catch(e) {
      
try { req = new ActiveXObject("Microsoft.XMLHTTP"); }
      
catch(oc) { req = null; }
   }

   // Mozailla/Safari
   
if (!req && typeof XMLHttpRequest != "undefined") { req = new XMLHttpRequest(); }

   // Call the processChange() function when the page has loaded
   
if (req != null) {
      
req.onreadystatechange = processChange;
      
req.open("GET", url, true);
      
req.send(null);
   }
}

function processChange() {
   
// The page has loaded and the HTTP status code is 200 OK
   
if (req.readyState == 4 && req.status == 200) {

      // Write the contents of this URL to the searchResult layer
      
getObject("searchResult").innerHTML = req.responseText;
   }
}

function getObject(name) {
   var
ns4 = (document.layers) ? true : false;
   var
w3c = (document.getElementById) ? true : false;
   var
ie4 = (document.all) ? true : false;

   if (ns4) return eval('document.' + name);
   if (
w3c) return document.getElementById(name);
   if (
ie4) return eval('document.all.' + name);
   return
false;
}

window.onload = function() {
   
getObject("q").focus();
}

// -->
</script>
</head>

<body>

<div align="center">

<h1 align="center">Dictionary</h1>

<div align="center">Type in part of a word to have it defined.</div>

<form action="find.php" method="GET" target="searchWindow">
<input type="text" name="q" id="q" size="20" onkeyup="loadXMLDoc('find.php?q='+this.value)" style="width:300px;">
<div align="left" id="searchResult" name="searchResult" style="font-family:Arial; font-size:12px; width:300px; border:#000000 solid 1px; padding:3px; "></div>
</form>

</div>

</body>
</html>

The loadXMLDoc() and processChange() functions are based on code from Apple Developer Connection (http://developer.apple.com/internet/webcontent/xmlhttpreq.html).  I’ve changed them a bit (Apple’s way was less compatible when ActiveX was turned off in IE, Google’s way works better).

The code only looks weird because Internet Explorer and Mozilla/Safari handle this in different ways (go figure).  In my first Simple PHP book I made a little JavaScript-based form mailer that actually called a PHP script using an Image object.

The way we do this is pretty much the same… just give the XMLHttpRequest object info like what URL to connect to, and then say, once it’s loaded, pass it to a function.

req.onreadystatechange = processChange;

This is a lot like the "onload" property of an Image object.  Anyway, all the processChange() function does is, if the page loaded correctly, populates the "searchResult" layer (the div tag we’re putting the dictionary suggestions on).  Just like with the iframe, it’s updated on each keystroke, unless it’s cached of course, but that isn’t the point.

Google Suggest obviously didn’t take Kevin Gibbs a long time to implement, but this technique is an easy way to make a web UI look more like a desktop UI… it falls just outside the event horizon of the "Because I Can" category.

Demo here: http://www.jumpx.com/tutorials/googlesuggest/demo.html

Download: http://www.jumpx.com/tutorials/googlesuggest/googlesuggest.zip
Webster1913 Database: http://www.jumpx.com/tutorials/googlesuggest/Webster-1913.gz (right click and save)

Share this story:
  • tweet

Author Description

Robert Plank is the creator of Lightning Track, Redirect Pro, Rotatorblaze, and others. An easy way to display the content saved by this article's script is explained in chapters 15 and 16 of his book, "Simple PHP": http://www.simplephp.com

No Responses to “Google Suggest With PHP”

You must be logged in to post a comment.

Connect With Us

RSSSubscribe 1,240Followers 492Likes
  • Popular
  • Recent
  • Comments
  • Creating Energy Spheres in Photoshop

    Apr 15, 2008 - 96 Comments
  • Easy Screen Scraping in PHP with the Simple HTML DOM Library

    Aug 6, 2008 - 20 Comments
  • Calculating date difference more precisely in PHP

    Mar 7, 2008 - 13 Comments
  • When Does Hosting Your Website in the Cloud Make Sense?

    Oct 8, 2010 - 2 Comments
  • Fun with the Microsoft Managed Extensibility Framework Part 2

    Oct 6, 2010 - 0 Comment
  • Fun with the Microsoft Managed Extensibility Framework Part 1

    Sep 22, 2010 - 0 Comment
  • Website Management on the go with the iPad

    I appreciated your post, but I was looking for something I didn't...
    November 24, 2012 - drmoderator
  • Creating Energy Spheres in Photoshop

    I'm a little stuck down here especially at the step of creating the...
    November 23, 2012 - sarah
  • Running background processes in PHP

    Can you give an example? As see it, you can use this only when you...
    November 16, 2012 - Shaked Klein Orbach
Developer Resources
  • Tutorial Directory
  • Learn HTML
  • Learn PHP
  • Learn CSS
  • Learn AJAX
  • Learn JavaScript
  • Learn Pear
  • White Papers
  • Resources
    • NetVisits Web Directory
    • Realtor Pixels
    • Answers On The Run
    • Ask A Geek
  • Recent Posts

    • When Does Hosting Your Website in the Cloud Make Sense?
    • Fun with the Microsoft Managed Extensibility Framework Part 2
    • Fun with the Microsoft Managed Extensibility Framework Part 1
    • Website Management on the go with the iPad
    • Code Contracts in C# 4.0 – Part 1

    Calendar

    May 2013
    M T W T F S S
    « Oct    
     12345
    6789101112
    13141516171819
    20212223242526
    2728293031  

    Recent Comments

    • drmoderator on Website Management on the go with the iPad
    • sarah on Creating Energy Spheres in Photoshop
    • Shaked Klein Orbach on Running background processes in PHP
    • Thomas Cuvillier on How To Upload Files Using PHP
    • rizal aditya on Extracting text from Word Documents via PHP and COM
    • Home
    © 2003 - 2013 DeveloperTutorials.com. All Rights Reserved. Privacy Policy.