Creating a PostgreSQL and MySQL driverby: Tony MarstonIntroduction The software that I write is not intended for just one customer who is locked into a single RDBMS. Instead it can be used by virtually any customer using the RDBMS of their choice. How is this possible? I deliberately designed and built my infrastructure around the 3 Tier Architecture which breaks down application code into three distinct areas of responsibility:
It is only the Data Access layer which communicates with the database, therefore it should be possible to communicate with another database simply by changing the component which exists within this layer. I have already produced two versions of my Data Access Object for MySQL:
The purpose of this article is to document my experiences while producing a DAO for a totally different RDBMS, in this case for PostgreSQL (version 8.0.3). Administrative programs Although both MySQL and PostgreSQL come with command line interfaces I think that as we are now in the 21st century the vast majority of developers would insist on a set of proper GUI tools. MySQL has the following:
PostgreSQL has the following:
These are all freely available, although there are also commercial products available for both databases. In my personal opinion the tools for PostgreSQL are second rate when compared with those for MySQL. I particularly dislike the feature in phpPgAdmin for executing SQL statements. You type in the statement, hit "Go", and nothing happens. This is because it has opened up a second browser window for the input of SQL commands, and all results appear in the first window. Very user-unfriendly! Connecting to a Database MySQL has tables within databases while PostgreSQL has tables within schemas within databases. Every database has a default "public" schema, while other schemas can be created on demand. It is possible for the same table name to exist within different schemas within the same database. With MySQL once you have established a connection you may access any table within any database. One database is nominated as the "default" or "current" database by using the mysqli_select_db function, in which case any table names within that database do not have to be qualified. Tables in other databases can only be accessed by qualifying them with the database name, as in With PostgreSQL a connection is limited to a single database, and you may only access tables that belong to schemas within that particular database. You can make a particular schema "default" or "current" by issuing an SQL query to set the schema search_path. Tables in other schemas can only be accessed by qualifying them with the schema name, as in I personally like to create a separate database for each of my software subsystems as it makes backups and restores that much easier, so the ability to swap a MySQL "database" for a PostgreSQL "schema" is very convenient. In order to make my code access database tables through PostgreSQL I made the following changes to my code: CREATE TABLE syntax Although both MySQL and PostgreSQL claim to be SQL compliant, there are some differences in the syntax of the CREATE TABLE command:
This means that the following in MySQL: CREATE TABLE IF NOT EXISTS `tablename` ( will have to be changed to work in PostgreSQL, as follows: CREATE TABLE tablename ( SQL queries As well as differences in the syntax for creating tables there is also a difference in the syntax of various SQL queries. CONCATThe following in MySQL: SELECT CONCAT(first_name, ' ', last_name) AS person_name FROM .... has to be converted to the following in PostgreSQL: SELECT first_name || last_name AS person_name FROM .... I can deal with this difference by including a new function in the PostgreSQL class to perform the necessary conversion, thus enabling me to leave the original MySQL syntax intact. JOINThe following query produces ERROR: JOIN/ON clause refers to "a", which is not part of JOIN. SELECT a.a_id, b.b_id, b.b_desc, x.a_id This was cured by changing it to the following: SELECT a.a_id, b.b_id, b.b_desc, x.a_id This error seems rather dubious to me as it may not be possible to have every part of the ON clause limited to the two tables 'b' and 'x'. Provided that at least one JOIN condition specifies those two tables then any other JOIN condition should not be so limited. In the above example I was able to get around the problem by specifying a literal instead of a column name, but this may not be possible in all circumstances. In another script I also found that changing the order of the table names in the FROM clause produced the same error. Thus SELECT a.a_id, b.b_id, b.b_desc, x.a_id, z.z_desc However, this same code DOES work in MySQL, so it's a simple matter to convert all JOIN statements so they work with both databases. GROUP BYThe following query produces ERROR: column "a.a_desc" must appear in the GROUP BY clause or be used in an aggregate function. SELECT a.a_id, a.a_desc, COUNT(b.a_id) AS child_count This is perfectly valid in MySQL as it does not insist that the GROUP BY clause contains the names of ALL selected columns. This gives performance advantages for common queries. This seems reasonable to me as on table 'a' the column 'a_id' is the primary key while 'a_desc' is merely a textual description and therefore irrelevant as far as any grouping is concerned. PostgreSQL is not so forgiving, therefore every column in the SELECT clause must also appear in the GROUP BY clause, as in the following: SELECT a.a_id, a.a_desc, COUNT(b.a_id) AS b_count Just as with the JOIN the solution is to amend the SQL statement until it works with both databases. INSERTIn standard SQL there are two forms of the INSERT command: INSERT INTO tablename VALUES ('value1','value2',...)
What I like about MySQL is that it also allows UPDATE-style syntax, as follows: UPDATE tablename SET col1='value1', col2='value2', ... I prefer this approach as it keeps column names and values together instead of in two separate lists. This seems such a common sense idea that I'm surprised it is not part of the standard already. Why allow 2 formats for INSERT and a completely different 3rd one for UPDATE? Even though PostgreSQL is not so accommodating, the fact in my infrastructure all QUERY statements are constructed and issued within the relevant database driver means that I can customise each driver according to the whims of the individual database. The remainder of my infrastructure remains totally oblivious to the internal workings of any individual driver. AUTO_INCREMENTMySQL makes it very easy to use a technical key in any table. All the DBA need do is include the AUTO_INCREMENT keyword in the table definition, as in the following: CREATE TABLE `test` ( Although the term AUTO_INCREMENT does not exist in PostgreSQL, it has the equivalent in SERIAL, as in the following: CREATE TABLE test ( The only other difference is the method used to obtain the last number issued. In MySQL there is a simple function: $id = mysql_insert_id($link); With PostgreSQL the same can be achieved with: $query = "SELECT currval('" .$tablename .'_' .$pkey ."_seq')"; This is because the SERIAL keyword makes use of a counter with the default name of Conclusion In theory it should be a relatively simple exercise to switch from one SQL database to another. As is usual in this universe there is a big difference between theory and practice. Although all the database vendors 'supposedly' follow the same standards, they all seem to have followed different interpretations of those standards, followed by different implementations, all of which are capped off with different proprietary extensions. This makes the developer's job of switching database vendors more complicated than it need be, but by following some simple steps it is possible to take that mountain and turn it back into a mole hill:
I have included a version of my PostgreSQL driver with my sample application, so you now have the option of running it with either MySQL or PostgreSQL. NOTE: All testing was done against PostgreSQL version 8.0.3 running on Windows XP using PHP versions 4.3.11 and 5.0.4. © 2008 NetVisits, Inc. All rights reserved. |