//, PHP/How to Handle a Many-to-Many Relationship with PHP and MySQL

How to Handle a Many-to-Many Relationship with PHP and MySQL

Intended Audience

This tutorial is intended for those developers who come across the situation where they need to cater for what is known as a ‘many-to-many’ relationship. It will explain how to generate the database tables to deal with such a relationship, and how to design HTML forms to display and update the contents of these tables.

This situation can often catch the inexperienced designer/developer unawares, often leading to clumsy or unworkable solutions. This tutorial will show a technique that has been successful for me for many years.

Introduction

What exactly is a many-to-many relationship? When your user tries to explain to you how a pair of business objects are related he might say “many of these can be related to many of those, and many of those can be related to many of these.” If you attempt to draw such a relationship in an Entity Relationship Diagram (ERD) you will probably end up with something like this:

Those of us who are more experienced in database design will immediately see this as a totally illegal relationship. The only sort of relationship which is possible is a ‘one-to-many’, so the novice might redraw his ERD as follows:

That may look better, but again it is impossible to implement in any RDBMS that I have come across. You can have A-to-B in a ‘one-to-many’ relationship, or B-to-A in a ‘one-to-many’ relationship, but you cannot have both at the same time. So what are we to do?

The solution is to create a new table which will act as the ‘many’ in two ‘one-to-many’ relationships, with each of the original tables acting as the ‘one’. This is shown in the following diagram, with the new table identified as ‘X’.

Table ‘X’ is sometimes referred to as a ‘link’, ‘intersection’ or ‘xref’ (cross-reference) table.

Database schema

Let us assume that the existing tables, ‘A’ and ‘B’, are defined as follows:

CREATE TABLE `a` (

`a_id` varchar(6) NOT NULL default '',
`a_desc` varchar(40) NOT NULL default '',
PRIMARY KEY (`a_id`)
)

CREATE TABLE `b` (
`b_id` varchar(6) NOT NULL default '',
`b_desc` varchar(40) NOT NULL default '',
PRIMARY KEY (`b_id`)
)

This new table must be constructed to allow the following:

* It must have a column which links back to table ‘A’.

* It must have a column which links back to table ‘B’.

* It must allow no more than one row to exist for any combination of rows from table ‘A’ and table ‘B’.

* It must have a primary key.

In order to satisfy all these rules all I need in table ‘X’ are two columns, ‘a_id’ and ‘b_id’. I do not need an additional column for a primary key as I can use the existing two columns. This produces a database schema as follows:

CREATE TABLE `x` (

`a_id` varchar(6) NOT NULL default '',
`b_id` varchar(6) NOT NULL default '',
PRIMARY KEY (`a_id`,`b_id`),
)

Note here that I have created a primary key from two columns, not one. Some naive database designers insist that every database table should have a single-column technical primary key called ‘id’ which obtains its value from an automatically-incrementing internal counter. In MySQL this can be done with the ‘auto-increment’ option. While I agree that there are places where a technical primary key is definitely beneficial, in my years of experience I have also encountered instances where a technical primary key is not only unnecessary, it is actually detrimental. A cross-reference table is one of those places.

Also note that it is not necessary to create a separate index for ‘a_id’ as this is already covered by virtue of the fact that it forms the leading portion of the primary key. A separate index on ‘b_id’ might be of benefit depending on the volume of rows in the table and the frequency of searches on that column.

Form design

Having designed the database we must now turn our attentions to designing the forms that will manipulate the data. For this I will suggest two alternatives although experienced developers may be able to suggest more.

Simple

In this design I have 4 simple forms which are interconnected as shown in the following diagram:

We start off with the form <Browse ‘A’> which enables the user to browse through all occurrences of table ‘A’. During the processing of this form the user will be able to select an occurrence of ‘A’, then press a control (a button or a hyperlink) which will pass the identity of the selected occurrence to a second form which I have identified as <Browse ‘X’ for selected ‘A’>.

This second form will show which occurrence of table ‘A’ has been selected, then list the current contents of table ‘X’ for that occurrence. For each occurrence of table ‘X’ it will include the relevant description from table ‘B’. The MySQL query to achieve this is as follows:

SELECT x.a_id, x.b_id, b.b_desc

FROM x
LEFT JOIN b ON (b.b_id = x.b_id)
WHERE (x.a_id = 'whatever')

A similar result may also be achieved using the following:

SELECT x.a_id, x.b_id, b.b_desc

FROM x, b
WHERE (x.a_id = 'whatever') AND (b.b_id = x.b_id)

The difference between these two SQL queries will only become apparent if an occurrence of table ‘B’ is missing for an occurrence of table ‘X’. With the first query all columns selected from table ‘B’ will be returned as empty. With the second query the effect will be that the occurrence of ‘X’ will be dropped from the selection.

The second form can only show existing occurrences of table ‘X’. In my design this form would have controls to invoke a third from to add new occurrences to ‘X’ or a fourth form to delete occurrences from ‘X’.

Complex

An alternative to this combines the facilities of forms (2), (3) and (4) into a single more complex form, as shown in the following diagram:

This modified form will still show which occurrence of table ‘A’ has been selected, but it will then list all occurrences of table ‘B’. Against each occurrence of table ‘B’ will be a checkbox to indicate if an associated occurrence exists on table ‘X’ or not. The user will be able to toggle the checkbox in order to create or delete the entry on table ‘X’. The MySQL query to select the data is as follows:

SELECT a.a_id, b.b_id, b.b_desc,  

CASE WHEN x.b_id IS NULL THEN 'F' ELSE 'T' END AS selected
FROM a, b
LEFT JOIN x ON (a.a_id = x.a_id AND b.b_id_id = x.b_id)
WHERE (a.a_id = 'whatever')

Notice here that I am selecting from table ‘A’ and ‘B’, then doing a LEFT JOIN on table ‘X’. The CASE..WHEN..THEN..END statement checks the existence of an associated occurrence on ‘X’ and sets a column named ‘selected’ to ‘T’ if one exists or ‘F’ if one does not.

As an added refinement I use a third form to allow the entry of selection criteria so that the contents of the second form can be refined. This is usually more relevant in those cases where there are more columns to be displayed, but it does provide the ability to select only those occurrences of ‘B’ where associated occurrences of ‘X’ either do or do not exist. This is accomplished by altering the WHERE clause to be one of the following:

WHERE (a.a_id = 'whatever') AND x.a_id IS NULL

or

WHERE (a.a_id = 'whatever') AND x.a_id IS NOT NULL

Complex update

If you thought that the code to retrieve the current data from the database was tricky then what will you think about the code needed to update the database after the user has made his changes? For this I have a standard procedure which has as its input two arrays:

1. $fieldarray which has an entry for each row displayed in the form, and for each row it contains the names and values for the primary key required for table ‘X’.

2. $selectarray which has an entry for the checkbox on each of the rows. Note that this will only contain entries where the checkbox is ON. If it has been checked OFF then the $_POST array will not contain an entry for that row.

I shall now describe the code to process these two arrays.

I begin by looping through each row that was displayed in the form and initialise two string variables:

foreach ($fieldarray as $rownum => $rowdata) {

$insert = NULL;
$delete = NULL;

Each row provides me with the names and values for the primary key, so I can move their details into the two string variables.

   foreach ($rowdata as $fieldname => $fieldvalue) {

$insert .= "$fieldname='$fieldvalue',";
$delete .= "$fieldname='$fieldvalue' AND ";
} // foreach

When there are no more fields left I can trim the unwanted ‘,’ and ‘ AND ‘.

   $insert = rtrim($insert, ',');

$delete = rtrim($delete, ' AND ');

Now I examine the contents of the checkbox in $selectarray and construct the SQL query to either create the entry if the checkbox is ON or delete the entry if the checkbox is OFF:

   if (isset($selectarray[$rownum])) {

$query = "INSERT INTO $tablename SET $insert";
} else {
$query = "DELETE FROM $tablename WHERE $delete";
} // if

Finally I execute the query and check for errors. Note that I ignore errors concerning duplicate entries. This is caused by a checkbox being ON originally and not being changed to OFF by the user:

   $result = @mysql_query($query, $dbconnect);

if (mysql_errno() <> 0) {
if (mysql_errno() == 1062) {
// ignore duplicate entry
} else {
trigger_error("SQL", E_USER_ERROR);
} // if
} // if
} // foreach

Summary

I first designed this solution to the many-to-many problem several years ago while developing in a different language. It was an interesting exercise to convert it to PHP and MySQL.

In my long career I have also encountered variations of this problem. In this solution table ‘X’ contains nothing but the foreign key fields which link up to the two related tables, but this can be made more complicated by the following:

* There may be additional fields on table ‘X’, which means that the database update routine will have to deal with more than simply choosing between an insert or a delete.

* A more complicated version is where table ‘X’ can contain multiple occurrences for each combination of ‘a_id’ and ‘b_id’. The example I have encountered several times is where each entry covers a different date range causing only one to be regarded as ‘current’ on any particular date, thus providing a history of values over a period of time.

However, I shall leave the solutions to these additional problems to a later date.

2010-05-26T11:39:11+00:00 April 6th, 2005|MySQL, PHP|0 Comments

About the Author:

I have been a software engineer, both designing and developing, since 1977. I have worked with a variety of 2nd, 3rd and 4th generation languages on a mixture of mainframes, mini- and micro-computers. I have worked with flat files, indexed files, hierarchical databases, network databases and relational databases. The user interfaces have included punched card, paper tape, teletype, block mode, CHUI, GUI and web. I have written code which has been procedural, model-driven, event-driven, component-based and object oriented. I have built software using the 1-tier, 2-tier, 3-tier and Model-View-Controller (MVC) architectures. After working with COBOL for 16 years I switched to UNIFACE in 1993, starting with version 5, then progressing through version 6 to version 7. In the middle of 2002 I decided to teach myself to develop web applications using PHP and MySQL.

Leave A Comment