2 Ways To Implement Session Tracking

by: Kiran Pai

2 Ways To Implement Session Tracking
This article explains how to implement session tracking using two of the simplest & oldest methods available to programmers. I feel that in order to understand the beauty of new technologies that exist today it is often necessary to understand what used to be done before that technology came into being. The techniques presented in this article do not use the new technologies present to implement session tracking, but use some old, tried and tested ways which are extremely popular even today. After reading this article you would be able to implement session tracking using any language, since you would understand the concepts of session tracking rather than some language dependent implementation of session tracking.

Various languages provide higher level API for implementing session tracking. There is a detailed session tracking API available in Java which enables many programmers to get session tracking implemented quickly and easily. But that is not what this article talks about. It focuses on understanding the basic techniques so that you can use it with any language.

To understand this article you need to know 3 things -
1. Familiarity with any server side technology such as JSP, ASP, Java servlets, etc.
2. You need to know HTML very well.
3. You need to know how to access the contents of a HTML Form from within a programming language such as JSP, ASP, etc.


What is session tracking?
Session tracking (for those who haven't heard of it) is a concept which allows you to maintain a relation between 2 successive requests made to a server on the Internet. Whenever a user browses any website, he uses HTTP (the underlying protocol) for all the data transfers taking place. This ofcourse is not important to the user. But it is for you as a programmer. HTTP is a stateless protocol. When a user requests for a page the server returns that web page to the user. When the user once again clicks on a new link the server once again sends the new page that was requested. The server (because of the use of HTTP as the underlying protocol) has no idea that these 2 successive requests have come from the same user. The server is not at all bothered about who is asking for the pages. All it does it return the page that has been requested. This is exactly what stateless means. There is no connection between 2 successive requests on the Internet.

What does HTTP being stateless have to do with session tracking?
There are many instances where some sort of connection is required between 2 requests made by a user. And since all transfers on the WWW use HTTP at the lowest level this sort of connection cannot be made. For example if you are at a website buying books online, then you may add books to your Cart and continue searching for more books. Every time you click on a new page your old selected books in the Cart should not disappear. In case you use the default way the WWW works, then since 2 successive request (by the same user) have no connection, there would be no books in your Cart every time you click on a new link. I mean every click would be considered as a separate request and no having no relation to previous request. Thus as you browse, all the information that relates to you should be maintained and should be carried on as you browse more and more. Your previous Shopping Cart contents should be present when you want to add a new book to the Cart. This is what session tracking enables you to do. It lets you maintain a active session as long as you are browsing. And it gives HTTP a sort of new quality with every successive request having some relation to previous requests within the same session.

Session tracking is so common that you may not even realise that it is present. You might be used to it. It is used on almost every possible site you visit on the net. For example at Hotmail once you enter your username-pass and you reach your inbox, had there been no session tracking then every time you click on a particular link in your inbox, you would be asked for your password. This would be the case since there would be no way to understand that the one who had originally entered his username-password is the same person who is currently asking for more pages. Session tracking allows you to store the information that you have successfully logged in and this information would be checked every time you do any thing within your inbox. Thus you would not be asked to enter your password with every click. I can give you thousands of examples where session tracking is used, but I guess you have got the point.

Now lets begin with the actual way to implement session tracking. I shall explain 2 ways to implement session tracking

1. Hidden Fields In Forms
2. URL Rewriting

Also I conclude the article with a few lines on cookies which is also used for session tracking.


Hidden Fields In Forms
This is the simplest and most easy way to implement session tracking. I find this method extremely useful to get the work done quickly. I can explain this with the help of the example I was speaking about - A Cart to hold your books.

In case you visit a site and you are presented a list of books with checkboxes next to each of them. You could select books and click on a Add to Cart Submit button. A sample code for such a page is shown below.

Remember this is just what the code may look like and not the exact page. You should try to understand the logic rather than focus on the syntax. Also remember that these are all dynamic pages being generated using some language such as JSP.
<b>Search results for books</b>
<form method="post" action="serverprogram.jsp">
<input type="checkbox" name="bookID" value="100">Java Servlet Programming<br>
<input type="checkbox" name="bookID" value="101">Professional JSP<br>
<input type="submit" name="Submit" value="Add to Cart"><br>
</form>

Suppose a page similar to the above one was generated when the user searched for some books. The above page has only 2 search results. There is a Form with 2 checkboxes, each next to the name of a book and a Submit button to add any selected books to the Cart.

Now suppose the user clicks on the checkbox next to book named 'Java Servlet Programming' , and then clicks on the Submit button. Note that the value of a checkbox is used in this case to store the bookID. Generally when you have many checkboxes each representing one-of-many kind of entity then the value for that checkbox differentiates between all of them. In our case since all the checkboxes represent books, each value represents a different bookID and thus a different book (one book-of-many books). This is actually a programming concept you would be familiar with in case you have done web programming.

Now coming back to the point, in case the user checked the checkbox next to the book named 'Java Servlet Programming' and then clicked the Submit button, the contents of the form are all bundled together and sent to the server side program. In our case the program is named addcart.jsp . Now suppose at any further instant when the same user is searching for more books then on a search result he might be presented with page such as the one shown below. Remember that he has already selected a book previously. So that book should be present in his Cart and now he would like to add more books.

<b>Search results for books</b>
<form method="post" action="serverprogram.jsp">
<input type="hidden" name="bookID" value="100">
<input type="checkbox" name="bookID" value="150">Teach yourself WML Programming<br>
<input type="checkbox" name="bookID" value="160">Teach yourself C++<br>
<input type="submit" name="Submit" value="Add to Cart"><br>
</form>

Those of you'll who are experts in programming must have already figured out how hidden fields help in session tracking. For the rest of you'll who are like me and take more time to figure out what is happening, let me explain..

The new search result produced once again 2 new books. One book named 'Teach yourself WML Programming' with a bookID of 150 and another book named 'Teach Yourself C++' with a bookID of 160. So a form was generated with the names of these 2 books and with 2 checkboxes so that the user may select any of these books and add them to the Cart. But there is one more important thing in the form that was generated. There is a hidden input field named bookID and having a value of 100. You might have noticed that 100 was the bookID of the book named 'Java Servlet Programming' which the user had initially selected. This line describing a hidden input does not make any difference on the HTML page displayed in the browser. It would be totally invisible to the user. But within the form it makes a hell lot of a difference. This way when the user keeps adding more and more books, there would be many hidden input fields each with a different value, each representing a previously selected book. When this form is submitted to the server side program, that program would not only fetch the newly selected checkboxes (newly selected books) but also these hidden fields each representing a previously selected book by that user. Note that all the input fields have the same name bookID but their values are different. Within the server side program you would simply expect a parameter called bookID which would be an array with different values. You could extract all the values and then use them as required. It is the job of the server side program to add these lines indicating hidden fields whenever it generates a new page.

Once again..the main concept to be understood is that a hidden field displays nothing ON the HTML page. So the user who is browsing the page sees nothing unusual, but the value associated with these hidden fields can be used to hold any kind of data that you want. Only care is to be taken so that every time your server side program generates a new form, it should read all the parameters passed to it from the previous form and then add all these values as new hidden fields in any new form that it generates. Thus you could carry information from one HTML page to another and thus maintain a connection between 2 pages.

The disadvantage of session tracking is that in case you do not want the user to know what information is being passed around to maintain a session (in case that information is somewhat vital..maybe a password or something) then this method is not the best one since the user can simply select to View the Source of the HTML page and get to see all the hidden fields present in the Form.



URL Rewriting
This is another popular session tracking method used by many. But it has a few bad points associated with it. Inspite of that I like to use this method. It doesn't require a lot of understanding to get the work done. URL Rewriting basically means that when the user is presented with a link to a particular resource instead of simply presenting the URL as you would normally do, the URL for that resource is modified so that more information is passed when requesting for that resource. I can see puzzled faces trying to make sense of what is written above.. Read on and things shall get more clear...

I will try explaining URL Rewriting with the same Shopping Cart example used in the hidden field method. Actually I could have shown simpler examples, but for you to compare the 2 methods I shall take up the same example once again.

So once again assume that a user has searched for some books and he has been presented with a search result that has 2 books listed. It is basically a Form with 2 checkboxes, each for one book and a Submit button to add any of these book to his Cart.
<b>Search results for books</b>
<form method="post" action="serverprogram.jsp">
<input type="checkbox" name="bookID" value="100">Java Servlet Programming<br>
<input type="checkbox" name="bookID" value="101">Professional JSP<br>
<input type="submit" name="Submit" value="Add to Cart"><br>
</form>

Now once again suppose the user selects the book named 'Java Servlet Programming' and then clicks on the Submit button. This would pass the contents of the form to the server side program called serverprogram.jsp which should read the selected checkboxes and do the necessary (i.e.. make some arrangements to keep a track of the selected books, which basically means implement session tracking). Now suppose the user continues browsing and searches for more books and is presented with a new search result just like in the previous example. For better understanding I shall once again give you the same 2 results as shown in hidden fields method. The 2 books named 'Teach yourself WML Programming' and 'Teach yourself C++'

<b>Search results for books</b>
<form method="post" action="serverprogram.jsp?bookID=100">
<input type="checkbox" name="bookID" value="150">Teach yourself WML Programming<br>
<input type="checkbox" name="bookID" value="160">Teach yourself C++<br>
<input type="submit" name="Submit" value="Add to Cart"><br>
</form>

You should be able to guess by now what URL rewriting is all about. In the above html source, the target for the form has been changed from serverprogram.jsp to serverprogram.jsp?bookID=100 . This is exactly what URL Rewriting means. The original URL which was only serverprogram.jsp has now been rewritten as serverprogram.jsp?bookID=100 . The effect of this is that the any part of the URL after the ? (question mark) is treated as extra parameters that are passed to the server side program. They are known as GET parameters. GET method of submitting forms always uses URL Rewriting. Now when the serverprogram.jsp fetches the parameters by the name bookID it would be presented with the one that was present after the ? in the URL as well as the newly selected checkboxes by the user in that Form.

Consider a general example where a user has selected 2 values, then whenever a program generates a new Form the target for that form should look something like

<form method="post" action="serversideprogram.jsp?name1=value1+name2=value2">

This sort of URL would keep on increasing as more and more values have to be carried on from one page to another.

The basic concept of URL Rewriting is that the server side program should continuously keep changing all the URLs and keep modifying them and keep increasing their length as more and more data has to be maintained between pages. The user does not see anything on the surface as such but when he clicks on a link he not only asks for that resource but because of the information after the ? in the URL he is actually sending previous data to the program.

The disadvantage of URL Rewriting (though its a minor one) is that the displayed URL in the browser is of course the rewritten URL. Thus the clean simple URL that was seen when hidden fields were used, is replaced with a one with a ? followed by many parameter values. This doesn't suit those who want the URL to look clean. Another disadvantage is that some browsers specify a limit on the length of a URL. So once the data which is being tracked exceeds beyond a certain limit, you may no longer be able to use URL Rewriting to implement session tracking. But that limit is generally large enough and so don't feel afraid to use this method. But do note that actually rewriting all the URLs within your program is not a simple task and requires some experience.

In case you are confused with what we have been doing with hidden fields and URL Rewriting, I shall sum it up once again for you. We are trying to learn methods that allow us to carry information from one HTML page to another since by default you cannot pass information from one HTML page to another. So to carry data from one page to another, we are either using hidden fields invisible to normal users or rewriting all the links on a page so that the server side program receives the old as well as new data. Thus we can maintain a session (a connection between multiple pages) for every user.



Cookies
This is one of the most famous methods and the one used by almost all professional sites. This allows you complete flexibility and whatever you want as far as session tracking is concerned. But it is not as easy as the other 2 methods. Besides some applications may not allow cookies in which case you have to revert back to the other 2 methods. I had designed websites using WML (Wireless Markup Language) which worked on WAP based cell phones. Unfortunately the cellphones did not have enough memory to support cookies, so I had to use hidden fields to get session tracking working. But cookies would work on almost every every computer, except when a user may have blocked all cookies for security reasons in which case you would once again have to use either of the other 2 methods.

There will be no code here to explain cookie usage. Using cookies is probably the best and the neatest of all the methods to maintain sessions. Cookies are basically small text files that are stored on the user's computers. This has information pertaining to that user. Once the cookie is created on the user's computer then for every further request made by that user in that session, the cookie is sent along with the request. The value of every cookie is unique (for users browsing a particular website), so the server side program can differentiate between various users.

The method to program cookies is different for different languages. Most of the language provide some class that covers all the details of cookie creation and maintenance. For example in Java you have a javax.servlet.http.Cookie class that is used to work with cookies. Since I have decided to keep this article language neutral and I had not planned to discuss cookies in depth I would not go into the details of cookie programming.


Finally...
For beginners however I suggest any of the first two methods to implement session tracking. Rather the facing the learning curve associated with cookies you could manage with one of the above 2 techniques that you could implement using any language. My first preference is always for hidden fields. But in cases where I am not dealing with forms as such (which generally doesn't happen) I also use URL Rewriting.

Hope this article gave you a sound introduction to session tracking. I am sure you can use the knowledge presented here for you personal programming needs. However in case you plan to implement a professional website then I would suggest you to look into APIs specifically designed for session tracking which would do all the above mentioned stuff for you automatically without you worrying about the nitty-gritty details.


Article published Saturday, 11th June 2005
© 2008 NetVisits, Inc. All rights reserved.