Caching Tutorial
By Mark Nottingham
2007-12-22
Writing Cache-Aware Scripts
By default, most scripts won’t return a validator (a Last-Modified
or ETag response header) or freshness information (Expires or Cache-Control).
While some scripts really are dynamic (meaning that they return a different
response for every request), many (like search engines and database-driven
sites) can benefit from being cache-friendly.
Generally speaking, if a script produces output that is reproducable with
the same request at a later time (whether it be minutes or days later), it
should be cacheable. If the content of the script changes only depending on
what’s in the URL, it is cacheble; if the output depends on a cookie,
authentication information or other external criteria, it probably isn’t.
- The best way to make a script cache-friendly (as well as perform
better) is to dump its content to a plain file whenever it changes. The
Web server can then treat it like any other Web page, generating and
using validators, which makes your life easier. Remember to only write
files that have changed, so the
Last-Modified times are preserved. - Another way to make a script cacheable in a limited fashion is to set
an age-related header for as far in the future as practical. Although
this can be done with
Expires, it’s probably easiest to do so with
Cache-Control: max-age, which will make the request fresh for an amount
of time after the request. - If you can’t do that, you’ll need to make the script generate a
validator, and then respond to
If-Modified-Since and/or If-None-Match
requests. This can be done by parsing the HTTP headers, and then
responding with 304 Not Modified when appropriate. Unfortunately, this is
not a trival task.
Some other tips;
- Don’t use POST unless it’s appropriate. Responses to
the POST method aren’t kept by most caches; if you send information in the
path or query (via GET), caches can store that information for the
future.
- Don’t embed user-specific information in the URL unless
the content generated is completely unique to that user.
- Don’t count on all requests from a user coming from the same
host, because caches often work together.
- Generate
Content-Length response headers. It’s easy to
do, and it will allow the response of your script to be used in a
persistent connection. This allows clients to request
multiple representations on one TCP/IP connection, instead of setting up a
connection for every request. It makes your site seem much faster.
See the Implementation Notes for more specific
information.
Tutorial Pages:
»
What’s a Web Cache? Why do people use them?
»
Kinds of Web Caches
»
Aren’t Web Caches bad for me? Why should I help them?
»
How Web Caches Work
»
How (and how not) to Control Caches
»
Tips for Building a Cache-Aware Site
» Writing Cache-Aware Scripts
»
Frequently Asked Questions
»
Implementation Notes — Web Servers
»
Implementation Notes — Server-Side Scripting
»
References and Further Information
|

|