Oct 30, 2009

Gzip challenges: browser compatibility, static gzip, graceful degradation

GZip (GNU Zip) is the most fundamental way to compress information in web. It exists for a decade (or more) and is supported today almost by every agent (browsers, robots, parsers, etc). So how can it used?

The first challenge: to gzip or not to gzip?

Well, it's OK to gzip every textual content on your website. But there were (or are) a few troubles with old versions of IE (which didn't understand this Content-Encoding), Safari (with gzip for CSS/JS files), and other browsers. All of they can be prevented (more or less), but their existence stop developers and admins from using this techniques through their projects and websites. Moreover there are a few less documented issues with some proxy servers, Content-Encoding, and chunks, which can lead to the whole server shutdown (implementation bugs). So should we use gzip?

We must! Gzipped content is usually 50-85% less in size, and this tremendously helps in accelerating your web pages. So let's review what ways are to prevent known bugs. For IE6 we can use these approaches:

  • not gzip if there is no SV1 in User Agent string (Service Pack 1, which has some issues with gzip fixed),
  • add 2048 spaces in beginning of gzipped file,
  • or just use deflate (with content flushing) instead of gzip.

To prevent issues with Safari (and some other browsers) we can just force Content-Type for gzipped content according to its initial value.

So most of the troubles can be solved.

The second challenge: CPU overhead

Also gzip adds very little to actual server processing time, but if you use powerful caching systems, proxing, etc, it can be notable. Here we come with static gzip.

Static gzip is a way to store gzipped content somewhere (usually on a hard disk, but it can be also any memory cache), so we cache gzip results and show them to end users, saving CPU time.

All works well here? Actually, no. But it's...

The third challenge: graceful degradation

No Apache / web server support for static gzip. Sometimes we can use any static gzip directives (i.e. special mod_rewrite ones) to redirect initial requests from static (or cached) files to their gzipped versions. Classical example is this mod_rewrite usage

RewriteCond %{HTTP:Accept-encoding} gzip
RewriteCond %{HTTP_USER_AGENT} !Konqueror
RewriteCond %{REQUEST_FILENAME}.gz -f
RewriteRule ^(.*)\.css$ $1.css.gz [QSA,L]
<FilesMatch \.css\.gz$>
ForceType text/css
</FilesMatch>

Here we check if user agent supports gzip encoding, if it's not Konqueror (which doesn't understand gzipped CSS/JS files), and if a statically gzipped file (.css.gz) exists. After this we redirect (internally) request to this file and force correct Content-Type for it.

But what if there is no mod_rewrite support? No problem! We can do the same a light PHP proxy, which actually receives a request, checks for all conditions, and serves prepared file.

The last challenge: what to gzip?

This one is the easiest. Obviously we need to gzip text types (XML, HTML, CSS, JS) and don't touch graphics or media files. But what else can be gzipped?

Some articles in this area recommend to add to gzipping ICO file type (which can be 3 times lesser after compression), and some font types (SVG, which is actually XML, OTF, TTF, EOT). All of this is already handled by Web Optimizer (.htaccess rules for all file types, PHP proxy with static gzip, etc). All in one brilliant acceleration package.

No comments:

Post a Comment