Dec 31, 2009

WEBO Site SpeedUp - configuration sets & debugging

In the coming new version of WEBO Site SpeedUp we prepared a lot of technical innovations and UI enhancements. One of them is sets of configuration - now you can simply choose one of the predefined settings' sets, or create your own based on one of them.

Configuration sets

As you can see from the upper screenshot there can be unlimited sets of configuration with 3 predefined - safe (which is completely safe for all environments, but doesn't provide a lot of acceleration), optimal (balanced one), and extreme (which should be re-configurated before it can be used for a live website).

Application debugging

We new concept of debug / live mode for WEBO Site SpeedUp application you can simply tune / debug current configuration set and save it. There a lot of options to configure, so basic process can be a bit hard for newbies.

But we prepared a lot of hints through the user interface - they should help you to speedup your website with WEBO Site SpeedUp.

Configuration export / import

After basic configuration is ready you can prepare one-two more configuration sets (if you have time, or just on a test website). And copy them to the live website. After this you can apply any of available sets 'on fly' painless (also in debug mode to be re-sure that all is OK with your website).

With new dashboard cache refresh and debugging application is much easier. Your just need to press 'Enable' or 'Refresh cache' - all the other actions wll be performed automatically.

Get WEBO Site SpeedUp

All these options are avilable in the latest alpha version - http://code.google.com/p/web-optimizator/downloads/detail?name=webo.site.speedup.v0.9.1b.zip. It already has update-to-beta procedure (you just need to go to "System Status -> Updates" and check 'Show information about beta versions'. After this just press 'Install' (below current beta version change log) - all files will be downloaded from beta repository and applied automatically.

Dec 23, 2009

WEBO Site SpeedUp alpha version

Actually in Xmas eve we prepared alpha preview of WEBO Site SpeedUp (formerly Web Optimizer). You can get it from official repository here (link updated to alpha2): http://code.google.com/p/web-optimizator/downloads/detail?name=webo.site.speedup.v0.9.1b.zip. Main feature is new interface and a few more tools + improved overall stability. Some screenshots:

Control Panel

Here are all important blocks which should help you to increase load speed of your website - Settings overview, System status, Cache, Load speed, optimization tools, Updates, News, etc.

Cache

On cache screen you can refresh overall cache and get all information about actual files.

On cache reload you see all performed actions.

System status

Here are listed all warnings / troubles with server environment, also some useful information about application.

Optimization tools

Optimization tools (static gzip and image optimization) allow you to compress your files in interactive mode. Image optimization will be performed via smush.it / punypng.com services (in alpha only former is available).

Get alpha2 version http://code.google.com/p/web-optimizator/downloads/detail?name=webo.site.speedup.v0.9.1b.zip and leave your feedback.

Any issues can be also submitted to issue tracker.

Dec 16, 2009

Web Optimizer for CDN

After a couple of questions (how Web Optimizer can be used to move all assets to CDN) we prepared a quick quide to let you use Web Optimizer with your CDN easily.

So step-by-step

  • First you need to have CDN host setuped as a mirror for your website. I.e. if you have an image www.site.com/img/image.png you must have cdn.site.com/img/image.png (right now Web Optimizer correctly supports only 1 subdomain for CDN - cdn.yourwebsite.com).
  • After this initial CDN setup you new to set only 2 options in Web Optimizer configuration. First if 'host' (the very first screen of settings, near license key field). Set it to cdn.site.com. This turns on CDN usage for CSS / JS merged assets.
  • Then you should enable multiple hosts. For this purpose in the group "Multiple hosts" just enable their usage + disable option "Check hosts' availability automatically" (this seems don't work properly in case of CDN). And enter 'cdn' (without quotes) to the field "Allowed hosts, i.e. img i1 i2".

That's all! So you will have the following options:

Website host (to include before static resources), i.e. site.com ->
cdn.yoursite.com
Enable multiple hosts -> Yes
Check hosts' availability automatically -> No
Allowed hosts, i.e. img i1 i2 -> cdn

So this will turn on CDN usage for your website fast and easily.

A good example of CDN mplementation is www.maxcdn.com.

Dec 11, 2009

Web Optimizer in Bitrix

Web Optimizer in BitrixThis night Web Optimizer has been added to official Bitrix repository (Bitrix provides large varierty of solutions to create corporate websites / intranet systems / e-stores). It seems now you can install for any Bitrix product (with the same license policy as for the other plugins / standalone Web Optimizer version - Community, Lite, and Premium Editions).

Also Web Optimizer is listed on Wordpress and Joomla! Extensions Directory. Feel free to leave your comment / vote for us there.

Dec 9, 2009

Rocket Boost your site

Web Optimizer Premium EditionTo make your website ready for Xmas shopping season we offer 20% discount for all editions of Web Optimizer.

Simply use promotional code webo_nwy20092010 and get 20% off. Site optimization was never that easy!

Also you can order basic or advanced installation — our engineers will help to make your website the fastest one.

Promotion will end at December 31, 2009.

Dec 3, 2009

Conditional caching: several approaches

Cache integrityLast month we talked about various caching layers last month. The latest 'line of defense' is conditional caching.

What is Conditional Caching?

Browser usually has a lot of resources in its local cache. Some of them can be expired but browser can check if they can be used once more. So there is a way (frankly speaking two different ways) to check if a resource can be used once more.

First we can use Last-Modified (specified in HTTP/1.0) and its pair — If-Modified-Since. Server sends header Last-Modified with the resource modification date. Browser can request if resource was changed since the last request and send this date back with If-Modified-Since header. If there were no modifications server responses with 304-answer.

There is almost the same situation with ETag and If-None-Match (or If-Match) headers. Expect the only thing that Entity Tag can be any string (date, set of numbers, file name, etc). So this allows you to define it any way. But ETag belongs to HTTP/1.1 specification.

Benefits

With 304-answers server actually doesn't send any content to browser so you save transfer time for these resources.

If can be also useful for a kind of dynamic content which can't be cached for a long time but can't be changed often. So browser re-requests such content (if cache is expired) but receives answer 'Not modified'.

Also if helps with pages' reload - when visitor presses Ctrl+R in his/her browser the latter must request from the server all resources. So such kind of refresh can be made faster with conditional caching.

Practical examples

Apache server usually sends ETag among Last-Modified. There is mod_headers that is responsible for such behavior (and it seems for 304-answers). You can add ETag header (which indicates file modification time only) this way:

FileETag MTime

You can also unset each of these headers:

Header unset Last-Modified
Header unset ETag

With PHP you can send the equivalent of these headers by:

@date_default_timezone_set(@date_default_timezone_get());
header("Last-Modified: " . gmdate("D, d M Y H:i:s", $time));
header("ETag: \"" . md5(gmdate("D, d M Y H:i:s", $time)) . "\"");

If you want to emulate 'classic' ETag which Apache server sends by default you need to use:

header("ETag: \"" . dec2hex(@fileinode($filename)) . '-' .
dec2hex(@filesize($filename)) . '-' . dec2hex($mtime) . "\"");

where dec2hec is a helper to convert numbers for correct conversion numbers from decimal to heximal.

Issues

First of all you need to send different conditional headers with compressed and not compressed content. Generic approach here is to add '-gzip' to the end of the ETag (there is nothing to do with Last-Modified so it's a bit less usable).

Then you need to make such headers equal through all servers you use to server content. Because common ETag header in Apache includes information about inode (server-related, but not actually file-related), so it must be eliminated or replaced.

Then there is information (not approved yet) about excessive requests from browsers with conditional headers. Please be careful with this.

For static resources Web Optimizer unsets Last-Modified header for all static resources and sets ETag based on modification time. For dynamic ones it (if HTML documents are cached) sets ETag based on content hash and Last-Modified with static PHP proxy.

Dec 1, 2009

Web Optimizer 0.6.7 released

Last 0.xx version before major 1.0 release has been published today - 0.6.7 aka 'frost'. We tuned a lot of minor stuff and improved a number of parts of application.

  • Added delayed load for iframes. Now you can exclude iframes (i.e. ads) from general waterfall of website load (to prevent their blocking nature) — the same way as for Unobtrusive logic.
  • Improved behavior for 'Uniform cache files' option. Now conditional comments are striped for all browsrs except IE and HTML cache files relate to current set of options.
  • Enabled optimization for cached content in Drupal. There was a bug in Drupal native module with not optimized content on caching. Now Web Optimizer works in any case, it doesn't matter have you cached your website on server side or not.
  • Separated Upgrade / Install to stable / beta branches. One more big change before 1.0 release — now you can upgrade either to the latest stable branch, or to the latest beta (the most featured) version on the product. And switch between them.
  • Added Web Optimizer module for Bitrix. Bitrix is the most known CMS for corporate websites, and now Web Optimizer supports it in native mode.
  • Improved unobtrusive logic (added some ads).
  • Improved files combine (added exclusions, general logic).

Download the latest version of Web Optimizer

Nov 26, 2009

Static gzip is your best friend

After we touched several aspects of caching let's return to gzip and review very simple and powerful technique 'Static gzip'.

What is Static Gzip?

Static Gzip is a way to serve compressed content w/o its actual compression 'on fly' (here is a blog post about gzip). Very roughly we have gzipped files and serve them instead of common ones. How can we do this?

General algorithm

  • First of all we need to have gzipped versions of initial files. Usually they are named with a postfix .gz, i.e. main.css.gz. As far as these files are static we can have them compressed at maximum. With Linux you can do the following
    gzip -c -n -9 main.css > main.css.gz
    to get the smallest compressed file from the initial one.
  • Secondly we must have any way to route HTTP requests to take compressed version of the file. Via Apache and mod_rewrite we can add the following rules to .htaccess file or Apache configuration.
    <IfModule mod_rewrite.c>
    RewriteEngine On
    RewriteCond %{HTTP:Accept-encoding} gzip
    RewriteCond %{HTTP_USER_AGENT} !Konqueror
    RewriteCond %{REQUEST_FILENAME}.gz -f
    RewriteRule ^(.*)\.css$ $1.css.gz [QSA,L]
    <FilesMatch \.css\.gz$>
    ForceType text/css
    </FilesMatch>
    </IfModule>
    <IfModule mod_mime.c>
    AddEncoding gzip .gz
    </IfModule>

Rip off the veil of the mystery

What does this set of rules actually do?

  • First of all we enable RewriteEngine (it may have been already enabled in your configuration).
  • Then
    RewriteCond %{HTTP:Accept-encoding} gzip
    rule selects all HTTP requests with
    Accept-Encoding: gzip
    header and allows Apache to perform other rules.
  • Then we skip Konqueror browser
    RewriteCond %{HTTP_USER_AGENT} !Konqueror
    because it seems it doesn't understand compressed content for CSS / JavaScript files.
  • Then we check if there is physical file with postfix .gz
    RewriteCond %{REQUEST_FILENAME}.gz -f
  • And only after all these checks we perform actual internal rewrite rule.
    RewriteRule ^(.*)\.css$ $1.css.gz [QSA,L]
    We redirect all requests to .css files to .css.gz files.
  • After this redirect (which is the last one — modificator [L]) we force Content-Type of such files to be text/css. Most servers send .gz with default archive encoding, and browsers can't properly detect this content.
  • Finally with AddEncoding gzip .gz rule sets proper Content-Encoding header for compressed content.

Of course this logic can be applied not only to CSS files but to all files which can be efficiently compressed — JavaScript, fonts, favicon.ico, etc.

That's all?

Graceful Degradation

What if Apache doesn't have mod_rewrite module? Or we need to serve CSS content via PHP? Or there is environment which doesn't support such logic (CGI / IIS)?

Well we can perform the whole algorithm actually via PHP too. All what we need here is the following.

  • Form compressed file name (usually just with a postfix .gz). And check for this file's existence.
  • If no such file exists, create compressed version of the file (i.e. via gzcompress function).
  • Then write this compressed content to a new file (with already defined 'gzipped' file name).
  • And set for this file time of change (mtime) to the value equals to initial (non-compressed) one. Why it's required? Because we can with check for existence also check if gzipped version of the current file has the same mtime and skip its re-creation in case of equivalence. If the current file and its compressed version have different time of change — it seems we need to re-create latter.

So with this logic we just check for file's existence and perform 2 checks for mtime (all such checks can be cached on file system level, or cache folder can be mapped to shared memory) and serve gzipped version. CPU is saved (with 80-85% of transferred content size)!

So Web Optimizer has all such approaches integrated and with version 0.6.7+ allows you to create .gz versions of CSS / JS files within any folder on your website.

Nov 20, 2009

Web Optimizer on Facebook

Yeah, now you can join us on Facebook:

http://www.facebook.com/pages/Web-Optimizer/183974322020

Also there is a Twitter account

Nov 19, 2009

Several layers of caching

The last two topics were about various aspects of client side caching. Also there was a topic about cache integrity check. For now let's review possible overall caching schema:

  1. Web Optimizer can get optimized HTML code from its own cache. And output it to the browser. It's the first layer of caching — on the server side.
  2. Otherwise Web Optimizer gets raw results from CMS engine (HTML code). But they can be already cached (in CMS). Web Optimizer doesn't touch internal CMS caching logic, only uses it if it's available. So here can be the second layer of caching logic — also on the server side.
  3. During client side optimization performance Web Optimizer usually checks (view a complete description of this logic) if there are any files ready to be served (merged and combined ones). If yes -- all is OK here, Web Optimizer uses them. It's a one more server side caching layer here.
  4. Also on serving files Web Optimizer (but usually Apache web server) checks if there are gzipped versions of files — .gz ones — and uses them (via mod_rewrite or static gzip) rather than 'gzip on fly'. The fourth level of server side caching. And overall website configuration can have 1-2 more levels (i.e. on frontend proxy, nginx or squid, or on shared memory virtual disk).
  5. But before files are served browser receives ready HTML code with assets' URLs. And tries to fetch them in its own cache — browser's one. If yes (Web Optimizer sets strong caching headers) - such files aren't requested from server. It's a client side caching layer.
  6. If there are no such files in local cache browser requests files but local (or not very local but intermediate) proxy server can have them cached (read more about caching on proxies). So proxy server gives such files faster than initial website (and request doesn't reach the website at all). Actually client side caching layer.
  7. If request reaches initial server but has conditional caching headers (ETag or Last-Modified) server can response with 304-answer (and don't send content). So it's client side caching layer too — all content is taken from client side.

Wow! It seems that there is all. As you can see there is about 7-9 different caching layers somehow smudged between end client and end server (various chain links of requests' way from your browser to the website and back to you).

If the next posts we will try to light some of these layers in details.

Nov 16, 2009

Web Optimizer 0.6.6 released

A lot of new features and bug fixes in new version.

  • Added separation for Community / Lite / Premium Editions. Now free version of Web Optimizer becomes Community Edition and it's prohibited to be used on commercial websites. For this purpose you can also buy Web Optimizer Lite Edition (data:URI + performance included, $19.99). Version comparison.
  • Added option to move all scripts (w/o merging) to </body>. If you choose to move all scripts to </body> but Minify JavaScript option is disabled — all scripts will be just moved one-by-one to the end of the document. Very useful.
  • Added option 'Uniform cache files for all browsers'. In some cases (i.e. if you use different from Web Optimizer caching engine) it's incorrect to differ HTML code through browsers (yes, this disables data:URI group of options but allows you to cache any page for any browser only once).
  • Added option 'Cache external files'. Now light PHP proxy can be applied to external files too. They can be downloaded, gzipped, and cached — to improve your own website load speed.
  • Added option 'Enable chained optimization'. There are a few issues with chained optimization algorithm (due to buggy server side environments or mix of rights). So it can be disabled to prevent any Web Optimizer tries to pre-optimize your pages. With this option disabled all website pages will be optimized by first visit only.
  • Added events onBeforeOptimization, onAfterOptimization, onCache to plugins API. These events can be used for standalone version to add any dynamic code to PHP, apply any internal logic, or add any dynamic pieces to cached HTML documents.
  • Improved behavior (content skipping). For content different from (X)HTML and for Ajax requests it's very tricky to append and optimization techniques (in most of cases such content has been already optimized, so it's just skipped).
  • Improved general logic in case of </body> absence. Sometimes HTML document has no </body> tag (yes, that's true). So we need to apply all logic in such cases too.
  • Improved multiple hosts behavior (especially for dynamic images).
  • Improved unobtrusive logic (minor fixes, added counters).
  • Improved Apache modules detection on CGI environments (especially mod_rewrite).
  • Fixed several tiny bugs in files fetching (after dozens of unit tests integrated).
  • VaM Shop and Gekklog added to supported systems.

Download the latest Web Optimizer.

Nov 14, 2009

Client Side Caching: proxy servers and forced reload

In the previous blog post we have talked about caching basics. Let's review now how proxies actually work and how can we force cache reload.

Cache reload

The main issue with far future expires headers is that browser doesn't re-request resource but takes it from local cache. So if you have made any changes on your website they won't be visible for all 'old' users (with cached styles and scripts, HTML documents usually aren't cached so aggressively).

So what can we do with this trouble? How can we tell browsers to re-request such resources?

Main cache reload patterns

There are two main patterns to force browsers (user agents) to request the current asset once more.

  • Add to the file name any GET parameter (which should indicate new state of this asset). For example
    styles.css -> styles.css "physical" file name. For example
    styles.css -> styles.v20091114.css

Both approaches change URL of the asset and force browser to re-request it.

Cache reload and proxy servers

As you can see the first approach is simpler than the second. But there a few possible issues with it. First of all some proxy servers doesn't cache URL with GET parameter (i.e. our styles.css So if you have a lot of visitors from a network behind one firewall we will serve this asset to each visitor separately, without its caching of a proxy server. This will slow down overall website speed and sometimes this can be critical.

But how can we apply new file name without actual changes on file system? Is there any way to perform this with only change in HTML code? Yes!

Apache rewrite rules

Apache web server has a powerful tool to perform 'hidden' redirects for local file (this is called 'internal redirects'). We can manage the first way with just one predefined rule for all files (in our case it's a set of numbers after .v):

RewriteEngine On
RewriteRule ^(.*)\.v[0-9]+\.css$ $1.css

So all such files will be redirected to their physical equivalents but you can change a part of URL with .v at any time — and browsers will request this asset once more.

Automated cache reload

There are several ways to automate cache reload process for all changed files. As far as Web Optimizer combines all resources into 1 file, it's required to re-check file mtime (time of change) for all files and re-combine all resources.

Issues with re-checking all combined files have been already described last month, so it's not generally good to check them all with every web page visit. We can cache all previous checks into 1 file and check only its mtime. So it's done by default. By default we can check time of change of the only file (CSS or JS one) and add as a GET parameter or as a part of file name.

So this is applied for all such files (that should be cached on a client side) and results in the following:

/cache/website.css you can see there are two timestamps in these CSS files, one goes as a GET parameter, the other — as a part of URL (and with Apache mod_rewrite rule is transformed to /cache/website.css).

Overall schema

So what is overall caching algorithm for the website?

  1. Check if we have combined file. If no — create it.
  2. Check mtime of the combined file. If it's required add mtime to URL (using one of the described ways).
  3. Browser receives HTML code with the URL of combined file.
  4. Browser checks if it has this URL cached. If yes, all finished here.
  5. If not browser requests cached file (which is already prepared on the server or is cached on the proxy).

Nov 8, 2009

Client Side Caching: basics, automation, challenges

It was a long way to setup correct schema to handle all issues with client side caching. Let's review it step by step.

Cache Basics

It's very simple to cache a single file - we need just to add Cache-Control header (for HTTP/1.1) and Expires header (for HTTP/1.0). Headers look like this:

Expires: Thu, 31 Dec 2019 23:55:55 GMT
Cache-Control: max-age=315360000

Is it so straightforward? Well, almost.

Caching on proxy

Internet consists of different servers. Some of them are common websites. Some of them are transport nodes which just redirect traffic from user to website and vice versa. Providers are interested in traffic decrease. So they turn on caching on their servers to serve common requests from inner network, not to transfer them outside.

How can we affect this caching? In RFC 2616 there is defined a way to do this: Cache-Control: public. So our example will turn to:

Expires: Thu, 31 Dec 2037 23:55:55 GMT
Cache-Control: max-age=315360000, public

Caching automation

But how all this can be applied to all files on the server? Apache has a special module to handle such situations: mod_expires. We can turn it on in Apache configuration with these directives:

ExpiresActive On
ExpiresDefault "access plus 10 years"

This seems to be very easy. What is the trouble?

What must be cached?

As far as almost all resources on a website are static (except maybe HTML documents) we can cache them all. The first approach is to cache all but skip caching for HTML documents. It can be performed this way (in Apache configuration):

ExpiresActive On
ExpiresDefault "access plus 10 years"
<FilesMatch \.(html|xhtml|xml|shtml|phtml|php)$>
ExpiresActive Off
</FilesMatch>

So we cache all except HTML documents. But what if we have dynamic CSS / JS files or images? I.e. thumbnails are generated via PHP, and a number of styles are merged 'on fly'.

ExpiresByType is a magic wand!

We can use another directive to cache all required files by their MIME types (which are usually properly defined on server). This way:

ExpiresActive On
ExpiresByType text/css A315360000

It's generally better because in case of dynamic images with can apply cache headers by their type, not their extension. Well, that's all?

Different challenges

There can be a lot of more or less common issues with this approach:

  • There are not common MIME types for a number of resources (i.e. for font files). So we can apply cache headers only by their extensions with FilesMatch. Because we can't be sure for all server environments it's better to combine both approaches: set headers with FilesMatch and with ExpiresByType.
  • How to define what files must be cached? Web Optimizer has a great number of pre-defined MIME types and extensions to handle most of static resources on the server. Also it has several options to manage caching behavior.
  • If there is no mod_expires on the server we can emulate all caching behavior with a light PHP proxy. In case if there is mod_rewrite we can just redirect all required resources internally to this proxy and get files cached.
  • If there is no both mod_expires and mod_rewrite on the server we need to parse HTML code to replace all calls to static files with their equivalents via PHP proxy. That's all.

Web Optimizer has not only outstanding support for various cache techniques for different environments but also provides a number of ways to force cache reload (if we have resource changed). We will discuss this in the next posts.

Nov 4, 2009

Web Optimizer 0.6.5 Swift released!

We have finally tested and improved (a lot of minor stuff) the last stable product version - 0.6.5 aka swift - before 1.0 release. List of changes:

  • Added CSS/JS static file names setting. This can be useful for small websites with the only CSS/JS set for all pages - so it can be named (instead of hash usage).
  • Improved chained optimization behavior. After a lot of additional tests chained optimization (on settings' save and product activation) has been significantly improved. Now it wastes even less server side resources. For WordPress and Drupal plugins was added ability not only to completely refresh cache on options' save, but also just to save options. In some cases the latter won't lead to cache renewal.
  • Improved cache clean up. Fixed a number of minor issues with cache directories usage (generally for standalone application, this doesn't affect plugins / modules as they have cache directories hard coded).
  • Improved static assets proxy for compressed files. Added a number of performance improvements (logic simplified) and fixed wrong behavior on static gzip cache usage.

Download the latest Web Optimizer.

Oct 30, 2009

Gzip challenges: browser compatibility, static gzip, graceful degradation

GZip (GNU Zip) is the most fundamental way to compress information in web. It exists for a decade (or more) and is supported today almost by every agent (browsers, robots, parsers, etc). So how can it used?

The first challenge: to gzip or not to gzip?

Well, it's OK to gzip every textual content on your website. But there were (or are) a few troubles with old versions of IE (which didn't understand this Content-Encoding), Safari (with gzip for CSS/JS files), and other browsers. All of they can be prevented (more or less), but their existence stop developers and admins from using this techniques through their projects and websites. Moreover there are a few less documented issues with some proxy servers, Content-Encoding, and chunks, which can lead to the whole server shutdown (implementation bugs). So should we use gzip?

We must! Gzipped content is usually 50-85% less in size, and this tremendously helps in accelerating your web pages. So let's review what ways are to prevent known bugs. For IE6 we can use these approaches:

  • not gzip if there is no SV1 in User Agent string (Service Pack 1, which has some issues with gzip fixed),
  • add 2048 spaces in beginning of gzipped file,
  • or just use deflate (with content flushing) instead of gzip.

To prevent issues with Safari (and some other browsers) we can just force Content-Type for gzipped content according to its initial value.

So most of the troubles can be solved.

The second challenge: CPU overhead

Also gzip adds very little to actual server processing time, but if you use powerful caching systems, proxing, etc, it can be notable. Here we come with static gzip.

Static gzip is a way to store gzipped content somewhere (usually on a hard disk, but it can be also any memory cache), so we cache gzip results and show them to end users, saving CPU time.

All works well here? Actually, no. But it's...

The third challenge: graceful degradation

No Apache / web server support for static gzip. Sometimes we can use any static gzip directives (i.e. special mod_rewrite ones) to redirect initial requests from static (or cached) files to their gzipped versions. Classical example is this mod_rewrite usage

RewriteCond %{HTTP:Accept-encoding} gzip
RewriteCond %{HTTP_USER_AGENT} !Konqueror
RewriteCond %{REQUEST_FILENAME}.gz -f
RewriteRule ^(.*)\.css$ $1.css.gz [QSA,L]
<FilesMatch \.css\.gz$>
ForceType text/css
</FilesMatch>

Here we check if user agent supports gzip encoding, if it's not Konqueror (which doesn't understand gzipped CSS/JS files), and if a statically gzipped file (.css.gz) exists. After this we redirect (internally) request to this file and force correct Content-Type for it.

But what if there is no mod_rewrite support? No problem! We can do the same a light PHP proxy, which actually receives a request, checks for all conditions, and serves prepared file.

The last challenge: what to gzip?

This one is the easiest. Obviously we need to gzip text types (XML, HTML, CSS, JS) and don't touch graphics or media files. But what else can be gzipped?

Some articles in this area recommend to add to gzipping ICO file type (which can be 3 times lesser after compression), and some font types (SVG, which is actually XML, OTF, TTF, EOT). All of this is already handled by Web Optimizer (.htaccess rules for all file types, PHP proxy with static gzip, etc). All in one brilliant acceleration package.

Oct 28, 2009

Version 0.6.4 released

Before 'major' 0.6.5 we improved Web Optimizer a bit with the following things.

  • Added cache re-generation instead of cache clean up. Now all files are in cache cleaned up, then Web Optimizer tries to re-generate them. While cache files are being generated website doesn't use Web Optimizer - to prevent CPU overload with clean cache. Only after we have all files ready to be served, we can completely activate Web Optimizer
  • Added 5 cache file sets generation on options' save. The approach described above can be used when options are saved.
  • Added option to force deflate over gzip for IE6/7. It seems IE6/7 plays with deflate page encoding a bit better than with gzip. So you can now force Web Optimizer to use this encoding for these browsers.
  • Added options for better placement of Web Optimizer stamp and link. There was a complete post about Web Optimizer stamp placement.
  • Added IfModule for all .htaccess rules. Since there were a few complaints about broken websites after .htaccess usage it's completely safer to add IfModule for all sections. This will block normal behavior of Web Optimizer in case of incorrect server side functionality detection. But won't break the whole website.
  • Improved static assets proxy for images. Added graceful degradation to redirect to the resource and fixed a few issues.
  • Improved mhtml behavior for Vista. Just disabled with a new set of cache files.
  • Improved CSS Sprites behavior for IE6. Added BackgroundImageCache directive for this browser to prevent CSS Sprites blinking.
  • Improved Apache modules detection on CGI environments.
  • Improved Joomla! plugin / mambot behavior on a few environments.
  • Improved dynamic styles' loading on DOMready event.
  • Fixed issues with performance / gzip / footer options incompatibility.

You can always download the latest version of Web Optimizer here.

Oct 26, 2009

Customizable Web Optimizer stamp

In nightly builds we added possibility to completely change Web Optimizer stamp (small picture placed to the right bottom corner of the website by default). We had a few requests about this, and maybe somebody will be more happy if can't only disable this stuff, but also stylize it.

Image setup

Now there are three images that can be placed as a stamp:

Web Optimizer stamp Web Optimizer stamp black Web Optimizer stamp black

All of them are included to Web Optimizer packages and you can just enter their file names to enable any of them (by default the first is used): web.optimizer.stamp.png, web.optimizer.stamp.black.png, and web.optimizer.stamp.white.png. Also you can create any other image and place it to web-optimizer/images and enter its name in options. It will be copied to cache folder and used on the website.

Text link

Now you can also put any text pointing back to the Web Optimizer website about acceleration. Just use this option in group 'Backlink and spot'. This will work if you leave settings for Web Optimizer image blank (so no image will be used).

Styles

The last part of customizing your optimized website is setting all styles for this stamp. HTML code is included right before body (so it affects overall website load as less as possible), but you can use all power of CSS to place this spot to any place of the page.

For example you can place it the middle of the bottom, to the left top corner, or wherever you want. CSS is set with a string in this group of Web Optimizer options. By default stamp is placed as 'float' with a number of negative margins, to emulate 'bottom right' position (which is buggy in IE6).

Oct 21, 2009

Really parallel CSS files - possible?

For about a year we aredealing, optimizaing, fighting with including background images into CSS files. This technique was successfully implemented in Data:URI [CSS] Sprites (and in Web Optimizer last week), but we faced there IE7@Vista bug (Aptimize doesn't handle it good , just skips any caching, it's terrible). But this post isn't about such bugs.

Parallel CSS files and FOUC

Everybody who deals with web pages performance optimization should know about FOUC - Flash Of Unstyled Content. Every browser tries to prevent such behavior (maybe except Opera), they delay on showing any picture on screen before they are sure to show 'styled' content. Even if CSS files are being loaded in 2, 3, 4, and more channels. It doesn't matter. The whole page will shown only after all CSS files will be loaded (and if there are any scripts - the situation will be even worse, blocking scripts in action).

No Parallel CSS files?

Actually, no... It's possible to force browser to render the page, and then add any CSS file. It's tricky but it exists. Why should we use this way?

data:URI (or mhtml) can be very large in size. Generally (even with compression) it's about 2-5x larger that CSS rules themselves. SO we can render the page about 2-3 times faster only with this technique!

Magic, magic!

After a long testing with various ways to inject CSS file to the document the following approach was invented. We can add one CSS file (with rules) to the very beginning of the page. Then we should add the second CSS file (with resources), but where? And how to load it 'true unobtrusively'?

We can add a bit scripting (inside head) which will:

  1. Hide initial CSS file from browsers with enabled JavaScript.
  2. Add this file to loading query on DOMready event.
  3. And degrades gracefully for browsers with disabled JavaScript (or broken for some reasons).

OK. The first point is simple:

<script type="text/javascript">
document.write("\x3c!--");
</script>
<link type="text/css" href="resource.css" rel="stylesheet"/>
<!--[if IE]><![endif]-->

After this CSS file there is conditional comment for IE, it plays as just closing comment for JavaScript one, or just a junk code for browsers with JavaScript disabled.

The second point isn't very hard:

<script type="text/javascript">
function _weboptimizer_load(){
load_our_css_here();
}
add_event('dDOMready', _weboptimizer_load);
</script>

So after browser renders the content in browser - we start to load additional CSS file with background images.

The third point has been already implemented in the first piece of code.

Browsers' support?

This code plays well in all browsers. A number of them tries to get this resource file earlier than DOMready event is called. But in this case content is rendered as soon as the first CSS file (only raw rules) has been loaded. So everything seems to be OK.

This approach is already in Web Optimizer since version 0.6.3. Option is called 'Load images on DOMready event'.

Oct 20, 2009

Version 0.6.3 released

Here we are with the latest version of wonderful performance optimization package:

  • Added separated CSS resource file loading on DOMready event. This will force browser to render content while it is loading background images (in resource file).
  • Added .htaccess functionality detection on CGI systems. Blog post about tricky Apache modules shows how complicated can be a simple task to detect server side features support. Now this is detected wherever possible.
  • Added gzip and cache for fonts. After Steve's post about @font-face we added complete coverage for fonts files.
  • Added support for gzip in static assets proxy library. Now it can be used as the "Furios Web Optimizer" - just add Cache / Gzip headers 'on fly' as fast as possible (no minify, static gzip support). There still is a place for improvement - i.e. add PHP Cache subsystems usage.
  • Added 'cache version' option to skip cache files' mtime check. More about this option here.
  • Added 'quick check' option to skip complete parsing of external files. More about this option here.
  • Added 'skip RegExp' option to allow standard-complaint website be faster. More about this option here.
  • Added textarea to set any CSS code over compressed files. Can help to rewrite some 'broken' styles quickly.
  • Spot in title (lang="wo") made optional (in full version only). A number of people have been concerned about this spot and SEO influence. Actually there is no correlation, but now it can be made optional.
  • Improved multiple hosts behavior in a few cases.
  • Improved 'Separate CSS files' behavior (data:URI group of settings).
  • And a few minor fixes.

You can always download the latest version of Web Optimizer from its download page.

Oct 17, 2009

Cache integrity vs. website speed

Well, last few days we were working on improving the very core logic of Web Optimizer algorithms - fetching files and checking cache integrity. Why is it so important? After all cache files are created (all merged JavaScript, CSS, all CSS Sprites and data:URI resources, gzipped versions, etc) website should be loaded as fast as possible. It's not generally very good if client side optimization wastes server time out.

And right now we reached 10x better performance for full version (with unobtrusive logic and multiple hosts disabled - up to 20x better). How is it possible?

General flow

Here is pie chart for time consumption for Web Optimizer logic:

Time consumption

This chart is valid for both versions - demo and full - but it can be optimized for your website only with full version. Demo version doesn't have performance group of options.

Cache integrity

Why cache integrity is so important? Because we need to be sure that all merged and minified files are up to date. It will be very bad if we can create cache files only once. And every small change in website design or layout would lead to all website cache re-creation. Web Optimizer can check cache integrity 'on fly', and perfectly does this.

But there is a huge lack of performance: with every hit to your website Web Optimizer checks all files that are listed in HTML documents and re-calculates the check sum. Then checks if such cache files exist. And only then serves the optimized content. It's very good, but it's excess. Usually websites are not changed for monthes and years. So we don't need to check these files thousand times a day.

The first point: do not check files changes

We can skip re-calculation of files' content and this can bring us about 2-3 times acceleration due to elimination of very expensive file system calls. Well this leads to cache clean up every time when physical files are changed. But on the live website it's not often, but saves 50-70% of server time (on Web Optimizer actions).

For this logic option "Ignore file modification time stamp" in Performance section is resposible. The difference between this option and fourth option below (which just skips file system calls) is the following.

If you change file content (i.e. add a few styles to main.css) with this option disabled (enabled check of file modification time) Web Optimizer will fetch content of main.css and tries to compare it with the previous one (by check sum). If check sum is different - a new cache file will be created. With this option enabled (disabled check of file modification time) Web Optimizer will take into account only file name (usually all content in head section), but not its content.

The second point: exclude regular expressions

Further investigation what was slow in Web Optimizer core logic put light on a lot of Regular Expressions. Well, RegExp's are very good if you need to do something fast and be sure in result. But they are also very expensive. And in the most of cases they can be replaced with raw string functions (i.e. strpos, substr, etc). Well-formed standard-compaint websites can be parsed very quickly, so why Web Optimizer must be slow for them? It must be slow for old-schooled websites, that can be parsed in a standard way.

This logic is managed by "Do not use regular expressions" option. This approach saves about 15-20% everywhere in Web Optimizer core.

The third point: quick check

So we have reduced calls to file system (from dozens to 2-3), we have optimized regular expressions with string functions, what else? The next step should be in reducing overall logic operations. While fetching all styles and scripts we make a lot operations: get tags, get attribute, correct attributes, check options, check values, etc.

All this can be skipped on just general cache integrity check. So reducing this logic to minimum (just to be sure that we can serve the same cached files for the same pack of styles and scripts) can bring us additional 10% in performance. A few but with other approaches is enough to provide the fastest client side optimization solution.

This is Check cache integrity only with head option.

The fourth point: reduce even more

OK, but the resonable question will arise: why will we need to perform any calls to file system? The answer is simple: we need to force cache reload on a client side if we have the same cache file name (i.e the same set of scripts but their content has been changed, so we need to reload it on the client side). Cache reload is forced by additional GET parameter (in demo version) and changed file name (with mod_rewrite in full version). These operations (check for cache file existence and its mtime) can be avoided if we hard code 'version' of our website application. So calls to file system can be reduced to 0.

But this is generally dangerous: we can't check cache integrity properly, and can serve files which don;t exist. This can be made only after all cache files have been created, and we are just tunind server side to the best performance.

This is option Cache version number (zero value skips its usage).

Finally

For now we have the following picture (in relative numbers):

Web Optimzer logic: full version

As you can see almost all parts are balanced to achieve exceptional server side performance (usually 1-2-5 ms in full version versus 20-40-100 for a demo version). All options are included in nightly builds and will be available in 0.6.3 after complete testing.

Oct 14, 2009

Tricky Apache modules

Every day we work on improving out techniques to cover more and more systems. To implement more and more methods. To test and choose the best techniques to speed up website around us.

Apache modules

When we speak about PHP proxy to server your pages fast a reasonable question arises - why should we relate to Apache? Is it a PHP solution or something else? The answer is very simple: we try to reduce server side load with every possibility. If Apache can gzip content - let's do this. If Apache can cache - let's also do it (Apache can be used as a backend for a complex web system, so this doesn't change a lot even if there is nginx / lighttpd / squid frontend).

Of course we can't write to Apache configuration file. It's very dangerous. And it's restricted with all hosting providers. But Apache has very useful feature in this area - .htaccess file. It provides full access to directives which have Location level (it's enough in almost all cases).

Well... What's wrong with .htaccess?

It's OK with it. We also have a Wiki page with recommended rules for .htaccess (or Apache configuration file).

Troubles begin when we are working not with mod_php environments but with CGI ones.

Apache modules detection

So to use all Web Optimizer features properly we need to know for sure what modules exist, and skip processing of a part of logic via PHP to save server resources, i.e. gzip via Apache or PHP, cache via Apache or PHP, or anything similar. To detect all allowed Apache modules we use apache_get_modules function. It can give us all the information about server if there is mod_php. Otherwise we need to use another approach. What is it?

First of all we need to check for .htaccess possibility (just download restricted file, if we can do this - .htaccess doesn't work). Then we can module by module write required rules (one test rule just to check module existence) to temporary .htaccess and get any file (via curl) from the folder with this .htaccess. If we can do this - all is OK, Apache module is enabled. If we can't (i.e. 500 error occures) - we disable such module in configuration and going to provide the same functionality via PHP.

That's all?

Yes, the whole process looks like this:

  • Check for .htaccess possibility (it also can be restricted in Apache configuration).
  • Check for Apache modules (via API or raw approach with curl).
  • Write all possible rules to .htaccess.
  • Provide all missed functionality via PHP.

Oct 13, 2009

Verson 0.6.2 released

We are here with the new build of Web Optimizer application. Main changes:

  • Added optional mhtml support. This will eliminate necessity of IE7- CSS hacks in case of data:URI and provide the optimization behaviour for these browsers.
  • Added mhtml size restriction and mhtml exclude files list. The same as for data:URI because files' sets and size limitations can be different for both techniques.
  • Added option to separate CSS files (to CSS rules and base64-encoded background images). Since CSS files can be loaded in parallel (in the most of browsers) we can split all rules. In the future it is possible to load CSS file with background images on onDOMloaded event that will provide the fastest 'first view' of any web page.
  • Separated inline code merging from external files. Now you can either merge inline CSS (or JavaScript) code inside body. or external files, or both of these chunks.
  • Added possibility to parse <body>. Eventually there can be a number of CSS or JavaScript includes in document body that can be harmless moved to head section (CSS files for sure, JavaScript ones very carefully). But this approach is disabled by default (because CSS files inside body are not standard-complaint and JavaScript includes very rarely can be moved from their places) and can be greedy.
  • Added static images proxy for CSS images. Only if CSS Sprites, or data:URI, or mhtml techniques are used (because full CSS tree is parsed via CSS Tidy).
  • Added separated CSS/JS files for IEs and other browsers. This allows us to include all files inside conditional comments to cached versions but lead to 5x more files in cache folder (only CSS / JavaScript ones, CSS Sprites are created on base of actual background-related rules, in most of cases such rules are the same for IEs and the other browsers).
  • Released module for Drupal 6.x. Now Drupal users can apply all Web Optimizer techniques through their administrative interface.
  • Improved compatibility with disabled curl library. Now Web Optimizer degrades gracefully if there is no curl library extension enabled on the server (just pass external or dynamic files).
  • Improved relative directories calculation. This leads to complete independence of Web Optimizer installation from the website root folder, document root folder (for the given host) and other environment-related features. Also this improves Joomla / Wordpress plugins behavior for websites in relative folders.
  • Improved unobtrusive logic. Added calculation of height of advertisement blocks for AdWords, added several counters and improved performance of content parsing.
  • And a lot of minor perofmance and logic improvements.

You can download the latest Web Optimizer here.

Oct 8, 2009

Data:URI + Mhtml = cross browser inline CSS images

About a year ago I presented an approach to create completely cross browser base64 inline CSS images. After 2 monthes it became DURIS.ru (Data:URI [CSS] Sprites).

What is data:URI images?

data:URI was presented by W3C years ago as an alternative way to include background images. It allows you to include not just a call to external file but the file itself (binary data in base64 format). This significantly reduces the number of HTTP requests but can lead to CSS file size increase (even with gzip enabled).

The worstest thing

data:URI isn't supported by IE7 and below, this restricts it's possible gain for your website. But for these browsers we can use mhtml approach, which is generally the same.

data:UR + mhtml

With correct server side browser detection (just MSIE number_of_version in User Agent string) we can separate client side optimization behavior and provide 2 different techniques for our users: data:URI for all except IE7 and below and mhtml for IE7 and below. So it's the solution!

Side effects

There are some minor side effetcs related to IE7@Vista, such as not working :hover and dissappearance cached mhtml images. We are working to prevent them but general logic is already integrated into Web Optimizer.

Large CSS file size

Of course the main side effect if large file size of CSS files (with included base64 images). This can be prevented by splitting CSS to 2 files: one with CSS rules and the other with CSS background images (data:URI or mhtml). The second file (with inline CSS images) we can load right after page is shown in browser (to reduce delay of styled content show). And overall page load will hte fastest.

Naturally we need to degrade gracefully for cases with disabled JavaScript (just load 2 files).

Oct 7, 2009

Conditional comments parsing

What are conditional comments?

Conditional comments allow you to hide (or show) a part of HTML code for IE-based / not IE browsers. You can find compelete guide about this technique here or in MSDN. Small example:

<!--[if IE 8]>
<p>Welcome to Internet Explorer 8.</p>
<![endif]-->

In all browsers (except IE8) we don't see anything for this code, it behaves itself as a standard HTML comment. IE8 will parse this chunk as a paragraph with text. There is one more wonderful technique from Microsoft - conditional complilation inside JavaScript - but it's out of scope of this article.

Why such comments are evil?

Usually webmasters create templates of the website for all browsers except IE, and then add IE-specific styles with conditional comments. They include separate files to improve these styles caching. It's great. But IE7- uses only 1 connection to load JS files (2 for CSS, but this doesn't change a lot), so it's additional HTTP request(s) and additional payload (in case of 1 merged global CSS file and 2 files in conditional comments there also will be some time wasted). We can prevent this.

Show me the magic!

We can detect browser on the server (by User Agent string) and just strip out conditional comments if they match current browser (IE). So comments that must work in IE are just uncommented in the document head, and latter is parsed in a common order.

All this leads to difference between "general" version of cached files and "IE" ones. So we just need to separate cached files by browser - for each browser receives its own cache file (right now 4 for IE and 1 for others), and all will be OK.

All this has been included into Web Optimizer and will be available in version 0.6.2.

Oct 6, 2009

Version 0.6.1 released

All tests have gone OK (no new issues). Version 0.6.1 is ready. List of changes:

  • Added support to cache static assets via PHP. The killer-feature for your web server w/o support for Expires header.
  • Added cross-websites multiple hosts disrtibution. Now you can use multiple Web Optimizer installations and distribute images through them completely transparently.
  • Improved CSS Sprites logic (background-position on icons, less image size, merging logic in a few cases).
  • Fixed 'white screen' in IE7 on gzip via PHP. Now compressed HTML go to the end-user w/o failure.
  • Fixed paths calculation installation inside subdirectory. Also for Joomla! plugin and Wordpress.
  • Adding debug edition (for the main branch and for Wordpress) to find out issues with product installation / logic.
  • Fixed a number of 500 errors in Wordpress plugin. Not parse empty content, avoid double headers.
  • Added support for .htaccess in local directory. RewriteBase is calculated now.
  • Added username/password for private testing via Basic Authorisation (for curl). Now you can hide dev website with .htaccess and get fully working Web Optimizer copy there.
  • Fixed multiple hosts distribution in HTML.
  • Added unobtrusive support for AddThis widget.
  • Improved page rendering on unobtrusive load.

Download the latest Web Optimizer

PHP proxy for static assets

Last several days we were implementing mod_expires + mod_headers emulation on PHP for those hosts which don't have these modules installed. The idea is simple - just forward all static resources (images for now) to PHP script that will set all correct headers (10 years cache) + check conditional cache hit (Etag). In short the maximum caching possibilities with minimal CPU wasted.

Ok. First of all we need to redirect all <img> to such PHP script (wo.static.php). This can be done as:

  • Via mod_rewrite. Just rewrite internally all images with dot (.) at the end (the minimal change in the file name, provided by Web Optimizer core) are redirected to wo.static.php?FILENAME.
  • Or just place raw call to this script instead of initial image (if we don't have mod_rewrite). File names will be ugly but images will be cached.

Protect from hacking

Long time ago, maybe in the last century, there were a lof of injections (NULL-byte for Perl, maybe the same for PHP), that allowed any user to get initial text of any file (i.e. /etc/passwd or script sources with database credentials hardcoded) on he server. We restrict this by checking for extension and discard not supported (=dangerous) files.

Also we can't allow users (=possible hackers) to access any system resources ecept images for this website. So we can just check if filename (its full path) includes document root, and only then we serve the file.

File name and type

To enlarge support from the user agents' side for received content we must:

  1. Add Content-Type header with already computed (and supported!) extension.
  2. Add Content-Disposition header to rewrite actual file name of served static file (from wo.static.php to requested file).

Conditional caching

Back to cache logic. First we need to calculate (and check) file's checksum. Google uses a combination of file's name, time of change and something else. We can't use time of change (Web Optimizer can be used on a cluster of servers), so the most general approach is to provide ETag with checksum (calculated from file's content).

It will play well for small files, but will lead to excessive server side load on large resources (but we are speaking about web page images, they always are less than 150 Kb, usually less than 20 Kb). And we can use light functions for this - crc32. Just to get hash from content which will be changed after any modifications of this file.

Non-conditional caching

After this check (with calcuated hash and no match for it from the browser's side) we can send pair of standard cache headers:

  • header("Cache-Control: public, max-age=" . $timeout);
  • header("Expires: " . gmdate('D, d M Y H:i:s', $_SERVER['REQUEST_TIME'] + $timeout). ' GMT');

One for HTTP/1.1, the other - for HTTP/1.0. And only after this send content of the requested file. The magic is over!

Just to remind: Web Optimizer uses standard web server options (mod_expires) to cache static content. Only if this is disabled it applies all the described algorithm.

Oct 5, 2009

Web Optimizer 0.6.0 Sailfish is out!

Web Optimizer Benefits

Web Optimizer is a complete acceleration solution for you website. Customers like fast websites, so make them pleasant. Moreover Web Optimizer can significantly reduce traffic and CPU expenses for your website.

Happy visitors

Any 100 ms delay negatively influences conversion of your website, and thus it is countable. Stop loosing your money. Integrate Web Optimizer in a few clicks into your website and increase your revenue right now. Download this perfect solution.

Less traffic

Web Optimizer applied a lot of proven compressing and caching techniques and sends from your website up to 98% less traffic (in text files, in images up to 94% less traffic)*. Simple actions such as gzip, or cache, or minify can't give you such gain. Only the right combination of them all corrected to industry high standards and best practices can provide you the mentioned rate. And Web Optimizer implements this combination. See actual results of traffic elimination.

Less CPU overhead

Web Optimizer integrates a lot of different cache approaches to remove load from your server. This includes both client side caching (which cuts down CPU time by reduction a number of requests to serve), server side caching (easily configurable), and early flush of content (which doesn't cache dynamic pages, but prevents 'white screen' in browser), etc. Also Web Optimizer injects static gzip technique wherever possible. It eliminates CPU overhead on content gzipping. Check how Web Optimizer can reduce server delay for Joomla!.

Innovative fast cache verification technique ensures cache files integrity and doesn't waste any additional resources.

Less time to integrate acceleration

Step-by-step interface allows you to install the application, configure it for your needs, clear cache, and monitor traffic and visitor's time savings. 'Express install' loads all pre-defined configuration options and makes overall website acceleration fast and easy. You can save dozens of hours with Web Optimizer — it just works for you! Complete installation HowTo manual helps you to get into the product easily.

Also pre-check for existent updates and extended security via protected installation mode are available.

Browsers' support

Web Optimizer has outstanding support both in client and server side technologies. Besides all cutting edge techniques (i.e. data:URI or deflate compression) it implements strong backward compatibility for all the other browsers through graceful degradation. If browser supports all standard approaches, its owner gets the fastest version of the website. If any stuff isn't available, visitor gets just very fast website. All system requirements.

Download or Buy Now Web Optimizer.

* Traffic reduction is caused both by data compression (up to 88% decrease for text files and up to 60% for GIF images) and caching methods (that reduce number of requests down to 15% from initial value).