PSAS/ SitePerformance

I was chatting with Larry earlier today, and he points out that we are likely to get a wee bit of attention from our presence at USENIX. He and I agree that we should make some effort to be able to handle a slashdotting. Slashdot apparently serves 40-50 pages per second during peak hours(slash), and one 1999 paper(seff) documents hit rates peaking near 160 hits per minute after being slashdotted.

In my own test requesting the Introduction page 100 times(test), run on our server to eliminate network latency, the response took a minimum of 1.27 s, average 1.316 s, and maximum 1.7 s; the same test, run 1,000 times on our copy of the NASA logo, took a minimum of 0.03 s, average 0.03071 s, and maximum 0.16 s. Server overload is sustained load with more than one request pending at a time. I expect the latency added by TWiki, besides being annoying during normal use, to lead to the server going down under load higher than 45 hits per minute sustained; while static pages should enable continued service up to 600 hits per minute without other configuration changes. In addition, we can expect visitors to be interested in our larger multimedia content, which has problems beyond raw hit rate.

There are a few basic issues.

An obvious solution is to upgrade the server, using our copious R&D budget. I doubt it will be worth the cost, especially since there are quite a few other improvements to try first.

My first proposal is to install the TWiki PublishAddOn(pub). It produces a cache of static files that are served directly by the web server, without the need for a CGI script. If we go this route, I have some comments on efficiently keeping the cache up to date, though I am sure James or Bart would make the same observations. The PublishAddOn has been activated --james 15-Apr-2003

At the same time, we should get rid of all redirects. I expect lots of short visits when under heavy load, and for short visits the number of redirect hits is not much smaller than the number of content hits. Meanwhile, they're completely useless and an extra round-trip from the visitor's point of view, a bit of a pain to mirror, and fairly easily avoided. While doing this, I would like to revamp the structure of our site to produce shorter, more meaningful URLs. We would have to do some sort of restructuring anyway to ensure that the cache is hit by default, rather than the CGI.

Once we have a static, redirect-free version of the site, we can make good use of more powerful servers owned and maintained by others. These servers could actively mirror the cache, or run HTTP accelerators using Squid or Apache. These require very little administration, making them an easy sell to owners of these other resources. (For example, I do not think there would be a problem with using linus.linuxlab.cs.pdx.edu.) They are also easy to load-balance, using round-robin DNS, even if some servers mirror and some cache. Our server would still handle all the dynamic content, including CVS browsing, mailing list management, and page edits.

An interesting strategy for reducing the cost of delivering our really big files is to use BitTorrent(bt). Popular among fans of anime, I recently also heard praise for it from CATs downloading RedHat 9 CD images. We can use this tool to spread load from requests for our MPEGs, ZIP archives of photos, and any other large content, to machines scattered across the Internet.

Another strategy, applicable to all sizes of files, is to gzip-compress the files. Some web browsers, Mozilla included, set the "Accept-Encoding: text/x-gzip" HTTP header on their requests, and Apache can deliver the compressed version with an appropriate Encoding header, as long as the compressed file is already available. (The uncompressed content still needs to be available too: Apache will not decompress content for browsers that do not support gzip encoding.) The benefit of this approach lies in the reduction of time spent waiting for the network transport to deliver each file.

The Apache 1.3 Performance Notes(perf) give some interesting suggestions. Pointing out that most Apache installations are bandwidth-limited, not CPU-bound, the document observes that "Most sites have less than 10Mbits of outgoing bandwidth, which Apache can fill using only a low end Pentium-based webserver." Unfortunately, PSU has an Internet II connection, and can easily sustain 5x that traffic over a long interval. If we do get slashdotted and Apache really is dying, we can turn off logging to minimize I/O; if that is not enough, we can try some more dramatic changes.

We might try replacing Apache, for instance. Boa(boa) is a web server that has been kept simple with the twin goals of improving response time and easing security audits. Since it does not support HTTP authentication, it is not usable for much of TWiki, but it would do a fine job of serving the cache contents.

I say operating systems are pure overhead because they are designed to support arbitrary applications, and so include code (for security, for instance) not needed if one knows in advance what applications will run. To reduce this overhead without loss of generality, Kernel Mode Linux(kml) runs unmodified binaries - Apache has been tested - in ring 0 instead of ring 3, so that the overhead of a system call is reduced to that of a function call. Apparently, this only slightly reduces the protections the kernel has from badly-written applications: segfaults and other signals are still delivered without a kernel panic, the scheduler works the same way, and the address space is still protected. Boa and KML together might be especially interesting.

There are also two patches to the Linux kernel that put trivial web servers in kernel-land: khttpd(khd), and TUX(tux). khttpd was integrated into Linux kernel 2.3.14, but people keep trying to get it pulled out on the grounds that it is poorly implemented(kt), and its mailing list shows little activity recently. I would prefer KML on the grounds that existing user-land web servers have recieved much better testing than these kernel-land implementations have.

In short, we have quite a few avenues to explore for increased web service performance at no cost to PSAS or its members, and we can expect to need some of them in the near future.

[slash] http://slashdot.org/~CmdrTaco/journal/27736

[seff] http://ssadler.phy.bnl.gov/adler/SDE/SlashDotEffectAddendum.html

[test] This bit of shell ran the test for me:

for t in `seq 100`; do
        /usr/bin/time -f %E wget -q -O/dev/null http://twiki.psas.pdx.edu/bin/view/PSAS/Introduction 2>&1
done | awk -F: 'BEGIN { min = 200 }
        { cur = 60 * $1 + $2; ++cnt; sum += cur;
        if(cur < min) { min = cur } else if(cur > max) { max = cur } }
        END { printf "min %f avg %f max %f\n", min, sum / cnt, max }'

[pub] http://twiki.org/cgi-bin/view/Plugins/PublishAddOn

[bt] http://bitconjurer.org/BitTorrent/

[perf] http://httpd.apache.org/docs/misc/perf-tuning.html

[boa] http://www.boa.org/

[kml] http://web.yl.is.s.u-tokyo.ac.jp/~tosh/kml/; see also Toshiyuki Maeda, "Kernel Mode Linux," May 2003, Linux Journal, pp. 62-67.

[khd] http://www.fenrus.demon.nl/

[tux] http://www.redhat.com/docs/manuals/tux/TUX-2.1-Manual/

[kt] http://kt.zork.net/kernel-traffic/kt20020520_167.html#1

-- ?JameySharp - 12 Apr 2003