buffered IO performance
February 11, 2007
Next to the raw-IO performance which is important for heavy, static file transfers the buffered IO performance is more interesting for sites which have a small set of static files which can be kept in the fs-caches.
As we are using hot-caches for this benchmark the “lightness” of the server becomes important. The less syscalls it has to do, the better.
The test-case is made up of 100MByte of files in the size of 10MByte and 100kByte.
Benchmark
100kByte
100MByte of 100kBytes files served from the hot caches:
lighttpd | |||
---|---|---|---|
backend | MByte/s | req/s | user + sys |
writev | 82.20 | 802.71 | 90% |
linux-sendfile | 70.27 | 686.32 | 56% |
gthread-aio | 75.39 | 736.23 | 98% |
posix-aio | 73.10 | 713.88 | 98% |
linux-aio-sendfile | 31.32 | 305.90 | 35% |
others | |||
Apache 2.2.4 (event) | 70.28 | 686.38 | 60% |
LiteSpeed 3.0rc2 | 70.20 | 685.65 | 50% |
linux-aio-sendfile
is loosing most of its performance as it has to useO_DIRECT
to operation which always is a unbuffered read.- Apache, LiteSpeed and
linux-sendfile
are using the same syscall:sendfile()
and end up with the same performance values - gthread-aio and posix-aio perform better than
sendfile()
write()
performs better thanthe threaded AIO
andsendfile()
I can’t explain that right now :)
10MByte
100MByte of 10MBytes files served from the hot caches. The benchmark command has been changed as in the other benchmarks:
$ http_load -verbose -timeout 40 -parallel 100 -fetches 500 http-load.10M.urls-100M
http_load
is doing a hard cut when we are using the -seconds
option and we might lose some MByte/s due to incomplete transfers.
lighttpd | |||
---|---|---|---|
backend | MByte/s | req/s | user + sys |
writev | 82.20 | 8.76 | 80% |
linux-sendfile | 53.95 | 5.65 | 40% |
gthread-aio | 83.02 | 8.66 | 90% |
posix-aio | 82.31 | 8.60 | 93% |
linux-aio-sendfile | 70.17 | 7.35 | 60% |
others | |||
Apache 2.2.4 (event) | 50.92 | 5.33 | 40% |
LiteSpeed 3.0rc2 | 55.58 | 5.80 | 40% |
- all the
sendfile()
implementations seem to have the same performance problem. writve()
and thethreaded AIO
backends utilize the network as expectedlinux-aio-sendfile
is faster as the bufferedsendfile()
even if it has to read everything from disk … strange