raw IO performance
February 03, 2007
In lighttpd 1.5.0 we support several network backends. Their job is to fetch static data from disk and send it to the client.
We want to compare the different backends for their performance and when you want to use which.
- writev
- linux-sendfile
- gthread-aio
- posix-aio
- linux-aio-sendfile
The impact of the stat
-threads shall also be checked.
We use a minimal configuration files:
server.document-root = "/home/jan/wwwroot/servers/grisu.home.kneschke.de/pages/"
server.port = 1025
server.errorlog = "/home/jan/wwwroot/logs/lighttpd.error.log"
server.network-backend = "linux-aio-sendfile"
server.event-handler = "linux-sysepoll"
server.use-noatime = "enable"
server.max-stat-threads = 2
server.max-read-threads = 64
iostat, vmstat and http_load
We used iostat
and vmstat
to see how the system is handling the load.
$ vmstat 5
procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
0 0 506300 12900 20620 437968 0 0 14204 17 5492 3323 4 23 9 63
0 1 506300 11212 20620 439888 0 0 17720 4 6713 3966 3 29 3 66
0 1 506300 11664 20632 440356 0 0 14460 8 5416 3120 2 24 2 71
1 0 506300 18916 20612 433168 0 0 13180 50 5505 3088 2 23 2 72
0 1 506300 11960 20628 440188 0 0 15860 6 5485 3307 2 24 3 71
$ iostat -xm 5
avg-cpu: %user %nice %system %iowait %steal %idle
2.20 0.00 24.40 70.40 0.00 3.00
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util
sda 67.40 0.40 68.00 0.80 13600.00 14.40 6.64 0.01 197.88 12.87 176.83 11.23 77.28
sdb 82.80 0.40 84.60 0.80 17000.00 14.40 8.30 0.01 199.23 23.18 280.16 11.61 99.12
md0 0.00 0.00 302.20 0.80 30520.00 6.40 14.90 0.00 100.75 0.00 0.00 0.00 0.00
Our http_load process returned:
>http_load -verbose -timeout 40 -parallel 100 -seconds 60 urls.100k
9117 fetches, 100 max parallel, 9.33581e+008 bytes, in 60 seconds
102400 mean bytes/connection
151.95 fetches/sec, 1.55597e+007 bytes/sec
msecs/connect: 5.47226 mean, 31.25 max, 0 min
msecs/first-response: 144.433 mean, 3156.25 max, 0 min
HTTP response codes:
code 200 -- 9117
We will use the same benchmark and the same configuration to compare the different back-ends.
Comparision
As comparision I tried the add other web-servers to the ring. As always, benchmark have to taken with a grain of salt. Don’t trust them, try to repeat them yourself.
- lighttpd 1.5.0-svn, config is above
- litespeed 3.0rc2
- epoll and sendfile are enabled. All the other options are defaults.
- Apache 2.2.4 event-mpm from the OpenSUSE 10.2 packages
- MinSpareThreads = 25
- MaxSpareThreads = 75
- ThreadLimit = 64
- ThreadsPerChild = 25
I’ll try to get a shrinked down, text based config-file which only contains the necessary options for others to repeat.
Expectations
The benchmark is supposed to show that async file-IO for single-threaded webservers is good. We expect that:
- the blocking network-backends are slow
- that Apache 2.2.4 offers the best performance as it is threaded + event-based
- lighttpd + async file-io gets into the range of Apache2
The problem with the blocking file-IO is that a single-threaded server can do nothing else while it is waiting for a syscall to finish.
Benchmarks
Running the http_load
against different backends shows the impact of async-io vs. sync-io.
100k files
lighttpd | ||
---|---|---|
backend | throughput | requests/s |
writev | 6.11MByte/s | 59.77 |
linux-sendfile | 6.50MByte/s | 63.62 |
posix-aio | 12.88MByte/s | 125.75 |
gthread-aio | 15.04MByte/s | 147.08 |
linux-aio-sendfile | 15.56MByte/s | 151.95 |
others | ||
litespeed 3.0rc2 (writev) | 4.35MByte/s | 42.78 |
litespeed 3.0rc2 (sendfile) | 5.49MByte/s | 53.68 |
apache 2.2.4 | 15.04MByte/s | 146.93 |
For small files you can gain around 140% more throughput.
without no-atime
To show the impact of the server.use-noatime = "enable"
we compare the vmstat
output for the gthread-aio
output
with and without noatime
:
With O_NOATIME
enabled:
procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 1 62 506300 9192 20500 470064 0 0 12426 5 7005 6324 3 27 7 63 0 63 506300 10188 19768 469732 0 0 14154 2 8252 7614 3 30 0 67 0 64 506300 10488 19124 470492 0 0 13589 0 8261 7483 3 27 0 69 0 64 506300 10196 17952 473092 0 0 13062 8 7388 6560 3 25 8 65 0 64 506300 10656 16836 474720 0 0 11790 0 6378 5074 2 23 11 64
With O_NOATIME
disabled:
procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 0 21 506300 10408 15452 491680 0 0 10515 326 5362 1619 2 22 19 57 3 7 506300 11116 17888 487588 0 0 11020 493 6056 2400 7 25 10 58 0 0 506300 10200 19704 488004 0 0 8840 365 4506 1622 2 20 29 49 2 14 506300 10460 21624 485288 0 0 12422 428 6464 1986 2 26 11 60 0 1 506300 9436 24116 485316 0 0 12640 513 7159 2109 3 28 2 67 0 21 506300 11864 25768 481588 0 0 8760 5 4436 1571 2 19 39 41 0 21 506300 10352 24892 483412 0 0 11941 339 6005 1913 3 24 12 6
You see how the bo
(blocks out) goes up and how bi
(blocks in) goes down in the same way. As you usually don’t need the atime
(file access file), you should either mount the file-system with noatime, nodiratime
or use the setting server.use-noatime = "enable"
. By default this setting is disabled to be backward compatible.
10MByte files
lighttpd | ||||
---|---|---|---|---|
backend | throughput | requests/s | %disk-util | user + sys |
writev | 17.59MByte/s | 2.35 | 50 % | 25 % |
linux-sendfile | 33.13MByte/s | 3.77 | 70 % | 30 % |
posix-aio | 50.61MByte/s | 5.69 | 98% | 60% |
gthread-aio | 47.97MByte/s | 5.51 | 100% | 50% |
linux-aio-sendfile | 44.15MByte/s | 4.95 | 90% | 40 % |
others | ||||
litespeed 3.0rc2 (sendfile) | 22.18MByte/s | 2.72 | 65% | 35 % |
Apache 2.2.4 | 42.81MByte/s | 4.73 | 95% | 40 % |
For larger files the win that you have with async-io is still around 50%.
stat() threads
For small files the performance is largely influenced by the stat()
sys-call. Before a file is open()
ed for reading it is first checked, if the file exists, if it is a regular file and if we can read from it. This sys-call is not async by itself.
We will use the gthread-aio
backend and the set of 100k files and run the benchmark again, this time changing the server.max-stat-threads
from 0 to 16.
threads | throughput |
---|---|
0 | 8.55MByte/s |
1 | 13.60MByte/s |
2 | 14.18MByte/s |
4 | 12.33MByte/s |
8 | 12.62MByte/s |
12 | 13.10MByte/s |
16 | 12.71MByte/s |
You should set the number of stat-threads equal to your number of disks for optimal performance.
read-threads
You can also tune the number of read-threads. Each disk-read request is queue and then executed in parallel by a pool of
readers. The goal is to keep the disk utilized at 100% and hiding the seeks for a stat()
and a read()
in the time that lighttpd spends in sending the data to the network.
threads | throughput |
---|---|
1 | 6.83MByte/s |
2 | 11.61MByte/s |
4 | 13.02MByte/s |
8 | 13.61MByte/s |
16 | 13.81MByte/s |
32 | 14.04MByte/s |
64 | 14.87MByte/s |
It looks like 2 reads per disk are already a good value.