Compiling a new kernel on an Intel N270 Atom-based Linux system can be awfully slow, so setting up distcc or ccache begins to really make sense.
However, there’s something about ccache and the Linux build process that don’t play nicely, using it’s masquerade setup I didn’t get a single hit from its cache.
Distcc on the other hand, especially when distributed pre-processing too with the distcc-pump invocation has a dramatic effect on the time taken to produce a new kernel (it’s very easy to fire up a stripped down Debian virtual machine to use as an extra distcc node), and here’s how I setup this Debian Squeeze 6.0.1 installation :
Firstly, install the kernel source of your choice and distcc using apt :
# apt-get install distcc-pump linux-source-2.6.32
Secondly, fix the distcc-pump script as it looks for python modules in the version 2.5 path, whereas Debian Squeeze comes with 2.6 out of the box :
# vi +43 /usr/bin/distcc-pump
Thirdly, setup some hosts to distribute the compiling to, here I choose to limit my faster ‘compute’ node to eight simultaneous jobs, and just one job at a time for the slower Atom system (the cpp,lzo options tell distcc to push pre-processing jobs to this host also and compress the source files across the wire) :
# cat /etc/distcc/hosts
Finally, as we can’t pass a ‘-j8’ option to the make command to request eight threads of compiling at once, we set the environment variable CONCURRENCY_LEVEL accordingly, the distcc-pump startup creates a socket which distcc can then talk to, though it borks the PATH, so we change that back to put /usr/lib/distcc at the beginning for its masquerading as a compiler to work correctly, and thats all you need to do before compiling and installing your kernel the Debian way.
# export DISTCC_VERBOSE=0
# export CONCURRENCY_LEVEL=8
# eval `distcc-pump --startup`
# echo $PATH
# export PATH=/usr/lib/distcc:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
# cd /usr/src/linux-source-2.6.32
# make menuconfig
# fakeroot make-kpkg --initrd --append-to-version=testbuild --revision=0.1 kernel_image
# distcc-pump --shutdown
# dpkg -i ../linux-image-2.6.32.testbuild_0.1_i386.deb
There are some anomalies in the previous graphs, notably the incongruent results for stripe cache sizes of 768 and 32768. So I ran some more tests with a smoother range of I/O blocksize.
NB : Each variable was tested three times, with all caches synced and flushed before and after. The average of those three tests were used to plot the data points on the graphs.
Caveat : I also reduced the test file size to 650MB (I know, I know, very bad practice to change multiple variables whilst testing).
Results from a basic synthetic benchmark of I/O performance whilst varying the stripe_cache_size tunable under the sysfs tree at
Tests were performed on a QNAP SS-439 NAS :
- Intel Atom N270 (1.6GHz) 82801G
- 2GB RAM
- Linux 22.214.171.124 kernel
- CFQ I/O elevator
- RAID5 (128kb chunksize) – 4* WD10TPVT 1TB drives (4kb physical sectors, aka. Advanced Format)
- EXT3 filesystem (noatime)
Whilst reading and writing in blocks of just 512 bytes, there seems to be no discernible benefit in setting a larger stripe cache size, with read performance dropping marginally as the cache size is increased.
The first interesting results appeared when the blocksize was increased to 4096bytes. Read performance drops off sharply as we increase the cache, though write performance gains a small amount.
At a blocksize of 1MB, our previous findings are reinforced. Read performance decreases significantly once past very low cache sizes, though write performance benefits a small amount by using larger values for the stripe_cache_size.