Jan 262011
 

I was recently setting up DBStats for a Bcfg2 installation and was having some serious performance issues when a client was uploading statistics to the server.

hwfwrv003.web.e.uh.edu:probe:current-kernel:['2.6.18-194.26.1.el5']
hwfwrv003.web.e.uh.edu:probe:groups:['group:rpm', 'group:linux', 'group:redhat', 'group:redhat-5Server', 'group:redhat-5', 'group:x86_64']
Generated config for hwfwrv003.web.e.uh.edu in 0.044s
Handled 1 events in 0.000s
Client hwfwrv003.web.e.uh.edu reported state clean
Imported data for hwfwrv003.web.e.uh.edu in 139.942095041 seconds

This is drastically slower than normal. So, I remounted the sqlite database on a ramdisk.

# losetup /dev/loop0 /bcfg2/bcfg2.sqlite
# mount -t ramfs /dev/loop0 /bcfg2/
# mount | grep ramfs
/dev/loop0 on /bcfg2 type ramfs (rw)

Here is the time it took once I moved the sqlite database to a ramdisk.

hwfwrv003.web.e.uh.edu:probe:current-kernel:['2.6.18-194.26.1.el5']
hwfwrv003.web.e.uh.edu:probe:groups:['group:rpm', 'group:linux', 'group:redhat', 'group:redhat-5Server', 'group:redhat-5', 'gr
oup:x86_64']
Generated config for hwfwrv003.web.e.uh.edu in 0.074s
Handled 1 events in 0.000s
Client hwfwrv003.web.e.uh.edu reported state clean
Imported data for hwfwrv003.web.e.uh.edu in 1.16791296005 seconds

That’s faster by a factor of almost 120! As you can see, something is very odd with the performance hit we are taking when using an ext4 filesystem. Just for comparison, I created an ext3 partition to hold the sqlite database.

# mount | grep foo
/dev/loop1 on /foo type ext3 (rw)
# ls /foo/
bcfg2.sqlite

Here is the same client update again when using ext3 to hold the sqlite database.

hwfwrv003.web.e.uh.edu:probe:current-kernel:['2.6.18-194.26.1.el5']
hwfwrv003.web.e.uh.edu:probe:groups:['group:rpm', 'group:linux', 'group:redhat', 'group:redhat-5Server', 'group:redhat-5', 'gr
oup:x86_64']
Generated config for hwfwrv003.web.e.uh.edu in 0.037s
Handled 1 events in 0.000s
Client hwfwrv003.web.e.uh.edu reported state clean
Imported data for hwfwrv003.web.e.uh.edu in 1.60297989845 seconds

I was finally able to track this down to a change in the default kernel configuration used by Ubuntu for ext4 filesystems. The change is detailed at https://bugs.launchpad.net/ubuntu/+source/linux/+bug/588069. Ubuntu apparently decided it was a good idea to turn on barriers by default in 10.04 (Lucid). Luckily, I was able to remount the ext4 partition without barriers (-o barrier=0) and the performance dropped back down to something more reasonable.

hwfwrv003.web.e.uh.edu:probe:current-kernel:['2.6.18-194.26.1.el5']
hwfwrv003.web.e.uh.edu:probe:groups:['group:rpm', 'group:linux', 'group:redhat', 'group:redhat-5Server', 'group:redhat-5', 'gr
oup:x86_64']
Generated config for hwfwrv003.web.e.uh.edu in 0.038s
Handled 1 events in 0.000s
Client hwfwrv003.web.e.uh.edu reported state clean
Imported data for hwfwrv003.web.e.uh.edu in 6.47736501694 seconds

That’s still much slower than ext3, but it’s at least acceptable in this particular case.

While I can understand the reasoning behind changing something like this, it does not appear to be a good idea to drastically reduce the performance of a LTS release without at least warning people VERY LOUDLY.

More information about this can be found at http://lwn.net/Articles/283161/.

 Posted by at 09:27

 Leave a Reply

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

(required)

(required)

*