2009-10-25.

Since the upgrade to a newer gentoo with 2.6.30 kernel, in September, 
the fileserver has had some times of long (ten second or so) delays
in responding to particular client connections: it seems this is 
mainly (only?) on mkdir calls, and that this in fact can happen 
whether the call is from an (NFSv3) client or done locally in e.g.
an ssh session.   The main space is on a ~1.8TB RAID5 array of 3 
new disks, using ext3 filesystem.  The kernel has been upgraded 
from 2.6.30 to 2.6.31 as a first precaution, but with no change
noticed.  There's no particularly heavy load associated with the 
times of problems, though I haven't yet any evidence that the 
problem has ever happened at a time with no other load at all -- 
there may for example be some cron-job using rdiff-backup briefly.

Searching on Google for
	ages OR eternity OR long time mkdir linux raid OR md OR ext3 2.6.3

the first interestingly relevant find was:
	http://insights.oetiker.ch/linux/fsopbench/
	``Note the HUGE max delay for mkdir seen in this example.'' (19s).

on which (albeit under considerable r/w load) the following /long/ 
delay had been seen 

	2.6.31.2-test relatime barrier=0 fs=ext3 disk=hdd journal=int data=writeback scheduler=cfq
	I mkdir           cnt   1004   min 0.018 ms   max 19592.233 ms   avg 29.447 ms   med 0.026 ms   std 644.768
	
	2.6.24-24-server atime barrier=0 fs=ext3 disk=areca RAID6 journal=ext data=ordered scheduler=deadline
	I mkdir           cnt    706   min 0.018 ms   max 5190.903 ms   avg 33.380 ms   med 0.025 ms   std 349.304

The ~20s delay seemed quite reminiscent of our server...
This time was when using the cfq (completely fair queueing, current default) 
scheduler, and with an ext3 filesystem.

Ours is using cfq for the disks of the array, and no scheduler for the 
array itself (naturally enough).  Our situation with a /software/ array
rather than a simple disk or a hardware RAID (looking like a simple disk)
is admittedly a little different from theirs. 
/sys/block/hda/queue/scheduler  noop anticipatory deadline [cfq]
/sys/block/hdc/queue/scheduler  noop anticipatory deadline [cfq]
/sys/block/sda/queue/scheduler  noop anticipatory deadline [cfq]
/sys/block/sdb/queue/scheduler  noop anticipatory deadline [cfq]
/sys/block/sdc/queue/scheduler  noop anticipatory deadline [cfq]

# tune2fs -l /dev/md0
tune2fs 1.41.3 (12-Oct-2008)
Filesystem volume name:   serv_home_raid
Last mounted on:          <not available>
Filesystem UUID:          5832561e-34cd-432f-ae13-2fbc1cef9162
Filesystem magic number:  0xEF53
Filesystem revision #:    1 (dynamic)
Filesystem features:      has_journal ext_attr resize_inode dir_index filetype needs_recovery sparse_super large_file
Filesystem flags:         signed_directory_hash
Default mount options:    (none)
Filesystem state:         clean
Errors behavior:          Continue
Filesystem OS type:       Linux
Inode count:              122101760
Block count:              488375936
Reserved block count:     4883759
Free blocks:              13697351
Free inodes:              121436197
First block:              0
Block size:               4096
Fragment size:            4096
Reserved GDT blocks:      907
Blocks per group:         32768
Fragments per group:      32768
Inodes per group:         8192
Inode blocks per group:   512
RAID stride:              16
Filesystem created:       Fri Aug 21 20:28:49 2009
Last mount time:          Tue Oct 20 20:51:19 2009
Last write time:          Tue Oct 20 20:51:19 2009
Mount count:              11
Maximum mount count:      -1
Last checked:             Fri Aug 21 20:28:49 2009
Check interval:           0 (<none>)
Reserved blocks uid:      0 (user root)
Reserved blocks gid:      0 (group root)
First inode:              11
Inode size:               256
Required extra isize:     28
Desired extra isize:      28
Journal inode:            8
First orphan inode:       91868594
Default directory hash:   half_md4
Directory Hash Seed:      929343b3-40c2-4ad3-8f00-7d78f28e1680
Journal backup:           inode blocks

Other, much earlier kernels were also complained of as regressions.
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/131094
Regression in speed (under competing processes) from ~2.6.15--2.6.22.


A little test, 2009-10-26,00:21: 
	tune2fs  -e remount-ro /dev/md0    (shouldn't matter at all -- just a forgotten preference)
	mount -o remount,data=writeback /home/    (default is claimed to be `ordered' mode)