If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. |
|
|
Thread Tools | Display Modes |
#21
|
|||
|
|||
Unimpressive performance of large MD raid
kkkk wrote:
In my case dd pushes 5 seconds of data before the disks start writing (dirty_writeback_centiseces = 500). dd stays always at least 5 seconds ahead of the writes. This should fill all stripes completely causing no reads. I even tried to raise the dirty_writeback_centisecs with no measurable performance benefit. Where is this 5secs of data stored? Is it at the ext3 layer or at the LVM layer (I doubt this one, also I notice there is no LVM kernel thread runing) or at the MD layer? Most likely it's in the page cache, above both layers. Why do you think dd stays at 100% CPU? (with disks/3ware caches enabled) Shouldn't that be 0%? Do you think the CPU is high due to a memory-copy operation? If it was that, I suppose dd from /dev/zero to /dev/null should go at 200MB/sec instead it goes at 1.1GB/sec (with 100%CPU occupation indeed, 65% of which is in kernel mode). That would mean that the number of copies performed by dd while copying to the ext3-raid is 5 times greater than that for copying from /dev/zero to /dev/null . Hmmm... a bit difficult to believe. there must be other stuff performed in the ext3 case so to hog the CPU. Is the ext3 code running whithin the dd process when dd writes? Copying from /dev/zero to /dev/null is a special case, as it doesn't have to do any filesystem work. It's basically measuring memory bandwidth. When copying to an actual file there will be work to arrange the filesystem, allocate disk blocks, etc. I wouldn't have expected it to happen within the context of the dd process, but I'm not a filesystem guy. Hmm probably not because kjournald had significant CPU occupation. What is the role of the journal during file overwrites? I suspect the journal will be involved on any filesystem access. Just curious, how is your ext3 filesystem configured for data journalling (journal/ordered/writeback)? Have you tried mounting it with "noatime"? Lastly, in your original email you asked about "sync". When run from the commandline, that command simply flushes all filesystem changes out to disk and waits for that process to complete. Depending on the disk the data may or may not have actually hit the platters by the time sync returns. Chris |
#22
|
|||
|
|||
Unimpressive performance of large MD raid
In comp.os.linux.development.system kkkk wrote:
Why do you think dd stays at 100% CPU? (with disks/3ware caches enabled) Shouldn't that be 0%? Do you think the CPU is high due to a memory-copy operation? If it was that, I suppose dd from /dev/zero to /dev/null should go at 200MB/sec instead it goes at 1.1GB/sec (with 100%CPU occupation indeed, 65% of which is in kernel mode). That would mean that the number of copies performed by dd while copying to the ext3-raid is 5 times greater than that for copying from /dev/zero to /dev/null . Hmmm... a bit difficult to believe. there must be other stuff performed in the ext3 case so to hog the CPU. Is the ext3 code running whithin the dd process when dd writes? I did not check the kernel code, but logically writing to /dev/null you do not need to copy data. So I normally I would expect 2 times more copying. I would try bs parameter to dd, for example on my machine dd if=/dev/zero of=/dev/null count=1000000 needs 0.560571s while time dd if=/dev/zero of=/dev/null count=100000 bs=10240 (which copies twice as much data) needs 0.109896. By default dd uses 512 byte block which means that you do a lot of system calls (each block is copied using separate call to read and write). And yes, when dd is doing system call work done in kernel is accounted as work done by dd. That includes many operations done by ext3 (some work is done by kernel treads and some is done from interrupts and accounted to whatever process is running at given time). Coming back to dd CPU usage: as long as there is enough space to buffer write dd should have 100% CPU utilization. Simply, dd is copying data to kernel buffers as fast as it can. Once kernel buffers are full dd should block -- however what you wrote suggest that you have enough memory to buffer whole write. Using large blocks dd should be faster than disks, but for small blocks cost of system calls may be high (and it does not help that you have many cores, because dd is single threaded and much of kernel work is done in the same thread). -- Waldek Hebisch |
#24
|
|||
|
|||
Unimpressive performance of large MD raid
|
#25
|
|||
|
|||
Unimpressive performance of large MD raid
|
#26
|
|||
|
|||
Unimpressive performance of large MD raid
|
#27
|
|||
|
|||
Unimpressive performance of large MD raid
kkkk wrote:
I am writing a sequential 14GB file with dd time dd if=/dev/zero of=zerofile count=28160000 conv=notrunc ; time sync What's your block size for dd? I'm guessing it's the default 512bytes from your figures above. So you're doing lots of little writes. What happends with a much bigger block size such as 1MB or more? Guy -- -------------------------------------------------------------------- Guy Dawson I.T. Manager Crossflight Ltd |
#28
|
|||
|
|||
Unimpressive performance of large MD raid
kkkk wrote:
This guy http://lists.freebsd.org/pipermail/f...er/005170.html is doing basically the same as I am doing with software raid done with ZFS in freebsd (raid-Z2 is basically raid-6) writing and reading 10GB files. His results are a heck of a lot better than mine with defaults settings and not very distant from the bare hard disks throughput (he seems to get about 50MB/sec per non-parity disk). This tells that software raid is indeed capable of doing good stuff in theory. Just linux MD + ext3 seems to have some performance problems :-( The key line in that link is dd bs=1m for a 10GB file. Note the 1MB block size setting for his test. Waldek Hebisch's post makes the same point about block size too. Guy -- -------------------------------------------------------------------- Guy Dawson I.T. Manager Crossflight Ltd |
#29
|
|||
|
|||
Unimpressive performance of large MD raid
In comp.os.linux.development.system kkkk wrote:
We are using an ext3 filesystem with defaults mount on top of LVM + MD raid 6. I am writing a sequential 14GB file with dd time dd if=/dev/zero of=zerofile count=28160000 conv=notrunc ; time sync Try to use /dev/md/X directly as target for dd, to keep filesystem overhead out of your measurement. (Please note that your filesystem will be destroyed by this.) |
#30
|
|||
|
|||
Unimpressive performance of large MD raid
Hi everybody,
Thanks for your suggestions I have seen the suggestions by Guy and Patrick to raise the bs for dd. I had already tried various values for this, up to a very large value, and I even tried the exact value of bs that would fill one complete RAID stripe in one write: no measurable performance improvement. Regarding trying to put the partition as data=writeback, I will try this one ASAP (possibly tomorrow: I need to find a moment when nobody is using the machine). Regarding trying to dd directly to the raw block device, I will also try this one ASAP. Luckily I have an unused LVM device located on the same MD raid 6. stay tuned... check back in 1-2 days. Thank everybody for your help |
Thread Tools | |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
two HDs in RAID better than one large drive? | sillyputty[_2_] | Homebuilt PC's | 16 | November 21st 08 02:24 PM |
Slow RAID 1 performance on SATA - can I convert to RAID 0? | Coolasblu | Storage (alternative) | 0 | July 30th 06 08:02 AM |
NCCH-DR large raid drives | adaptabl | Asus Motherboards | 9 | April 19th 06 11:02 AM |
Which SATA drives for large RAID 5 array? | Eli | Storage (alternative) | 16 | March 26th 05 06:47 PM |
Large files on Barracuda IV in RAID | Nick | Storage (alternative) | 9 | August 27th 03 06:16 PM |