A computer components & hardware forum. HardwareBanter

If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

Go Back   Home » HardwareBanter forum » General Hardware & Peripherals » Storage & Hardrives
Site Map Home Register Authors List Search Today's Posts Mark Forums Read Web Partners

Writing to block device is *slower* than writing to the filesystem!?



 
 
Thread Tools Display Modes
  #1  
Old August 7th 09, 01:30 PM posted to comp.os.linux.development.system,comp.arch.storage
kkkk
external usenet poster
 
Posts: 17
Default Writing to block device is *slower* than writing to the filesystem!?

Hi all,
we have a new machine with 3ware 9650SE controllers and I am testing
hardware RAID and linux software MD raid performances
For now I am on hardware RAID. I have setup a raid-0 with 14 drives.

If I create an xfs filesystem on it (whole device, no partitioning,
aligned stripes during mkfs, etc) then I write to a file with dd (or
with bonnie++) like this:
sync ; echo 3 /proc/sys/vm/drop_caches ; dd if=/dev/zero
of=/mnt/tmp/ddtry bs=1M count=6000 conv=fsync ; time sync
about 540MB/sec come out (last sync takes 0 seconds). This is similar to
3ware-declared performances of 561MB/sec
http://www.3ware.com/KB/Article.aspx?id=15300

however, if instead I write directly to the block device like this
sync ; echo 3 /proc/sys/vm/drop_caches ; dd if=/dev/zero of=/dev/sdc
bs=1M count=6000 conv=fsync ; time sync
performance is 260MB/sec!?!? (last sync takes 0 seconds)

I tried many times and this is the absolute fastest I could obtain. I
tweaked the bs, the count, I removed the conv=fsync... i ensured 3ware
caches are ON on the block device, I set anticipatory scheduler... No
way. I am positive that creating the xfs filesystem and writing on it is
definitely faster than writing to the block device directly.

How could that be!? Anyone knows what's happening?

Please note that the machine is absolutely clean and there is no other
workload. I am running kernel 2.6.31 (ubuntu 9.10 alpha live).

Thank you
  #2  
Old August 7th 09, 08:30 PM posted to comp.arch.storage
Mark F[_2_]
external usenet poster
 
Posts: 164
Default Writing to block device is *slower* than writing to the filesystem!?

On Fri, 07 Aug 2009 14:30:11 +0200, kkkk wrote:

Hi all,
we have a new machine with 3ware 9650SE controllers and I am testing
hardware RAID and linux software MD raid performances
For now I am on hardware RAID. I have setup a raid-0 with 14 drives.

If I create an xfs filesystem on it (whole device, no partitioning,
aligned stripes during mkfs, etc) then I write to a file with dd (or
with bonnie++) like this:
sync ; echo 3 /proc/sys/vm/drop_caches ; dd if=/dev/zero
of=/mnt/tmp/ddtry bs=1M count=6000 conv=fsync ; time sync
about 540MB/sec come out (last sync takes 0 seconds). This is similar to
3ware-declared performances of 561MB/sec
http://www.3ware.com/KB/Article.aspx?id=15300

however, if instead I write directly to the block device like this
sync ; echo 3 /proc/sys/vm/drop_caches ; dd if=/dev/zero of=/dev/sdc
bs=1M count=6000 conv=fsync ; time sync
performance is 260MB/sec!?!? (last sync takes 0 seconds)

I haven't played with UNIX since I retired in 1994, but here are
some suggestions:
.. Does dd buffer correctly? (Probably does, but good to check)
.. Compare the I/O counts for both methods; Compare the
CPU time for both methods. I remember issues where the dummy
devices used a small {block size, record size, or some such}
so that there were high I/O counts and therefore high CPU
use when we didn't expect it.
.. Can you report the results for each of the various bs values
that you used? You say that you tried various
values and are just reporting the best result, but it would
be nice to see how things are affected as the blocksize changes.
.. Perhaps the filesystem is smart enough to avoid some movement
between buffers that writing to a block device has to do.
A difference in CPU use might be an indication of this, but
a difference could be caused by other reasons.

(Don't laugh at my experience being so old: I've seen a couple
of problems reported this year [2009] that were the same as I
saw before 1975. And that doesn't count all of the buffer overflow
crap that was solved in hardware before 1961.)


I tried many times and this is the absolute fastest I could obtain. I
tweaked the bs, the count, I removed the conv=fsync... i ensured 3ware
caches are ON on the block device, I set anticipatory scheduler... No
way. I am positive that creating the xfs filesystem and writing on it is
definitely faster than writing to the block device directly.

How could that be!? Anyone knows what's happening?

Please note that the machine is absolutely clean and there is no other
workload. I am running kernel 2.6.31 (ubuntu 9.10 alpha live).

Thank you

  #3  
Old August 8th 09, 01:25 AM posted to comp.os.linux.development.system,comp.arch.storage
David Schwartz
external usenet poster
 
Posts: 5
Default Writing to block device is *slower* than writing to thefilesystem!?

On Aug 7, 5:30*am, kkkk wrote:

If I create an xfs filesystem on it (whole device, no partitioning,
aligned stripes during mkfs, etc) then I write to a file with dd (or
with bonnie++) like this:
* sync ; echo 3 /proc/sys/vm/drop_caches ; dd if=/dev/zero
of=/mnt/tmp/ddtry bs=1M count=6000 conv=fsync ; time sync
about 540MB/sec come out (last sync takes 0 seconds). This is similar to
3ware-declared performances of 561MB/sec
*http://www.3ware.com/KB/Article.aspx?id=15300

however, if instead I write directly to the block device like this
* sync ; echo 3 /proc/sys/vm/drop_caches ; dd if=/dev/zero of=/dev/sdc
bs=1M count=6000 conv=fsync ; time sync
performance is 260MB/sec!?!? (last sync takes 0 seconds)

I tried many times and this is the absolute fastest I could obtain. I
tweaked the bs, the count, I removed the conv=fsync... i ensured 3ware
caches are ON on the block device, I set anticipatory scheduler... No
way. I am positive that creating the xfs filesystem and writing on it is
definitely faster than writing to the block device directly.

How could that be!? Anyone knows what's happening?


There could be a lot of reasons, but the most likely is that they're
writing to opposite ends of the drive. To test, put a 'skip' in your
'dd' to the block device. See if larger skips result in higher speeds.

DS

  #4  
Old August 8th 09, 07:47 PM posted to comp.os.linux.development.system,comp.arch.storage
kkkk
external usenet poster
 
Posts: 17
Default Writing to block device is *slower* than writing to the filesystem!?

David Schwartz wrote:
There could be a lot of reasons, but the most likely is that they're
writing to opposite ends of the drive. To test, put a 'skip' in your
'dd' to the block device. See if larger skips result in higher speeds.


Nope, it's not that. I seeked as you said to the end of the device and
the speed is not significantly different. Writing to the device goes
from 239 to 233 MB/sec (it's actually a bit faster at the beginning).

I am positive that the seek value I used for dd is correct because I
tried to raise it a bit further and it gave me error: dd: `/dev/sdc':
cannot seek: Invalid argument

Next idea...?

Thank you!
  #5  
Old August 10th 09, 01:01 AM posted to comp.os.linux.development.system,comp.arch.storage
kkkk
external usenet poster
 
Posts: 17
Default Writing to block device is *slower* than writing to the filesystem!?

kkkk wrote:
Hi all,
we have a new machine with 3ware 9650SE controllers and I am testing ...


I found it! I found it!

dd apparently does not buffer writes correctly (good catch, Mark):
apparently disregards bs value and submits very small writes. It needs
oflags=direct to really do that, and even then there's a limit. Also,
elevator merging of small writes does not try hard enough and cannot
achieve good throughput. More details tomorrow.
  #6  
Old August 10th 09, 04:00 PM posted to comp.os.linux.development.system,comp.arch.storage
Robert Nichols[_2_]
external usenet poster
 
Posts: 63
Default Writing to block device is *slower* than writing to the filesystem!?

In article s.com,
kkkk wrote:
:kkkk wrote:
: Hi all,
: we have a new machine with 3ware 9650SE controllers and I am testing ...
:
:I found it! I found it!
:
:dd apparently does not buffer writes correctly (good catch, Mark):
:apparently disregards bs value and submits very small writes. It needs
flags=direct to really do that, and even then there's a limit. Also,
:elevator merging of small writes does not try hard enough and cannot
:achieve good throughput. More details tomorrow.

Curious. I'm not seeing that behavior in either Centos 5 or Fedora 11
(coreutils-5.97-19.el5, coreutils-7.2-2.fc11). In both of those, when I
run:

strace dd if=/dev/zero bs=1M count=1 of=somefile conv=fsync

I see exactly one read and write, each of size 1048576.

--
Bob Nichols AT comcast.net I am "RNichols42"
  #7  
Old August 10th 09, 06:40 PM posted to comp.os.linux.development.system,comp.arch.storage
kkkk
external usenet poster
 
Posts: 17
Default Writing to block device is *slower* than writing to the filesystem!?

Robert Nichols wrote:
Curious. I'm not seeing that behavior in either Centos 5 or Fedora
11 (coreutils-5.97-19.el5, coreutils-7.2-2.fc11). In both of those,
when I run:

strace dd if=/dev/zero bs=1M count=1 of=somefile conv=fsync

I see exactly one read and write, each of size 1048576.



I haven't straced it but this is what appears from iostat -x 1 (grabbed
from live iostat)

Without direct: (bs=1M)

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s
avgrq-sz avgqu-sz await svctm %util

sdc 0.00 559294.00 0.00 14384.00 0.00 570550.00
39.67 143.98 9.96 0.07 100.00



With direct: (bs=1M)

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s
avgrq-sz avgqu-sz await svctm %util

sdc 0.00 0.00 0.00 3478.00 0.00 890368.00
256.00 5.77 1.66 0.28 98.40



You see, without direct there are a whole lot of wrqm/s (= probably lots
of wasted CPU cycles), and the average submitted size is still 143.98
256.0 (I suppose 143.98 is after the merges, correct?)

With direct there are no wrqm/s, and the submitted request size is 256
sectors exactly.


With oflag=direct, performances increase with increasing bs, like this:

(3ware 9650SE-16ML hw raid-0 256K chunk size, 14 disks [1TB 7200RPM SATA])
bs size - speed:
512B - 4.9MB/sec
1K - 13.3MB/sec
2K - 26.6MB/sec
4K - 54.1MB/sec
8K - 96MB/sec
16K - 157MB/sec
32K - 231 MB/s
64K - 300 MB/s
128K - 359 MB/s (from this point on, avgrq-sz does not increase
anymore, but performances still increase)
256K - 404MB/sec
512K - 430MB/sec
1M - 456MB/sec
2M - 466MB/sec
4M - 473MB/sec
3584K (stripe size) - 494MB/sec
8M - 542MB/sec !! A big performance jump!!
16M - 543MB/sec
32M - 568MB/sec ! Another big performance jump
64M - 603MB/sec ! Again !! Here are CPU occupations: real 0m11.213s,
user 0m0.004s, sys 0m3.880s
128M - 641MB/sec
256M - 676MB/sec
512M - 645MB/sec (performances start dropping)
1G - 620MB/sec

Avgrq-sz apparently cannot go over 256 sectors, is this a hardware limit
by the device, 3ware?

Notwithstanding this, performances still increase up to bs=256M. From
iostat the only apparent change (apart from increasing wsec/s obviously)
is avgqu-sz, being 1.0 up to bs=128K, and then raising to about 20.0
at bs=256M. Do you think this can be the reason for the performance
increase up to 256M?

Thanks for any thoughts.
 




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
dvd writing DNA Cdr 7 June 18th 05 08:11 AM
CDRW writing always slower than CDR? SleeperMan Cdr 4 December 13th 04 04:50 PM
HardDrive: Reading & Writing Slower UWS General Hardware 0 January 8th 04 07:32 PM
CD Writing Blade General 8 October 7th 03 07:12 PM
NF7-S cd writing Richard Rollins Abit Motherboards 1 July 22nd 03 10:31 PM


All times are GMT +1. The time now is 03:42 PM.


Powered by vBulletin® Version 3.6.4
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 HardwareBanter.
The comments are property of their posters.