If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. |
|
|
|
Thread Tools | Display Modes |
#1
|
|||
|
|||
Raid level write verification
Is it true that Raid levels like 1, 1+0, 3, 4 verify on the fly that
both the data write and the ecc write match while levels like raid 5 don't (with few exceptions on the very high end)? Is this dependant rather on make & model? TIA |
#2
|
|||
|
|||
teckytim wrote:
Is it true that Raid levels like 1, 1+0, 3, 4 verify on the fly that both the data write and the ecc write match while levels like raid 5 don't (with few exceptions on the very high end)? No. No RAID level *requires* any kind of read-after-write verification, though any RAID *implementation* could offer it as an additional feature. However, I think some RAID-3 implementations verify on the fly that the parity information matches the stripe being *read* (since that has no impact on performance, save for the CPU cycles required by the comparison and the bus cycles consumed by reading the parity). Though I don't recall that the accepted RAID-3 definition *requires* this. - bill |
#3
|
|||
|
|||
Bill Todd wrote: teckytim wrote: Is it true that Raid levels like 1, 1+0, 3, 4 verify on the fly that both the data write and the ecc write match while levels like raid 5 don't (with few exceptions on the very high end)? No. No RAID level *requires* any kind of read-after-write verification, though any RAID *implementation* could offer it as an additional feature. However, I think some RAID-3 implementations verify on the fly that the parity information matches the stripe being *read* (since that has no impact on performance, save for the CPU cycles required by the comparison and the bus cycles consumed by reading the parity). Though I don't recall that the accepted RAID-3 definition *requires* this. - bill Thanks. I was afraid that was the answer as so many raid details are nonstandard or rather manufacturer specific. So if I *require* a raid implementation that does this, where do I have to look? Are there PCI card products (SATA & SCSI), raid boxes, or is this only available in the high end non-das raid like emc, etc. Are background media scans sufficient protection against failing/flaky media so the verify feature discussed above is not necessary? Thanks again. |
#4
|
|||
|
|||
teckytim wrote:
.... So if I *require* a raid implementation that does this, where do I have to look? Are there PCI card products (SATA & SCSI), raid boxes, or is this only available in the high end non-das raid like emc, etc. Someone else here might know, but I don't. Are background media scans sufficient protection against failing/flaky media so the verify feature discussed above is not necessary? Thanks again. I don't think the two are all that closely related. All read-after-write does is verify that the data written was what you intended to write: while this does guard against very low-probability errors like silently-failing null writes or 'wild' writes (though with the latter you have to worry about what got clobbered as well), it isn't likely to be any kind of substitute for background 'scrubbing' to catch deteriorating sectors (which I think are orders of magnitude more likely than unheralded write failures, but that's just my impression). Sun claims that its new ZFS file system for Solaris has supplementary checksum information that guards data from main-memory to disk and back again - you might find a look there interesting. But that's not specifically RAID-related. - bill |
#5
|
|||
|
|||
Bill Todd wrote: teckytim wrote: ... So if I *require* a raid implementation that does this, where do I have to look? Are there PCI card products (SATA & SCSI), raid boxes, or is this only available in the high end non-das raid like emc, etc. Someone else here might know, but I don't. Are background media scans sufficient protection against failing/flaky media so the verify feature discussed above is not necessary? Thanks again. I don't think the two are all that closely related. All read-after-write does is verify that the data written was what you intended to write: while this does guard against very low-probability errors like silently-failing null writes or 'wild' writes (though with the latter you have to worry about what got clobbered as well), it isn't likely to be any kind of substitute for background 'scrubbing' to catch deteriorating sectors (which I think are orders of magnitude more likely than unheralded write failures, but that's just my impression). I didn't think they are related, at least not outside of the most general sense. Read-after-write just seems to me a reasonable extra failsafe where data integrity/security trumps all else. That perception could be wrong though. I have occasionally read about transient write errors in raid 5 implementations which writers/poster believe make raid 5 less reliable than other levels. Also I have read about some interesting data protection features in EMC & I think Netapp which I believe combat these fears. It seems to me the likelihood of a flakey drive causing problems increases with array size (drive #) and esp in larger ATA arrays. In the event of, say, a weakening sector which causes a write to fail but is not quite weak enough to be marked bad it would cause confusion on defect scan. I have also seen a drive or two which was failing by corrupting data, but still spinning & not showing much or anything in the way of bad sectors. It's rare, but I've seen it and wouldn't want one such drive to take a crap all over an arrays stripes. "read-after-write" in addition to background defect scanning makes sense to me. I usually see only the latter. That makes me wonder. Sun claims that its new ZFS file system for Solaris has supplementary checksum information that guards data from main-memory to disk and back again - you might find a look there interesting. But that's not specifically RAID-related. - bill Very interesting. Will look. Thanks again for the response. |
#6
|
|||
|
|||
teckytim ) wrote:
: I have occasionally read about transient write errors in raid 5 : implementations which writers/poster believe make raid 5 less reliable : than other levels. Also I have read about some interesting data : protection features in EMC & I think Netapp which I believe combat : these fears. A block protection scheme (aka DIF) has recently been standardized by T10. That protection scheme has been implemented by a few of the silicon suppliers (including my employer). Look for that scheme to become a pretty common feature in the next couple of years. It has been a proprietary feature of a few storage vendors for a number of years already. The recently announced SGI 4G FC array (OEMed from Engenio) is an example that has this new standardized feature built into it. Dave |
#7
|
|||
|
|||
Dave Sheehy wrote:
teckytim ) wrote: : I have occasionally read about transient write errors in raid 5 : implementations which writers/poster believe make raid 5 less reliable : than other levels. Also I have read about some interesting data : protection features in EMC & I think Netapp which I believe combat : these fears. A block protection scheme (aka DIF) has recently been standardized by T10. That protection scheme has been implemented by a few of the silicon suppliers (including my employer). Look for that scheme to become a pretty common feature in the next couple of years. It has been a proprietary feature of a few storage vendors for a number of years already. The recently announced SGI 4G FC array (OEMed from Engenio) is an example that has this new standardized feature built into it. If it is indeed now a standard I suspect that given sufficient effort I could learn its details. But if you found it convenient to post them (at least to the degree that one could understand the technology involved - e.g., is it simply an additional checksum, does it live with the data or separate from it, etc.), it would save me and other curious individuals some time. Thanks, - bill |
#8
|
|||
|
|||
Bill Todd ) wrote:
: Dave Sheehy wrote: : teckytim ) wrote: : : I have occasionally read about transient write errors in raid 5 : : implementations which writers/poster believe make raid 5 less reliable : : than other levels. Also I have read about some interesting data : : protection features in EMC & I think Netapp which I believe combat : : these fears. : : A block protection scheme (aka DIF) has recently been standardized by T10. : That protection scheme has been implemented by a few of the silicon : suppliers (including my employer). Look for that scheme to become a : pretty common feature in the next couple of years. It has been a : proprietary feature of a few storage vendors for a number of years already. : The recently announced SGI 4G FC array (OEMed from Engenio) is an example : that has this new standardized feature built into it. : If it is indeed now a standard I suspect that given sufficient effort I : could learn its details. But if you found it convenient to post them : (at least to the degree that one could understand the technology : involved - e.g., is it simply an additional checksum, does it live with : the data or separate from it, etc.), it would save me and other curious : individuals some time. The details can be found in the SBC-2 or -3 standard at t10.org. Look for the section on "Protection Information". Also, some new 32 bit extended SCSI commands are being proposed to support this functionality. There are some rumblings about adding this to T13 as well but I'm not familiar with the status of that. Briefly, 8 bytes of information are appended to each block. There are 3 fields of information, a 2 byte CRC (of the data), a 4 byte LBA count, and a 2 byte application tag. Theoretically, the information can be applied end to end (i.e. generated at the server and sent to and returned from the array) but that is not a typical deployment (although a few HBA manufacturers are incorporating the feature). The typical deployment is to generate the information in the protocol controller on the front end of the array as its written to memory (i.e. data cache). It is written to disk by the back end. The information is validated by both back end and front end when the data is read by the protocol controller. When performend in this fashion the data is protected as it traverses the bus (e.g. PCI and PCIX only have simple parity protection), while it resides in memory, and while it resides on the disk. Dave |
#9
|
|||
|
|||
Dave Sheehy wrote:
.... Briefly, 8 bytes of information are appended to each block. There are 3 fields of information, a 2 byte CRC (of the data), a 4 byte LBA count, and a 2 byte application tag. Theoretically, the information can be applied end to end (i.e. generated at the server and sent to and returned from the array) but that is not a typical deployment (although a few HBA manufacturers are incorporating the feature). The typical deployment is to generate the information in the protocol controller on the front end of the array as its written to memory (i.e. data cache). It is written to disk by the back end. The information is validated by both back end and front end when the data is read by the protocol controller. When performend in this fashion the data is protected as it traverses the bus (e.g. PCI and PCIX only have simple parity protection), while it resides in memory, and while it resides on the disk. Thanks. That's the kind of thing I thought might be useful a decade ago, though seems a little stingy today - e.g., limiting the LBA address to 32 bits (common arrays below the level that the host system may be aware of already exceed this size, though when used only as a sanity check the low 32 bits of the LBA may be sufficient) and the application-specific area to 16 (if both fields were longer the application-specific area could be used, e.g., to hold a file identifier which would facilitate reconstruction of a file system - I have a vague recollection that IBM's i-series boxes and their ancestors may have done this). It should at least allow a host which cares enough to implement the functionality the ability to generate the validation information before the data leaves main memory and check it after it returns. This will catch otherwise undetected bus errors and anything clobbered by a wild write, but unfortunately still won't detect that the intended destination was never updated (or that a silent null write failure occurred). And the largest single potential market for such a feature could turn out to be SATA based... - bill |
#10
|
|||
|
|||
Thanks for the follow-up posts Bill & Dave. Very helpful. In addition
I see a proposal for a "Write Read Verify" feature extension over at T13.org http://www.t13.org/docs2005/e04129r5...ead_verify.pdf I am specifically looking for SCSI & SATA DAS & controllers that utilizes advanced protection mechanisms as has been mentioned here. Any product recommendations along those lines? Thanks again for your time. |
|
Thread Tools | |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Need help with SATA RAID 1 failure on A7N8X Delux | Cameron | Asus Motherboards | 10 | September 6th 04 11:50 PM |
P4C800-E Delux: Setting up SATA Drives with RAID | Will | Asus Motherboards | 13 | July 12th 04 04:33 AM |
How to set up RAID 0+1 on P4C800E-DLX MB -using 4 SATA HDD's & 2 ATA133 HHD? | Data Wing | Asus Motherboards | 2 | June 5th 04 03:47 PM |
Gigabyte GA-8KNXP and Promise SX4000 RAID Controller | Old Dude | Gigabyte Motherboards | 4 | November 12th 03 07:26 PM |
which RAID level for write only? | Tester A. | Storage & Hardrives | 11 | September 29th 03 10:12 AM |