If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. |
|
|
Thread Tools | Display Modes |
#21
|
|||
|
|||
"Bill Todd" wrote in message ... "Ron Reaugh" wrote in message ... ... Does a bad sector that happens to be detected during a RAID 1 HD failure and replacement constitute any reflection on the efficacy of that recovery? I say no. And you're wrong - utterly. When you have a disk failure in your RAID-1 pair, and only *then* discover that a data sector on the surviving disk is also bad, you've lost data - i.e., 'failed'. That's not my definition of failed. Does undetected "silent sector deterioration" actually much of a threat to real world current two drive RAID 1 reliability? I say no. Same degree of wrongness here as well. Nope. |
#22
|
|||
|
|||
Bill Todd wrote: "ohaya" wrote in message ... ... Before I begin, I was really looking for just a kind of "ballpark" kind of "rule of thumb" for now, with as many assumptions/caveats as needed to make it simple, i.e., something like assume drives are in their "life" (the flat part of the Weibull/bathtub curve), ignore software, etc. The drives *have* to be in their nominal service life: once you go beyond that, you won't get any meaningful numbers (because they have no significance to the product, and thus the manufacturer won't have performed any real testing in that life range). Think of it like this: I just gave you two SCSI drives, I guarantee you their MTBF is 1.2 Mhours, which won't vary over the time period that they'll be in-service, no other hardware will ever fail (i.e., don't worry about the processor board or raid controller), and it takes ~0 time to repair a failure. Given something like that, and assuming I RAID1 these two drives, what kind of MTBF would you expect over time? Infinite. - Is it the square of the individual drive MTBF? See: http://www.phptr.com/articles/article.asp?p=28689 No. This example applies to something like an unmanned spacecraft, where no repairs or replacements can be made. Such a system has no meaningful MTBF beyond its nominal service life (which will usually be much less than the MTBF of even a single component, when that component is something as reliable as a disk drive). Or: http://tech-report.com/reviews/2001q...d/index.x?pg=2 (this one doesn't make sense if MTTR=0 == MTBF=infinity?) That's how it works, and this is the applicable formula to use. For completeness, you'd need to factor in the fact that drives have to be replaced not only when they fail but when they reach the end of their nominal service life, unless you reserved an extra slot to use to build the new drive's contents (effectively, temporarily creating a double mirror) before taking the old drive out. Or: http://www.teradataforum.com/teradat...107_214543.htm (again, don't know how MTTR=0 would work) The same way: though the explanation for RAID-5 MTBF is not in the usual form, it's equivalent. - Is it 150% the individual drive MTBF? See: http://www.zzyzx.com/products/whitep...ility_primer.p df No: the comment you saw there is just some half-assed rule of thumb that once again assumes no repairs are effected (and is still wrong even under that assumption, though the later text that explains the value of repair is qualitatively valid). - Is it double the individual drive MTBF? (I don't remember where I saw this one.) No. The second paper that you cited has a decent explanation of why the formula is what it is. If you'd like a more detailed one, check out Transaction Processing: Concepts and Techniques by Jim Gray and Andreas Reuter. Hi, I'm back , and I'm bottom-posting to one of the earlier posts so that everything is there, as this thread is getting a little long. I hope that this is ok? I'm still a little puzzled about your (and I think Ron's) earlier comments about the article from phptr.com that I linked earlier (see above), and I've been trying to "reconcile" that approach/methodology to the ones from the tech-report.com and from the teradataforum.com. If I try to run the equivalent (hypothetical) numbers through both, I get vastly different results. For example, if I: - assume 100,000 hours for a single drive/device and - have 3 drives in RAID 1, and - assume 24 hours MTTR, and - use the tech-report.com/teradataforum.com method, I get: MTTF(RAID1) ~ 20 TRILLION hours+ And, if I follow the method from phptr.com, with the same data, I get: AFR(1 drive) = 8760/100,000 = .0876 AFR(3 drives-RAID1) = (.0876)^3 ~ .0006722 MTBF(3 drives-RAID1) = 8760/AFR(3 drives-RAID1) ~ 13 MILLION hours+ Using the method from the phptr.com page, the MTBF results are WAY less than the other method. Assuming that the tech-report.com/teradataforumc.com method is more correct, and if the method from the phptr.com page is so wrong for calculating just a relatively simple RAID1 configuration, is ANY of the rest of the methods described in the phptr.com page a valid approach? The reason for my question is that the next thing that I wanted to look at was to use the method described in the rest of the phptr.com page (i.e., in the case study) to do some ballpark figuring for a more extended system (with more than just the raided drives), similar to what was in the case study, using MTBF numbers that I have for components. If any of you might be able to shed some (more) light on this, I'd really appreciate it. Thanks again, Jim |
#23
|
|||
|
|||
"Ron Reaugh" wrote in message ...
And you're wrong - utterly. When you have a disk failure in your RAID-1 pair, and only *then* discover that a data sector on the surviving disk is also bad, you've lost data - i.e., 'failed'. That's not my definition of failed. I wrote data to the disk. It didn't come back. Sounds like failure to me. |
#24
|
|||
|
|||
"Ron Reaugh" wrote in message ...
And you're wrong - utterly. When you have a disk failure in your RAID-1 pair, and only *then* discover that a data sector on the surviving disk is also bad, you've lost data - i.e., 'failed'. That's not my definition of failed. I wrote data to the disk. It didn't come back. Sounds like failure to me. |
#25
|
|||
|
|||
"Ron Reaugh" wrote in message ...
And you're wrong - utterly. When you have a disk failure in your RAID-1 pair, and only *then* discover that a data sector on the surviving disk is also bad, you've lost data - i.e., 'failed'. That's not my definition of failed. I wrote data to the disk. It didn't come back. Sounds like failure to me. |
#26
|
|||
|
|||
In article ,
Ron Reaugh wrote: .... A first stab at that process is called nightly backup and the second stab is scheduled defrags. "silent sector deterioration" can happen but is usually an isolated sector here or there and is quite uncommon. Yes, good arrays all have scrubbing capabilities (or should have them). But life isn't quite so easy. Many disk workloads show very high locality: For long stretches, the actuator stays at or near the same position. If you start scrubbing carelessly while a low-intensity foreground workload is running, the response time for real IOs can increase quite precipitously. So the trick with implementing scrubbing is to forecast when the foreground workload will be idle. Like all forecasting of the future, this is quite difficult (if I knew how to do it, I would play the stock market, and get out of the storage business). Note that good scrubbing has to be done internally to the array, because external scrubbing (for example a full backup, or just reading the block device end to end) will not touch all sectors on all disks. And depending on how the array is implemented, it may never touch some sectors (for example, as long as no disk has failed, most arrays will never read the parity block on a RAID-5 group). So this isn't something the user of a disk array can take care of himself. Good RAID 1 will fill the new/replacement drive inspite of such a sector read error and then one is left with an operable system with an isolated read error that may be dealt with. Depending on the definition of "data loss" this issue may not count and is relatively obscure. Modern HDs are quite good at being able to read/recover their data. Well, the promise of RAIDed disks is that there is NO data loss. I personally think that as soon as I lose a sector, I have violated my contract with the end user. Clearly losing one sector is better than losing a whole LUN or a whole array. But if that sector is in an allocated area (of the file system or the database that sits above), the array has corrupted or invalidated data. That's why to many customers the first bit error invalidates the whole LUN - as soon as you lose a single sector, you'll have some explaining to do (often takes the form that a C-level executive has to call the customer and apologize, followed by massive price cuts or rebates. If you look at the introduction history and market penetration of the big disk arrays (EMC Symmetrix, Hitachi Lightning, IBM Shark and so on), you'll see that the "public perception" of data reliability has been a big factor in selling and pricing; I don't want to go into details, as they are sure to step on someones foot. Whether the "public perception" of data reliability is actually correlated with the real incidence of data loss is an interesting study in mass psychology and the power of marketing over engineering. But what is clear is that there are many customer who are perfectly willing to pay a lot of extra money (a factor of 2, 3 or 10 more than the lowest bidder) and select a vendor that gives them a warm and fuzzy feeling (and maybe also real technical advantages, or even contractual guarantees) about the quality and reliability of the disk array. -- The address in the header is invalid for obvious reasons. Please reconstruct the address from the information below (look for _). Ralph Becker-Szendy _firstname_@lr _dot_ los-gatos _dot_ ca.us |
#27
|
|||
|
|||
In article ,
Ron Reaugh wrote: Does a bad sector that happens to be detected during a RAID 1 HD failure and replacement constitute any reflection on the efficacy of that recovery? I say no. For an enterprise-class disk array, this is catastrophic (see previous message). It will cause an alert to field service personnel. Often, the customer will have to be told officially (even if the customer has not detected the read failure yet). For a small RAID array (for example a RAID card on the PCI bus with 2 or 4 drives, and a single-system file system like NTFS or ext3 on top): Most of the time nobody cares. The performability expectation for such a system is sufficiently low that loss of sectors can often be tolerated. In particular because in typical file system workloads (excluding data bases), much of the data is written, read maybe for a short period after being written (for example by the next nightly backup), and never touched again. Does undetected "silent sector deterioration" actually much of a threat to real world current two drive RAID 1 reliability? I say no. Sorry, but for large disk arrays (which typically have many hundred or a few thousand disks in them) this is right up there at the top failure modes (excluding the ones that can't be dealt with anyhow, like meteorite, fire, or software bugs). Together with complete failure of the 2nd disk, and failure of the 2nd disk that is induced by the extra stress of RAID recovery. With the very large disks today, the detected and undetected failure of individual sectors is beginning to be a very significant worry, and I can assure you that the large companies in this sector (their name is typical 2- or 3-letter abbreviations, for example [IEHS][BMPu][CMn], plus Hitachi and NetApp) are putting significant research and development effort into new forms of redundant storage that can survive such problems better. By the way, I'm always saying "RAID-1" and "2nd disk", even though a lot of the large arrays are actually formatted to RAID-5 or other parity or erasure code based schemes. The examples are just easier for RAID-1. One particular worrisome trend is "off-track writes", which is rumored to be more common in consumer-grade disks (typically IDE disks): If during writing mechanical vibration occurs, the head might wander off, and write the new data slightly off the track, without completely overwriting the data on the track. If you now seek away and come back to read later, you can get lucky and by coincidence settle on the new data, or you can get unlucky and hit the old track, and read old data (which is still there, with perfectly valid ECCs, but maybe not for a whole track and only for a few sectors). You can see how this can be quite catastrophic, even in a non-redundant system. It gets really juicy if this happens during a RAID-5 reconstruction, because now you will take this old data, and XOR it with the other disks in the RAID group, creating absolute gibberish, and then writing the gibberish back to disk, thinking that it is valid. In a RAID-1 system the off-track read at least returns data that used to be valid (small consolation). What you might detect here is a certain mindset. We all know that individual disks are fallible, and we've learned to live with this (operative word here is "backup"). For small RAID arrays (often based on motherboards or PCI cards, or hidden in the back end of NAS servers), we do a few simple steps that give you a huge improvement in reliability, but are still considered somewhat unreliable. For most personal and small business users, these small RAID systems give you a huge bang for the buck. But once you enter the realm of the big enterprise storage systems, things change, and you MUST NEVER EVER LOSE DATA (in all upper case), because if you do, high-level executives will have their busy schedules interrupted, and you engineer's behind will be on the line or toast. The reason the enterprise storage systems are so expensive (in terms of $/GB) is that they are fantastically well built, and vendors go to extraordinary lengths to stand behind them. One of these days, if you buy me a few beers, I'll tell you the story of the big array vendor who offered to truck pallets full of batteries in every 24 hours to keep his disk array running through a multi-day power outage (because shutting it down was considered to increase the risk of data loss). -- The address in the header is invalid for obvious reasons. Please reconstruct the address from the information below (look for _). Ralph Becker-Szendy _firstname_@lr _dot_ los-gatos _dot_ ca.us |
#28
|
|||
|
|||
"Robert Wessel" wrote in message om... "Ron Reaugh" wrote in message ... And you're wrong - utterly. When you have a disk failure in your RAID-1 pair, and only *then* discover that a data sector on the surviving disk is also bad, you've lost data - i.e., 'failed'. That's not my definition of failed. I wrote data to the disk. It didn't come back. Sounds like failure to me. A single sector lost does not constitute RAID 1 failure. Does RAID 1 operate whereby each read is redundant and then the two read datasets are compared in OS buffers? NO! There is a failure rate that such would catch although obscure. Does that constitute a RAID 1 failure? Folks are grasping into obscurity and very low probabilities. |
#29
|
|||
|
|||
"Ralph Becker-Szendy" wrote in message news:1089998384.642616@smirk... In article , Ron Reaugh wrote: ... A first stab at that process is called nightly backup and the second stab is scheduled defrags. "silent sector deterioration" can happen but is usually an isolated sector here or there and is quite uncommon. Yes, good arrays all have scrubbing capabilities (or should have them). But life isn't quite so easy. Many disk workloads show very high locality: For long stretches, the actuator stays at or near the same position. If you start scrubbing carelessly while a low-intensity foreground workload is running, the response time for real IOs can increase quite precipitously. So the trick with implementing scrubbing is to forecast when the foreground workload will be idle. Like all forecasting of the future, this is quite difficult (if I knew how to do it, I would play the stock market, and get out of the storage business). Note that good scrubbing has to be done internally to the array, because external scrubbing (for example a full backup, or just reading the block device end to end) will not touch all sectors on all disks. And depending on how the array is implemented, it may never touch some sectors (for example, as long as no disk has failed, most arrays will never read the parity block on a RAID-5 group). So this isn't something the user of a disk array can take care of himself. Good RAID 1 will fill the new/replacement drive inspite of such a sector read error and then one is left with an operable system with an isolated read error that may be dealt with. Depending on the definition of "data loss" this issue may not count and is relatively obscure. Modern HDs are quite good at being able to read/recover their data. Well, the promise of RAIDed disks is that there is NO data loss. Well, one has to define that very carefully. Firstly differentiating "loss" and "error". I personally think that as soon as I lose a sector, I have violated my contract with the end user. Remember that this discussion was about two drive RAID 1. Clearly losing one sector is better than losing a whole LUN or a whole array. But if that sector is in an allocated area (of the file system or the database that sits above), the array has corrupted or invalidated data. That's why to many customers the first bit error invalidates the whole LUN - as soon as you lose a single sector, you'll have some explaining to do (often takes the form that a C-level executive has to call the customer and apologize, followed by massive price cuts or rebates. And what percentage of "bit error" goes undetected overall system wise? If you look at the introduction history and market penetration of the big disk arrays (EMC Symmetrix, Hitachi Lightning, IBM Shark and so on), you'll see that the "public perception" of data reliability has been a big factor in selling and pricing; I don't want to go into details, as they are sure to step on someones foot. Whether the "public perception" of data reliability is actually correlated with the real incidence of data loss is an interesting study in mass psychology and the power of marketing over engineering. But what is clear is that there are many customer who are perfectly willing to pay a lot of extra money (a factor of 2, 3 or 10 more than the lowest bidder) and select a vendor that gives them a warm and fuzzy feeling (and maybe also real technical advantages, or even contractual guarantees) about the quality and reliability of the disk array. Two drive modest configuration RAID 1 arrays are the issue. |
#30
|
|||
|
|||
wrote in message news:1089999993.239394@smirk... In article , Ron Reaugh wrote: Does a bad sector that happens to be detected during a RAID 1 HD failure and replacement constitute any reflection on the efficacy of that recovery? I say no. For an enterprise-class disk array, this is catastrophic (see previous message). See the thread title and the thread itself and what the issue is. It will cause an alert to field service personnel. Often, the customer will have to be told officially (even if the customer has not detected the read failure yet). For a small RAID array (for example a RAID card on the PCI bus with 2 or 4 drives, and a single-system file system like NTFS or ext3 on top): Most of the time nobody cares. Now we're back to our thread and my point. The performability expectation for such a system is sufficiently low that loss of sectors can often be tolerated. In particular because in typical file system workloads (excluding data bases), much of the data is written, read maybe for a short period after being written (for example by the next nightly backup), and never touched again. Does undetected "silent sector deterioration" actually much of a threat to real world current two drive RAID 1 reliability? I say no. Sorry, No, read it again "Does undetected "silent sector deterioration" actually much of a threat to real world current two drive RAID 1 reliability? I say no." |
Thread Tools | |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
IDE RAID | Ted Dawson | Asus Motherboards | 29 | September 21st 04 03:39 AM |
Need help with SATA RAID 1 failure on A7N8X Delux | Cameron | Asus Motherboards | 10 | September 6th 04 11:50 PM |
Asus P4C800 Deluxe ATA SATA and RAID Promise FastTrack 378 Drivers and more. | Julian | Asus Motherboards | 2 | August 11th 04 12:43 PM |
What are the advantages of RAID setup? | Rich | General | 5 | February 23rd 04 08:34 PM |
Gigabyte GA-8KNXP and Promise SX4000 RAID Controller | Old Dude | Gigabyte Motherboards | 4 | November 12th 03 07:26 PM |