If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. |
|
|
Thread Tools | Display Modes |
#11
|
|||
|
|||
Bill Todd wrote: "ohaya" wrote in message ... ... BTW, re. the "0" MTTR, see my post back to Bill Todd. I had given 4 hours as an example in that post, but after posting and thinking about it, given the scenario that I posed, it really seems like the MTTR would be more like "0" than like 4 hours, since with my scenario, the "system" never really fails (since the drives are hot-swappable). Comments? If you've learned how to repopulate on the order of 100 GB of failed drive in zero time, especially while not seriously degrading on-going processing (so don't just assert that you can use anything like the full bandwidth of its partner to restore it), I suspect that there are many people who would be very interested in talking with you. Bill, You're right, in my mind at least, I was ignoring any effect of restoring to a replacement drive in the case of a failed drive. But, I am looking mainly at FAILURE rates (MTBF), and assuming hot-swappable drives, wouldn't the system continue to run (possibly with some performance degradation because of the restore)? Is the period of time where the new/replacement drive is being restored normally considered "downtime", i.e., is it included in MTTR? Jim |
#12
|
|||
|
|||
If you've learned how to repopulate on the order of 100 GB of failed drive in zero time, especially while not seriously degrading on-going processing (so don't just assert that you can use anything like the full bandwidth of its partner to restore it), I suspect that there are many people who would be very interested in talking with you. Bill, You're right, in my mind at least, I was ignoring any effect of restoring to a replacement drive in the case of a failed drive. But, I am looking mainly at FAILURE rates (MTBF), and assuming hot-swappable drives, wouldn't the system continue to run (possibly with some performance degradation because of the restore)? Is the period of time where the new/replacement drive is being restored normally considered "downtime", i.e., is it included in MTTR? Jim Hi, BTW, I wanted to mention that I really appreciate the patience you all have shown with my questions, some of which might've admittedly appeared stupid or naive, but this discussion has been VERY helpful to me, at least. So again, thanks!! Jim |
#13
|
|||
|
|||
In article , ohaya wrote:
Is the period of time where the new/replacement drive is being restored normally considered "downtime", i.e., is it included in MTTR? Yes. Think of it this way: if a second drive failed in that period, would the system as a whole fail? Yes. Therefore, that time has to be include in the calculation, so they must be including it in MTTR. -- I've seen things you people can't imagine. Chimneysweeps on fire over the roofs of London. I've watched kite-strings glitter in the sun at Hyde Park Gate. All these things will be lost in time, like chalk-paintings in the rain. `-_-' Time for your nap. | Peter da Silva | Har du kramat din varg, idag? 'U` |
#14
|
|||
|
|||
"ohaya" wrote in message ... It's kind of funny, but when I first started looking, I thought that I'd find something simple. That was this weekend ... As I said in my prior post. Maintained RAID 1 failure(of the cases included) can be ignored as it's swamped by other failures in the real world. It's a great academic exercise with little practical application here. Ron, Thanks again. I'm starting to understand your 2nd sentence above . If I'm understanding what you're saying, with a RAID1 setup, with 2 drives with reasonable (i.e., 1.2Mhours) MTBF, from a design standpoint, you wouldn't be worried about failures of the drives themselves, because there are other failures/components (e.g., the processor board, etc.) that would have an MTBF much lower than the raid'ed drives themselves. Did I get that right? And many more failure sources, EXACTLY. BTW, re. the "0" MTTR, see my post back to Bill Todd. I had given 4 hours as an example in that post, but after posting and thinking about it, given the scenario that I posed, it really seems like the MTTR would be more like "0" than like 4 hours, since with my scenario, the "system" never really fails (since the drives are hot-swappable). Comments? Except for the possibility that the second drive fails before the first is replaced. But in that 4 hours I'd be more concerned about gaint meteroid impact. |
#15
|
|||
|
|||
In article , ohaya wrote:
.... If the above calculation is in fact a good estimate, and just so that I'm clear, if: - I had a RAID1 setup with two SCSI drives that really have an MTBF of 1.2Mhours, and - The drives are within their "normal" lifetime (i.e., not in infant mortality or end-of-life), and - The processor board/hardware was such that it supported a hot swap such that if one of the drives failed, it could be replaced without having halting the system, and - We estimated (for planning purposes) that let's say, worst-case, it took someone an 4 hours to detect the failure, get another identical drive, and replace it (so MTTR ~4 hours). Then a reasonable ballpark estimate for the "theoretical" MTTF (which is ~MTBF) to be: (1.2Mhours)(1.2Mhours) ---------------------- = MTTF(RAID1) 2 x 4 hours Is that correct? Yes. But irrelevant. And non-intuitive to boot. First, the MTTR (repair time) has to be in there, because: While a failed drive (1/2 the pair) is being repaired, the array is no longer redundant. So the only failure mode considered in this formula is the following: One drive fails; while that drive is being repaired, the second drive also fails. By "repair", we mean the time it takes to prepare another drive, and copy the data from the surviving (good) drive onto it, so redundancy is restored. By the way, you can immediately see why it is good to have a hot spare drive ready to go: If you have to wait for a human to remove the dead drive and add a new drive, the typical MTTR is at least a few hours, often a day (the time it takes to alert the human and get him into the room with the spare drive). If the spare is powered up and ready to go, the typical MTTR is a few hours (can be as short as 1 hour), to copy the data onyo it. Obviously, the simply formula (comes from the appendix of the original Berkeley RAID paper, and already caused much hilarity back then) ignores all real-world problems, only addressing uncorrelated single-drive failure. Second, as many other people have said, this reliability calculation is completely irrelevant. Real storage systems based on RAID fail, and they do so all the time. Some fail because of simultaneous failure of two drives (some slang calls this a "RAID kill"). Some fail because during reconstruction after a single drive failure, the surviving drive is found to have bad sectors or be unreadable, or the extra stress of the reconstruction causes the surviving drive to fail (slang sometimes calls this a "strip kill" or "repair kill"). Many more fail due to correlated failures (for example a faulty power supply manages to kill all the drives simultaneously). The real source of failures, which is much much higher than the above academic calculation, is systems issues. Within a disk array, firmware or hardware faults are commonly the source for data loss (examples: The array forgot to write dirty data back from cache, or the SCSI bus has a double-bit error that's not caught by parity checking, or in a RAID-5 XOR engine, which is sometimes implemented in hardware, the byte counter can be off). Even more realistic: The best RAIDed array in the world doesn't help you if your filesystem or database corrupts data for fun - except that the corrupt data is now stored extremely reliably. There is a story of a company that had a complete second computer center, with all their data being continuously replicated between the two computer centers. In the event of a desaster, the second computer center could with a few second notice take over for the first one, and keep running nearly seamlessly. The second computer center was located in the other tower of the World Trade Center. Oops. If you really care about your data surviving for a long time, and maybe being continuously accessible, and maybe even being continuously accessible with good performance, you have to look at the overall design, and have to study techniques such as logging, HSM, backup, remote mirroring, transactional storage systems, data dispersion a la Oceanstore ... In the meantime, get yourself two disks, set them up as RAID-1, and you have already made the largest single step towards a reliable system. -- The address in the header is invalid for obvious reasons. Please reconstruct the address from the information below (look for _). Ralph Becker-Szendy _firstname_@lr _dot_ los-gatos _dot_ ca.us |
#16
|
|||
|
|||
"ohaya" wrote in message ... Then a reasonable ballpark estimate for the "theoretical" MTTF (which is ~MTBF) to be: (1.2Mhours)(1.2Mhours) ---------------------- = MTTF(RAID1) 2 x 4 hours Is that correct? Wow!!! Somehow, this seems "counter-intuitive" (sorry) .... Hey, *single* disks are pretty damn reliable in the kind of ideal service conditions you postulate: mirrored disks are just (reliable) squared. A 2,000,000-year RAID-1-pair MTBF sounds great, until you recognize that if you have 2,000,000 installations, about one of them will fail each year. If each site has 100 disk pairs rather than just one, then someone will lose data every 3+ days (or you'll need only 20,000 sites for about one to lose data every year). Bill, Thanks for the perspective. But, so that I'm clear, if the individual drives really have 1.2Mhours MTBF (and I think the Atlas 15K II spec sheet actually claims 1.4Mhours), then the "squared" MTBF would indicate that RAID 1 pair would be something like 1+ TRILLION hours MTBF, not 1+ MILLION hours. Have I misinterpreted something? Yes: the figures I gave above were in years, not hours. Still, I dropped a 0 while doing the calcs in my head (I think I used 10^5 rather than 10^4 for approximating hours per year): they should all be 10x as large. Ralph made a very significant comment, by the way: at such probabilities, you really have to take silent sector deterioration seriously, so the array needs to 'scrub' its data in the background to detect such deterioration while you still have a good copy left to fix it with. Otherwise, the system's mean time to data loss drops precipitously. - bill |
#17
|
|||
|
|||
"Bill Todd" wrote in message ... "ohaya" wrote in message ... Then a reasonable ballpark estimate for the "theoretical" MTTF (which is ~MTBF) to be: (1.2Mhours)(1.2Mhours) ---------------------- = MTTF(RAID1) 2 x 4 hours Is that correct? Wow!!! Somehow, this seems "counter-intuitive" (sorry) .... Hey, *single* disks are pretty damn reliable in the kind of ideal service conditions you postulate: mirrored disks are just (reliable) squared. A 2,000,000-year RAID-1-pair MTBF sounds great, until you recognize that if you have 2,000,000 installations, about one of them will fail each year. If each site has 100 disk pairs rather than just one, then someone will lose data every 3+ days (or you'll need only 20,000 sites for about one to lose data every year). Bill, Thanks for the perspective. But, so that I'm clear, if the individual drives really have 1.2Mhours MTBF (and I think the Atlas 15K II spec sheet actually claims 1.4Mhours), then the "squared" MTBF would indicate that RAID 1 pair would be something like 1+ TRILLION hours MTBF, not 1+ MILLION hours. Have I misinterpreted something? Yes: the figures I gave above were in years, not hours. Still, I dropped a 0 while doing the calcs in my head (I think I used 10^5 rather than 10^4 for approximating hours per year): they should all be 10x as large. Ralph made a very significant comment, by the way: at such probabilities, you really have to take silent sector deterioration seriously, so the array needs to 'scrub' its data in the background to detect such deterioration while you still have a good copy left to fix it with. Otherwise, the system's mean time to data loss drops precipitously. A first stab at that process is called nightly backup and the second stab is scheduled defrags. "silent sector deterioration" can happen but is usually an isolated sector here or there and is quite uncommon. Good RAID 1 will fill the new/replacement drive inspite of such a sector read error and then one is left with an operable system with an isolated read error that may be dealt with. Depending on the definition of "data loss" this issue may not count and is relatively obscure. Modern HDs are quite good at being able to read/recover their data. |
#18
|
|||
|
|||
"Ron Reaugh" wrote in message ... "Bill Todd" wrote in message ... .... Ralph made a very significant comment, by the way: at such probabilities, you really have to take silent sector deterioration seriously, so the array needs to 'scrub' its data in the background to detect such deterioration while you still have a good copy left to fix it with. Otherwise, the system's mean time to data loss drops precipitously. A first stab at that process is called nightly backup Nope: this will read only one of the two copies of the data, and thus decrease the probability that one is bad only by a factor of 2 (unless the array is wise enough to choose a random copy for each read, or load considerations encourage it to). Besides, the vast majority of the data will usually be known to be unchanged and hence won't be backed up at all frequently. and the second stab is scheduled defrags. Better, but there'll still often be some data that doesn't need to be moved (at least if the defrag algorithm has any brains). "silent sector deterioration" can happen but is usually an isolated sector here or there and is quite uncommon. It doesn't have to be very common or at all extensive to decrease the mean time to data loss of a RAID-1 pair from tens of millions of years to tens of thousands of years. As I noted earlier, when the number of disk pairs gets high, such a reduction becomes significant. - bill |
#19
|
|||
|
|||
"Bill Todd" wrote in message ... "Ron Reaugh" wrote in message ... "Bill Todd" wrote in message ... ... Ralph made a very significant comment, by the way: at such probabilities, you really have to take silent sector deterioration seriously, so the array needs to 'scrub' its data in the background to detect such deterioration while you still have a good copy left to fix it with. Otherwise, the system's mean time to data loss drops precipitously. A first stab at that process is called nightly backup Nope: this will read only one of the two copies of the data, Well, "stab" and which it will read is not necessarily always clear and may change. and thus decrease the probability that one is bad only by a factor of 2 (unless the array is wise enough to choose a random copy for each read, or load considerations encourage it to). Besides, the vast majority of the data will usually be known to be unchanged and hence won't be backed up at all frequently. Assuming incremental backups but two drive RAID 1 may very well get imaged each night. and the second stab is scheduled defrags. Better, but there'll still often be some data that doesn't need to be moved (at least if the defrag algorithm has any brains). Right but this is all about probability reducttion. "silent sector deterioration" can happen but is usually an isolated sector here or there and is quite uncommon. It doesn't have to be very common or at all extensive to decrease the mean time to data loss of a RAID-1 pair from tens of millions of years to tens of thousands of years. As I noted earlier, when the number of disk pairs gets high, such a reduction becomes significant. Does a bad sector that happens to be detected during a RAID 1 HD failure and replacement constitute any reflection on the efficacy of that recovery? I say no. Does undetected "silent sector deterioration" actually much of a threat to real world current two drive RAID 1 reliability? I say no. |
#20
|
|||
|
|||
"Ron Reaugh" wrote in message ... .... Does a bad sector that happens to be detected during a RAID 1 HD failure and replacement constitute any reflection on the efficacy of that recovery? I say no. And you're wrong - utterly. When you have a disk failure in your RAID-1 pair, and only *then* discover that a data sector on the surviving disk is also bad, you've lost data - i.e., 'failed'. Does undetected "silent sector deterioration" actually much of a threat to real world current two drive RAID 1 reliability? I say no. Same degree of wrongness here as well. You really need to write less and read more. - bill |
Thread Tools | |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
IDE RAID | Ted Dawson | Asus Motherboards | 29 | September 21st 04 03:39 AM |
Need help with SATA RAID 1 failure on A7N8X Delux | Cameron | Asus Motherboards | 10 | September 6th 04 11:50 PM |
Asus P4C800 Deluxe ATA SATA and RAID Promise FastTrack 378 Drivers and more. | Julian | Asus Motherboards | 2 | August 11th 04 12:43 PM |
What are the advantages of RAID setup? | Rich | General | 5 | February 23rd 04 08:34 PM |
Gigabyte GA-8KNXP and Promise SX4000 RAID Controller | Old Dude | Gigabyte Motherboards | 4 | November 12th 03 07:26 PM |