![]() |
If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. |
|
|
Thread Tools | Display Modes |
#1
|
|||
|
|||
![]()
So Seagate and other makers are getting ready to introduce 20 TB HDD's
to the market. According to Seagate, its fastest drives are capable of sustained 250 MB/s transfers (if you believe them). It would take 30+ hours to entirely fill such a drive with data at maximum speed! Is that too much time, no matter how much capacity you are getting? Is that basically unusable capacity? I know you can say that a drive that large would be filled over a number of years, and no one would be filling it all up in one go. But that's probably true in a home environment, but what about an enterprise environment? What if that drive were part of a RAID array, and one of those drives failed and needed to be replaced? In RAID parity, the entire drive has to be written to, because the parity is required on all drives at once. Imagine you start synchronizing a replacement drive like that, and it takes 30 hours to do that? That's a long enough time that it's conceivable another drive within that array would fail too, before it's had a chance to completely resync with the array. So sure, you can get that capacity with an HDD, but should you really be storing your data on something that slow? HDD's can't get much faster. |
#2
|
|||
|
|||
![]()
On 5/13/2020 7:17 PM, Yousuf Khan wrote:
So Seagate and other makers are getting ready to introduce 20 TB HDD's to the market. According to Seagate, its fastest drives are capable of sustained 250 MB/s transfers (if you believe them). It would take 30+ hours to entirely fill such a drive with data at maximum speed! Is that too much time, no matter how much capacity you are getting? Is that basically unusable capacity? I know you can say that a drive that large would be filled over a number of years, and no one would be filling it all up in one go. But that's probably true in a home environment, but what about an enterprise environment? What if that drive were part of a RAID array, and one of those drives failed and needed to be replaced? In RAID parity, the entire drive has to be written to, because the parity is required on all drives at once. Imagine you start synchronizing a replacement drive like that, and it takes 30 hours to do that? That's a long enough time that it's conceivable another drive within that array would fail too, before it's had a chance to completely resync with the array. So sure, you can get that capacity with an HDD, but should you really be storing your data on something that slow? HDD's can't get much faster. Backblaze wrote an article about replacing failing drives: https://www.backblaze.com/blog/life-...ze-hard-drive/ Lynn |
#3
|
|||
|
|||
![]()
On 5/13/20 6:17 PM, Yousuf Khan wrote:
So Seagate and other makers are getting ready to introduce 20 TB HDD's to the market. According to Seagate, its fastest drives are capable of sustained 250 MB/s transfers (if you believe them). It would take 30+ hours to entirely fill such a drive with data at maximum speed! Is that too much time, no matter how much capacity you are getting? No, that is not too much time for some, if not many use cases. Is that basically unusable capacity? Absolutely not. I know you can say that a drive that large would be filled over a number of years, and no one would be filling it all up in one go. There will be some people that will fill it in almost one go. But that's probably true in a home environment, but what about an enterprise environment? Some enterprises (think Backblaze) will fill drives in a few days. They have very specialized ways to write data to hundreds / thousands of drives. They write things to each drive to capacity and then move on to the next drive. (There is obviously redundancy elsewhere in the application stack.) As such, they fill the drives in what some would consider one go. What if that drive were part of a RAID array, and one of those drives failed and needed to be replaced? In RAID parity, the entire drive has to be written to, because the parity is required on all drives at once. It depends on what type of RAID technology is used. ZFS's RAID has the unique ability to only re-synchronize the amount of the drive that was used, not the entire drive. Aside: ZFS is very impressive. Imagine you start synchronizing a replacement drive like that, and it takes 30 hours to do that? And? This happens. I have a friend & colleague that's waiting on a RAID array to rebuild and has estimates of nearly 200 hours. That's a long enough time that it's conceivable another drive within that array would fail too, before it's had a chance to completely resync with the array. This is, and has been for 10–20 years. That's one of the reasons that RAID-6 and higher RAID levels are popular. So sure, you can get that capacity with an HDD, but should you really be storing your data on something that slow? Sure. Anything that only needs to access a subset of the content but wants a deep catalog is a perfect use for such a drive. HDD's can't get much faster. What is a HDD? Why does an SSD /not/ qualify as a HDD? What about the various holographic storage methods that IBM (and others) have experimented with over the last 30 years. There have been multiple times in the past that hard drive manufacturers have experimented with, and shipped to customers, drives that have multiple sets of heads for increased performance. Also, I'm quite certain that each and every time that someone has said that something can't get faster, it does. -- Grant. . . . unix || die |
#4
|
|||
|
|||
![]()
Yousuf Khan wrote:
So Seagate and other makers are getting ready to introduce 20 TB HDD's to the market. According to Seagate, its fastest drives are capable of sustained 250 MB/s transfers (if you believe them). It would take 30+ hours to entirely fill such a drive with data at maximum speed! Is that too much time, no matter how much capacity you are getting? Is that basically unusable capacity? I know you can say that a drive that large would be filled over a number of years, and no one would be filling it all up in one go. But that's probably true in a home environment, but what about an enterprise environment? What if that drive were part of a RAID array, and one of those drives failed and needed to be replaced? In RAID parity, the entire drive has to be written to, because the parity is required on all drives at once. Imagine you start synchronizing a replacement drive like that, and it takes 30 hours to do that? That's a long enough time that it's conceivable another drive within that array would fail too, before it's had a chance to completely resync with the array. So sure, you can get that capacity with an HDD, but should you really be storing your data on something that slow? HDD's can't get much faster. HDDs are not only used by consumers that have 1 to 4 units in their computers. They are also used by datacenters that have THOUSANDS of at their site, and then THOUSANDS more at other datacenters to provide for catastrophic physical disaster (flood, tsunami, earthquake, meteor, falling aircraft and space junk, terrorism, etc). Google has datacenters in 13 locations: N. and S. Carolina, Iowa, Georgia, Oklahoma, Oregon, Hong Kong, Singapore, Taiwan, Finland, Belgium, Ireland, and Chile. Through subsidiaries, they have datacenters elsewhere, too: Virginia, Alanta GA (multiple), Netherlands (2 locations), Hungary, and Poland, https://www.backblaze.com/blog/hard-...tats-for-2019/ "As of December 31, 2019, Backblaze had 124,956 spinning hard drives." https://www.computerworld.com/articl...the-world.html Ranked by square footage. The Citadel (www.switch.com/the-citadel) is largest sized. It's hard to get them to concretely expose their total storage capacity. The estimate for Google is 10 exabytes which, using your 20TB HDD example, would consume 500,000 HDDs. At Google, an HDD dies every few minutes due to the sheer number of drives they employ. Just because YOU don't have that much data to retain or archive doesn't mean no one else does. The average 4K movies consumes about 100GB. Would only take 200 movies to fill up a 20TB drive. According AllFlicks back in 2016, Netflix had 6,494 movies back then (and 1,609 TV shows). While Netflix discards movies after awhile, I'm sure they've grown since then. The more disks you have spinning, even when adding to a RAID config, the more fragile becomes the setup. Putting the same amount of data on less mechanicals means less chance of physical failure. |
#5
|
|||
|
|||
![]()
20 terrorbites would be an "archive" drive with shingles.
The sensible drives go up to 16 TB? Even that is going to take ages for a scandisk. |
#6
|
|||
|
|||
![]() |
#7
|
|||
|
|||
![]()
On Sat, 16 May 2020 07:50:32 -0400, Yousuf Khan
wrote: On 5/16/2020 5:31 AM, wrote: 20 terrorbites would be an "archive" drive with shingles. The sensible drives go up to 16 TB? Even that is going to take ages for a scandisk. I haven't done a scandisk in quite a few years, and prior to that it was another few years since the previous one. It's not something I worry about, nor do I worry about how long it takes to fill a drive with data. My primary concerns are how many SATA ports and drive bays I have on hand. Those are the limiting factors. I think even 16 TB is way too large, shingles or not. It would still take nearly 18 hours. We all have different needs. My server has 16 SATA ports and 15 drive bays, so the OS lives on an SSD that lays on the floor of the case. The data drives are 4TB x5 and 2TB x10, for a raw capacity of 40TB, formatted to 36.3TB. I use DriveBender to pool all of the drives into a single volume. Windows is happy with that. Since there are no SATA ports or drive bays available, upgrading for more storage means replacing one or more of the current drives. External drives aren't a serious long-term option. The PC that I'm typing on, which I consider my workstation, has 6 SATA ports native to the mobo, 3 NVMe sockets, and 10 drive bays. I use an NVMe drive for the OS and 4TB x3 plus 12TB x2 for data, giving me 36TB raw and 32.7TB formatted. I also use DriveBender here, so Windows sees single 32.7TB volume. With one SATA port available (and 5 drive bays), I can expand the storage by adding one drive. Beyond that, since I'll be out of SATA ports and don't really want to use a PCIe SATA card, my next move would be to replace the 4TB drives with something bigger. At the moment, I'm looking at 12TB and 14TB drives as possible system upgrades. The 16TB drives are still expensive, with most being north of $400 apiece. What would be a type of HDD that a system could handle practically now? I think perhaps the upper limit is 8 TB? That would take nearly 9 hours to fill. 6 TB would take 6.5 hours, 4 TB would take 4.5 hours. Mine are 36.3TB and 32.7TB. I've never filled a volume that size all at once. |
#8
|
|||
|
|||
![]()
On 5/16/2020 5:02 PM, Mark Perkins wrote:
On Sat, 16 May 2020 07:50:32 -0400, Yousuf Khan wrote: On 5/16/2020 5:31 AM, wrote: 20 terrorbites would be an "archive" drive with shingles. The sensible drives go up to 16 TB? Even that is going to take ages for a scandisk. I haven't done a scandisk in quite a few years, and prior to that it was another few years since the previous one. It's not something I worry about, nor do I worry about how long it takes to fill a drive with data. My primary concerns are how many SATA ports and drive bays I have on hand. Those are the limiting factors. Well, nobody does Scandisks more than once in several years. I'm sure Pedro meant that as an extreme example, but not something that is unreasonable to expect to do occasionally. I think even 16 TB is way too large, shingles or not. It would still take nearly 18 hours. We all have different needs. My server has 16 SATA ports and 15 drive bays, so the OS lives on an SSD that lays on the floor of the case. The data drives are 4TB x5 and 2TB x10, for a raw capacity of 40TB, formatted to 36.3TB. I use DriveBender to pool all of the drives into a single volume. Windows is happy with that. Since there are no SATA ports or drive bays available, upgrading for more storage means replacing one or more of the current drives. External drives aren't a serious long-term option. But the point is, neither are internal ones these days, it seems. Assuming even if these are mainly used in enterprise settings, they would likely be part of a RAID array. Now if the RAID array is new and all of these drives were put in new as part of the initial setup, there's nothing to worry about, you fill it up to whatever level of data you have. Hopefully your array holds at least twice the amount of data that someone's old setup had, so it can keep growing before it too needs to be replaced or upgraded. Now as this array ages, it's reasonable to assume that one of the drives may die, and it would need to be replaced. By the time this event happens, likely this array is probably at least 80% full or more. Inserting a replacement drive into the array will require massive amount of time to resync, even if it is a smart resync, doing only the blocks that actually have data on them. Now, looking up what Drive Bender is, it seems to be a virtual volume concatenator. So it's not really a RAID, individual drives die and only the data on them are lost, unless they are backed up. So even in that case, if one of these massive drives is part of your DB setup, replacing that drive will be a major pain in the butt even while restoring from backups. It really begs the question how long are you willing to wait for a drive to get repopulated, knowing that while this is happening it's also going to be maxing out the rest of your system for the amount of hours that the restore operation is happening? My point is that I think people will only be willing to wait a few hours, perhaps 4 or 5 hours at most, before they say it's not worth it, in a home environment. In an enterprise environment, that tolerance may get extended out to 8 or 10 hours. So at some point, all of this capacity is useless, because it's impractical to manage with the current drive and interface speeds. If SSD's were cheaper per byte, then even SSD's running on a SATA interface would still be viable at the same capacities we see HDD's at right now. So a 16 or 20 TB SSD would be usable devices, but 16 or 20 TB HDD's aren't. Yousuf Khan |
#9
|
|||
|
|||
![]()
On Sun, 17 May 2020 22:06:35 -0400, Yousuf Khan
wrote: On 5/16/2020 5:02 PM, Mark Perkins wrote: On Sat, 16 May 2020 07:50:32 -0400, Yousuf Khan wrote: On 5/16/2020 5:31 AM, wrote: 20 terrorbites would be an "archive" drive with shingles. The sensible drives go up to 16 TB? Even that is going to take ages for a scandisk. I haven't done a scandisk in quite a few years, and prior to that it was another few years since the previous one. It's not something I worry about, nor do I worry about how long it takes to fill a drive with data. My primary concerns are how many SATA ports and drive bays I have on hand. Those are the limiting factors. Well, nobody does Scandisks more than once in several years. I'm sure Pedro meant that as an extreme example, but not something that is unreasonable to expect to do occasionally. I think even 16 TB is way too large, shingles or not. It would still take nearly 18 hours. We all have different needs. My server has 16 SATA ports and 15 drive bays, so the OS lives on an SSD that lays on the floor of the case. The data drives are 4TB x5 and 2TB x10, for a raw capacity of 40TB, formatted to 36.3TB. I use DriveBender to pool all of the drives into a single volume. Windows is happy with that. Since there are no SATA ports or drive bays available, upgrading for more storage means replacing one or more of the current drives. External drives aren't a serious long-term option. But the point is, neither are internal ones these days, it seems. I don't follow what you're saying. To me, internal drives are the primary data storage option. Assuming even if these are mainly used in enterprise settings, they would likely be part of a RAID array. Now if the RAID array is new and all of these drives were put in new as part of the initial setup, snip No, I'm not assuming that (Enterprise and RAID) at all. I'm assuming use in the home market, and specifically the subset of the home market where people want to keep large amounts of data accessible. RAID is relatively rare in that setting, isn't it? I don't know anyone who uses it, but that doesn't mean much. Now, looking up what Drive Bender is, it seems to be a virtual volume concatenator. So it's not really a RAID, individual drives die and only the data on them are lost, unless they are backed up. So even in that case, if one of these massive drives is part of your DB setup, replacing that drive will be a major pain in the butt even while restoring from Restoring just the missing files is a major pain? Why does that have to be the case? FWIW, I haven't found that to be true. It's much faster than doing a full restore, for example. backups. It really begs the question how long are you willing to wait for a drive to get repopulated, knowing that while this is happening it's also going to be maxing out the rest of your system for the amount of hours that the restore operation is happening? If there's something you need right away, you prioritize that. Otherwise, let the restore run and do its thing. It's not like disk access brings a modern system to its knees, right? Performance wise, you wouldn't even know it's happening. So in general, there's no significant waiting, and remember that failed drives are not an every day/week/month/year occurrence. Most drives last longer than I'm willing to use them, getting replaced when the data has outgrown their capacity. My point is that I think people will only be willing to wait a few hours, perhaps 4 or 5 hours at most, before they say it's not worth it, in a home environment. I don't follow that at all. In an enterprise environment, that tolerance may get extended out to 8 or 10 hours. So at some point, all of this capacity is useless, because it's impractical to manage with the current drive and interface speeds. ??? How often are you clearing and refilling an entire drive? If SSD's were cheaper per byte, then even SSD's running on a SATA interface would still be viable at the same capacities we see HDD's at right now. So a 16 or 20 TB SSD would be usable devices, but 16 or 20 TB HDD's aren't. That sounds like nonsense. If 100TB HDD's were available at a reasonable price and reasonably reliable, many people would find them to be perfectly usable. I'd love to replace all of my smaller drives with fewer larger drives and in fact that's exactly what I've been doing since the mid-1980's. |
Thread Tools | |
Display Modes | |
|
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
How many hard drives (and size) do you place in your build these days? | Yes | General | 24 | November 7th 12 02:13 PM |
Why is it taking days to copy 80gigs from internal to external USBdrive takes days | [email protected] | Storage (alternative) | 4 | February 24th 08 04:24 PM |
Learned a hard lesson a few days ago | Fred Smith | Asus Motherboards | 14 | August 2nd 04 04:20 PM |
Finally, I could create a bootable WinXP CD with the stuff that ACER includes in the C drive of the C-110-CT!!! | Juan I. Cahis | Acer Computers | 3 | June 10th 04 12:22 AM |