If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. |
|
|
Thread Tools | Display Modes |
#11
|
|||
|
|||
How would you store 100TB data?
flux wrote: In article , HVB wrote: On Tue, 28 Feb 2006 00:14:40 -0500, flux wrote: I wonder if there are ANY data centers that store 100 TB let alone... ?!?! I know one data centre that has 1PB (yes, you read that right) of useable storage, with over half of it actually consumed. I'm currently designing a new data centre which will require over 2.5PB of useable storage capacity. The amount of actual raw storage required is very much higher than this. These sound like special exceptions. So, yes, plenty of data centres actually store more than 100TB of data. You seem to be saying that 2 cases you just quoted count for plenty. While 100TB shops are certainly at the larger end of the spectrum, they're hardly uncommon. If they were, why would the big storage vendors all sell single arrays that can store more than that? For example, IBM's DS8000 can store 192TB, EMC's Symmetrix DMX2000 holds 118TB, the DMX3000 230TB, and the DMX-3 1052TB, and HP's StorageWorks XP12000 supports 332TB. |
#12
|
|||
|
|||
How would you store 100TB data?
On Tue, 28 Feb 2006 00:14:40 -0500, flux wrote:
In article 1140933623.229616@smirk, wrote: Since disks are so unreliable (they are typically the least reliable Oh really? Really. Rarely do I need to have a mb or power supply swapped. Same goes for cpu although a few bugs have caused more ram swaps than I would like. But disks fail every day. I manage a decent sized NAS environment and of the 400TB of usable storage I've only once had to have a motherboard replaced, twice ram, and the occasional power supply/cable/misc. But drives are replaced by the shipment every week. computer thing in a data center, excluding the air conditioning, which What about electronics, motherboards, CPUs, memory? Rarely any issues with these unless you are unlucky enough to run into a bug. In todays data centers, 100TB systems are common; I wonder if there are ANY data centers that store 100 TB let alone... I wonder if you have a clue. ~F |
#13
|
|||
|
|||
How would you store 100TB data?
In article .com,
wrote: I already have right at a TB of storage in my home and I do not consider myself unique. Oh - you have two disk drives, I see. :-) That joke was a little glib and cruel; not that many 500GB drives have shipped in the consumer channel yet. I'm still at 300 some GB in my server (3 drives), but the disks are not full yet, so I haven't seen a need to upgrade for a few years. I know several people who have multi-TB systems at home. The easy way to need and fill that disk space is to build your own PVR, or to rip all your DVDs onto disk, which makes it easier for the kids to watch the movies they want to watch (like Nemo or Toy Story) without risk of the DVDs getting scratched. Clearly, the way consumers use disk space at home, and the way corporations use disk space, are very different. Interestingly, digital movie production is a large consumer of disk space; supposedly making a feature film today consumes many PB in temporary space. I'm not sure how close Google and Yahoo are at closing in on an EB of storage, but suspect one or both will reach that level soon. There are no firm numbers in public about their storage capacites, those are closely guarded secrets. From usually reliable sources (lots of people live in the bay area, and people talk), I hear that Google had at the minimum several times 3PB in the Mountain View data center alone about 2 years ago; if you include their remote data centers, they are probably at dozens or hundreds of PB today. 100TB? As mentioned earlier, this is now only two standard 19" racks of storage. How many corporate data centers have only two racks in those beautiful computer rooms they built and manage? There are pictures of large data centers around the web, google for them. They typically have hundreds of racks. A good fraction of that is storage. It is not uncommon to see a dozen Sharks, Lightnings or Symmetrix in one room; with up-to-date models that is a PB right there. This is not even counting racks and racks of 1U- or 2U-servers being used as storage devices. The largest single file system I know of (and I probably missed a few) is over 2PB (single file system means that you can mount it a single mount point and access it as a single name space with a single data space). Google for "ASCI Purple". Quite a few other customers have storage plants that size, just not in a single file system. In article , Faeandar wrote: On Tue, 28 Feb 2006 00:14:40 -0500, flux wrote: In todays data centers, 100TB systems are common; I wonder if there are ANY data centers that store 100 TB let alone... I wonder if you have a clue. Possibly he doesn't. Which is OK: anyone in the storage industry who claims that 100TB systems don't exist will be irrelevant in a short period. Or possibly he is a troll with a clue. Either way is fine with me. To be honest, I've not built a 100TB myself yet. Somewhere on the public web is a picture of a 30TB system I built 2.5 years ago, with another guy and me standing proudly in front of it; took 4 racks back then (using SCSI disks). But then, I don't work with real customers in real data centers. -- The address in the header is invalid for obvious reasons. Please reconstruct the address from the information below (look for _). Ralph Becker-Szendy |
#14
|
|||
|
|||
How would you store 100TB data?
In article ,
Faeandar wrote: Since disks are so unreliable (they are typically the least reliable Oh really? Really. Rarely do I need to have a mb or power supply swapped. Same goes for cpu although a few bugs have caused more ram swaps than I would like. My experience is essentially the opposite. But disks fail every day. I manage a decent sized NAS environment and of the 400TB of usable storage I've only once had to have a Well, 400 TB is an awful lot of storage. I really got to wonder what's on there. motherboard replaced, twice ram, and the occasional power supply/cable/misc. But drives are replaced by the shipment every week. 400 TB at 500 GB drives is 800 drives. So how many motherboards are there? |
#15
|
|||
|
|||
How would you store 100TB data?
|
#16
|
|||
|
|||
How would you store 100TB data?
flux wrote: My experience is essentially the opposite. But disks fail every day. I manage a decent sized NAS environment and of the 400TB of usable storage I've only once had to have a Well, 400 TB is an awful lot of storage. I really got to wonder what's on there. motherboard replaced, twice ram, and the occasional power supply/cable/misc. But drives are replaced by the shipment every week. 400 TB at 500 GB drives is 800 drives. So how many motherboards are there? He's almost certainly not using all 500GB drives. Assuming it's 150GB drives, you'd expect a failed drive every 2.5 weeks based on the (optimistic) published MTBFs (typically 1.2M hours for high end SCSI/FC drives, divided by 2700 drives). If a chunk of his array is performance critical, he may well be using 36 or 72GB drives in that portion. While you start with the 2.5 weeks per "real" failure, remember that all high end arrays do a considerable amount of monitoring and tend to call for drive replacements when correctable error counts start increasing (and whatever other events they're monitoring). Hopefully *before* they actually fail. The typical process is that the drive that's acting up is migrated to the hot spare, the questionable drive is remarked as the hot spare, and its replacement is scheduled. The high end arrays will all phone home to let the support folks know to make sure a new drive gets shipped. Construction of the big arrays varies considerably, but you typically have 7-15 drives plugging into a single backplane. The backplane isn't usually too smart, but does have the power management and isolation circuitry needed to isolate and hot swap the drives, plus various indicators and whatnot (usually a few LEDs for each drive, sometimes powered locks for each drive). Those backplanes are typically plugged into controller boards, which contain the actual smarts of the array. Controllers in big arrays typically handle 4-16 backplanes each. Then you have some interconnnect, I/O cards for the host interface, and often a higher level of management hardware. Again, actual implementations are all over the place. |
#17
|
|||
|
|||
How would you store 100TB data?
HVB writes:
When you ring a call centre and they tell you that your call "may be monitored", what they really mean is "your call *is* being recorded". This data is kept for a long time, if not forever. One manufacturing client of mine creates huge amounts of video data. For quality control purposes they use video to check their production runs. They keep this for a long time, in case they need to check for faults. An outsider to the business would probably never consider storing data like this. They used to use video tape, but that has it's own problems and for them the advantages of online storage outweighed the costs. Again, those are just two examples. Applications like that probably tend to mostly use the most recent data. Is it really worth keeping so much older, rarely used data spinning all the time, instead of having some big tape robots (or even cabinets full of tape cartridges) like we used to see before disks got so cheap? |
#18
|
|||
|
|||
How would you store 100TB data?
HVB writes:
They keep the data on ATA drives, so they get relatively low cost storage and practically instant visual access to any manufacturing run. Do they keep those hundreds (thousands?) of ATA drives spinning all the time in case of some rare access to any particular one, or do they have some way of powering them up only when needed? The dozen or so seconds of latency from that would probably be tolerable. |
#19
|
|||
|
|||
How would you store 100TB data?
Paul Rubin wrote:
HVB writes: They keep the data on ATA drives, so they get relatively low cost storage and practically instant visual access to any manufacturing run. Do they keep those hundreds (thousands?) of ATA drives spinning all the time in case of some rare access to any particular one, or do they have some way of powering them up only when needed? The dozen or so seconds of latency from that would probably be tolerable. Spinning down the disks would presumably have both benefits and drawbacks (spin-up can cause failures...). But there are ready-made products that seems to be designed for scenarios like this and will automatically manage the disks. I see that the densest storage box I know of (Nexsan ATABeast/ SATABeast, 42 disks in 4U!) is now available in a SATA version, which seems to have added something they call AutoMAID(TM) (Massive Array of Idle Disks) which seems tailored for this (no mention of this in the older ATABeast, wonder if they have or potentially could add this on it via new firmware) They list 210 TB in a standard rack (40U, leaving 2U for two FC switches), but that's the raw capacity before removing RAID overhead or hot-spares (and using 500GB disks). Say 150 TB usable perhaps (somewhere in the 100-170TB range depending on RAID array size and hot spares). So if 100+ TB on multiple (FC/SAN) volumes is OK it can actually be done in less than a rack (29-40U depending on degree of redundancy required). |
#20
|
|||
|
|||
How would you store 100TB data?
In article ,
Paul Rubin wrote: HVB writes: Do they keep those hundreds (thousands?) of ATA drives spinning all the time in case of some rare access to any particular one, or do they have some way of powering them up only when needed? The dozen or so seconds of latency from that would probably be tolerable. Nearly all disk drives have been kept on and spinning. Traditionally, there has been a lot of scepticism towards spinning disks down, as it is not clear that they will ever spin back up. The old "sticktion" problems come to mind. There are also questions about what happens to spindle lubricants if the spindle isn't rotating for long periods. In spite of these questions, systems are now being built in which the bulk of all (SATA) disks are kept spun down; this technology has even acquired a new acronym, namely MAID (and I don't remember what it stands for exactly). Please google for Copan systems. To my knowledge (which is guaranteed to be only partial), the Copan system has the highest density of storage in TB/sqft of floor space, or TB/cuft of data center colume, or TB/kW of power used, or some metric like that (I don't remember the details). Probably Copan's website will have such information. I would bet that a Copan system is several hundred disks in a rack. There may be other vendors that are providing disk systems that spin down or have similar densities. -- The address in the header is invalid for obvious reasons. Please reconstruct the address from the information below (look for _). Ralph Becker-Szendy |
Thread Tools | |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Seagate Barracuda 160 GB IDE becomes corrupted. RMA? | Dan_Musicant | Storage (alternative) | 79 | February 28th 06 08:23 AM |
Be a Smart Worker - Projects Available - Data Entry | Data Network Forum | Storage & Hardrives | 0 | November 13th 04 06:31 AM |
Modem connection speed | Neil Barnwell | General | 58 | July 14th 04 07:18 PM |
Network File Server | Bob | Storage (alternative) | 37 | May 4th 04 09:07 PM |
help with motherboard choice | S.Boardman | Overclocking AMD Processors | 30 | October 20th 03 10:23 PM |