If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. |
|
|
|
Thread Tools | Display Modes |
#11
|
|||
|
|||
Very Large Filesystems
In article . com,
Aknin wrote: On May 1, 3:15 am, Ernst S Blofeld wrote: Jan-Frode Myklebust wrote: Is there any other solution than backups, if neither the fs nor the two snaps can be trusted ? I would argue that making your fs's as small as possible, to confine the damage, and keeping good backups is the best option. Why would tape backup be "totally impractical even for sizes much smaller than 4TB." ? Who said don't make backups ? ZFS is not a backup solution but a filesystem with checksumming and redundancy features. I've never heard anyone seriously suggest that ZFS obviated the need for backups, not in this thread or anywhere else. Rant about non-issues elsewhere please. As already pointed out, increasing the number of filesystems does not increase the protection because you still have all the common modes of failure (including the software bugs that you are so apparently keen on). How much better off are a million files on a single filesystem against the same files on a thousand filesystems if everything else remains equal? There is no meaningful difference at all. Moreover backups do not address the OP's point - silent corruption. If you aren't checking your files how can you have any confidence in your backups? A backup is as problematic in terms of integrity as the filesystem it is read from. Backing-up a corrupt file doesn't fix it. You cannot avoid the need for checksumming to detect errors and redundancy to fix them. Putting these features directly in your filesystem is a good idea - integrity is maintained and there is fast recovery. The fact that there will be teething problems in ZFS or an equivalent filesystem is not a sound basis for rejecting these features. There will still be backups in the future too. ESB I've cross-posted this question on several places, and practically all answers switched immediately to backup/restore issues. It seems that no-one puts any kind of trust in filesystems, in the sense that even Filesystems are not the problem. Hardware is. I've worked with many thousands of PC disks starting with the first release of NTFS, almost 15 years ago. I have never seen NTFS "corrupt" itself. All failures were traced to dying hardware. Sh*t happens. I have to admit that my experience with RAID is much less. I'd like to hear of documented cases of such NTFS problems. In any case, you need a strategy for backup and recovery of your data. Even if the filesystem is fine, the building can burn down. -- a d y k e s @ p a n i x . c o m Don't blame me. I voted for Gore. A Proud signature since 2001 |
#12
|
|||
|
|||
Very Large Filesystems
In article .com,
Aknin wrote: What's the maximum filesystem size you've used in production environment? How did the experience come out? In the NCAR Mass Storage Service (MSS), a tape archive that is approaching 3 PB in size, we currently have a disk cache of 48 TB on 4 FC-SATA RAIDs. I have it configured as 24 logical units of just under 2 TB each, each as a single Irix XFS file system. When a disk in the RAID fails, the controller can rebuild the RAID group in about 4-6 hours. Files written to the disk cache (between 112 KB and 1 GB in size) are usually written to tape within 24 hours. Residency in the cache varies between 30-60 days. We've not had any problems with XFS. |
#13
|
|||
|
|||
Very Large Filesystems
One option is to go with segmented filesystems: www.ibrix.com.
Instead of having one monolithic filesystem, break it up across several segments. Ibrix still provides a single namespace. Back up the segments separately, recover them separately. On Apr 28, 2:21 am, Aknin wrote: Following some research I've been doing on the matter across newsgroups and mailing lists, I'd be glad if people could share numbers about real life large filesystem and their experience with them. I'm slowly coming to a realization that regardless of theoretical filesystem capabilities (1TB, 32TB, 256TB or more), more or less across the enterprise filesystem arena people are recommending to keep practical filesystems up to 1TB in size, for manageability and recoverability. What's the maximum filesystem size you've used in production environment? How did the experience come out? Thanks, -Yaniv |
#14
|
|||
|
|||
Very Large Filesystems
Filesystems are not the problem. Hardware is.
I've worked with many thousands of PC disks starting with the first release of NTFS, almost 15 years ago. I have never seen NTFS "corrupt" itself. All failures were traced to dying hardware. Sh*t happens. I have to admit that my experience with RAID is much less. I'd like to hear of documented cases of such NTFS problems. In any case, you need a strategy for backup and recovery of your data. Even if the filesystem is fine, the building can burn down. here's a well-known NTFS example: http://support.microsoft.com/kb/229607 I also have another from personal experience. When you get way out of the norm, you are much more likely to encounter problems that have nothing to do with your hardware. I had a case open with MS in which I was told they had internal documentation suggesting limits that, while beyond what you'd likely ever see in 'normal' scenarious, are not out of the realm of possibility for poorly-designed applications... of which I inherited one. Put 9 figures worth of dirs/files on a single NTFS volume in a heavily write-intensive environment and tell me all is well. It's very scary, and **** starts to break down, write failures, etc. I know -- I've been there, and I'm doing it now until such time things get rewritten. I've lost it all more than once. Takes weeks to restore. Yes, the app is broken, but it's what I'm stuck with for now. Also, ask your vendors (any of them) for documented studies of heavy IO in that type of environment. None of them have any because for the most part, they do not test to those levels. Even MS only tests to 100M dirs/files for milestone releases (SPs) of Win2K3, and this is the first release where they went that high. You want to be safe, you'd better stay below 10M dirs/files on a single volume. That's realistically the highest you can go and count on all of your vendors possibly having tested to (I'm talking file systems, file-based replication software). You uncover the bugs their normal stress-testing doesn't. Believe me, in trying to deal with my mess, I've done a lot of talking to vendors. Their sales reps tell you everything is 'not a problem.' Their technical guys get really quiet when you ask for proof or customer examples you can speak with. |
#15
|
|||
|
|||
Very Large Filesystems
In article ,
Kraft Fan! numberoneKraftfan@littlerockarkansasmassageschool .com wrote: Filesystems are not the problem. Hardware is. I've worked with many thousands of PC disks starting with the first release of NTFS, almost 15 years ago. I have never seen NTFS "corrupt" itself. All failures were traced to dying hardware. Sh*t happens. I have to admit that my experience with RAID is much less. I'd like to hear of documented cases of such NTFS problems. In any case, you need a strategy for backup and recovery of your data. Even if the filesystem is fine, the building can burn down. here's a well-known NTFS example: http://support.microsoft.com/kb/229607 I also have another from personal experience. When you get way out of the norm, you are much more likely to encounter problems that have nothing to do with your hardware. I had a case open with MS in which I was told they had internal documentation suggesting limits that, TNX, I knew there would be something and I figured it would be a case of pushing some scale limit. -- a d y k e s @ p a n i x . c o m Don't blame me. I voted for Gore. A Proud signature since 2001 |
#16
|
|||
|
|||
Very Large Filesystems
Kraft Fan! wrote:
.... here's a well-known NTFS example: http://support.microsoft.com/kb/229607 Since that particular bug was fixed just over 8 years ago, something a bit more recent might be a more convincing argument for not trusting a reasonably mature file system. - bill |
#17
|
|||
|
|||
Very Large Filesystems
Aknin proclaimed:
Following some research I've been doing on the matter across newsgroups and mailing lists, I'd be glad if people could share numbers about real life large filesystem and their experience with them. I'm slowly coming to a realization that regardless of theoretical filesystem capabilities (1TB, 32TB, 256TB or more), more or less across the enterprise filesystem arena people are recommending to keep practical filesystems up to 1TB in size, for manageability and recoverability. What's the maximum filesystem size you've used in production environment? How did the experience come out? I'd guess the biggest problem with very large file systems would be when you need to run a file system check against them and dont have a few days to run the check on 100 terabytes or so. Some scale better than others, particularly if they are practically full. Backups and restores can be helped by delta style technology. |
|
Thread Tools | |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Decentralized/distributed filesystems questions | Wendell III | Storage & Hardrives | 0 | April 21st 07 07:04 PM |
Decentralized/distributed filesystems questions | Wendell III | Storage (alternative) | 0 | April 21st 07 07:04 PM |
Lecture on Parallel Filesystems | Fortuitous Technologies | Storage & Hardrives | 0 | January 11th 07 05:41 AM |
writing simultaneously to 2 or more network filesystems | [email protected] | Storage & Hardrives | 3 | June 21st 06 06:11 AM |
1 dedicated drive (mirrored), or thin stripe for redo filesystems? | [email protected] | Storage & Hardrives | 1 | February 28th 05 05:03 PM |