If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. |
|
|
Thread Tools | Display Modes |
#1
|
|||
|
|||
Open source storage
So in the past few months there have been some interesting moves
towards Open Source Storage: ZFS on Solaris, and Nexenta's software appliance. Has anyone out there deployed it to where it actually does anything useful? The cost savings are phenomenal, but nothing is truly free, you pay for it one way or another. On the flip side, its the LAST part of the stack which is still proprietary, and a part of me thinks its inevitable. SC |
#2
|
|||
|
|||
Open source storage
S writes:
So in the past few months there have been some interesting moves towards Open Source Storage: ZFS on Solaris, and Nexenta's software appliance. Of course, Linux has had more sophisticated file systems, including several clustered file systems, available as open source for some time.... Has anyone out there deployed it to where it actually does anything useful? The cost savings are phenomenal, but nothing is truly free, you pay for it one way or another. Fundamentally, you pay by doing the support and maintenance yourself, and by not having as much focused tuning expertise, formal testing, and relationships with database, operating system, backup vendors. Of course, you also take on the responsibility of making sure that whatever disks/tapes you buy work reliably with the controllers, motherboard, and operating system. (Does that SYNCHRONIZE CACHE command realy work?) If you're saving very much, you've probably also lost the hardware redundancy that's built into a hardware RAID system -- dual-ported access to disks, independent buses (not sharing a controller chip), etc. On the flip side, it's the LAST part of the stack which is still proprietary, and a part of me thinks its inevitable. Actually it's not; the firmware on the controllers is proprietary in nearly all cases, and the firmware in the drives is as well. -- Anton |
#3
|
|||
|
|||
Open source storage
In article , Anton Rang wrote:
Of course, Linux has had more sophisticated file systems, including several clustered file systems, available as open source for some time.... More sophisticated than ZFS? |
#4
|
|||
|
|||
Open source storage
the wharf rat wrote:
In article , Anton Rang wrote: Of course, Linux has had more sophisticated file systems, including several clustered file systems, available as open source for some time.... More sophisticated than ZFS? ReiserFS (especially Reiser4) is beyond question more sophisticated than ZFS - not only in concept (generic data-clustering ability, for example) but in execution (e.g., it incorporates batch-update mechanisms somewhat similar to ZFS's without losing sight of the importance of on-disk file contiguity). Extent-based XFS also does a significantly better job of promoting on-disk contiguity than ZFS does (even leaving aside the additional depredations caused by ZFS's brain-damaged 'RAID-Z' design) - and contributed the concept of allocate-on-write to ZFS (and Reiser) IIRC. GFS (and perhaps GPFS) support concurrent device sharing among the clustered systems that Anton mentioned (last I knew ZFS had no similar capability). ZFS is something of a one-trick pony. Its small-write performance is very good (at least when RAID-Z is not involved), but with access patterns that create fragmented files its medium-to-large read performance is just not competitive - and last I knew it didn't even have a defragmenter to alleviate that situation (defragmenting becomes awkward when you perform snapshots at the block level). And despite its hype about eliminating the LVM layer as soon as you need to incorporate redundancy in your storage up it pops again in the form of device groups - so there's relatively little net gain in that respect over a well-designed LVM interface (not that a ZFS-like approach *couldn't* have done a better job of eliminating LVM-level management, mind you). I wouldn't be so critical of ZFS if its marketeers and accompanying zealots hadn't hyped it to the moon and back: it's a refreshing change from the apparent complete lack of corporate interest in file-system development over the last decade or so, even if its design leaves a bit to be desired and its implementation is less than full-fledged - and it should be very satisfactory for environments that don't expose its weaknesses. (And yes, I do like its integral integrity checksums, but their importance has been over-hyped as well - given the number of significantly higher-probability hazards that data is subject to.) - bill |
#5
|
|||
|
|||
Open source storage
One might argue that Reiser has done a pretty slick job of marketing
his FS as well. I have heard that he hasn't really run his FS on any enterprise class storage. Understandable considering he's a small shop. Maybe this has changed. XFS has had its own issues. Yes you have on-disk continuity, but if you lose power while XFS is building its extant, you've got data corruption. I don't have any direct experience with ZFS...I'm trying to talk one of my buddies into letting me play with it on a system he has on his site though. So I really think the issues stopping people from deploying open source storage a 1. Lack of snapshots, which may not be an issue if ZFS gains traction. 2. No coherent DR strategy. I don't consider rsync a mirroring solution if it needs to walk the tree each time. 3. It seems like storage admins still need to have that support hotline printed out and pinned next to their workstation :-) Anyone think any different? On Feb 17, 4:59 pm, Bill Todd wrote: the wharf rat wrote: In article , Anton Rang wrote: Of course, Linux has had more sophisticated file systems, including several clustered file systems, available as open source for some time.... More sophisticated than ZFS? ReiserFS (especially Reiser4) is beyond question more sophisticated than ZFS - not only in concept (generic data-clustering ability, for example) but in execution (e.g., it incorporates batch-update mechanisms somewhat similar to ZFS's without losing sight of the importance of on-disk file contiguity). Extent-based XFS also does a significantly better job of promoting on-disk contiguity than ZFS does (even leaving aside the additional depredations caused by ZFS's brain-damaged 'RAID-Z' design) - and contributed the concept of allocate-on-write to ZFS (and Reiser) IIRC. GFS (and perhaps GPFS) support concurrent device sharing among the clustered systems that Anton mentioned (last I knew ZFS had no similar capability). ZFS is something of a one-trick pony. Its small-write performance is very good (at least when RAID-Z is not involved), but with access patterns that create fragmented files its medium-to-large read performance is just not competitive - and last I knew it didn't even have a defragmenter to alleviate that situation (defragmenting becomes awkward when you perform snapshots at the block level). And despite its hype about eliminating the LVM layer as soon as you need to incorporate redundancy in your storage up it pops again in the form of device groups - so there's relatively little net gain in that respect over a well-designed LVM interface (not that a ZFS-like approach *couldn't* have done a better job of eliminating LVM-level management, mind you). I wouldn't be so critical of ZFS if its marketeers and accompanying zealots hadn't hyped it to the moon and back: it's a refreshing change from the apparent complete lack of corporate interest in file-system development over the last decade or so, even if its design leaves a bit to be desired and its implementation is less than full-fledged - and it should be very satisfactory for environments that don't expose its weaknesses. (And yes, I do like its integral integrity checksums, but their importance has been over-hyped as well - given the number of significantly higher-probability hazards that data is subject to.) - bill |
#6
|
|||
|
|||
Open source storage
S wrote:
One might argue that Reiser has done a pretty slick job of marketing his FS as well. Yes, he has - but there's more relative substance behind that marketing than there is behind ZFS's (after all, when you promote yourself as "The Last Word In File Systems" it's easy to fall quite embarrassingly short). I have heard that he hasn't really run his FS on any enterprise class storage. The subject was not breadth of existing deployment but sophistication. .... XFS has had its own issues. Yes you have on-disk continuity, but if you lose power while XFS is building its extant, you've got data corruption. I'd like to see a credible reference for that allegation (unless you're simply referring to the potential inconsistency that virtually all update-in-place file systems have when *updating* - rather than writing for the first time - multiple sectors at once). .... So I really think the issues stopping people from deploying open source storage a 1. Lack of snapshots, which may not be an issue if ZFS gains traction. My impression is that snapshots have been available in Linux, BSD, and for that matter Solaris itself for many years in various forms associated with LVMs and/or file systems. 2. No coherent DR strategy. I don't consider rsync a mirroring solution if it needs to walk the tree each time. Synchronous mirroring at the driver level has been available for ages, and is entirely feasible across distances of at least 100 miles - enough to survive any disaster which your business is likely to survive as long as your remote site is reasonably robust. If write performance requirements can be relaxed a bit distances can be significantly greater. I haven't looked recently, so I don't know how well those facilities deal with temporary link interruptions and subsequent catch-up (if you've got dedicated fiber to a robust back-up site that may not be too likely to occur, but in other circumstances it would be very desirable). - bill |
#7
|
|||
|
|||
Open source storage
On Feb 18, 4:52 pm, Bill Todd wrote:
S wrote: One might argue that Reiser has done a pretty slick job of marketing his FS as well. Yes, he has - but there's more relative substance behind that marketing than there is behind ZFS's (after all, when you promote yourself as "The Last Word In File Systems" it's easy to fall quite embarrassingly short). Thats pretty funny and I would have to agree :-) I have heard that he hasn't really run his FS on any enterprise class storage. The subject was not breadth of existing deployment but sophistication. Right, but if Reiser hasn't run his FS on any enterprise-class storage how can we assume its ready for prime-time, enterprise-class deployment? XFS has had its own issues. Yes you have on-disk continuity, but if you lose power while XFS is building its extant, you've got data corruption. I'd like to see a credible reference for that allegation (unless you're simply referring to the potential inconsistency that virtually all update-in-place file systems have when *updating* - rather than writing for the first time - multiple sectors at once). See section 6.1: Delaying allocation http://oss.sgi.com/projects/xfs/pape...nix/index.html I remember reading another paper with detailed descriptions of causing data corruption on XFS through power manipulation but of course I can't find it anymore. So I really think the issues stopping people from deploying open source storage a 1. Lack of snapshots, which may not be an issue if ZFS gains traction. My impression is that snapshots have been available in Linux, BSD, and for that matter Solaris itself for many years in various forms associated with LVMs and/or file systems. I believe you can only have 1 snapshot at a time in LVM. Nowhere near the sophistication of WAFL snapshots. 2. No coherent DR strategy. I don't consider rsync a mirroring solution if it needs to walk the tree each time. Synchronous mirroring at the driver level has been available for ages, and is entirely feasible across distances of at least 100 miles - enough to survive any disaster which your business is likely to survive as long as your remote site is reasonably robust. If write performance requirements can be relaxed a bit distances can be significantly greater. I haven't looked recently, so I don't know how well those facilities deal with temporary link interruptions and subsequent catch-up (if you've got dedicated fiber to a robust back-up site that may not be too likely to occur, but in other circumstances it would be very desirable) Can you name some examples of synchronous mirroring at the driver level? Is it open source? Easy to deploy? Bottom line: I'd like to see people deploy Open Source Storage in their data centers. I'm just wondering why it hasn't happened yet and offering possible reasons. S - bill |
#8
|
|||
|
|||
Open source storage
S wrote:
.... if Reiser hasn't run his FS on any enterprise-class storage how can we assume its ready for prime-time, enterprise-class deployment? Because any failure of enterprise-class storage to faithfully mimic (e.g.) SCSI behavior should be considered to be an enterprise-storage bug rather than any problem with the file system? XFS has had its own issues. Yes you have on-disk continuity, but if you lose power while XFS is building its extant, you've got data corruption. I'd like to see a credible reference for that allegation (unless you're simply referring to the potential inconsistency that virtually all update-in-place file systems have when *updating* - rather than writing for the first time - multiple sectors at once). See section 6.1: Delaying allocation http://oss.sgi.com/projects/xfs/pape...nix/index.html There's nothing there that even remotely hints at data corruption on power loss: the defined semantics of any normal Unix-style file system (including ZFS) specifies that any user data that hasn't been explicitly flushed to disk may or may not be on the disk, in whole or in part, should power fail (that's what write-back caching is all about: if you want atomic on-disk persistence, you use fsync or per-request write-through - though even those won't necessarily guarantee full-request, let alone multi-request, atomicity beyond the individual file block level should power fail before the request completes, even on ZFS; about the only difference with ZFS is that individual file block disk writes are guaranteed to be atomic rather than just the near-guarantee that disks provide that individual sector writes will be atomic). It's been many years since I read that paper, though, and it provided a pleasant trip down memory lane. XFS did a lot of interesting things for the early '90s, even if not all of them were necessarily optimal. .... So I really think the issues stopping people from deploying open source storage a 1. Lack of snapshots, which may not be an issue if ZFS gains traction. My impression is that snapshots have been available in Linux, BSD, and for that matter Solaris itself for many years in various forms associated with LVMs and/or file systems. I believe you can only have 1 snapshot at a time in LVM. Nowhere near the sophistication of WAFL snapshots. But all that you need to do an on-line backup, one of the most important consumers of snapshot technology. Other uses of snapshots tend to be more like inferior substitutes for 'continuous data protection' facilities, though the advent of writable snapshots (clones) has opened up new uses (at least new imaginable uses: how much actual utility they have I'm not sure). The old Solaris fssnap mechanism may have been limited to a single snapshot. Peter Braam et al. produced alpha and beta releases of a more general snapshot facility called snapfs in 2001 which I thought either got further developed or replaced with another product of the same name, but I didn't find further information on it. The Linux LVM and LVM2 support snapshots (the latter including writable snapshots) - and a quick glance at the documentation didn't seem to indicate that they supported only one at a time. 2. No coherent DR strategy. I don't consider rsync a mirroring solution if it needs to walk the tree each time. Synchronous mirroring at the driver level has been available for ages, and is entirely feasible across distances of at least 100 miles - enough to survive any disaster which your business is likely to survive as long as your remote site is reasonably robust. If write performance requirements can be relaxed a bit distances can be significantly greater. I haven't looked recently, so I don't know how well those facilities deal with temporary link interruptions and subsequent catch-up (if you've got dedicated fiber to a robust back-up site that may not be too likely to occur, but in other circumstances it would be very desirable) Can you name some examples of synchronous mirroring at the driver level? Is it open source? Easy to deploy? I'm not all that familiar with the offerings, but my impression is that DRDB may be the current Linux standard in this area; a 2003 description can be found at http://www.linux-mag.com/id/1502 , and it's still being developed (just Google it). You may have been able to roll your own remote replication before DRDB by using a remote disk paired (RAID-1-style) with a local disk under local LVM facilities. - bill |
#9
|
|||
|
|||
Open source storage
Bill Todd wrote: There's nothing there that even remotely hints at data corruption on power loss: the defined semantics of any normal Unix-style file system (including ZFS) specifies that any user data that hasn't been explicitly flushed to disk may or may not be on the disk, in whole or in part, should power fail (that's what write-back caching is all about: ... Sorry to go off on a tangent but I think it is somewhat relevant since S was talking about Enterprise storage: How common is it for enterprise storage vendors to have disks with firmware that makes it impossible to the enable write-back cache? We have an SGI NAS (IS4500) where this is the case and it took me a little by surprise, although it does make a lot of sense when you have 100 TB storage. Does most or all enterprise storage permanently disable write-back cache? Thanks, Steve |
#10
|
|||
|
|||
Open source storage
Steve Cousins wrote:
Bill Todd wrote: There's nothing there that even remotely hints at data corruption on power loss: the defined semantics of any normal Unix-style file system (including ZFS) specifies that any user data that hasn't been explicitly flushed to disk may or may not be on the disk, in whole or in part, should power fail (that's what write-back caching is all about: ... Sorry to go off on a tangent but I think it is somewhat relevant since S was talking about Enterprise storage: How common is it for enterprise storage vendors to have disks with firmware that makes it impossible to the enable write-back cache? We have an SGI NAS (IS4500) where this is the case and it took me a little by surprise, although it does make a lot of sense when you have 100 TB storage. Does most or all enterprise storage permanently disable write-back cache? Thanks, Steve Ha, I'm more shocked that anything from SGI is still in use. |
Thread Tools | |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
[OT] Open source wars | Robert Myers | General | 4 | June 16th 05 03:07 PM |
IBM turning Power into Open Source? | Black Jack | General | 24 | April 15th 04 01:06 PM |
SCO: IBM trying to hijack Open Source says supporter | Daeron | Homebuilt PC's | 6 | February 11th 04 01:10 AM |
TI calculators go open source | Yousuf Khan | General | 4 | December 22nd 03 07:37 PM |
Massachusetts goes open-source | Yousuf Khan | General | 14 | October 22nd 03 05:56 AM |