If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. |
|
|
Thread Tools | Display Modes |
#31
|
|||
|
|||
EMC to IBM SAN LUN replication
Arne Joris wrote:
Good grief Bill, are you one of those people that type more as they grow angrier ? Thank you for all your replies, but I'm starting to get a bit ticked off by your rather abrasive style. Then shape up, pay a *lot* better attention to what you've (at least presumably) read before responding to it, and you'll stop deserving that style. If we can get back to what started this whole discussion, I stated that ... if your secondary site is not just a remote data vault but an actual production site where people need access to the data, it is pretty lame to have severs at the secondary site go over the WAN to go read the data from the primary storage when they have a copy of the data right there ! With a distributed block cache on top of asynchronous data replication, you could have both sites do I/O to the same volumes and access their local storage. I never heard you make any successful arguments against this, Then you just weren't listening. The better (software-based) alternatives that I described allow users to access data with no more inter-site synchronous (or asynchronous) activity than the low-level kludge that you're touting - and in many cases considerably less such activity. They also allow far more efficient updating, by supporting file-(or application-)level write-back caching *above* the storage level protected by established single-site persistence mechanisms (such as journaling or careful update) and coordinated between sites by the same mechanisms that multiple cooperating instances of a file system (or application of any complexity) have to use *anyway*. Keep reading what I wrote until you at least *begin* to understand it, or stop bleating about how little respect your babbling has received. your only beef with me is that you claim it would be stupid to use a distributed block cache appliance (yes I know you don't like that wording) to do this, seeing how the only useful way for apps at the two sites to use the data requires synchronous messaging between them anyway. Thus obviating the need for any such proprietary, special-purpose, and typically somewhat expensive hardware. Then there was a whole lot of talk about VMS, I'm not sure why that was relevant. Primarily (as I've already noted) because of your drivel about inter-site synchronization of this ilk being 'emerging technology' rather than very old hat with a new shiny ribbon around it. Despite increasing levels of toxicity (unilaterally from your part I might add), I have kept on trying to argue that for certain solutions, the appliance solution makes sense. And you've failed in that attempt, save in cases where no better solution already exists and there's some real need for what YottaYotta offers. Those cases at a minimum *don't* include Oracle RAC (it has its own, better, and more comprehensive software mechanisms to deploy here), or systems like VMS (and IIRC HACMP on AIX), or situations where those systems can be used as inter-site local file or database servers for less competent clients - nor, of course, other situations where moderately lackadaisical remote-site access is not that big a deal. In other words, that space of "certain solutions [where] the appliance solution makes sense" is at best a very narrow one, and YottaYotta's rather limited success tends to reflect this. I think you have been thinking about clustered applications with tight coherency requirements and can't or won't see beyond them. As usual, you think incorrectly. I guess you just can't wrap what passes for your intellect around the fact that clustering, even with tight coherency, does not necessarily imply that site-local data can't be accessed at all sites as long as said local data is not actively being updated at the time of use (that's what VMS's distributed lock management and distributed file system is largely about). Of course, accessing site-local data is often not particularly important at all, unless you're accessing it in bulk: as long as you can safely access the *contents* of a large file reliably from local storage, the fact that the file open operation may have taken an extra second because it was performed 3,000 miles away tends not to be all that significant (VMS doesn't do things that way - I just mention it to indicate that a *range* of effective approaches exists between a comprehensive ability to migrate *all* synchronization control and a more limited ability for a centralized mechanism to farm out specific revocable permissions to where they'll be useful). I'm fine with the fact you think I deserve a newsgroup lashing for arguing for something you are convinced is impossible or impractible, Except that, once again, you just haven't been paying attention: you already propped up that straw man before, and I already responded that what you're describing is eminently possible and practical, just not particularly useful in any general sense compared with the available alternatives. By the way, I notice that you didn't choose to respond to my implied question about whether you have (or had) any relationship with YottaYotta. Inquiring minds want to know... - bill |
#32
|
|||
|
|||
EMC to IBM SAN LUN replication
Bill Todd wrote:
.... Of course, accessing site-local data is often not particularly important at all, unless you're accessing it in bulk: as long as you can safely access the *contents* of a large file reliably from local storage, the fact that the file open operation may have taken an extra second because it was performed 3,000 miles away tends not to be all that significant Rats - classic math-in-my-head off-by-one (order of magnitude) error: 6,000 mile round-trip small-message latency should be more like 0.1 second, not 1 second (having allowed a factor of 3 for fibre and switching overheads compared with raw light velocity, though that factor is just a SWAG) - at least if small messages get reasonable priority compared with bulk transfers because this has minimal effect on the latter anyway. The result being that synchronous replication with multi-site access using somewhat simpler distributed-lock mechanisms than VMS's is for most workloads eminently feasible over such distances, as long as one is willing to tolerate bandwidth-limited performance at the instants when large updates must be made persistent: 1. Small operations (say, a file look-up plus open: I'm going to use files in this example since most of the real world does, but similar mechanisms can be used to coordinate access to block ranges) can just get sent from remote sites to the primary site to execute there. Many will execute at speeds at least somewhat comparable to what they'd see if they used locally-stored data, since their targets (if of any general interest) will be cached in RAM at the primary site (and a 0.1 second round-trip message latency is thus comparable to having to make a dozen +/- local disk accesses to complete the operation). 2. Sufficiently small files may just be served up from the primary site along with the associated open operation when this makes sense (e.g., if their content is already cached at the primary site); otherwise, the primary site just gives the interested remote site revocable permission to read the file content (regardless of its size) from its own local storage. 3. Small updates performed at a remote site can be efficiently communicated to the primary site for application there (and forwarding to any other remote replicas elsewhere; since the initiating remote site already has a copy, it can just hold onto it until the primary site gives it the go-ahead to update its local storage. This does entail temporarily revoking any read permissions on the affected byte range of the file that have been granted to other sites, so may incur as much as two round-trip small-message latencies (but any other form of distributed coherence mechanism incurs at least one such round-trip latency). 4. Large updates can be performed at a remote site in much the same manner: only permissions for the byte range being updated need be revoked from other sites (if any have been handed out). A mechanism to reserve bulk-update permission (incrementally revocable after use) for regions beyond current EOF is also required for append operations. 5. When inter-site bandwidth is constrained large updates can't become persistent across all the sites very quickly, but (again) that's true regardless of the inter-site synchronization mechanism used. However, many intermediate-sized updates (e.g., those that use large on-disk pages to cluster related metadata for access efficiency, but which normally get updated only one small piece at a time) can be written back lazily (as distinct from asynchronously, but accomplishing a very similar result), because all the information needed to make the related changes persistent after any service interruption has already been captured on disk in small logical log records. Another way to view this is that inter-site bandwidth limitations simply cause large updates to be written out to (multi-site) disks slowly (again, as distinct from asynchronously) - but since the data already exists in memory at the originating site, it's available there and will become available at other sites as soon as it can get there, so there's no difference in either accessibility or in persistent redundancy between this approach to bulk updates and any other (such as YottaYotta's) on those scores. 6. That leaves the question of what happens when a remote site wants data that it does not have current permission to access locally. Clearly, it must ask for that permission - and this is a signal to the primary site that if any relevant updates have not yet made it to the remote site they should be included as additional information, along with the permission, rather than left to migrate there lazily at the storage level as would otherwise have been the case. This would be equivalent to the hardware-level so-called 'distributed cache' mechanism that's been bruited about here, save for the advantage that it executes at the software caching level and thus allows deferring hardware writes (such that more updates can accumulate in actively-updated pages before they're finally forced - once - back to disk). Let's look at things another way: how does an approach like YottaYotta's meld with existing shared-storage file systems (or something like Oracle RAC) to maintain inter-site consistency? Clearly, it must lie to the systems at each end of the inter-site link to make them think that they're accessing the *same* storage rather than two different instances of it, since that's all such software understands (transparency being YottaYotta's main rationale for existence). This means that it can't return storage-level 'write complete' to such software until either all sites have been updated or at least all sites have been notified that an update to the relevant data is occurring (so that sites not yet updated can stall any local requests for that data until their update completes). But doesn't such software already cooperate between instances at the multiple sites such that anyone trying to access the data *knows* that it's in the process of being updated? Yes - but if YottaYotta *didn't* also coordinate (somewhat redundantly) as well at a lower level but instead returned 'write complete' as soon as the local storage had been updated and just let the update propagate lazily elsewhere, the higher-level software might heave an immediate sigh of relief, release all interlocks, and the remote end might then try to access the data (as a new request for it raced in) before the update had fully migrated there (remember: since synchronous interlocking is required anyway, it's not the inter-site *latency* that's the issue here so much as the inter-site *bandwidth*, so while small synchronization messages may buzz around the complex relatively quickly it may take quite a while for bulk updates to propagate). So the major difference between something like YottaYotta and the system that I've described above (leaving aside the cost of the otherwise-superfluous YottaYotta hardware and its potential for creating a bottleneck in its own right) is that (assuming that YottaYotta does not in fact propagate large updates synchronously - the main 'advantage' apparently being touted for it in this discussion) a local large update may get 'write complete' back sooner with YottaYotta (i.e., rather than make the local updater wait for synchronous propagation, instead any subsequent remote accessor has to wait until it gets there). This has a down-side (which I mentioned before) in that it makes the local updater think that the write is now safely distributed redundantly across the remote sites, when in fact only a single copy may yet exist and a local storage failure before it gets fully-propagated could lose it - not a legitimate situation at all if you're using YottaYotta to create a supposedly site-disaster-tolerant configuration, though perhaps OK if all you're using it for it to make access at remote sites more efficient (though, as noted, you can do as well using a well-designed distributed file system, and without the associated down-side). But is there any significant compensating up-side to making the local update appear to complete early? Not really: accessors who are interested in reading only persistent data likely also want it to be replicated to the expected redundancy level, whereas accessors who don't care can just access it via the distributed file system from the originating instance's cache (in toto at the local site where it was written, and at least in small amounts that can be sent efficiently across the inter-site links elsewhere, though for bulk access elsewhere they'll have to wait until the entire update arrives there - just as they do with the YottaYotta approach). Does YottaYotta at least do better as inter-site latencies increase (say, due to use of satellite links)? Nope: small synchronous inter-site interlocking messages take the same amount of time to propagate regardless of whether they're doing so in the service of YottaYotta or of a better solution, and for large updates bandwidth is bandwidth, regardless. Are there times when *real* asynchronous replication has advantages? Certainly (though not if you want concurrent access to anything beyond snapshot-style data at the remote end): once you don't have to wait for any synchronous inter-site interlocking, you can do things like put your transaction log into NVRAM and get update response times down to tens of microseconds rather than close to 10 milliseconds. Even there, though, it doesn't help as much with throughput, since the higher-latency log writes that you get with synchronous inter-site replication just let more log entries accumulate to be destaged in the next log write (i.e., where transaction throughput is concerned your real limit is log write bandwidth rather than log write latency, even though response times may suffer some). I wouldn't have bothered to go into this level of detail for friend Arne, since I seriously doubt that he could follow it even if he put in much more effort than he appears capable of. But it's an area of the file system that I'm developing that I've never thought through in quite this level of detail before, because I've never considered inter-continental site separation to be a major goal for it - so it's been a useful exercise for my own benefit, and I hope of some interest to a few people here. - bill |
Thread Tools | |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
EMC Celerra Replication | Fred | Storage & Hardrives | 1 | February 9th 06 06:37 PM |
replication and tape backup | [email protected] | Storage & Hardrives | 4 | February 22nd 05 04:21 AM |
SAN replication - WAN | Hal Kuff | Storage & Hardrives | 1 | October 22nd 04 01:21 AM |
SAN filesystem uses local storage for reads with synchronous replication | Bill Todd | Storage & Hardrives | 6 | October 21st 04 04:54 PM |
Sync Replication over CWDM lines | Andy S | Storage & Hardrives | 1 | July 21st 04 09:06 PM |