A computer components & hardware forum. HardwareBanter

If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

Go Back   Home » HardwareBanter forum » General Hardware & Peripherals » Storage & Hardrives
Site Map Home Register Authors List Search Today's Posts Mark Forums Read Web Partners

EMC to IBM SAN LUN replication



 
 
Thread Tools Display Modes
  #31  
Old October 24th 06, 11:58 PM posted to comp.arch.storage
Bill Todd
external usenet poster
 
Posts: 162
Default EMC to IBM SAN LUN replication

Arne Joris wrote:
Good grief Bill, are you one of those people that type more as they
grow angrier ? Thank you for all your replies, but I'm starting to get
a bit ticked off by your rather abrasive style.


Then shape up, pay a *lot* better attention to what you've (at least
presumably) read before responding to it, and you'll stop deserving that
style.


If we can get back to what started this whole discussion, I stated that

... if your secondary site is not just a remote data vault
but an actual production site where people need access to the data, it
is pretty lame to have severs at the secondary site go over the WAN to
go read the data from the primary storage when they have a copy of the
data right there ! With a distributed block cache on top of
asynchronous data replication, you could have both sites do I/O to the
same volumes and access their local storage.


I never heard you make any successful arguments against this,


Then you just weren't listening. The better (software-based)
alternatives that I described allow users to access data with no more
inter-site synchronous (or asynchronous) activity than the low-level
kludge that you're touting - and in many cases considerably less such
activity. They also allow far more efficient updating, by supporting
file-(or application-)level write-back caching *above* the storage level
protected by established single-site persistence mechanisms (such as
journaling or careful update) and coordinated between sites by the same
mechanisms that multiple cooperating instances of a file system (or
application of any complexity) have to use *anyway*.

Keep reading what I wrote until you at least *begin* to understand it,
or stop bleating about how little respect your babbling has received.

your only
beef with me is that you claim it would be stupid to use a distributed
block cache appliance (yes I know you don't like that wording) to do
this, seeing how the only useful way for apps at the two sites to use
the data requires synchronous messaging between them anyway.


Thus obviating the need for any such proprietary, special-purpose, and
typically somewhat expensive hardware.


Then there was a whole lot of talk about VMS, I'm not sure why that was
relevant.


Primarily (as I've already noted) because of your drivel about
inter-site synchronization of this ilk being 'emerging technology'
rather than very old hat with a new shiny ribbon around it.

Despite increasing levels of toxicity (unilaterally from your
part I might add), I have kept on trying to argue that for certain
solutions, the appliance solution makes sense.


And you've failed in that attempt, save in cases where no better
solution already exists and there's some real need for what YottaYotta
offers. Those cases at a minimum *don't* include Oracle RAC (it has its
own, better, and more comprehensive software mechanisms to deploy here),
or systems like VMS (and IIRC HACMP on AIX), or situations where those
systems can be used as inter-site local file or database servers for
less competent clients - nor, of course, other situations where
moderately lackadaisical remote-site access is not that big a deal.

In other words, that space of "certain solutions [where] the appliance
solution makes sense" is at best a very narrow one, and YottaYotta's
rather limited success tends to reflect this.

I think you have been
thinking about clustered applications with tight coherency requirements
and can't or won't see beyond them.


As usual, you think incorrectly. I guess you just can't wrap what
passes for your intellect around the fact that clustering, even with
tight coherency, does not necessarily imply that site-local data can't
be accessed at all sites as long as said local data is not actively
being updated at the time of use (that's what VMS's distributed lock
management and distributed file system is largely about).

Of course, accessing site-local data is often not particularly important
at all, unless you're accessing it in bulk: as long as you can safely
access the *contents* of a large file reliably from local storage, the
fact that the file open operation may have taken an extra second because
it was performed 3,000 miles away tends not to be all that significant
(VMS doesn't do things that way - I just mention it to indicate that a
*range* of effective approaches exists between a comprehensive ability
to migrate *all* synchronization control and a more limited ability for
a centralized mechanism to farm out specific revocable permissions to
where they'll be useful).


I'm fine with the fact you think I deserve a newsgroup lashing for
arguing for something you are convinced is impossible or impractible,


Except that, once again, you just haven't been paying attention: you
already propped up that straw man before, and I already responded that
what you're describing is eminently possible and practical, just not
particularly useful in any general sense compared with the available
alternatives.

By the way, I notice that you didn't choose to respond to my implied
question about whether you have (or had) any relationship with
YottaYotta. Inquiring minds want to know...

- bill
  #32  
Old October 25th 06, 04:59 AM posted to comp.arch.storage
Bill Todd
external usenet poster
 
Posts: 162
Default EMC to IBM SAN LUN replication

Bill Todd wrote:

....

Of course, accessing site-local data is often not particularly important
at all, unless you're accessing it in bulk: as long as you can safely
access the *contents* of a large file reliably from local storage, the
fact that the file open operation may have taken an extra second because
it was performed 3,000 miles away tends not to be all that significant


Rats - classic math-in-my-head off-by-one (order of magnitude) error:
6,000 mile round-trip small-message latency should be more like 0.1
second, not 1 second (having allowed a factor of 3 for fibre and
switching overheads compared with raw light velocity, though that factor
is just a SWAG) - at least if small messages get reasonable priority
compared with bulk transfers because this has minimal effect on the
latter anyway.

The result being that synchronous replication with multi-site access
using somewhat simpler distributed-lock mechanisms than VMS's is for
most workloads eminently feasible over such distances, as long as one is
willing to tolerate bandwidth-limited performance at the instants when
large updates must be made persistent:

1. Small operations (say, a file look-up plus open: I'm going to use
files in this example since most of the real world does, but similar
mechanisms can be used to coordinate access to block ranges) can just
get sent from remote sites to the primary site to execute there. Many
will execute at speeds at least somewhat comparable to what they'd see
if they used locally-stored data, since their targets (if of any general
interest) will be cached in RAM at the primary site (and a 0.1 second
round-trip message latency is thus comparable to having to make a dozen
+/- local disk accesses to complete the operation).

2. Sufficiently small files may just be served up from the primary site
along with the associated open operation when this makes sense (e.g., if
their content is already cached at the primary site); otherwise, the
primary site just gives the interested remote site revocable permission
to read the file content (regardless of its size) from its own local
storage.

3. Small updates performed at a remote site can be efficiently
communicated to the primary site for application there (and forwarding
to any other remote replicas elsewhere; since the initiating remote site
already has a copy, it can just hold onto it until the primary site
gives it the go-ahead to update its local storage. This does entail
temporarily revoking any read permissions on the affected byte range of
the file that have been granted to other sites, so may incur as much as
two round-trip small-message latencies (but any other form of
distributed coherence mechanism incurs at least one such round-trip
latency).

4. Large updates can be performed at a remote site in much the same
manner: only permissions for the byte range being updated need be
revoked from other sites (if any have been handed out). A mechanism to
reserve bulk-update permission (incrementally revocable after use) for
regions beyond current EOF is also required for append operations.

5. When inter-site bandwidth is constrained large updates can't become
persistent across all the sites very quickly, but (again) that's true
regardless of the inter-site synchronization mechanism used. However,
many intermediate-sized updates (e.g., those that use large on-disk
pages to cluster related metadata for access efficiency, but which
normally get updated only one small piece at a time) can be written back
lazily (as distinct from asynchronously, but accomplishing a very
similar result), because all the information needed to make the related
changes persistent after any service interruption has already been
captured on disk in small logical log records. Another way to view this
is that inter-site bandwidth limitations simply cause large updates to
be written out to (multi-site) disks slowly (again, as distinct from
asynchronously) - but since the data already exists in memory at the
originating site, it's available there and will become available at
other sites as soon as it can get there, so there's no difference in
either accessibility or in persistent redundancy between this approach
to bulk updates and any other (such as YottaYotta's) on those scores.

6. That leaves the question of what happens when a remote site wants
data that it does not have current permission to access locally.
Clearly, it must ask for that permission - and this is a signal to the
primary site that if any relevant updates have not yet made it to the
remote site they should be included as additional information, along
with the permission, rather than left to migrate there lazily at the
storage level as would otherwise have been the case. This would be
equivalent to the hardware-level so-called 'distributed cache' mechanism
that's been bruited about here, save for the advantage that it executes
at the software caching level and thus allows deferring hardware writes
(such that more updates can accumulate in actively-updated pages before
they're finally forced - once - back to disk).

Let's look at things another way: how does an approach like
YottaYotta's meld with existing shared-storage file systems (or
something like Oracle RAC) to maintain inter-site consistency? Clearly,
it must lie to the systems at each end of the inter-site link to make
them think that they're accessing the *same* storage rather than two
different instances of it, since that's all such software understands
(transparency being YottaYotta's main rationale for existence). This
means that it can't return storage-level 'write complete' to such
software until either all sites have been updated or at least all sites
have been notified that an update to the relevant data is occurring (so
that sites not yet updated can stall any local requests for that data
until their update completes).

But doesn't such software already cooperate between instances at the
multiple sites such that anyone trying to access the data *knows* that
it's in the process of being updated? Yes - but if YottaYotta *didn't*
also coordinate (somewhat redundantly) as well at a lower level but
instead returned 'write complete' as soon as the local storage had been
updated and just let the update propagate lazily elsewhere, the
higher-level software might heave an immediate sigh of relief, release
all interlocks, and the remote end might then try to access the data (as
a new request for it raced in) before the update had fully migrated
there (remember: since synchronous interlocking is required anyway,
it's not the inter-site *latency* that's the issue here so much as the
inter-site *bandwidth*, so while small synchronization messages may buzz
around the complex relatively quickly it may take quite a while for bulk
updates to propagate).

So the major difference between something like YottaYotta and the system
that I've described above (leaving aside the cost of the
otherwise-superfluous YottaYotta hardware and its potential for creating
a bottleneck in its own right) is that (assuming that YottaYotta does
not in fact propagate large updates synchronously - the main 'advantage'
apparently being touted for it in this discussion) a local large update
may get 'write complete' back sooner with YottaYotta (i.e., rather than
make the local updater wait for synchronous propagation, instead any
subsequent remote accessor has to wait until it gets there). This has a
down-side (which I mentioned before) in that it makes the local updater
think that the write is now safely distributed redundantly across the
remote sites, when in fact only a single copy may yet exist and a local
storage failure before it gets fully-propagated could lose it - not a
legitimate situation at all if you're using YottaYotta to create a
supposedly site-disaster-tolerant configuration, though perhaps OK if
all you're using it for it to make access at remote sites more efficient
(though, as noted, you can do as well using a well-designed distributed
file system, and without the associated down-side).

But is there any significant compensating up-side to making the local
update appear to complete early? Not really: accessors who are
interested in reading only persistent data likely also want it to be
replicated to the expected redundancy level, whereas accessors who don't
care can just access it via the distributed file system from the
originating instance's cache (in toto at the local site where it was
written, and at least in small amounts that can be sent efficiently
across the inter-site links elsewhere, though for bulk access elsewhere
they'll have to wait until the entire update arrives there - just as
they do with the YottaYotta approach).

Does YottaYotta at least do better as inter-site latencies increase
(say, due to use of satellite links)? Nope: small synchronous
inter-site interlocking messages take the same amount of time to
propagate regardless of whether they're doing so in the service of
YottaYotta or of a better solution, and for large updates bandwidth is
bandwidth, regardless.

Are there times when *real* asynchronous replication has advantages?
Certainly (though not if you want concurrent access to anything beyond
snapshot-style data at the remote end): once you don't have to wait for
any synchronous inter-site interlocking, you can do things like put your
transaction log into NVRAM and get update response times down to tens of
microseconds rather than close to 10 milliseconds. Even there, though,
it doesn't help as much with throughput, since the higher-latency log
writes that you get with synchronous inter-site replication just let
more log entries accumulate to be destaged in the next log write (i.e.,
where transaction throughput is concerned your real limit is log write
bandwidth rather than log write latency, even though response times may
suffer some).

I wouldn't have bothered to go into this level of detail for friend
Arne, since I seriously doubt that he could follow it even if he put in
much more effort than he appears capable of. But it's an area of the
file system that I'm developing that I've never thought through in quite
this level of detail before, because I've never considered
inter-continental site separation to be a major goal for it - so it's
been a useful exercise for my own benefit, and I hope of some interest
to a few people here.

- bill
 




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
EMC Celerra Replication Fred Storage & Hardrives 1 February 9th 06 06:37 PM
replication and tape backup [email protected] Storage & Hardrives 4 February 22nd 05 04:21 AM
SAN replication - WAN Hal Kuff Storage & Hardrives 1 October 22nd 04 01:21 AM
SAN filesystem uses local storage for reads with synchronous replication Bill Todd Storage & Hardrives 6 October 21st 04 04:54 PM
Sync Replication over CWDM lines Andy S Storage & Hardrives 1 July 21st 04 09:06 PM


All times are GMT +1. The time now is 08:23 AM.


Powered by vBulletin® Version 3.6.4
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 HardwareBanter.
The comments are property of their posters.