A computer components & hardware forum. HardwareBanter

If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

Go Back   Home » HardwareBanter forum » General Hardware & Peripherals » Storage & Hardrives
Site Map Home Register Authors List Search Today's Posts Mark Forums Read Web Partners

EMC to IBM SAN LUN replication



 
 
Thread Tools Display Modes
  #21  
Old October 22nd 06, 05:48 PM posted to comp.arch.storage
Arne Joris
external usenet poster
 
Posts: 14
Default EMC to IBM SAN LUN replication

Bill Todd wrote:
Perhaps you should actually learn something about VMS clustering before
presuming to expound on the deficiencies that you imagine it has. VMS's
distributed file system has, since the mid-'80s, provided exactly what
you describe above: an environment in which multiple instances of dumb
applications executing at different sites can concurrently share (and
update) the same file(s), with the same integrity guarantees that they'd
enjoy if they were all running on a single machine.


Jeez Bill, are you trying to play the old geezer claiming anything new
is at best a poor imitation of something they've been using for 20
years :-)
You are right, I know nothing about VMS. But it makes me think of
mainframes and a proprietary file system, which would mean you can't
just take your existing SAP, Exchange and StorNext filesystem services,
and make them run on a VMS cluster, am I right ? If a storage appliance
does all this, you can connect the WAN link and the SAN at each site to
the storage appliances and every service can run on it's own server
with whatever OS it prefers, connecting to the local SAN, using
whatever block acces it prefers (raw block device, file system,...)
That's what I meant by the app doesn't need to know anything about the
distributed storage, nor is there any requirement for server hardware,
OS or communication middle-ware.

And exactly how do you think that the secondary site's portion of your
hypothetical distributed cache would *know* that its local data was
stale, without some form of synchronous communication with the primary
site? Your hands are waving pretty fast, but I suspect you really don't
know much about this subject at the level of detail that you're
attempting to discuss.


I think you know the answer as well as I do : by having a distributed
cache manager, one that tries to put the cache directory for a given
block range at the site with the most intensive I/O pattern. If you
make those directory entries large enough you can reduce the overhead
of having to ask the cache manager for every I/O.
Yes there will of course we some synchronous traffic between the sites,
for communication between the parts of the distributed cache manager.

I find it curious you suspect I don't know anything about the subject.
Do you think it's impossible or impractical and therefor anyone
claiming the opposite must be full of it ?

If one assumes a major bandwidth bottleneck between the sites, such that
small communications about data currency can be synchronous but large
updates cannot easily be, there might be *some* rationale for the kind
of system that you seem to be describing. But in that case,
secondary-site accessors aren't going to get very good service when they
want something that has changed recently.


If they want something that has changed recently *by another site* they
won't get good service, that's right. There is no way around the fact
that if you want access to the data just written at the other site, you
need to pull it over the WAN and it will take a while.
....

There is no lock when the storage appliance maintains the inter-site
coherence;


It sounds as if you may be confused again: applications don't *see* any
locking using inter-site VMS distributed file access any more than they
see it doing local access: the 'locking' involved is strictly internal
to the file system (just as it is for a local-only file system), and is
largely involved in guaranteeing the atomicity of data updates (if two
accessors try to update the same byte range, for example, the result
really ought to be what one or the other wrote, rather than some mixture
of the two) or internal file system updates (say, an Open and a Delete
operation racing each other, where they really need to be serialized
rather than mixed together).


Well like I said, I know squat about VMS. I wouldn't want to argue VMS
with someone who's clearly a VMS guru, with such ability to produce
page-long technical arguments at the slightest hint of ignorance :-)
Let me retract my earlier statement about high-end services needing to
be designed to run on VMS, I have offended you.

Or perhaps you're suggesting that with block-level inter-site
replication no inter-site locking is required to support inter-site
file-level access. If so, you're simply wrong: even if the remote site
applies updates in precisely the same order in which they're applied at
the primary site, lack of access to the primary site's in-memory file
system context remains a problem (i.e., the secondary site is still
stale in that sense unless the primary site's file system does something
special to ensure that it isn't, outside the confines of your 'storage
appliance').


No I thought I had said that applications requiring a file system need
to run a distributed file system on top of the block service. Some
distributed file systems have their own distributed locking mechanism
to lock blocks or files, others use a single server at the primary
site. But Oracle for example can use raw block devices and does it's
own locking and synchronisation (with RAC), so there's not always a
need to provide one as part of the storage system.

if applications need a lock manager (like distributed file
systems for example), they'll need to provide their own.


What's this "if"? The only instances in which they will *not* require
something like a distributed lock manager are exactly those which Nick
described: using snapshot-style facilities at the secondary site to
create an effectively separate (usually read-only) environment which can
then be operated upon in whatever manner one wants.


How about the migrating services that can move between the sites but
will only access data at a single site at any time ? There is no data
locking required, the service can simply assume it's data is always
there and it can always access it. And like I said, most applications
requiring locks already provide them themselves.

You can only avoid latency penalties by not accessing the same data at
both sites within the async replication window.


Same problem I noted above: unless you've got synchronous oversight to
catch such issues (even if the actual updates can be somewhat
asynchronous), it's all too easy to get stale data as input to some
operation that will then use it to modify other parts of the system.


But that is exactly what a distributed cache does, keeping track of
dirty blocks and fetching them automatically across sites when required
! That cache flushes to an asynchronous data replication service, so
that if the cache doesn't have the blocks you want to do I/O to, it's
ensured to be on your local storage.

Luckily there are many
I/O patterns that can avoid this. My example of agent migration for
example, would have a piece of data being accessed from a single site
only, until the agent moves to the other side.


That may work OK for read-only access, but then so does a block-level
snapshot (just take one when you move the agent). Where data gets
updated (or, perhaps worse, appended to), you've got allocation activity
at the new site which must be carefully coordinated with the primary site.


Creating a snapshot and making it available on the other site requires
the migration service to integrate with the storage at both sites,
which is not just a trivial matter ! With a distributed block-level
cache it needs to know nothing about the storage, and in fact the
storage can be from different vendors at both sites (what started this
thread !), something that wouldn't work with your snapshot example.

For any application
where data is produced only once, rarely modified but read often, the
latency hits are minimal, and the ease of just being able to read
up-to-date data at all sites at any moment is a big benefit.


If the updates really are that rare, then synchronous replication and
VMS-style site-local access will work fine.


That all depends on your definition of rare, and the latencies you are
talking about. If you have say 2000 km worth of latency, synchronous is
not an option even for very modest updates.

The apps
at all sites can be completely unaware of synchronisation windows and
do not need to be made aware of each other.


Just as has been the case with VMS for decades - as I said.


Alright, VMS rocks and so do distributed cache appliances, but VMS has
been rocking for 20 years longer :-) Sort of a Neil Young versus Green
Day argument we're having here.

Arne

  #22  
Old October 22nd 06, 06:11 PM posted to comp.arch.storage
Arne Joris
external usenet poster
 
Posts: 14
Default EMC to IBM SAN LUN replication

Nik Simpson wrote:
Arne Joris wrote:
Nik Simpson wrote:
If you want to do processing of data for applications like data mining
or backup, then you can take a snapshot of the remote mirror (both have
the ability to do that) and serve the snapshot up to an application
server for processing. What else did you have in mind, and if "there are
more options" then why not share you knowledge with us?


Well for example if your secondary site is not just a remote data vault
but an actual production site where people need access to the data, it
is pretty lame to have severs at the secondary site go over the WAN to
go read the data from the primary storage when they have a copy of the
data right there ! With a distributed block cache on top of
asynchronous data replication, you could have both sites do I/O to the
same volumes and access their local storage.


If that's what you want to do, then it makes a hell of a lot more sense
to do the replication at the file system level not at the LUN, because
you need something in the replication layer that understands
synchronization issues at the file system.


If you assume every data block in every file could be being updated at
any moment, you are right.
But take for example seismic data files, which get produced as an
hours-long sequential dump of data. Why should a processing app at site
2 need to wait until the entire file had been written at site 1 ? It
could simply start reading at block 0, and perhaps even do some
modifications to the data, as long as it hadn't crossed site 1's
writing barrier.

Or take the example of web servers running at both sites, and at each
site there could be people writing new content (think a slashdot-like
situation) into their own data region. The web servers might not get
notified every time a new document is added at the remote site, but an
occasional directory refresh will show the new content and that is more
than enough for a lot of applications.
Obviously you'll need some form of agreement between the sites to only
use certain block ranges for new I/O so they won't all start writing to
the same regions.

Arne

  #23  
Old October 22nd 06, 08:20 PM posted to comp.arch.storage
Bill Todd
external usenet poster
 
Posts: 162
Default EMC to IBM SAN LUN replication

Arne Joris wrote:
Bill Todd wrote:
Perhaps you should actually learn something about VMS clustering before
presuming to expound on the deficiencies that you imagine it has. VMS's
distributed file system has, since the mid-'80s, provided exactly what
you describe above: an environment in which multiple instances of dumb
applications executing at different sites can concurrently share (and
update) the same file(s), with the same integrity guarantees that they'd
enjoy if they were all running on a single machine.


Jeez Bill, are you trying to play the old geezer claiming anything new
is at best a poor imitation of something they've been using for 20
years :-)


No: I'm an old geezer observing that every single damn thing that
you've characterized as 'emerging technology' is in fact very old hat:
VMS had it over two decades ago, IBM had it a decade ago in Parallel
Sysplex (and to a lesser extent in HACMP on AIX), other Unixes have been
developing it more recently, as well as third-parties (Mercury's SANergy
being one early example, a shared-storage/central metadata server
implementation supporting both Windows and Unix later bought by IBM for
Tivoli which could be used to achieve most of what you've described,
though not as flexibly as VMS facilities can): it's Windows that's the
real laggard (Microsoft was working with DEC in the '90s to try to move
in this direction, but came up far short and never rectified that: they
appear to have decided that concurrently-shared-storage architectures,
whether real or virtual, were not the way they wished to go).

You are right, I know nothing about VMS.


Then why in hell did you presume to try to characterize its facilities
as anything other than precisely what I told you they were? When I observed

"Since VMS has been doing exactly these kinds of things since the
mid-'80s, calling it 'emerging technology' (rather than, say, catch-up
implementations) seems a bit of a stretch."

you responded

"The difference is..."

without knowing a damn thing about whether any difference of the form
that you went on to describe actually existed (and, of course, it didn't).

But it makes me think of
mainframes and a proprietary file system, which would mean you can't
just take your existing SAP, Exchange and StorNext filesystem services,
and make them run on a VMS cluster, am I right ?


Well, SAP of course *used* to run natively on VMS, until it decided that
cHumPaq was insufficiently committed to the platform to make continuing
to do so worthwhile.

As for the rest, you can use the inter-site VMS cluster as a distributed
CIFS file server to serve Windows clients - in the manner that you
suggested using a distributed 'storage appliance'.

But none of them would run (whether on the 'storage appliance' that
you're imagining or on a VMS cluster) using multiple concurrent
instances unless they were designed to do so at the application level:
while the 'dumb Web server' that you originally mentioned might not
require any coordination between instances, things like SAP and Exchange
most definitely would.

If a storage appliance
does all this, you can connect the WAN link and the SAN at each site to
the storage appliances and every service can run on it's own server
with whatever OS it prefers, connecting to the local SAN, using
whatever block acces it prefers (raw block device, file system,...)


Only if a) that storage appliance is interlocking raw block access
synchronously (some VMS storage hardware does this and VMS cluster
software can also do it at a low level when handling the replication on
dumb hardware; I don't specifically remember whether they export raw
block-level inter-site access for application use, but see no reason why
they wouldn't) and b) you're using higher-level shared-storage
distributed file system software (executing by definition *outside* the
'storage appliance' if it indeed is exporting only block-level access,
as you state above) for file-level accesses.

That's what I meant by the app doesn't need to know anything about the
distributed storage,


Nor (once again) does it with the VMS facilities that I've described.

nor is there any requirement for server hardware, OS


All you've done is substitute this mythical 'storage appliance' for the
server hardware and OS.

or communication middle-ware.


*What* 'communication middle-ware', pray tell? Unless you're referring
to using something like CIFS to link Windows systems to the distributed
VMS servers, in which case if you want file-level access you'll instead
have to use shared-storage distributed file system 'middle-ware' on
those Windows servers to link to your block-level 'storage appliance'.


And exactly how do you think that the secondary site's portion of your
hypothetical distributed cache would *know* that its local data was
stale, without some form of synchronous communication with the primary
site? Your hands are waving pretty fast, but I suspect you really don't
know much about this subject at the level of detail that you're
attempting to discuss.


I think you know the answer as well as I do :


I suspect somewhat better, since I've actually designed and implemented
systems of this ilk a couple of times rather than just bloviated about them.

by having a distributed
cache manager, one that tries to put the cache directory for a given
block range at the site with the most intensive I/O pattern.


Ah, you're learning something from VMS already, I see - save that the
block level (rather than the file level) isn't the optimal place to do
it, and (if site-local access to data is as important as you seem to
think it is) using a distributed cache makes a lot less sense than using
distributed locks (since you'd rather access data locally - even if you
have to go to disk for it - as long as that's safe, regardless of
whether some other site might have it cached).

Now, if you're using 'distributed cache' (something I tend to use in the
context of allowing one system to benefit from the data in another's
cache rather than having to go to disk for it) to mean something much
more like 'distributed locking mechanism' (which tracks potential
synchronization issues such that they can be properly addressed should
they occur), then we're just using different terminology to describe the
same thing.

If you
make those directory entries large enough you can reduce the overhead
of having to ask the cache manager for every I/O.


Of course you have to interrogate the cache (or other inter-site
synchronization facility) on every I/O - the idea is that you only have
to interrogate the *site-local* portion of it, because it's being kept
synchronously up to date about any temporary inconsistencies.

Yes there will of course we some synchronous traffic between the sites,
for communication between the parts of the distributed cache manager.


Duh - you mean like the synchronous communication required between the
parts of VMS's distributed lock manager? Yup.


I find it curious you suspect I don't know anything about the subject.


Why would you find it curious that your earlier incompetent observations
about it would lead someone to that conclusion?

You do seem willing to learn, though - but you still appear a bit
confused about the level of inter-site coordination required by
concurrent distributed file-level access.

Do you think it's impossible or impractical and therefor anyone
claiming the opposite must be full of it ?


Au contrai it's eminently possible and practical (as I've noted, VMS
has been doing it for over 20 years, just somewhat better than the
approach which you sketched out would).

....

But Oracle for example can use raw block devices and does it's
own locking and synchronisation (with RAC), so there's not always a
need to provide one as part of the storage system.


If the storage system firmware (rather than the Oracle software) is
handling the inter-site block-level replication (as you seem to be
suggesting), then either that firmware needs to implement at least
short-term inter-site interlocks to ensure that regardless of which site
Oracle elects to obtain block-level data from the copy obtained is up to
date, or the Oracle code must ensure that until a write-complete ACK has
been received from the local storage hardware no remote access to that
block can occur (and the local hardware must not return completion
status until all copies have been updated).

By the way, the Oracle RAC DLM implementation is based on the VMS DLM
design given to them by DECpaq (the earlier Oracle Parallel Server
product originated on VMS, since that was the only cluster environment
that provided the required distributed lock management), and the AIX
HACMP DLM was a clone based on careful study of the VMS DLM design (not
surprising, since one of its main purposes was to allow OPS - with its
VMS DLM-based locking interface - to run in the HACMP environment).
Imitation often *is* the sincerest form of flattery.


if applications need a lock manager (like distributed file
systems for example), they'll need to provide their own.

What's this "if"? The only instances in which they will *not* require
something like a distributed lock manager are exactly those which Nick
described: using snapshot-style facilities at the secondary site to
create an effectively separate (usually read-only) environment which can
then be operated upon in whatever manner one wants.


How about the migrating services that can move between the sites but
will only access data at a single site at any time ?


No competent engineer would design a storage system claiming to be
multi-site accessible without appropriate interlocks (even supporting
reliable snapshot-style access at the remote site requires that remote
updates occur in the same *order* that primary-site updates do): at
most, they'd *allow* multi-site access with vehement and copious "on
your head be it" warnings about the potential perils of using it - which
would only be exacerbated (in terms of race windows) if the asynchronous
replication you've talked about were used.

As a specific example of a potential problem *above* the hardware level,
if your hypothetical service migrates at any speed exceeding that of
sneakernet, dirty data that it just wrote to the OS file system cache on
one site may not have been flushed to the underlying (distributed)
storage media by the time the service pops up on the other site (delays
of up to 30 seconds are common in Unix environments, for example).

There is no data
locking required, the service can simply assume it's data is always
there and it can always access it. And like I said, most applications
requiring locks already provide them themselves.


Horse****: most applications requiring locks don't need to think about
them at all, because the underlying file system is doing all that
transparently for them. Just *getting to* the data involves following a
multi-link path through the file system directory structure, an
operation that can't occur reliably on a 'secondary' site without some
degree of inter-site update coordination (for God's sake, if you're
using an optimized journaling file system the secondary site may have
the pertinent *log* entries but won't have the associated in-memory
update context: you can't use the file system there in anything
resembling up-to-date fashion without first performing a recovery from
the on-disk log).

....

Luckily there are many
I/O patterns that can avoid this. My example of agent migration for
example, would have a piece of data being accessed from a single site
only, until the agent moves to the other side.

That may work OK for read-only access, but then so does a block-level
snapshot (just take one when you move the agent). Where data gets
updated (or, perhaps worse, appended to), you've got allocation activity
at the new site which must be carefully coordinated with the primary site.


Creating a snapshot and making it available on the other site requires
the migration service to integrate with the storage at both sites,
which is not just a trivial matter !


*None* of this is as trivial a matter as you imagine it to be, as I hope
you're starting to learn (if you respond again, we should get a pretty
good idea of just how educable you are).

....

For any application
where data is produced only once, rarely modified but read often, the
latency hits are minimal, and the ease of just being able to read
up-to-date data at all sites at any moment is a big benefit.

If the updates really are that rare, then synchronous replication and
VMS-style site-local access will work fine.


That all depends on your definition of rare, and the latencies you are
talking about. If you have say 2000 km worth of latency, synchronous is
not an option even for very modest updates.


If update latency makes synchronous replication prohibitively slow
(VMS-style distributed lock management can allow the *reads* to proceed
at all sites without delay, as long as no nearby updates are occurring
at the time), then your only real option is to use ordered asynchronous
replication plus snapshots upon which any required recovery operations
are then performed before use (if you still don't understand why, reread
the existing material until the light dawns).

Unless, of course, you are using a 'careful update' file system such as
VMS's (Berkeley's 'soft update' mechanism may also qualify), which
should avoid the need for the recovery part (desirable in cases where
writable snapshots are not supported) - though you'll still need to use
a snapshot based on ordered underlying storage updates, not just wing it
with the underlying storage contents continuing to change beneath you.

- bill
  #24  
Old October 23rd 06, 01:35 AM posted to comp.arch.storage
Nik Simpson
external usenet poster
 
Posts: 73
Default EMC to IBM SAN LUN replication

Arne Joris wrote:
Nik Simpson wrote:
Arne Joris wrote:
Nik Simpson wrote:
If you want to do processing of data for applications like data mining
or backup, then you can take a snapshot of the remote mirror (both have
the ability to do that) and serve the snapshot up to an application
server for processing. What else did you have in mind, and if "there are
more options" then why not share you knowledge with us?
Well for example if your secondary site is not just a remote data vault
but an actual production site where people need access to the data, it
is pretty lame to have severs at the secondary site go over the WAN to
go read the data from the primary storage when they have a copy of the
data right there ! With a distributed block cache on top of
asynchronous data replication, you could have both sites do I/O to the
same volumes and access their local storage.

If that's what you want to do, then it makes a hell of a lot more sense
to do the replication at the file system level not at the LUN, because
you need something in the replication layer that understands
synchronization issues at the file system.


If you assume every data block in every file could be being updated at
any moment, you are right.


Thanks ;-)

But take for example seismic data files, which get produced as an
hours-long sequential dump of data. Why should a processing app at site
2 need to wait until the entire file had been written at site 1 ? It
could simply start reading at block 0, and perhaps even do some
modifications to the data, as long as it hadn't crossed site 1's
writing barrier.



Sounds good in theory, but if the replication is asynchronous with no
synchronous communication or lock manager, how does the application at
site remote site know how far the application at the local site's
"writing barrier" has moved at any given moment?

You also have to assume that the writing is sequential, i.e. at it's
absolute best this can only work for purely sequential I/O streams and
as such is somewhat similar to a streaming media server. As soon as any
element of randomness is added to the I/O the whole things goes to hell
in a hand basket because neither end knows what the hell is going on at
any given point, without maintaining some sort of synchronous communication.

Or take the example of web servers running at both sites, and at each
site there could be people writing new content (think a slashdot-like
situation) into their own data region. The web servers might not get
notified every time a new document is added at the remote site, but an
occasional directory refresh will show the new content and that is more
than enough for a lot of applications.


So, we are replicating at the block level write? And now I have
application #1 starting a new file, allocating an Inode and starting
allocate blocks from the free list, simultaneously or at around the same
time, application #2 at the remote site also opens a new file, gets an
inode and some free blocks, problem is they both got the same inode.
You'll note the obvious race conditions with two independent writers
allocating blocks on separate and asynchronous copies of the same
filesystem. I'd be interested to hear how you plan to get around this
with your appliances and distributed lock manager.


Obviously you'll need some form of agreement between the sites to only
use certain block ranges for new I/O so they won't all start writing to
the same regions.



A nice little hand wave there, sort of two separate file systems free
lists and inode maps, perhaps once site uses even numbers and the other
one sticks to odd numbers ;-)

BTW, in your original response to my post about DataCore/FalconStor you
said "there are more options" and then to Bill you claimed that this was
still an emerging market. To be an emerging market, I'd have to assume
that there is somebody somewhere that you think is doing something like
this, if so, please share.


--
Nik Simpson
  #25  
Old October 23rd 06, 05:00 PM posted to comp.arch.storage
Arne Joris
external usenet poster
 
Posts: 14
Default EMC to IBM SAN LUN replication

Nik Simpson wrote:
Sounds good in theory, but if the replication is asynchronous with no
synchronous communication or lock manager, how does the application at
site remote site know how far the application at the local site's
"writing barrier" has moved at any given moment?


There is synchronous messaging between the parts of the distributed
cache living at each site.
The cache is providing a way to transparently share the data that
hasn't been replicated yet between the two sites.
All writes go through the cache first where the data stays local unless
the other site needs it. Then the cache drains to the asynchronous
replication, which will write it to storage at both sites.

You also have to assume that the writing is sequential, i.e. at it's
absolute best this can only work for purely sequential I/O streams and
as such is somewhat similar to a streaming media server. As soon as any
element of randomness is added to the I/O the whole things goes to hell
in a hand basket because neither end knows what the hell is going on at
any given point, without maintaining some sort of synchronous communication.


Yeah this example for seismic data uses the knowledge of the purely
sequential stream to be able to start processing data at the other
site. For general access not based on any of these 'tricks', you'll
either need a distributed filesystem on top of the distributed cache if
you want files, or have the applications at both sites provide some
locking mechanisms themselves.

Or take the example of web servers running at both sites, and at each
site there could be people writing new content (think a slashdot-like
situation) into their own data region. The web servers might not get
notified every time a new document is added at the remote site, but an
occasional directory refresh will show the new content and that is more
than enough for a lot of applications.


So, we are replicating at the block level write? And now I have
application #1 starting a new file, allocating an Inode and starting
allocate blocks from the free list, simultaneously or at around the same
time, application #2 at the remote site also opens a new file, gets an
inode and some free blocks, problem is they both got the same inode.
You'll note the obvious race conditions with two independent writers
allocating blocks on separate and asynchronous copies of the same
filesystem. I'd be interested to hear how you plan to get around this
with your appliances and distributed lock manager.


I never said anything about a file system, but if you insist on using
files an not blocks, you could pre-allocate a file for each site, and
have each site use a database using it's own file to manage your
content. Both files are available at both sites, and a web server at
any site can extract the index from both files by starting a new
database instance on the remote file and querying against that.

BTW, in your original response to my post about DataCore/FalconStor you
said "there are more options" and then to Bill you claimed that this was
still an emerging market. To be an emerging market, I'd have to assume
that there is somebody somewhere that you think is doing something like
this, if so, please share.


I thought I already did but perhaps it wasn't explicit enough :
http://www.yottayotta.com/clusteredfilesys.html

Arne

  #26  
Old October 23rd 06, 06:54 PM posted to comp.arch.storage
Arne Joris
external usenet poster
 
Posts: 14
Default EMC to IBM SAN LUN replication

Bill Todd wrote:
No: I'm an old geezer observing that every single damn thing that
you've characterized as 'emerging technology' is in fact very old hat:
VMS had it over two decades ago, IBM had it a decade ago in Parallel
Sysplex (and to a lesser extent in HACMP on AIX), other Unixes have been
developing it more recently, as well as third-parties (Mercury's SANergy
being one early example, a shared-storage/central metadata server
implementation supporting both Windows and Unix later bought by IBM for
Tivoli which could be used to achieve most of what you've described,
though not as flexibly as VMS facilities can): it's Windows that's the
real laggard (Microsoft was working with DEC in the '90s to try to move
in this direction, but came up far short and never rectified that: they
appear to have decided that concurrently-shared-storage architectures,
whether real or virtual, were not the way they wished to go).


These are all server-based solutions. Putting it inside the SAN on a
storage appliance has benefits, but you don't seem to believe them.
That is the emerging technology part, not as much the code running on
those appliances or the ideas behind them, but the new solutions for
multi-site problems they offer.

rant about your offense taken at VMS ignorance deleted
Alright let's get over this VMS thing and move on...

But it makes me think of
mainframes and a proprietary file system, which would mean you can't
just take your existing SAP, Exchange and StorNext filesystem services,
and make them run on a VMS cluster, am I right ?


Well, SAP of course *used* to run natively on VMS, until it decided that
cHumPaq was insufficiently committed to the platform to make continuing
to do so worthwhile.

As for the rest, you can use the inter-site VMS cluster as a distributed
CIFS file server to serve Windows clients - in the manner that you
suggested using a distributed 'storage appliance'.

But none of them would run (whether on the 'storage appliance' that
you're imagining or on a VMS cluster) using multiple concurrent
instances unless they were designed to do so at the application level:
while the 'dumb Web server' that you originally mentioned might not
require any coordination between instances, things like SAP and Exchange
most definitely would.


Yup they do need it and in fact they already have their own
coordination mechanisms. So why not use the servers these services run
on natively and connect them to a SAN that provides a global cache
mechanism so they don't have to worry about moving data between the
sites ? What would VMS offer a service that was re-desgined to run on a
VMS cluster that this native solution wouldn't ?

...
All you've done is substitute this mythical 'storage appliance' for the
server hardware and OS.


Well a lot of people would rather buy some appliance boxes than
re-design their software to run on a new platform. If you can't see the
benefit of that, I don't know what to say anymore.

...
I think you know the answer as well as I do :


I suspect somewhat better, since I've actually designed and implemented
systems of this ilk a couple of times rather than just bloviated about them.


Oh now there's the grumpy old geezer again :-)

...
Now, if you're using 'distributed cache' (something I tend to use in the
context of allowing one system to benefit from the data in another's
cache rather than having to go to disk for it) to mean something much
more like 'distributed locking mechanism' (which tracks potential
synchronization issues such that they can be properly addressed should
they occur), then we're just using different terminology to describe the
same thing.


A distributed cache is a collection of caches on all the storage
appliance at all the sites. Perhaps I misunderstand your 'locking'; the
pieces of cache on all the appliances use sychronous messages to keep
coherency among them, and will send a chunk of dirty data at site 1 to
site 2 if a host at site 2 asks for it. Is that locking a cache region
? Yes something is keeping track of which appliance owns dirty data for
a given block range.

Of course you have to interrogate the cache (or other inter-site
synchronization facility) on every I/O - the idea is that you only have
to interrogate the *site-local* portion of it, because it's being kept
synchronously up to date about any temporary inconsistencies.


Right, all I/O goes through the distributed cache. If it knows a remote
appliance has an entry for a block you're doing I/O to, it will go get
it for you. If it knows there is no dirty data for a block and no local
read cache either, it will go read it from local storage for you.

I find it curious you suspect I don't know anything about the subject.


Why would you find it curious that your earlier incompetent observations
about it would lead someone to that conclusion?


Incompetent how ? By stating not just any app on any platorm can run on
VMS ? By failing to convince you I'm not an idiot ?

You do seem willing to learn, though - but you still appear a bit
confused about the level of inter-site coordination required by
concurrent distributed file-level access.


Thank you oh storage guru, I shall try to be worthy of your time. These
posts feel a lot like the wax-on/wax-off routine :-)
Seriously though, my main point has been that there are two levels of
inter-site coordination : one at the app level and one at the storage
level.

...
But Oracle for example can use raw block devices and does it's
own locking and synchronisation (with RAC), so there's not always a
need to provide one as part of the storage system.


If the storage system firmware (rather than the Oracle software) is
handling the inter-site block-level replication (as you seem to be
suggesting), then either that firmware needs to implement at least
short-term inter-site interlocks to ensure that regardless of which site
Oracle elects to obtain block-level data from the copy obtained is up to
date, or the Oracle code must ensure that until a write-complete ACK has
been received from the local storage hardware no remote access to that
block can occur (and the local hardware must not return completion
status until all copies have been updated).


Indeed. Until the storage says "yes I've accepted your I/O" (into
distributed cache in this case), Oracle must not allow other I/O to the
same block, or else there are no guarantees about what will be the
final data in there. But you see, the storage appliances don't lock
anything, it's Oracle that has to implement it this way.

...
As a specific example of a potential problem *above* the hardware level,
if your hypothetical service migrates at any speed exceeding that of
sneakernet, dirty data that it just wrote to the OS file system cache on
one site may not have been flushed to the underlying (distributed)
storage media by the time the service pops up on the other site (delays
of up to 30 seconds are common in Unix environments, for example).


Yes host caches are bad, if you want file systems you need a
distributed file system like CXFS, StorNext, PloyServe, ... They handle
host cache issues.

There is no data
locking required, the service can simply assume it's data is always
there and it can always access it. And like I said, most applications
requiring locks already provide them themselves.


Horse****: most applications requiring locks don't need to think about
them at all, because the underlying file system is doing all that
transparently for them. Just *getting to* the data involves following a
multi-link path through the file system directory structure, an
operation that can't occur reliably on a 'secondary' site without some
degree of inter-site update coordination (for God's sake, if you're
using an optimized journaling file system the secondary site may have
the pertinent *log* entries but won't have the associated in-memory
update context: you can't use the file system there in anything
resembling up-to-date fashion without first performing a recovery from
the on-disk log).


You are thinking about file systems and I was not. At the block level,
the most recent data is always present from all sites. No data locking
required, at least not at the block level. Apps need to implement
locking themselves.

....
Creating a snapshot and making it available on the other site requires
the migration service to integrate with the storage at both sites,
which is not just a trivial matter !


*None* of this is as trivial a matter as you imagine it to be, as I hope
you're starting to learn (if you respond again, we should get a pretty
good idea of just how educable you are).


Well I hope I'm not disappointing ! Did you think you scared me into
hiding with your biting sarcasm :-)
It is as trivial in this particular example of migrating agents. The
migration service only needs to ensure an agent is halted at site 1
before getting started at site 2 (and ofcourse save any of the agent's
state it needs to persist to resume service), and doesn't need to
bother with getting the data there.

That all depends on your definition of rare, and the latencies you are
talking about. If you have say 2000 km worth of latency, synchronous is
not an option even for very modest updates.


If update latency makes synchronous replication prohibitively slow
(VMS-style distributed lock management can allow the *reads* to proceed
at all sites without delay, as long as no nearby updates are occurring
at the time), then your only real option is to use ordered asynchronous
replication plus snapshots upon which any required recovery operations
are then performed before use (if you still don't understand why, reread
the existing material until the light dawns).


Are we talking about recovery now ? Yes you need write order coherency
in your asynchronous replication, and your app still needs to be able
to recover from partial but ordered loss of data.

Arne

  #27  
Old October 23rd 06, 09:34 PM posted to comp.arch.storage
Nik Simpson
external usenet poster
 
Posts: 73
Default EMC to IBM SAN LUN replication

Arne Joris wrote:
Nik Simpson wrote:
Sounds good in theory, but if the replication is asynchronous with no
synchronous communication or lock manager, how does the application at
site remote site know how far the application at the local site's
"writing barrier" has moved at any given moment?


There is synchronous messaging between the parts of the distributed
cache living at each site.


So the cache is synchronously mirrored between the two sites? If so, I
don't see how this is asynchronous in any conventional sense of the word
and will have the performance problems of any synchronous system when
the site-to-site latency is significant.

Also, your example of having a large sequential data set being written
at the source site and a reader at the remote site can be handled so
much more easily with a TCP/IP socket that I'm not sure what you think
this approach would buy given it's undoubted cost and complexity.

The cache is providing a way to transparently share the data that
hasn't been replicated yet between the two sites.


But in order for it to be seen at the remote site, there must be a copy
of the changed data at the remote site, so how does it allow for sharing
of "data that hasn't been replicated yet between the two sites."




I thought I already did but perhaps it wasn't explicit enough :
http://www.yottayotta.com/clusteredfilesys.html



No, you never mentioned Yotta in any post that I can recall. So now
we've got one clustered file system from a vendor with a history of not
delivering anything than marketing materials and implementing new (and
apparently efficient) ways of blowing through VC money at high speed.
This is actually a completely new incarnation of Yotta who managed to
spend the best part of $100M on a hi-end array that never shipped,
apparently there selling a different variety of snake oil now.

Based on what I can tell from the website, what they have appears to be
a distributed synchronous storage appliance, i.e. blocks are replciated
synchronously with users accessing a local filesystem. If so, that's
really not that special (as has been pointed out, it's nothing new)

Problems will be latency for writes as the various site lock-managers
negotiate for access which will be in addition to the "write" being
mirrored to each cache.


--
Nik Simpson
  #28  
Old October 23rd 06, 10:50 PM posted to comp.arch.storage
Bill Todd
external usenet poster
 
Posts: 162
Default EMC to IBM SAN LUN replication

Arne Joris wrote:
Bill Todd wrote:
No: I'm an old geezer observing that every single damn thing that
you've characterized as 'emerging technology' is in fact very old hat:
VMS had it over two decades ago, IBM had it a decade ago in Parallel
Sysplex (and to a lesser extent in HACMP on AIX), other Unixes have been
developing it more recently, as well as third-parties (Mercury's SANergy
being one early example, a shared-storage/central metadata server
implementation supporting both Windows and Unix later bought by IBM for
Tivoli which could be used to achieve most of what you've described,
though not as flexibly as VMS facilities can): it's Windows that's the
real laggard (Microsoft was working with DEC in the '90s to try to move
in this direction, but came up far short and never rectified that: they
appear to have decided that concurrently-shared-storage architectures,
whether real or virtual, were not the way they wished to go).


These are all server-based solutions. Putting it inside the SAN on a
storage appliance has benefits, but you don't seem to believe them.


Rather, trying to 'put it inside the SAN on a storage appliance' has
severe limitations, but you don't seem to understand them.

That is the emerging technology part,


No, it's the bull**** part. Try to follow along this time: I won't
bother attempting to educate you again.

....

Yup they do need it and in fact they already have their own
coordination mechanisms. So why not use the servers these services run
on natively and connect them to a SAN that provides a global cache
mechanism so they don't have to worry about moving data between the
sites ?


Because when you limit yourself to raw block-level access (as you claim
is your intent later on in your post) 'moving data between the sites' is
the *easy* part, and these coordinating instances of a
block-level-access application are in a better position to do so
intelligently than some generic (and somewhat mis-labeled, as should
become obvious) 'caching' mechanism in the hardware.

If any real intelligence is required due to significant geographical
site separation, that is. If not, then the application just
synchronously mirrors between sites - even easier.

If you posit a shared-storage file system to allow your applications
transparent file-level access, then the observations above about
applications apply equally to the file system's internal operation (and
since you already needed special shared-storage file system facilities,
having the file system handle the replication in software is far less
expensive than using custom hardware).

What would VMS offer a service that was re-desgined to run on a
VMS cluster that this native solution wouldn't ?


The point, which you seem to have forgotten, is exactly the opposite:
what would this so-called 'emerging technology' offer that VMS didn't
offer two decades ago - at the block level as well as at the file level
(I think that Oracle Parallel Server used the former)? The only
remotely novel feature appears to be the lazy replication with
synchronous cache-coherence (itself of somewhat debatable merit, unless
you're replicating remotely solely for remote access performance rather
than for guaranteed availability: letting the application use data that
has not yet been replicated entails some danger if the only copy is then
lost) - a rather narrow market niche upon which to base a product, and
(as should also become evident) definitely not the best way to handle
bandwidth-constrained long links (they have to be bandwidth-constrained,
since if they were only latency-constrained then the synchronous
communication required just by the cache-coherence mechanisms that you
describe wouldn't be feasible).


...
All you've done is substitute this mythical 'storage appliance' for the
server hardware and OS.


Well a lot of people would rather buy some appliance boxes than
re-design their software to run on a new platform.


You really do seem intent on forgetting that you were not talking about
anything platform-specific in your initial drivel but rather about
'emerging technology'. So I'll remind you once again of why it made you
appear incompetent (and also suggest that the longer you continue to try
to bluster your way out of it, the more incompetent you appear).

If you can't see the
benefit of that, I don't know what to say anymore.


You haven't known what to say all along, but that hasn't kept you from
saying it.


..
I think you know the answer as well as I do :

I suspect somewhat better, since I've actually designed and implemented
systems of this ilk a couple of times rather than just bloviated about them.


Oh now there's the grumpy old geezer again :-)


Nah - I'm just not very tolerant of people who are as unaware of the
limits of their knowledge as you are, yet insist on arguing about things
they don't understand rather than concentrating on remedying that
deficiency.

And this is in no way age-related: I've lacked tolerance for
incompetent and ineducable blowhards since I was young.

Now, 'incompetent and ineducable' are characterizations that are indeed
relative to the subject under discussion: if you were not bloviating at
such a detailed level, they might be less applicable. Unfortunately,
you've chosen to try to argue about details that you're just not
equipped to address (or, apparently, even understand - though you've now
got another shot at that here).


...
Now, if you're using 'distributed cache' (something I tend to use in the
context of allowing one system to benefit from the data in another's
cache rather than having to go to disk for it) to mean something much
more like 'distributed locking mechanism' (which tracks potential
synchronization issues such that they can be properly addressed should
they occur), then we're just using different terminology to describe the
same thing.


A distributed cache is a collection of caches on all the storage
appliance at all the sites.


'Fraid not: that's just a dumb bag of unconnected caches.

Perhaps I misunderstand your 'locking';


Seems likely.

the
pieces of cache on all the appliances use sychronous messages to keep
coherency among them,


That's a cache-coherency mechanism, rather than a 'distributed cache'
per se. VMS uses its distributed lock manager to (among many other
things) create such a distributed coherency mechanism for its
distributed storage.

and will send a chunk of dirty data at site 1 to
site 2 if a host at site 2 asks for it.


Now, *that's* starting to resemble a real distributed cache, rather than
just a bag o' caches connected by a coherence mechanism (e.g., of the
'invalidate' variety that passes updates only through the underlying
storage layer).

Of course, what you've described isn't a very broadly-useful cache, but
just a means of supporting lazy inter-site replication (more on that later).

....

my main point has been that there are two levels of
inter-site coordination : one at the app level and one at the storage
level.


Well, if you limit the discussion to apps that don't use a file system,
I suppose. But as I already noted above, not only is that a rather
narrow market niche upon which to base a product, but the amount of
value that storage-level cache-coherence adds is at best debatable (in
fact, there's a specific example just coming up below).


...
But Oracle for example can use raw block devices and does it's
own locking and synchronisation (with RAC), so there's not always a
need to provide one as part of the storage system.

If the storage system firmware (rather than the Oracle software) is
handling the inter-site block-level replication (as you seem to be
suggesting), then either that firmware needs to implement at least
short-term inter-site interlocks to ensure that regardless of which site
Oracle elects to obtain block-level data from the copy obtained is up to
date, or the Oracle code must ensure that until a write-complete ACK has
been received from the local storage hardware no remote access to that
block can occur (and the local hardware must not return completion
status until all copies have been updated).


Indeed. Until the storage says "yes I've accepted your I/O" (into
distributed cache in this case), Oracle must not allow other I/O to the
same block, or else there are no guarantees about what will be the
final data in there. But you see, the storage appliances don't lock
anything, it's Oracle that has to implement it this way.


And (along the lines that I mentioned above) it's trivial (and likely
necessary, for other reasons) for Oracle to handle this itself (since it
already has to coordinate synchronously between instances about any data
that may be in the process of being updated), rather than depend upon
some specialized, proprietary storage-level inter-site caching mechanism
(Oracle has always preferred to provide such facilities itself,
precisely so that they will be available in *all* the environments it
runs on). In fact, if Oracle handles it itself it need not even wait
for the update to occur at all (locally or remotely), but can just send
the up-to-date in-memory copy to the other site to use there.

And that, perhaps more than anything else, illustrates why trying to
shove inter-site cache coherence (you're actually talking more about
underlying data coherence than 'cache coherence', since any caching is
strictly short-term to cover things until the replication has completed)
down into the storage layer is ill-conceived and decidedly sub-optimal.
Oracle (using block-level access) and distributed file systems (using
block-level access underneath to support distributed file-level access
above) *know what they can afford to cache internally rather than force
immediately (even if lazily) to disk*. They can coordinate use of such
non-persistent data between sites, while the underlying storage
(including your proposed storage appliance 'caching' mechanism) *can't
even see it yet*.

When they use transaction logs, they can capture small logical updates
synchronously, propagate these small log updates to remote sites
synchronously (with negligibly greater latency than it takes just to
propagate the information that the log has been persistently updated -
if log persistence was required for that log record), and propagate the
larger related block updates lazily (since as long as the log
information has been made persistent, there's no rush about making the
related block updates persistent - they can always be redone from the
log data if necessary; for that matter, if inter-site bandwidth is a
major constraint, the remote block content can be reconstructed from the
log data there rather than have to be sent at all). Implementations
that aren't journaled can also send small updates (or other coordinating
information) synchronously with approximately the same cost as the
synchronous propagation of cache-coherence data that you're advocating -
and with far greater control over the inter-site sharing semantics than
a dumb lower-level persistent data-coherence protocol allows.

If you want to optimize lazy replication to a distant site where
concurrent access to the data is permitted, *that's* the way to go about
it. And with reasonably fine-grained distributed locking it supports
your example of reading gargantuan amounts of seismic data behind the
writer as well - even if it's done through a file system instead of with
raw blocks: while the writer (and the lazy remote updating facility)
will temporarily lock the end of the file while appending to it, the
rest of the file (and the path to it) can be accessed directly at the
remote site as long as it's not also being actively changed at the
originating site.

The bottom line is that there's no free lunch: if you want remote-site
access to its local copies of data at other than the snapshot level, you
need synchronous inter-site coordination of some kind for anything save
read-only access at *all* sites. You can use fine-grained revocable
distributed locking (coherence) mechanisms to minimize the need to check
with other sites on accesses (synchronous VMS clusters have been
successfully used at separations of up to 500 miles that I know of, and
that's certainly not a hard limit - in particular, special-case
situations such as your seismic data example might well tolerate much
larger separations), and where bulk access to data is concerned you can
(in the absence of updating conflict) gain access to a large amount of
it with only a single inter-site coherence check even without
distributed coherence locks (i.e., the latency of the inter-site
permission check can be small compared to the local bulk-transfer time);
in either case, the facilities are in most cases better implemented in
the inter-site software coordination layer than at a lower level - only
when they have *not* been flexibly implemented at that higher level
would less comprehensive lower-level facilities have any value.

Which is what YottaYotta (which you mentioned in your original post)
provides. Last I knew, they weren't doing all that well, which suggests
that their product (while it may admirably do what it says on the tin)
may be as limited in *general* applicability as I suggested above. I do
recall that they are (or at least began as) a Canadian company, and you
seem to have a Canadian email address - but wouldn't want to jump to any
conclusion based on that...


...
As a specific example of a potential problem *above* the hardware level,
if your hypothetical service migrates at any speed exceeding that of
sneakernet, dirty data that it just wrote to the OS file system cache on
one site may not have been flushed to the underlying (distributed)
storage media by the time the service pops up on the other site (delays
of up to 30 seconds are common in Unix environments, for example).


Yes host caches are bad, if you want file systems you need a
distributed file system like CXFS, StorNext, PloyServe, ... They handle
host cache issues.


And while doing so can handle the kinds of issues you'd like to stick
down in the hardware better (and less expensively) than they can be
handled there, as described above.


There is no data
locking required, the service can simply assume it's data is always
there and it can always access it. And like I said, most applications
requiring locks already provide them themselves.

Horse****: most applications requiring locks don't need to think about
them at all, because the underlying file system is doing all that
transparently for them. Just *getting to* the data involves following a
multi-link path through the file system directory structure, an
operation that can't occur reliably on a 'secondary' site without some
degree of inter-site update coordination (for God's sake, if you're
using an optimized journaling file system the secondary site may have
the pertinent *log* entries but won't have the associated in-memory
update context: you can't use the file system there in anything
resembling up-to-date fashion without first performing a recovery from
the on-disk log).


You are thinking about file systems and I was not.


Then what, exactly, were you referring to when you said

"If a storage appliance does all this, you can connect the WAN link and
the SAN at each site to the storage appliances and every service can run
on it's own server with whatever OS it prefers, connecting to the local
SAN, using whatever block access it prefers (raw block device, file
system,...)"

It certainly *sounded* as if you thought that your magical inter-site
block replication mechanism would allow use of any OS (and any related
file system - see your last words above) to access data on both ends of
the inter-site link.

And, of course, that's dead wrong.

Or when Nick said

"you need something in the replication layer that understands
synchronization issues at the file system"

and you responded

"If you assume every data block in every file could be being updated at
any moment, you are right.
But take for example seismic data files, which get produced as an
hours-long sequential dump of data. Why should a processing app at site
2 need to wait until the entire file had been written at site 1 ? It
could simply start reading at block 0, and perhaps even do some
modifications to the data, as long as it hadn't crossed site 1's writing
barrier."

and

"The web servers might not get notified every time a new document is
added at the remote site, but an occasional directory refresh will show
the new content and that is more than enough for a lot of applications."

That certainly *sounded* as if you a) were talking about file access and
b) didn't have a clue about other file system synchronization
dependencies (e.g., as in the mere path *to* the file that I mentioned
above) that were a problem.

In any event, if you are not *now* talking about file systems, then (as
I'm getting tired of observing) you're talking about a rather
specialized product niche - and still one where whatever the application
is doing internally to make up for the lack of file-level access
facilities may be in a much better position to manage inter-site
replication (as well as the other inter-site issues it must be aware of)
than your proposed box would be.

....

Creating a snapshot and making it available on the other site requires
the migration service to integrate with the storage at both sites,
which is not just a trivial matter !

*None* of this is as trivial a matter as you imagine it to be, as I hope
you're starting to learn (if you respond again, we should get a pretty
good idea of just how educable you are).


Well I hope I'm not disappointing !


Since I've encountered similar people frequently over the years, I'm
neither disappointed nor surprised (though I'm always somewhat hopeful -
guess I'm just an optimist at heart). But you now have another
opportunity to mitigate the impression that you've built up.

....

That all depends on your definition of rare, and the latencies you are
talking about. If you have say 2000 km worth of latency, synchronous is
not an option even for very modest updates.

If update latency makes synchronous replication prohibitively slow
(VMS-style distributed lock management can allow the *reads* to proceed
at all sites without delay, as long as no nearby updates are occurring
at the time), then your only real option is to use ordered asynchronous
replication plus snapshots upon which any required recovery operations
are then performed before use (if you still don't understand why, reread
the existing material until the light dawns).


Are we talking about recovery now ?


Perhaps you've not very familiar with how storage-level crash-consistent
snapshots work either (when you're using a *planned* snapshot you have
the option, if the higher layers support it, of orchestrating things
such that recovery is not required to use said snapshot, but if you want
to be able to take ad hoc remote snapshots at any point in time - since
you seemed to be saying that you didn't want to have to involve the
application in any explicit coordination - they'll only be
crash-consistent).

Yes you need write order coherency
in your asynchronous replication, and your app still needs to be able
to recover from partial but ordered loss of data.


As does any underlying file system: that's what 'recovery' above
referred to.

- bill
  #29  
Old October 24th 06, 03:52 PM posted to comp.arch.storage
Arne Joris
external usenet poster
 
Posts: 14
Default EMC to IBM SAN LUN replication

Nik Simpson wrote:
So the cache is synchronously mirrored between the two sites? If so, I
don't see how this is asynchronous in any conventional sense of the word
and will have the performance problems of any synchronous system when
the site-to-site latency is significant.


The cache is not mirrored, dirty data is only present in a single
appliance.
The messages between the cache components at both sites are
synchronous.
When an appliance receives an I/O request for a given block, the
distributed cache uses synchronous messages to find out if there is
cached data on any appliances, be it at the same site or a remote site.

Also, your example of having a large sequential data set being written
at the source site and a reader at the remote site can be handled so
much more easily with a TCP/IP socket that I'm not sure what you think
this approach would buy given it's undoubted cost and complexity.


Both sites need the file at the end of the day; site 1 produces it,
site 2 processes it and modifies it, site 1 might then be required to
visualize the data, etc.
If you use a TCP socket to push data back and forth over a WAN, your
apps will need to be in lockstep with each other (with perhaps a buffer
at each end), which seems a whole lot harder to implement than have
both apps simply write and read a set of blocks. You could take the
same apps that were used to produce, process and visualize the data in
a single site using their local SAN, and move them out to different
sites without having to modify them.

The cache is providing a way to transparently share the data that
hasn't been replicated yet between the two sites.


But in order for it to be seen at the remote site, there must be a copy
of the changed data at the remote site, so how does it allow for sharing
of "data that hasn't been replicated yet between the two sites."


No you don't need to copy all the cache to the other site; all you need
is a cache coherency protocol that can fetch dirty data at the remote
site on demand. And keep in mind that only the dirty cached data needs
to be fetched remotely; any clean remote caches can be ignored and you
can go straight to local disk to get the data. The distributed cache
can hide all these details from the initiators.

...
Based on what I can tell from the website, what they have appears to be
a distributed synchronous storage appliance, i.e. blocks are replciated
synchronously with users accessing a local filesystem. If so, that's
really not that special (as has been pointed out, it's nothing new)


Not according to this article :
http://www.byteandswitch.com/document.asp?doc_id=94013

Arne

  #30  
Old October 24th 06, 06:24 PM posted to comp.arch.storage
Arne Joris
external usenet poster
 
Posts: 14
Default EMC to IBM SAN LUN replication

Good grief Bill, are you one of those people that type more as they
grow angrier ? Thank you for all your replies, but I'm starting to get
a bit ticked off by your rather abrasive style.

If we can get back to what started this whole discussion, I stated that

... if your secondary site is not just a remote data vault
but an actual production site where people need access to the data, it
is pretty lame to have severs at the secondary site go over the WAN to
go read the data from the primary storage when they have a copy of the
data right there ! With a distributed block cache on top of
asynchronous data replication, you could have both sites do I/O to the
same volumes and access their local storage.


I never heard you make any successful arguments against this, your only
beef with me is that you claim it would be stupid to use a distributed
block cache appliance (yes I know you don't like that wording) to do
this, seeing how the only useful way for apps at the two sites to use
the data requires synchronous messaging between them anyway.

Then there was a whole lot of talk about VMS, I'm not sure why that was
relevant. Despite increasing levels of toxicity (unilaterally from your
part I might add), I have kept on trying to argue that for certain
solutions, the appliance solution makes sense. I think you have been
thinking about clustered applications with tight coherency requirements
and can't or won't see beyond them.

I'm fine with the fact you think I deserve a newsgroup lashing for
arguing for something you are convinced is impossible or impractible,
I'll ignore your claims about stupidity needing to be punished as I'm
sure you'll have your own personal reasons for this.

Keep on VMSing Bill !

Arne

 




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
EMC Celerra Replication Fred Storage & Hardrives 1 February 9th 06 05:37 PM
replication and tape backup [email protected] Storage & Hardrives 4 February 22nd 05 03:21 AM
SAN replication - WAN Hal Kuff Storage & Hardrives 1 October 22nd 04 01:21 AM
SAN filesystem uses local storage for reads with synchronous replication Bill Todd Storage & Hardrives 6 October 21st 04 04:54 PM
Sync Replication over CWDM lines Andy S Storage & Hardrives 1 July 21st 04 09:06 PM


All times are GMT +1. The time now is 12:06 PM.


Powered by vBulletin® Version 3.6.4
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 HardwareBanter.
The comments are property of their posters.