A computer components & hardware forum. HardwareBanter

If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

Go Back   Home » HardwareBanter forum » General Hardware & Peripherals » Storage & Hardrives
Site Map Home Register Authors List Search Today's Posts Mark Forums Read Web Partners

comments on Panasas Object Storage



 
 
Thread Tools Display Modes
  #1  
Old October 27th 03, 05:07 PM
darren
external usenet poster
 
Posts: n/a
Default comments on Panasas Object Storage

Hi all,

Just came across this article:

http://www.infoworld.com/article/03/...rinside_1.html

http://www.panasas.com/activescaleos.html

Sounds like an interesting product. Anybopdy got any comments or experience
with this?


  #2  
Old October 27th 03, 11:04 PM
Bill Todd
external usenet poster
 
Posts: n/a
Default


"darren" wrote in message
...
Hi all,

Just came across this article:

http://www.infoworld.com/article/03/...rinside_1.html

http://www.panasas.com/activescaleos.html

Sounds like an interesting product. Anybopdy got any comments or

experience
with this?


It is indeed an interesting product, but Infoworld doesn't seem to have been
able to separate hype from fact.

Multi-protocol file servers (which can serve up any individual file the way
the client wants it, whether via NFS or CIFS) are nothing new. Clustered
file servers actually aren't anything new, either (VMS has supported them
for decades, and some commercial Unixes have more recently), and while
they're relatively new to commodity hardware Panasas' isn't the first
(Tricord, which went belly-up last year, actually shipped a product called
'Illumina' with some similar characteristics, for example - including IIRC
the ability to reshuffle data to restore redundancy around failed components
and shift load to added hardware).

Their CTO is Garth Gibson, who's been touting 'object storage devices' for
many years (starting with his work at CMU on 'network-attached secure
disks'). After several somewhat misguided shots at the problem this one
looks worthwhile, though still somewhat sub-optimal in a few respects. And
while there may be some 'object' flavor to some of the internal mechanisms
used, it's still basically just a file system that distributes files across
an expandible set of storage servers: as I said, a worthwhile product, but
hardly unique (see, for example, IBM's 'Storage Tank' and the emerging
Lustre product on Linux).

- bill



  #3  
Old October 29th 03, 02:47 AM
VirtualSean
external usenet poster
 
Posts: n/a
Default

Hello Bill,

To your rejoinder points... a couple of questions, if I may.

...while there may be some 'object' flavor to some of the internal mechanisms
used, it's still basically just a file system that distributes files across
an expandible set of storage servers...


To realize a "true" OSD (as defined by Intel, et al), is it the case
that the physical drives themselves in such a configuration (the drive
manufacturer's firmware) and/or disk controllers must posses the
capability of "handling" the Object Storage model, along with
correspondingly aligned "filesystem" (objsystem) software systems?

I don't know the Panasas product offering(s), but I'm curious as to
how far they go toward the idealized OBS/OSD model, and to what degree
they might have [dis]advantages vis-a-vis Lustre and something like
Sistina's GFS (the latter may be a stretch, but I'm trying to
understand if an OSD model could be imposed on something like a GFS as
well).

I'll appreciate any info.

Thank you.

--
VS




"Bill Todd" wrote in message ...
"darren" wrote in message
...
Hi all,

Just came across this article:

http://www.infoworld.com/article/03/...rinside_1.html

http://www.panasas.com/activescaleos.html

Sounds like an interesting product. Anybopdy got any comments or

experience
with this?


It is indeed an interesting product, but Infoworld doesn't seem to have been
able to separate hype from fact.

Multi-protocol file servers (which can serve up any individual file the way
the client wants it, whether via NFS or CIFS) are nothing new. Clustered
file servers actually aren't anything new, either (VMS has supported them
for decades, and some commercial Unixes have more recently), and while
they're relatively new to commodity hardware Panasas' isn't the first
(Tricord, which went belly-up last year, actually shipped a product called
'Illumina' with some similar characteristics, for example - including IIRC
the ability to reshuffle data to restore redundancy around failed components
and shift load to added hardware).

Their CTO is Garth Gibson, who's been touting 'object storage devices' for
many years (starting with his work at CMU on 'network-attached secure
disks'). After several somewhat misguided shots at the problem this one
looks worthwhile, though still somewhat sub-optimal in a few respects. And
while there may be some 'object' flavor to some of the internal mechanisms
used, it's still basically just a file system that distributes files across
an expandible set of storage servers: as I said, a worthwhile product, but
hardly unique (see, for example, IBM's 'Storage Tank' and the emerging
Lustre product on Linux).

- bill

  #4  
Old October 29th 03, 09:35 AM
Bill Todd
external usenet poster
 
Posts: n/a
Default


"VirtualSean" wrote in message
om...
Hello Bill,

To your rejoinder points... a couple of questions, if I may.

...while there may be some 'object' flavor to some of the internal

mechanisms
used, it's still basically just a file system that distributes files

across
an expandible set of storage servers...


To realize a "true" OSD (as defined by Intel, et al),


I hadn't noticed that Intel had presumed to 'define' what an OSD was.

is it the case
that the physical drives themselves in such a configuration (the drive
manufacturer's firmware) and/or disk controllers must posses the
capability of "handling" the Object Storage model, along with
correspondingly aligned "filesystem" (objsystem) software systems?


That would, I suspect, depend upon exactly whose definition of what an OSD
was you used.


I don't know the Panasas product offering(s), but I'm curious as to
how far they go toward the idealized OBS/OSD model,


And that would as well.

and to what degree
they might have [dis]advantages vis-a-vis Lustre and something like
Sistina's GFS (the latter may be a stretch, but I'm trying to
understand if an OSD model could be imposed on something like a GFS as
well).


The current definitions of an OSD that I'm aware of state that the 'device'
(which if undefined could be something that many people would call a
'server' of some kind rather than a disk) should encapsulate the physical
placement of data on the disk(s) such that it can be addressed externally by
object-identifier/offset/length triples. This allows the device to make
optimization decisions on its own about placement (IIRC HP's AutoRAID
product had some primitive capabilities in this area, but did not present
the result in the form of 'objects'). The 'object' may also support
(possibly extendible) 'attributes'.

One problem with this approach is that rather than simply moving complexity
from something like a file 'inode' from one place (the higher-level
software) to another ( the OSD), it adds a layer - since files can exceed
the size of the OSD and hence must be able to span multiple OSDs. So
there's still metadata in the 'inode', plus more at the OSD - and more
stages of metadata introduce more potential for requiring additional random
accesses before you can get to the data you want (not to mention adding more
points at which any *changes* in metadata must be reliably stored before a
modification operation can be considered to be 'stable' - though suitable
introduction of NVRAM at these points can help mitigate any additional
latency that this might cause). Another problem is that a high-level view
of a file as an 'object' often includes any storage redundancy used to store
that file, whereas individual 'objects' at an OSD by definition do not
encapsulate redundancy across multiple OSDs (hence for this reason as well a
'file' cannot necessarily map 1:1 to an OSD object). A third issue is that
of locking granularity, which also can span OSDs when a file does (Garth had
some early problems with this).

OTOH, decentralizing this metadata can help support increased scalability in
large systems (where many clients may be able to interrogate many OSDs
directly without going through a single 'inode server' as a bottleneck -
though there are other ways to accomplish this as well, such as
client-caching some or all of the metadata when it doesn't change all that
often).

IMO, the industry is still groping around for something approaching an
'ideal' decomposition of work in distributed file systems. The SNIA
includes disk vendors who would *love* to find ways to standardize what an
'OSD' is so that they could build added value into their products and price
them accordingly, but it's just not yet clear what that standard should be -
or even that a disk is the right level for *any* such standardization (and
if it's not, whether standardization at a higher - storage server - level
makes sense, rather than just accepting that the possibilities there are
sufficiently rich that proprietary solutions should be allowed to bloom to
address varying needs).

My own current views are that 1) OSDs don't make a great deal of sense when
talking about block-level storage (i.e., they don't add much value to that
idiom), 2) OSDs don't make a great deal of sense as higher-than-block-level
devices attached to single hosts (since they don't off-load sufficient work
to be very interesting: whatever you off-load tends to be balanced by the
additional host interface complexity required to control it as closely as a
host may often wish to do), and 3) *standardized* OSDs don't make much
sense in distributed systems, since they unnecessarily constrain creativity
in the design of those systems (i.e., such systems are sufficiently complex
that no single decomposition is even close to ideal for all circumstances -
and no current decomposition that I've seen is close to ideal even for many
common ones).

- bill



  #5  
Old October 29th 03, 11:33 AM
darren
external usenet poster
 
Posts: n/a
Default

Hi Bill,

Thanx for the insight.

I was looking for a high-throughput storage solution to handle random read
and writes of large number of small files.

Wonder if the panasas solution will do better then a netapp.

We are currently facing some problems when loading large number of small
files into a NetApp F820 at high speed...the machine actually "core-dumped"
on us!
"Bill Todd" wrote in message
...

"VirtualSean" wrote in message
om...
Hello Bill,

To your rejoinder points... a couple of questions, if I may.

...while there may be some 'object' flavor to some of the internal

mechanisms
used, it's still basically just a file system that distributes files

across
an expandible set of storage servers...


To realize a "true" OSD (as defined by Intel, et al),


I hadn't noticed that Intel had presumed to 'define' what an OSD was.

is it the case
that the physical drives themselves in such a configuration (the drive
manufacturer's firmware) and/or disk controllers must posses the
capability of "handling" the Object Storage model, along with
correspondingly aligned "filesystem" (objsystem) software systems?


That would, I suspect, depend upon exactly whose definition of what an OSD
was you used.


I don't know the Panasas product offering(s), but I'm curious as to
how far they go toward the idealized OBS/OSD model,


And that would as well.

and to what degree
they might have [dis]advantages vis-a-vis Lustre and something like
Sistina's GFS (the latter may be a stretch, but I'm trying to
understand if an OSD model could be imposed on something like a GFS as
well).


The current definitions of an OSD that I'm aware of state that the

'device'
(which if undefined could be something that many people would call a
'server' of some kind rather than a disk) should encapsulate the physical
placement of data on the disk(s) such that it can be addressed externally

by
object-identifier/offset/length triples. This allows the device to make
optimization decisions on its own about placement (IIRC HP's AutoRAID
product had some primitive capabilities in this area, but did not present
the result in the form of 'objects'). The 'object' may also support
(possibly extendible) 'attributes'.

One problem with this approach is that rather than simply moving

complexity
from something like a file 'inode' from one place (the higher-level
software) to another ( the OSD), it adds a layer - since files can exceed
the size of the OSD and hence must be able to span multiple OSDs. So
there's still metadata in the 'inode', plus more at the OSD - and more
stages of metadata introduce more potential for requiring additional

random
accesses before you can get to the data you want (not to mention adding

more
points at which any *changes* in metadata must be reliably stored before a
modification operation can be considered to be 'stable' - though suitable
introduction of NVRAM at these points can help mitigate any additional
latency that this might cause). Another problem is that a high-level view
of a file as an 'object' often includes any storage redundancy used to

store
that file, whereas individual 'objects' at an OSD by definition do not
encapsulate redundancy across multiple OSDs (hence for this reason as well

a
'file' cannot necessarily map 1:1 to an OSD object). A third issue is

that
of locking granularity, which also can span OSDs when a file does (Garth

had
some early problems with this).

OTOH, decentralizing this metadata can help support increased scalability

in
large systems (where many clients may be able to interrogate many OSDs
directly without going through a single 'inode server' as a bottleneck -
though there are other ways to accomplish this as well, such as
client-caching some or all of the metadata when it doesn't change all that
often).

IMO, the industry is still groping around for something approaching an
'ideal' decomposition of work in distributed file systems. The SNIA
includes disk vendors who would *love* to find ways to standardize what an
'OSD' is so that they could build added value into their products and

price
them accordingly, but it's just not yet clear what that standard should

be -
or even that a disk is the right level for *any* such standardization (and
if it's not, whether standardization at a higher - storage server - level
makes sense, rather than just accepting that the possibilities there are
sufficiently rich that proprietary solutions should be allowed to bloom to
address varying needs).

My own current views are that 1) OSDs don't make a great deal of sense

when
talking about block-level storage (i.e., they don't add much value to that
idiom), 2) OSDs don't make a great deal of sense as

higher-than-block-level
devices attached to single hosts (since they don't off-load sufficient

work
to be very interesting: whatever you off-load tends to be balanced by the
additional host interface complexity required to control it as closely as

a
host may often wish to do), and 3) *standardized* OSDs don't make much
sense in distributed systems, since they unnecessarily constrain

creativity
in the design of those systems (i.e., such systems are sufficiently

complex
that no single decomposition is even close to ideal for all

circumstances -
and no current decomposition that I've seen is close to ideal even for

many
common ones).

- bill





  #6  
Old October 29th 03, 04:11 PM
grey
external usenet poster
 
Posts: n/a
Default

The Panasas equipment is pretty interesting, you should also check out
the Spinnaker networks equipment and the NAS equipment from SGI.
NetApp equipment has the write penalty in part due to its use of RAID
4. You might want to talk to the folks at www.zerowait.com, they know
a lot of the tricks on tuning NetApp equipment.

Grey

"darren" wrote in message ...
Hi Bill,

Thanx for the insight.

I was looking for a high-throughput storage solution to handle random read
and writes of large number of small files.

Wonder if the panasas solution will do better then a netapp.

We are currently facing some problems when loading large number of small
files into a NetApp F820 at high speed...the machine actually "core-dumped"
on us!
"Bill Todd" wrote in message
...

"VirtualSean" wrote in message
om...
Hello Bill,

To your rejoinder points... a couple of questions, if I may.

...while there may be some 'object' flavor to some of the internal

mechanisms
used, it's still basically just a file system that distributes files

across
an expandible set of storage servers...

To realize a "true" OSD (as defined by Intel, et al),


I hadn't noticed that Intel had presumed to 'define' what an OSD was.

is it the case
that the physical drives themselves in such a configuration (the drive
manufacturer's firmware) and/or disk controllers must posses the
capability of "handling" the Object Storage model, along with
correspondingly aligned "filesystem" (objsystem) software systems?


That would, I suspect, depend upon exactly whose definition of what an OSD
was you used.


I don't know the Panasas product offering(s), but I'm curious as to
how far they go toward the idealized OBS/OSD model,


And that would as well.

and to what degree
they might have [dis]advantages vis-a-vis Lustre and something like
Sistina's GFS (the latter may be a stretch, but I'm trying to
understand if an OSD model could be imposed on something like a GFS as
well).


The current definitions of an OSD that I'm aware of state that the

'device'
(which if undefined could be something that many people would call a
'server' of some kind rather than a disk) should encapsulate the physical
placement of data on the disk(s) such that it can be addressed externally

by
object-identifier/offset/length triples. This allows the device to make
optimization decisions on its own about placement (IIRC HP's AutoRAID
product had some primitive capabilities in this area, but did not present
the result in the form of 'objects'). The 'object' may also support
(possibly extendible) 'attributes'.

One problem with this approach is that rather than simply moving

complexity
from something like a file 'inode' from one place (the higher-level
software) to another ( the OSD), it adds a layer - since files can exceed
the size of the OSD and hence must be able to span multiple OSDs. So
there's still metadata in the 'inode', plus more at the OSD - and more
stages of metadata introduce more potential for requiring additional

random
accesses before you can get to the data you want (not to mention adding

more
points at which any *changes* in metadata must be reliably stored before a
modification operation can be considered to be 'stable' - though suitable
introduction of NVRAM at these points can help mitigate any additional
latency that this might cause). Another problem is that a high-level view
of a file as an 'object' often includes any storage redundancy used to

store
that file, whereas individual 'objects' at an OSD by definition do not
encapsulate redundancy across multiple OSDs (hence for this reason as well

a
'file' cannot necessarily map 1:1 to an OSD object). A third issue is

that
of locking granularity, which also can span OSDs when a file does (Garth

had
some early problems with this).

OTOH, decentralizing this metadata can help support increased scalability

in
large systems (where many clients may be able to interrogate many OSDs
directly without going through a single 'inode server' as a bottleneck -
though there are other ways to accomplish this as well, such as
client-caching some or all of the metadata when it doesn't change all that
often).

IMO, the industry is still groping around for something approaching an
'ideal' decomposition of work in distributed file systems. The SNIA
includes disk vendors who would *love* to find ways to standardize what an
'OSD' is so that they could build added value into their products and

price
them accordingly, but it's just not yet clear what that standard should

be -
or even that a disk is the right level for *any* such standardization (and
if it's not, whether standardization at a higher - storage server - level
makes sense, rather than just accepting that the possibilities there are
sufficiently rich that proprietary solutions should be allowed to bloom to
address varying needs).

My own current views are that 1) OSDs don't make a great deal of sense

when
talking about block-level storage (i.e., they don't add much value to that
idiom), 2) OSDs don't make a great deal of sense as

higher-than-block-level
devices attached to single hosts (since they don't off-load sufficient

work
to be very interesting: whatever you off-load tends to be balanced by the
additional host interface complexity required to control it as closely as

a
host may often wish to do), and 3) *standardized* OSDs don't make much
sense in distributed systems, since they unnecessarily constrain

creativity
in the design of those systems (i.e., such systems are sufficiently

complex
that no single decomposition is even close to ideal for all

circumstances -
and no current decomposition that I've seen is close to ideal even for

many
common ones).

- bill



  #7  
Old October 29th 03, 06:26 PM
Bill Todd
external usenet poster
 
Posts: n/a
Default

[Responded privately before noticing this newsgroup post:]

Hi Bill,

Thanx for the insight.

I was looking for a high-throughput storage solution to handle random
read
and writes of large number of small files.

Wonder if the panasas solution will do better then a netapp.

We are currently facing some problems when loading large number of small
files into a NetApp F820 at high speed...the machine actually
"core-dumped"
on us!


Needless to say, that should *never* happen due to simple load. Either your
hardware has a problem, or WAFL does: I would expect NetApp to take this
problem fairly seriously if you reported it.

Log-structured (or sort-of-log-structured, like NetApp) file systems should
be a good choice for the workload you describe - especially when
supplemented with NVRAM to handle the most recent portion of the 'log' (as
WAFL is). I think Sun may have one that they support, and Linux may as
well. Since I have no idea what internal algorithms Panasas uses to place
data on disk, there's no way to SWAG how well they'd do: I have a
recollection of recently seeing IOPS figures for a system that were
*clearly* far beyond the capabilities of the underlying disk storage (i.e.,
I expect they referred to accessing cached data), and it may well have been
in the Panasas literature (which is the most recent stuff I've looked at) -
so that won't necessarily shed much light on the subject.

Reiserfs on Linux is supposedly optimized to handle small files well, but
may not offer as much flexibility as you'd need (e.g., the ability to load
them at high speed without much robustness because you could just start the
load over in the unlikely event of, e.g., a power failure). I've been
working on a file system of my own for a while that should do the job
flexibly and well, but I suspect you may want something sooner than it's
likely to be available (if you're looking a year or so out, we could talk).

- bill



  #8  
Old October 29th 03, 10:31 PM
VirtualSean
external usenet poster
 
Posts: n/a
Default

Hi Bill,

To realize a "true" OSD (as defined by Intel, et al),...


I hadn't noticed that Intel had presumed to 'define' what an OSD was.


I neglected my "e.g.," in that statement. I was simply attempting to
"settle" on one definition (or some acknowledged _set_ of aligned
definitions) for reference. Not my intention to assert that Intel has
any more clout or credibility in defining OSD/OBS.

Thank you very much for your thoughtful and info-packed reply Bill.

Much obliged.


--
VS



"Bill Todd" wrote in message ...
"VirtualSean" wrote in message
om...
Hello Bill,

To your rejoinder points... a couple of questions, if I may.

...while there may be some 'object' flavor to some of the internal

mechanisms
used, it's still basically just a file system that distributes files

across
an expandible set of storage servers...


To realize a "true" OSD (as defined by Intel, et al),


I hadn't noticed that Intel had presumed to 'define' what an OSD was.

is it the case
that the physical drives themselves in such a configuration (the drive
manufacturer's firmware) and/or disk controllers must posses the
capability of "handling" the Object Storage model, along with
correspondingly aligned "filesystem" (objsystem) software systems?


That would, I suspect, depend upon exactly whose definition of what an OSD
was you used.


I don't know the Panasas product offering(s), but I'm curious as to
how far they go toward the idealized OBS/OSD model,


And that would as well.

and to what degree
they might have [dis]advantages vis-a-vis Lustre and something like
Sistina's GFS (the latter may be a stretch, but I'm trying to
understand if an OSD model could be imposed on something like a GFS as
well).


The current definitions of an OSD that I'm aware of state that the 'device'
(which if undefined could be something that many people would call a
'server' of some kind rather than a disk) should encapsulate the physical
placement of data on the disk(s) such that it can be addressed externally by
object-identifier/offset/length triples. This allows the device to make
optimization decisions on its own about placement (IIRC HP's AutoRAID
product had some primitive capabilities in this area, but did not present
the result in the form of 'objects'). The 'object' may also support
(possibly extendible) 'attributes'.

One problem with this approach is that rather than simply moving complexity
from something like a file 'inode' from one place (the higher-level
software) to another ( the OSD), it adds a layer - since files can exceed
the size of the OSD and hence must be able to span multiple OSDs. So
there's still metadata in the 'inode', plus more at the OSD - and more
stages of metadata introduce more potential for requiring additional random
accesses before you can get to the data you want (not to mention adding more
points at which any *changes* in metadata must be reliably stored before a
modification operation can be considered to be 'stable' - though suitable
introduction of NVRAM at these points can help mitigate any additional
latency that this might cause). Another problem is that a high-level view
of a file as an 'object' often includes any storage redundancy used to store
that file, whereas individual 'objects' at an OSD by definition do not
encapsulate redundancy across multiple OSDs (hence for this reason as well a
'file' cannot necessarily map 1:1 to an OSD object). A third issue is that
of locking granularity, which also can span OSDs when a file does (Garth had
some early problems with this).

OTOH, decentralizing this metadata can help support increased scalability in
large systems (where many clients may be able to interrogate many OSDs
directly without going through a single 'inode server' as a bottleneck -
though there are other ways to accomplish this as well, such as
client-caching some or all of the metadata when it doesn't change all that
often).

IMO, the industry is still groping around for something approaching an
'ideal' decomposition of work in distributed file systems. The SNIA
includes disk vendors who would *love* to find ways to standardize what an
'OSD' is so that they could build added value into their products and price
them accordingly, but it's just not yet clear what that standard should be -
or even that a disk is the right level for *any* such standardization (and
if it's not, whether standardization at a higher - storage server - level
makes sense, rather than just accepting that the possibilities there are
sufficiently rich that proprietary solutions should be allowed to bloom to
address varying needs).

My own current views are that 1) OSDs don't make a great deal of sense when
talking about block-level storage (i.e., they don't add much value to that
idiom), 2) OSDs don't make a great deal of sense as higher-than-block-level
devices attached to single hosts (since they don't off-load sufficient work
to be very interesting: whatever you off-load tends to be balanced by the
additional host interface complexity required to control it as closely as a
host may often wish to do), and 3) *standardized* OSDs don't make much
sense in distributed systems, since they unnecessarily constrain creativity
in the design of those systems (i.e., such systems are sufficiently complex
that no single decomposition is even close to ideal for all circumstances -
and no current decomposition that I've seen is close to ideal even for many
common ones).

- bill

 




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
I want to build a 2.8TB storage array Yeechang Lee Homebuilt PC's 21 January 12th 05 02:00 AM
Cost of DVD as data storage versus HDD (UK) David X Cdr 136 December 7th 04 03:46 PM
31 Terabytes of storage? Terry Wilson Homebuilt PC's 10 August 5th 04 10:10 PM
Windows Clustering with Compaq F2 Storage array JB Lee Compaq Servers 1 July 16th 04 11:50 PM
Terabyte Storage By Real-Storage Real-Storage Storage & Hardrives 2 October 23rd 03 04:18 PM


All times are GMT +1. The time now is 06:11 PM.


Powered by vBulletin® Version 3.6.4
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 HardwareBanter.
The comments are property of their posters.