photonic x86 CPU design

#61 October 28th 05, 10:07 PM

Casper H.S. Dik wrote:

Bill Todd writes:

Casper H.S. Dik wrote:

...

This is one reason why Solaris ZFS keeps checksums separate from data;
it can tell which part of the mirror returned bad data and repair it
because the checksum fails.

Could you point me to such detailed descriptions of ZFS as that which
included this juicy snippet (a feature that I've been pursuing myself at
my usual glacial pace)? Unless, of course, the information came from
internal-only sources.

There's no current substantial technical information available; there's a
question and answer session at:

http://www.sun.com/emrkt/campaign_do...SEE-091504.pdf

In it, we're directed to more information at:

http://blogs.sun.com/ahrens

which seems to run out over a year ago.

which covers some of this in more detail.

but you must understand we're all paid to sprinkle ZFS teasers in
news groups and blogs :-)

The basic model for checksumming is fairly simple: all data is interconnected
through pointers and with each pointer a checksum of the data at the end
of the pointer is stored.

Casper

--
The e-mail address in our reply-to line is reversed in an attempt to
minimize spam. Our true address is of the form .

#62 October 28th 05, 10:12 PM

CJT writes:

ShuChih (Q): When will ZFS be included in Solaris 10? We were told first
in late summer 2004, then early 2005,
then May 2005....
Brian Ellefritz (A): ZFS will be in Solaris 10 when we ship the product.
The current projection is the end of
calendar 2004.

Interesting; well, better do it right than early :-)

Real soon now; and as usual this will conincide with the
release of the source code on OpenSolaris.org.

Casper
--
Expressed in this posting are my opinions. They are in no way related
to opinions held by my employer, Sun Microsystems.
Statements on Sun products included here are not gospel and may
be fiction rather than truth.

#63 October 28th 05, 10:27 PM

Casper H.S. Dik wrote:

CJT writes:

ShuChih (Q): When will ZFS be included in Solaris 10? We were told first
in late summer 2004, then early 2005,
then May 2005....
Brian Ellefritz (A): ZFS will be in Solaris 10 when we ship the product.
The current projection is the end of
calendar 2004.

Interesting; well, better do it right than early :-)

True.

Real soon now; and as usual this will conincide with the
release of the source code on OpenSolaris.org.

How about the timing in he ? ;-)

http://www.sun.com/nc/05q3/chat/tran...5Q3_091505.pdf

(from: )

"Network Computing 05Q3 – Online Chat
Thursday, September 15, 2005"

"Q: When will ZFS be available?
Chris Ratcliffe (A): November/December for public early access - right
now we're in private beta with customers across the world,
tuning and enhancing for the public release."

Casper

--
The e-mail address in our reply-to line is reversed in an attempt to
minimize spam. Our true address is of the form .

#64 October 29th 05, 09:06 AM

CJT writes:

How about the timing in he ? ;-)

http://www.sun.com/nc/05q3/chat/tran...5Q3_091505.pdf

(from: )

"Network Computing 05Q3 – Online Chat
Thursday, September 15, 2005"

"Q: When will ZFS be available?
Chris Ratcliffe (A): November/December for public early access - right
now we're in private beta with customers across the world,
tuning and enhancing for the public release."

Not too far off, I'd say

(Had to read carefully to check it referred to this November and not
last year's :-)

Casper
--
Expressed in this posting are my opinions. They are in no way related
to opinions held by my employer, Sun Microsystems.
Statements on Sun products included here are not gospel and may
be fiction rather than truth.

#65 October 29th 05, 08:42 PM

Casper H.S. Dik wrote:
Bill Todd writes:

Casper H.S. Dik wrote:

...

This is one reason why Solaris ZFS keeps checksums separate from data;
it can tell which part of the mirror returned bad data and repair it
because the checksum fails.

Could you point me to such detailed descriptions of ZFS as that which
included this juicy snippet (a feature that I've been pursuing myself at
my usual glacial pace)? Unless, of course, the information came from
internal-only sources.

There's no current substantial technical information available; there's a
question and answer session at:

http://www.sun.com/emrkt/campaign_do...SEE-091504.pdf

which covers some of this in more detail.

Thanks - I'll check it out.

but you must understand we're all paid to sprinkle ZFS teasers in
news groups and blogs :-)

There's enough interesting stuff hinted at that those who care about
storage should already be interested (though I'm still not convinced
that 128-bit pointers are worthwhile).

The basic model for checksumming is fairly simple: all data is interconnected
through pointers and with each pointer a checksum of the data at the end
of the pointer is stored.

That's certainly what I came up with, as a by-product of already having
settled on a no-update-in-place (though not conventionally
log-structured) approach (so every change updates the parent and writing
the checksum is therefore free; reading it is free as well, since you
have to go through the parent to get to its child).

So I've been looking for more details about the rest of ZFS to try to
decide whether I've got enough original innovation left to be worth
pursuing.

- bill

#66 October 29th 05, 11:08 PM

Bill Todd writes:

So I've been looking for more details about the rest of ZFS to try to
decide whether I've got enough original innovation left to be worth
pursuing.

The main other innovation which makes it different is the merging of
the volume management with the filesystems.

With ZFS, the filesytems has combined knowledge about the RAID groups
and the filesystem and knows when all bits of a RAID have been written,
so it doesn't need to suffer from certain "standard" RAID 5 problems.

Also, from the system management point of few it's so much easier to
use than LVM + fs.

Casper
--
Expressed in this posting are my opinions. They are in no way related
to opinions held by my employer, Sun Microsystems.
Statements on Sun products included here are not gospel and may
be fiction rather than truth.

#67 October 29th 05, 11:33 PM

On 2005-10-29 17:08:27 -0500, Casper H.S. Dik said:

With ZFS, the filesytems has combined knowledge about the RAID groups
and the filesystem and knows when all bits of a RAID have been written,
so it doesn't need to suffer from certain "standard" RAID 5 problems.

Does ZFS support parity? So far I've only seen references to mirroring
and striping. It would seem easy to avoid the problems of RAID 5 if you
don't use parity.

--
Wes Felter - - http://felter.org/wesley/

#68 October 30th 05, 01:06 AM

Casper H.S. Dik wrote:
Bill Todd writes:

So I've been looking for more details about the rest of ZFS to try to
decide whether I've got enough original innovation left to be worth
pursuing.

The main other innovation which makes it different is the merging of
the volume management with the filesystems.

With ZFS, the filesytems has combined knowledge about the RAID groups
and the filesystem and knows when all bits of a RAID have been written,
so it doesn't need to suffer from certain "standard" RAID 5 problems.

Also, from the system management point of few it's so much easier to
use than LVM + fs.

That's fine as long as you don't wish to combine file-level with
block-level storage on the same disks, but even NetApp has moved to do
so lately (and I say this as a confirmed file-system bigot who thinks
that block-level storage is so rarely preferable - given reasonable
file-level facilities such as direct I/O that bypasses any system
caching - as to be almost irrelevant these days).

Furthermore, there's still a fairly natural division between the file
layer and the block layer (which in no way limits the file system's
ability to use knowledge of the block layer to its advantage nor
requires that users be concerned about the block layer unless they want
to use it directly). And finally time-efficient user-initiated
reorganization (e.g., adding/subtracting disks or moving redundancy from
mirroring to a parity basis) and recovery from disk (or entire server)
failures dictates that the redundancy restoration on recovery proceed in
at least multi-megabyte chunks (whereas file system actively requires
much finer-grained allocation: I did find a reference to batching
updates into a single large write - another feature I've been working on
- but even that doesn't address the problem adequately a lot of the time).

Otherwise, ZFS sounds truly impressive (e.g., I'm not used to seeing
things like prioritized access outside real-time environments, let alone
deadline-based scheduling - though that's coming into more general vogue
with the increasing interest in isochronous data, assuming that's the
level of 'deadline' you're talking about).

- bill

#69 October 31st 05, 10:40 AM

Wes Felter writes:

On 2005-10-29 17:08:27 -0500, Casper H.S. Dik said:

With ZFS, the filesytems has combined knowledge about the RAID groups
and the filesystem and knows when all bits of a RAID have been written,
so it doesn't need to suffer from certain "standard" RAID 5 problems.

Does ZFS support parity? So far I've only seen references to mirroring
and striping. It would seem easy to avoid the problems of RAID 5 if you
don't use parity.

It has a form of RAID 5 (dubbed RAIDZ) which uses parity. If there's
no outright disk failure, ZFS can reconstruct which disk is returning the
bad data. Also, by not comitting the data in ZFS until all parts of the
RAID have been written, there's no chance of the "RAID5 hole" occuring.
(The RAID5 hole is where you have, e.g., a power failure, before you're
written all data + parity; the data is corrupt with no way to recover)

Casper
--
Expressed in this posting are my opinions. They are in no way related
to opinions held by my employer, Sun Microsystems.
Statements on Sun products included here are not gospel and may
be fiction rather than truth.

#70 October 31st 05, 07:09 PM

Thomas Womack wrote:
In article ,
Terje Mathisen wrote:

I.e. is there a code in which you can encode two blocks of data on four
drives, and recover the data blocks from any pair of surviving drives?

Unfortunately not -- you need five, encode A A B B A+B, and then you
can recover from any two losses. That pattern extends: two blocks on
six drives survives three failures by A A B B A+B A+B, on eight
survives four by A A A B B B A+B A+B ...

There's probably a beautiful proof, based on some intuitively obvious
property of some geometric object that's clearly the right thing to
imagine, that 4-on-2-surviving-2 doesn't work; but I ran through all
2^16 possible encodings, since getting C to count to 116 is quicker
than acquiring mathematical intuition.

Seven discs can encode 4 discs of content and survive two losses; use
the Hamming code, whose minimum distance is 3 so *removing* (rather
than corrupting) two whole columns still leaves distance between the
codewords. But seven discs are getting rather loud and heavy.

And expensive. People have done backup and verify to tape for decades, a
RAID-5 array is far more reliable, and the most useful storage you can
get and still have failure tolerance (N-1 of N drives for data). If
backups are expensive or inconvenient people don't do them.

--
bill davidsen
SBC/Prodigy Yorktown Heights NY data center
http://newsgroups.news.prodigy.com

Thread Tools
Show Printable Version Email this Page
Display Modes
Linear Mode Switch to Hybrid Mode Switch to Threaded Mode

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
NetBurst, Bangalore and automated design	Leif Sterner	Intel	0	May 5th 04 05:58 PM
Analog Design Manager in India, Bangalore	abdul	General Hardware	1	December 14th 03 01:09 AM
Recommend Book on Basic Video Card Design?	Jeff Walther	General	29	December 9th 03 04:32 AM