A computer components & hardware forum. HardwareBanter

If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

Go Back   Home » HardwareBanter forum » General Hardware & Peripherals » Storage & Hardrives
Site Map Home Register Authors List Search Today's Posts Mark Forums Read Web Partners

SDLT wear & tear (small files vs. big files)



 
 
Thread Tools Display Modes
  #11  
Old September 28th 03, 10:30 PM
Eric Lee Green
external usenet poster
 
Posts: n/a
Default

In article , Peter da Silva ruminated:
In article ,
Eric Lee Green wrote:
As far as staging of entire backups via backup software, if we're
talking about a small office network that may be satisfactory, but I'm
having difficulty conceiving how to handle terabyte-sized backups in a
reasonable manner that way.


Instead of interleaving the streams straight to tape, write them to disk.


That part is easy enough. Tapio didn't care where it was sending its
data. The actual tape writer accepted a stream (interleaved or not)
and stored it to tape, ticking out stream ID, stream block ID, and
tape location info as it did so in order that they could be registered
in the location database so that the data could be easily restored. It
didn't care where the stream was coming from.

When a stream is full, or you have a tapes worth, write to tape.


The problem is knowing when you have a tape's worth, given the uneven
compressibility of data and an unknown compression algorithm on the
part of the tape drive. Either you end up wasting space, or you end up
having to span multiple tapes. Firmware compression algorithms
complicate things greatly. One notion I considered was to simply
disable any firmware compression algorithm, and do a block-by-block
compression at the software level. The problem there is that then we
become compute-bound on the tape server rather than hardware-bound. At
the time, server hardware really wasn't very CPU-heavy and wasn't
capable of handling the load. Bumping the compression out to the
client level was also a possibility. That actually probably would have
worked okay, but at the time (Pentium II 300mhz with 128mb of RAM was
normal back then, a dual PII-450 or Xeon 450 with 512mb of RAM was the
super-deluxe server hardware) client hardware just didn't have much
oomph.

If one disk won't keep the tape happy, slice the stream across multiple
drives, either explicitly or using RAID.


Disk thruput is not a big deal nowdays. Faster computers are making
many things feasible now that back in the day weren't really
credible. For example, something like LUFS (Linux Userland File
System) used as a framework for a time-based snapshotting filesystem
for a storage appliance would have been utterly ludicrous even three
years ago. CPU's were so slow back then that the only way to get
acceptable performance from a filesystem was to run it in kernel-land,
where you had direct access to the unified buffer cache and driver
layer without any kernel/userland transisions. When I benchmarked LUFS
on modern hardware (P4 2.4ghz with 512mb of RAM) back in May, with
some minor optimizations I obtained well over 150mb/sec raw
throughput, which would have been utterly ludicrous with that 300mhz
Pentium II a few years ago.

--
Eric Lee Green
Linux/Unix Software Engineer seeks employment
see http://badtux.org for resume


-----= Posted via Newsfeeds.Com, Uncensored Usenet News =-----
http://www.newsfeeds.com - The #1 Newsgroup Service in the World!
-----== Over 100,000 Newsgroups - 19 Different Servers! =-----
  #12  
Old September 29th 03, 04:39 PM
Peter da Silva
external usenet poster
 
Posts: n/a
Default

In article ,
Eric Lee Green wrote:
When a stream is full, or you have a tapes worth, write to tape.


The problem is knowing when you have a tape's worth, given the uneven
compressibility of data and an unknown compression algorithm on the
part of the tape drive.


Before I go into compression, let's clarify that point. When I say
"a tape's worth" here, I mean "enough that it's worthwhile to start
dumping to tape". If it's more than the tape can hold then you handle
the end of tape and leave the rest of the stream on disk until you
switch tapes. Whether you pick up the next tape with the same stream
or not is a policy decision, really.

But...

Don't compress on the tape drive, make sureit's compressed by the time it
spools to disk, and run the drive without compression. Tape compression
gets you 2:1 at most, with the better algorithms you can use on the server
I've got 10:1 for some partitions.

At the very worst, you will never do *worse* than tape compression.

One notion I considered was to simply
disable any firmware compression algorithm, and do a block-by-block
compression at the software level.


That's exactly what I do with Amanda, except I'm using streaming
compression (gzip -9) rather than block compression.

The problem there is that then we
become compute-bound on the tape server rather than hardware-bound.


CPU is even cheaper than disk. We use Alphas so we've been "CPU rich" for
years, but at home I've been using a K6-3/400 for my Amanda server and it's
not breathing hard doing server-side compression for a couple of bitty
boxes. Most of them compress on the client.

But that's an implementation detail... the results are similar.

--
I've seen things you people can't imagine. Chimneysweeps on fire over the roofs
of London. I've watched kite-strings glitter in the sun at Hyde Park Gate. All
these things will be lost in time, like chalk-paintings in the rain. `-_-'
Time for your nap. | Peter da Silva | Har du kramat din varg, idag? 'U`
  #13  
Old September 29th 03, 11:07 PM
George Sarlas
external usenet poster
 
Posts: n/a
Default

Thanks to everyone for their input. I have thought about staging the
data to some cheap IDE drives first. I'll play around with it some
more.

-george


"Scott" wrote in message ...
"Peter da Silva" wrote in message
...
I asked a question: can Netbackup use a local disk as cache to buffer tape
writes and prevent shoeshining?


Only as a two-step process.
1) Backup from servers to disk
2) Disk-to-tape copy

So, it works, but does increase the amount of time it takes to complete
backups (though it may decrease the amount of time that the servers being
backed up are busy)

Scott

 




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Far Cry Glitch Ati Videocards 14 June 30th 05 11:47 PM
novice asks - Installing a scanner John McGaw General 8 September 20th 04 05:19 PM
novice asks - Installing a scanner Noozer Homebuilt PC's 7 September 20th 04 05:19 PM
Microtek scanner made my PC unusable. Robert Clark Scanners 5 June 22nd 04 10:46 PM
I need help storing files with Nero HBYardSale Cdr 0 June 26th 03 08:51 PM


All times are GMT +1. The time now is 04:32 AM.


Powered by vBulletin® Version 3.6.4
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 HardwareBanter.
The comments are property of their posters.