View Single Post
  #2  
Old March 29th 05, 06:49 AM
Paul Rubin
external usenet poster
 
Posts: n/a
Default

(dgm) writes:
I have around 20 Tb of data that I (a) want to store for a very (50
years)long time and also have available for search and download...
(a) the preservation masters which is the data we want to keep and is
in tiff and bwf formats among others
(b) the viewing copies which are in derived formals such as png and mp3.


You have a bunch of images and audio recordings. If the images are
scanned paper docs, the tiff files won't be substantially smaller than
the png files. If they're photographs, you probably want to view jpg
rather than png. In this case the mp3 and jpg files will probably be
less than 1/10th the size of the originals, or about 2 GB total, not
much at all. A small RAID disk system can hold that much easily, on
say six 400GB drives.

I am coming down to an HSM type of solution which a large enough front
end cache to allow us to keep the viewing copies online at all times
but which allows the archival copies to disappear off to tape to be
cloned and duplicated etc.


Sounds kind of complicated. Where's this data now, how is it stored,
and how fast are you adding to it and through what kind of system? 20
TB isn't really big storage these days. You could have a small tape
library online and move incoming raw data to tape immediately while
also making the online viewing copies on disk. HSM systems with
automatic migration and retrieval are probably overkill.