A computer components & hardware forum. HardwareBanter

If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

Go Back   Home » HardwareBanter forum » General Hardware & Peripherals » Storage & Hardrives
Site Map Home Register Authors List Search Today's Posts Mark Forums Read Web Partners

fileserver backup: ndmp or network backup?



 
 
Thread Tools Display Modes
  #1  
Old December 8th 06, 03:22 PM posted to comp.arch.storage
Christoph Peus
external usenet poster
 
Posts: 2
Default fileserver backup: ndmp or network backup?

Hi all,

at the moment our 1.5 TB fileserver holds about 3.5 million files, which
are backed up via network. Now I have to plan an upgrade to at least 3TB
and consider to invest in an ndmp-capable filer. But is nmdp really
faster in this environment ? (if the underlying RAID-systems are equally
fast und the network connection between fileserver and backupserver is
not a bottleneck?). I have no experience with ndmp up to now and would
really appreciate help from some experts.
Thanks in advance!

Christoph

--
Christoph Peus
Universität Witten/Herdecke
Bereich Informationstechnologie
Stockumer Str. 10
58453 Witten, Germany
Tel: +49-2302 926212
http://www.uni-wh.de
  #2  
Old December 10th 06, 01:38 AM posted to comp.arch.storage
Faeandar
external usenet poster
 
Posts: 191
Default fileserver backup: ndmp or network backup?

On Fri, 08 Dec 2006 15:22:21 +0100, Christoph Peus
wrote:

Hi all,

at the moment our 1.5 TB fileserver holds about 3.5 million files, which
are backed up via network. Now I have to plan an upgrade to at least 3TB
and consider to invest in an ndmp-capable filer. But is nmdp really
faster in this environment ? (if the underlying RAID-systems are equally
fast und the network connection between fileserver and backupserver is
not a bottleneck?). I have no experience with ndmp up to now and would
really appreciate help from some experts.
Thanks in advance!

Christoph


NDMP is almost always faster, primarily because of the difference in
overhead but also because of dump's IO pattern vs. network reads.

One of the greatest things about NDMP, imo, is that it takes advantage
of any snapshots your system may be capable of performing. This means
a guaranteed consistent backup rather than a network backup. In a
network backup one part of a file may change right before or right
after the read happens, which will give you an inconsistent file on
tape. The same holds true for a directory structure or data set.
Of course, if your file server cannot take snapshots it won't be any
more consistent with NDMP, just faster.

NDMP is not a transport protocol, it is a command protocol. What that
means is that it relies on a transport protocol, usually IP, to
transport data, but the commands used to initiate and control that
transport are done so by NDMP.
For most unix-based filers NDMP simply calls dump for backup and IP
for transport. I can't say what is used for windows based filers but
I would imagine a similar setup, maybe Windows Backup?

Network Data Management Protocol. The management part is key;
remember it is not a transport protocol and you will avoid several
pitfalls.

~F
  #3  
Old December 11th 06, 05:14 PM posted to comp.arch.storage
Christoph Peus
external usenet poster
 
Posts: 2
Default fileserver backup: ndmp or network backup?

Faeandar wrote:

NDMP is almost always faster, primarily because of the difference in
overhead but also because of dump's IO pattern vs. network reads.

One of the greatest things about NDMP, imo, is that it takes advantage
of any snapshots your system may be capable of performing. This means
a guaranteed consistent backup rather than a network backup. In a
network backup one part of a file may change right before or right
after the read happens, which will give you an inconsistent file on
tape. The same holds true for a directory structure or data set.
Of course, if your file server cannot take snapshots it won't be any
more consistent with NDMP, just faster.

NDMP is not a transport protocol, it is a command protocol. What that
means is that it relies on a transport protocol, usually IP, to
transport data, but the commands used to initiate and control that
transport are done so by NDMP.
For most unix-based filers NDMP simply calls dump for backup and IP
for transport. I can't say what is used for windows based filers but
I would imagine a similar setup, maybe Windows Backup?

Network Data Management Protocol. The management part is key;
remember it is not a transport protocol and you will avoid several
pitfalls.


Thanks for your comment. We already use Linux LVM snapshotting for our
network backup, so I don't see a special advantage of NMDP here, or did
I get something wrong in your explanation?

Regarding performance it's most important for me to know wheter the
difference is typically "only" 20% or a magnitude.
When a network backup does a full backup of a filesystem with millions
of files, it typically reads every single file separately which forces a
lot of seeks of the involved hds.

Idea: If a NDMP-enabled filer would have access to the filesystems
snapshot-volume on block-level it could be clever enough to do a full
scan of the filesystem, saving information which blocks are in use by
which file and do a full backup afterwords by *sequentially* reading the
volume on block-level from first to last cylinder, skipping unused
blocks, which would need significantly less seeks. Are there systems
that work this way?

Christoph
  #4  
Old December 11th 06, 06:05 PM posted to comp.arch.storage
Bill Todd
external usenet poster
 
Posts: 162
Default fileserver backup: ndmp or network backup?

Christoph Peus wrote:

....

When a network backup does a full backup of a filesystem with millions
of files, it typically reads every single file separately which forces a
lot of seeks of the involved hds.


That's true, but in your particular case your average file size appears
to be over 400 KB, which means that (unless the files suffer from
significant internal fragmentation - which you might find it worthwhile
to eliminate for other reasons) a file-by-file backup (with reasonable
caching of the directory and inode structure) should achieve about 1/3
of the best possible transfer bandwidth anyway (and won't have to
transfer anything but live data, meaning that it could approach half the
ideal performance).


Idea: If a NDMP-enabled filer would have access to the filesystems
snapshot-volume on block-level it could be clever enough to do a full
scan of the filesystem, saving information which blocks are in use by
which file and do a full backup afterwords by *sequentially* reading the
volume on block-level from first to last cylinder, skipping unused
blocks, which would need significantly less seeks.


That could leave the individual files equally fragmented on the backup
medium. A better approach might be for the file system itself to keep
its data better-consolidated (not just for backup purposes, but for
other common situations where many smallish files within a single
directory may be accessed together) - if not at run-time, then via use
of a suitably-intelligent defragmenter.

- bill
  #5  
Old December 12th 06, 05:52 AM posted to comp.arch.storage
Maxim S. Shatskih
external usenet poster
 
Posts: 87
Default fileserver backup: ndmp or network backup?

snapshot-volume on block-level it could be clever enough to do a full
scan of the filesystem, saving information which blocks are in use by
which file


No need in this, at least in Windows. Windows has FSCTL_GET_VOLUME_BITMAP, so,
exclusion of the free blocks from the disk image is trivial and does not
require the file/dir tree walk.

I hope UNIXen also have the same kind of IOCTL. At least most UNIX filesystems
use the free space bitmap scattered across fixed well-known locations on the
volume, so, supporting such a call would be trivial.

--
Maxim Shatskih, Windows DDK MVP
StorageCraft Corporation

http://www.storagecraft.com

  #6  
Old December 13th 06, 03:55 AM posted to comp.arch.storage
Faeandar
external usenet poster
 
Posts: 191
Default fileserver backup: ndmp or network backup?

On Mon, 11 Dec 2006 17:14:16 +0100, Christoph Peus
wrote:

Faeandar wrote:

NDMP is almost always faster, primarily because of the difference in
overhead but also because of dump's IO pattern vs. network reads.

One of the greatest things about NDMP, imo, is that it takes advantage
of any snapshots your system may be capable of performing. This means
a guaranteed consistent backup rather than a network backup. In a
network backup one part of a file may change right before or right
after the read happens, which will give you an inconsistent file on
tape. The same holds true for a directory structure or data set.
Of course, if your file server cannot take snapshots it won't be any
more consistent with NDMP, just faster.

NDMP is not a transport protocol, it is a command protocol. What that
means is that it relies on a transport protocol, usually IP, to
transport data, but the commands used to initiate and control that
transport are done so by NDMP.
For most unix-based filers NDMP simply calls dump for backup and IP
for transport. I can't say what is used for windows based filers but
I would imagine a similar setup, maybe Windows Backup?

Network Data Management Protocol. The management part is key;
remember it is not a transport protocol and you will avoid several
pitfalls.


Thanks for your comment. We already use Linux LVM snapshotting for our
network backup, so I don't see a special advantage of NMDP here, or did
I get something wrong in your explanation?


Not sure where Linux is involved. Is that the current file server
platform? I'll assume it is.

Snapshotting in such a case will get you file consistency but it does
not help with performance if you're still doing a network backup. If
you were doing a dump the performance would be improved. How much it
would improve I can't say.


Regarding performance it's most important for me to know wheter the
difference is typically "only" 20% or a magnitude.


It should not be a magnitude difference. Something under 50% I would
guess offhand.

When a network backup does a full backup of a filesystem with millions
of files, it typically reads every single file separately which forces a
lot of seeks of the involved hds.


Right, which is why a FS dump or equivalent is faster. No special
magic involved just removing the overhead of network traffic and cpu
switching. IIRC, dumps do a near sequential disk read of data during
it's mapping phase (or at least enough parallel reads to seem so),
which would be significantly faster than random reads to service
single file requests.


Idea: If a NDMP-enabled filer would have access to the filesystems
snapshot-volume on block-level it could be clever enough to do a full
scan of the filesystem, saving information which blocks are in use by
which file and do a full backup afterwords by *sequentially* reading the
volume on block-level from first to last cylinder, skipping unused
blocks, which would need significantly less seeks. Are there systems
that work this way?

Christoph



Can you give us a list of potential vendors you are looking at? We
may be able to give more specifics for each of those.

~F
  #7  
Old January 30th 07, 09:15 AM posted to comp.arch.storage
Big Al
external usenet poster
 
Posts: 2
Default fileserver backup: ndmp or network backup?

One thing you need to check is whether your NDMP backup job is restartable
or not.

We had configured NDMP on a 8 TB filesystem, but belatedly realised the
filer NDMP did not support restartable jobs, so if anything happened to the
job (tape error, drive error, library got rebooted, etc. etc.) the backup
job will restart from scratch.

Needless to say, it proved to be a big headache. Should have configured a
filesystem backup.......

"Faeandar" wrote in message
...
On Fri, 08 Dec 2006 15:22:21 +0100, Christoph Peus
wrote:

Hi all,

at the moment our 1.5 TB fileserver holds about 3.5 million files, which
are backed up via network. Now I have to plan an upgrade to at least 3TB
and consider to invest in an ndmp-capable filer. But is nmdp really
faster in this environment ? (if the underlying RAID-systems are equally
fast und the network connection between fileserver and backupserver is
not a bottleneck?). I have no experience with ndmp up to now and would
really appreciate help from some experts.
Thanks in advance!

Christoph


NDMP is almost always faster, primarily because of the difference in
overhead but also because of dump's IO pattern vs. network reads.

One of the greatest things about NDMP, imo, is that it takes advantage
of any snapshots your system may be capable of performing. This means
a guaranteed consistent backup rather than a network backup. In a
network backup one part of a file may change right before or right
after the read happens, which will give you an inconsistent file on
tape. The same holds true for a directory structure or data set.
Of course, if your file server cannot take snapshots it won't be any
more consistent with NDMP, just faster.

NDMP is not a transport protocol, it is a command protocol. What that
means is that it relies on a transport protocol, usually IP, to
transport data, but the commands used to initiate and control that
transport are done so by NDMP.
For most unix-based filers NDMP simply calls dump for backup and IP
for transport. I can't say what is used for windows based filers but
I would imagine a similar setup, maybe Windows Backup?

Network Data Management Protocol. The management part is key;
remember it is not a transport protocol and you will avoid several
pitfalls.

~F



 




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
How Can I Get the sdk of Toshiba IK-WB11A ?? eqqmc2 Webcams 0 August 11th 06 05:02 AM
How Can I Get the sdk of Toshiba IK-WB11A ?? eqqmc2 Webcams 0 August 11th 06 05:02 AM
Which backup programs will do this? AL D Homebuilt PC's 13 December 30th 05 05:15 AM
How do you backup a small network of computers? Paul J. Campbell Storage (alternative) 22 December 4th 05 12:51 PM
Network Backup Storage Device JimJ Storage (alternative) 1 November 13th 04 11:49 AM


All times are GMT +1. The time now is 04:17 PM.


Powered by vBulletin® Version 3.6.4
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 HardwareBanter.
The comments are property of their posters.