NDMP considerations

**[email protected]** · #11 April 6th 07, 07:18 AM posted to comp.arch.storage

On Apr 5, 5:25 pm, Faeandar wrote:
On 5 Apr 2007 16:58:47 -0700, wrote:

Snapvault is quite nice because you can schedule the vaults to occur
much more simply than snapmirror (though technically they both use the
same underlying technology).

Why are you not backing up the filers directly with NDMP over FC or
even Ethernet?

This is exactly what I am trying to setup here... an NDMP over FC
environment... However, some poeple had warned about long backup times
for large amounts of data, and that is what prompted me to my post...

That is completely dependent on your data set. Post a quota report
with file listing and I can give you a ballpark. Also post what type
of infrastructure you are looking to build.
2gb or 4gb FC?
direct to tape or FC switch?
type of tape drives?
Are all filers going to backup direct to tape or will you do 3-way
backups in some cases? (this is where a filer without a tape drive
does an NDMP dump over ethernet to a filer who has a tape drive)
What backup software are you going to use?

Which is, if I had say 1.2TB of data, would I also face the 14 hr
backup cycles like Raju had? Because in our environment, we will
likely start approaching over 500TB of data with in the next few
months as we consolidate and with in a year be up to a PB of online
data.

So while I now understand how efficient Snapmirror is for replication,
I would like to understand what the NDMP backup considerations should
be... will that also work at wireline speeds? Can I do multiple NDMP
backups from various filers in parallel and so on.

I think we're getting somewhere now. There are lots of restrictions
depending on the backup software.
For the most part you can't share tape drives between filers except
via 3-way backups (as mentioned above). But this can cause
performance problems depending on your data, network, infrastructure,
etc. But it does work quite nicely in many cases and is almost as
fast as FC, you just have a filer doing both net in and tape out for
something other than it's own data.

Is the time to backup really a problem? What's driving it? I ask
because the filer backups based on snapshot so the need to backup fast
is not based on coherency.

Post file count or quota report, that will help you understand backup
implications better than any theoretical discussion of speeds and
feeds.

~F

The time to backup may actually not b a problem now that I understand
what you say... although I'd expect the backup to happen in somehwat a
reasonable amount of time... For example, respoding to the post by
Raju, if we had a 14 hour window to backup just 1.2 TB of data, that
wouldnt fly with 500TB of data... Its not that it needs to be done in
24 hrs or even a few days... its just that we cant have operations
that span more than a few weeks.

Is Raju's enviornment an anomoly or is that common? I'm asking
because I've heard some caution about long backup windows for large
amounts of data.

Probobally more in particular, restore on a volume level also seems to
be something some people in my old organization had warned me about...
some volumes are fairly very large, and just to recover a single file,
would I need to restore an entire volume?

**Raju Mahala** · #12 April 6th 07, 10:01 AM posted to comp.arch.storage

On Apr 6, 12:19 am, wrote:
On Apr 5, 10:39 am, "Raju Mahala" wrote:

On Apr 5, 12:19 pm, wrote:

Hi Friends,
I have a task of consolidating some of our datacenters. When done, I
would have accumilated about 1 peta byte of data on our brand new
netapp filers.

While I am in our scoping out phase, I would like to know what
considerations I should take for backing up (full and incremental
backups) of 1 petabyte of data on an on going basis.

What issues will I run into?

What are best practises?

How long will this backup take?

Any other things to watch out for?

Just a few notes on our architecture... we have a few high end netapp
filers (combinations of 3020s, 960cs etc) as tier 1 followed by a tier
of 270cs for older data.

I dont have anything speced for VTL and backup yet.

Thank You
Ludlas.

Infact I don't have experience on ndmpcopy backup but I feel during
ndmpcopy filer performance may be affected. So you have to findout
whether you have sufficient off hours for backup. Another point I
would like to say that check average file size, if its very less then
ndmpcopy may take more time. More no. of small size files on volume
affect ndmpcopy performance very badly.
In out environment, 1.2TB volume with around 30M files take more than
14hr to complete ndmpcopy backup. So we take backup from another linux
box where storage volumes are mounted.

- Raju

Thanks Raju, that somewhat helps...

I cant understand though why a mere 1.2 TB volume is taking 14 hours
to read. That is way slower than wireline speeds... We are able to
read 1.2TB of data much faster than that from even our slow 270s just
over NFS infact. (one of our modules copies all snapshot directories
via NFS instead of ndmp right now)

Do you know where the bottleneck is? Any one else out there with
experiences in this area for such large number of files and data?- Hide quoted text -

- Show quoted text -

Yes we know the bottleneck. Lots of RCA was done by Netapp guy also.
So there are two major reasons :
1) large no. of small files (average file size : ~40KB)
2) Heavy turnaround of file creation/deletion

So due to these factors volume gets fragmented at file level and
because ndmpcopy works at file level so it takes more time to read
fragmentated files. If we go for snapmirror then it doesn't take much
time because that is block level copy. Even it takes more than
expected time during snapvault which is block level copy but works at
qtree level so need to find out corresponding block for qtree.

- Raju

**Raju Mahala** · #13 April 6th 07, 12:01 PM posted to comp.arch.storage

Post file count or quota report, that will help you understand backup
implications better than any theoretical discussion of speeds and
feeds.

~F

As I feel file count and daily turnaround of file deletion/creation
(which lead to file fragmentation) will play major role in total
backup time through ndmpcopy, whether it is over FC or ethernet.
I have experience in NDMP over ethernet and in that case volume with
low turnaround and less file count were getting done almost based on
wire speed but volume those had more file count with higher turnaround
were having problem.
I can't comment on NDMP over FC but I feel bottleneck is due to
filecount which will remain as it is so must be same scenerio.

The time to backup may actually not b a problem now that I understand
what you say... although I'd expect the backup to happen in somehwat a
reasonable amount of time... For example, respoding to the post by
Raju, if we had a 14 hour window to backup just 1.2 TB of data, that
wouldnt fly with 500TB of data... Its not that it needs to be done in
24 hrs or even a few days... its just that we cant have operations
that span more than a few weeks.

Is Raju's enviornment an anomoly or is that common? I'm asking
because I've heard some caution about long backup windows for large
amounts of data.

This is not with our all netapp volumes so can't say common. Its only
with some netapp volume those are for some specific user groups where
heavy turnaround of file creation/deletion occurs with very small
files. Rest of the volumes are ok and don't have these issues.

Probobally more in particular, restore on a volume level also seems to
be something some people in my old organization had warned me about...
some volumes are fairly very large, and just to recover a single file,
would I need to restore an entire volume?

I am not sure but I hope that shouldn't be the case. Restore procedure
must depend on Backup software policy. If it is the case then I think
useless so I almost sure it shouldn't be the case.

- Raju

**Faeandar** · #14 April 6th 07, 10:10 PM posted to comp.arch.storage

On 5 Apr 2007 23:18:13 -0700, wrote:

On Apr 5, 5:25 pm, Faeandar wrote:
On 5 Apr 2007 16:58:47 -0700, wrote:

Snapvault is quite nice because you can schedule the vaults to occur
much more simply than snapmirror (though technically they both use the
same underlying technology).

Why are you not backing up the filers directly with NDMP over FC or
even Ethernet?

This is exactly what I am trying to setup here... an NDMP over FC
environment... However, some poeple had warned about long backup times
for large amounts of data, and that is what prompted me to my post...

That is completely dependent on your data set. Post a quota report
with file listing and I can give you a ballpark. Also post what type
of infrastructure you are looking to build.
2gb or 4gb FC?
direct to tape or FC switch?
type of tape drives?
Are all filers going to backup direct to tape or will you do 3-way
backups in some cases? (this is where a filer without a tape drive
does an NDMP dump over ethernet to a filer who has a tape drive)
What backup software are you going to use?

Which is, if I had say 1.2TB of data, would I also face the 14 hr
backup cycles like Raju had? Because in our environment, we will
likely start approaching over 500TB of data with in the next few
months as we consolidate and with in a year be up to a PB of online
data.

So while I now understand how efficient Snapmirror is for replication,
I would like to understand what the NDMP backup considerations should
be... will that also work at wireline speeds? Can I do multiple NDMP
backups from various filers in parallel and so on.

I think we're getting somewhere now. There are lots of restrictions
depending on the backup software.
For the most part you can't share tape drives between filers except
via 3-way backups (as mentioned above). But this can cause
performance problems depending on your data, network, infrastructure,
etc. But it does work quite nicely in many cases and is almost as
fast as FC, you just have a filer doing both net in and tape out for
something other than it's own data.

Is the time to backup really a problem? What's driving it? I ask
because the filer backups based on snapshot so the need to backup fast
is not based on coherency.

Post file count or quota report, that will help you understand backup
implications better than any theoretical discussion of speeds and
feeds.

~F

The time to backup may actually not b a problem now that I understand
what you say... although I'd expect the backup to happen in somehwat a
reasonable amount of time... For example, respoding to the post by
Raju, if we had a 14 hour window to backup just 1.2 TB of data, that
wouldnt fly with 500TB of data... Its not that it needs to be done in
24 hrs or even a few days... its just that we cant have operations
that span more than a few weeks.

Well, raw backup time is a function of total data and speed of tape
drives. You just can't get faster than the max speed of the
destination.
At 30MB/sec you can backup 2.5TB a day. So that 500TB better be split
up nicely or you can count on weeks of backup time.
A filer can do more than 30MB/sec so you could have 2 or more drives
to make that go faster overall.

Is Raju's enviornment an anomoly or is that common? I'm asking
because I've heard some caution about long backup windows for large
amounts of data.

What caution? You can either back it up in a window or you can't.
There's not really a cautionary statement there. If you can break up
the data set to work within your window then great. If not, you still
need to back it up.

Probobally more in particular, restore on a volume level also seems to
be something some people in my old organization had warned me about...
some volumes are fairly very large, and just to recover a single file,
would I need to restore an entire volume?

You do not need to restore the entire volume. However if you do not
use DAR then you may end up scanning the entire tape set for that
backup. One thing to keep in mind about DAR is that it's file level
only. So if you select a directory you just lost that ability, but if
you select every file in a directory you will get speedy restores.

~F

**Faeandar** · #15 April 6th 07, 11:33 PM posted to comp.arch.storage

On 6 Apr 2007 02:01:35 -0700, "Raju Mahala"
wrote:

On Apr 6, 12:19 am, wrote:
On Apr 5, 10:39 am, "Raju Mahala" wrote:

On Apr 5, 12:19 pm, wrote:

Hi Friends,
I have a task of consolidating some of our datacenters. When done, I
would have accumilated about 1 peta byte of data on our brand new
netapp filers.

While I am in our scoping out phase, I would like to know what
considerations I should take for backing up (full and incremental
backups) of 1 petabyte of data on an on going basis.

What issues will I run into?

What are best practises?

How long will this backup take?

Any other things to watch out for?

Just a few notes on our architecture... we have a few high end netapp
filers (combinations of 3020s, 960cs etc) as tier 1 followed by a tier
of 270cs for older data.

I dont have anything speced for VTL and backup yet.

Thank You
Ludlas.

Infact I don't have experience on ndmpcopy backup but I feel during
ndmpcopy filer performance may be affected. So you have to findout
whether you have sufficient off hours for backup. Another point I
would like to say that check average file size, if its very less then
ndmpcopy may take more time. More no. of small size files on volume
affect ndmpcopy performance very badly.
In out environment, 1.2TB volume with around 30M files take more than
14hr to complete ndmpcopy backup. So we take backup from another linux
box where storage volumes are mounted.

- Raju

Thanks Raju, that somewhat helps...

I cant understand though why a mere 1.2 TB volume is taking 14 hours
to read. That is way slower than wireline speeds... We are able to
read 1.2TB of data much faster than that from even our slow 270s just
over NFS infact. (one of our modules copies all snapshot directories
via NFS instead of ndmp right now)

Do you know where the bottleneck is? Any one else out there with
experiences in this area for such large number of files and data?- Hide quoted text -

- Show quoted text -

Yes we know the bottleneck. Lots of RCA was done by Netapp guy also.
So there are two major reasons :
1) large no. of small files (average file size : ~40KB)
2) Heavy turnaround of file creation/deletion

So due to these factors volume gets fragmented at file level and
because ndmpcopy works at file level so it takes more time to read
fragmentated files. If we go for snapmirror then it doesn't take much
time because that is block level copy. Even it takes more than
expected time during snapvault which is block level copy but works at
qtree level so need to find out corresponding block for qtree.

- Raju

There is really no fragmentation in WAFL, at least not what most
people understand it to be. Because ALL writes on a filer are
sequential (think NVRAM) and because of the way WAFL functions it's
almost zero in a general purpose environment.
However, the more you fill up a volume the harder the file system has
to work to figure out where to put stuff. This is what causes
performance degradation at 85% and above.

Data churn is also not an issue for backups unless you run out of
snapshot space. Performance may be impacted because of the general
resource usage of other processes but the data churn itself is not an
issue.

~F

Thread Tools
Show Printable Version Email this Page
Display Modes
Linear Mode Switch to Hybrid Mode Switch to Threaded Mode

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
NDMP backups I/O bottleneck?	Mike	Storage & Hardrives	3	January 25th 07 05:02 PM
Power Supply - compatibility considerations?	Colleyville Alan	General	5	January 8th 06 10:31 AM
Lost with NDMP	kudzu-cro	Storage & Hardrives	3	November 9th 05 06:24 PM
Notebook HDDs Considerations	nameruse	Storage (alternative)	3	July 20th 04 08:35 PM
AI7 Temps & Water Cooling Considerations	Jim	Abit Motherboards	7	January 10th 04 04:28 AM