If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. |
|
|
|
Thread Tools | Display Modes |
#11
|
|||
|
|||
NDMP considerations
On Apr 5, 5:25 pm, Faeandar wrote:
On 5 Apr 2007 16:58:47 -0700, wrote: Snapvault is quite nice because you can schedule the vaults to occur much more simply than snapmirror (though technically they both use the same underlying technology). Why are you not backing up the filers directly with NDMP over FC or even Ethernet? This is exactly what I am trying to setup here... an NDMP over FC environment... However, some poeple had warned about long backup times for large amounts of data, and that is what prompted me to my post... That is completely dependent on your data set. Post a quota report with file listing and I can give you a ballpark. Also post what type of infrastructure you are looking to build. 2gb or 4gb FC? direct to tape or FC switch? type of tape drives? Are all filers going to backup direct to tape or will you do 3-way backups in some cases? (this is where a filer without a tape drive does an NDMP dump over ethernet to a filer who has a tape drive) What backup software are you going to use? Which is, if I had say 1.2TB of data, would I also face the 14 hr backup cycles like Raju had? Because in our environment, we will likely start approaching over 500TB of data with in the next few months as we consolidate and with in a year be up to a PB of online data. So while I now understand how efficient Snapmirror is for replication, I would like to understand what the NDMP backup considerations should be... will that also work at wireline speeds? Can I do multiple NDMP backups from various filers in parallel and so on. I think we're getting somewhere now. There are lots of restrictions depending on the backup software. For the most part you can't share tape drives between filers except via 3-way backups (as mentioned above). But this can cause performance problems depending on your data, network, infrastructure, etc. But it does work quite nicely in many cases and is almost as fast as FC, you just have a filer doing both net in and tape out for something other than it's own data. Is the time to backup really a problem? What's driving it? I ask because the filer backups based on snapshot so the need to backup fast is not based on coherency. Post file count or quota report, that will help you understand backup implications better than any theoretical discussion of speeds and feeds. ~F The time to backup may actually not b a problem now that I understand what you say... although I'd expect the backup to happen in somehwat a reasonable amount of time... For example, respoding to the post by Raju, if we had a 14 hour window to backup just 1.2 TB of data, that wouldnt fly with 500TB of data... Its not that it needs to be done in 24 hrs or even a few days... its just that we cant have operations that span more than a few weeks. Is Raju's enviornment an anomoly or is that common? I'm asking because I've heard some caution about long backup windows for large amounts of data. Probobally more in particular, restore on a volume level also seems to be something some people in my old organization had warned me about... some volumes are fairly very large, and just to recover a single file, would I need to restore an entire volume? |
#12
|
|||
|
|||
NDMP considerations
On Apr 6, 12:19 am, wrote:
On Apr 5, 10:39 am, "Raju Mahala" wrote: On Apr 5, 12:19 pm, wrote: Hi Friends, I have a task of consolidating some of our datacenters. When done, I would have accumilated about 1 peta byte of data on our brand new netapp filers. While I am in our scoping out phase, I would like to know what considerations I should take for backing up (full and incremental backups) of 1 petabyte of data on an on going basis. What issues will I run into? What are best practises? How long will this backup take? Any other things to watch out for? Just a few notes on our architecture... we have a few high end netapp filers (combinations of 3020s, 960cs etc) as tier 1 followed by a tier of 270cs for older data. I dont have anything speced for VTL and backup yet. Thank You Ludlas. Infact I don't have experience on ndmpcopy backup but I feel during ndmpcopy filer performance may be affected. So you have to findout whether you have sufficient off hours for backup. Another point I would like to say that check average file size, if its very less then ndmpcopy may take more time. More no. of small size files on volume affect ndmpcopy performance very badly. In out environment, 1.2TB volume with around 30M files take more than 14hr to complete ndmpcopy backup. So we take backup from another linux box where storage volumes are mounted. - Raju Thanks Raju, that somewhat helps... I cant understand though why a mere 1.2 TB volume is taking 14 hours to read. That is way slower than wireline speeds... We are able to read 1.2TB of data much faster than that from even our slow 270s just over NFS infact. (one of our modules copies all snapshot directories via NFS instead of ndmp right now) Do you know where the bottleneck is? Any one else out there with experiences in this area for such large number of files and data?- Hide quoted text - - Show quoted text - Yes we know the bottleneck. Lots of RCA was done by Netapp guy also. So there are two major reasons : 1) large no. of small files (average file size : ~40KB) 2) Heavy turnaround of file creation/deletion So due to these factors volume gets fragmented at file level and because ndmpcopy works at file level so it takes more time to read fragmentated files. If we go for snapmirror then it doesn't take much time because that is block level copy. Even it takes more than expected time during snapvault which is block level copy but works at qtree level so need to find out corresponding block for qtree. - Raju |
#13
|
|||
|
|||
NDMP considerations
Post file count or quota report, that will help you understand backup
implications better than any theoretical discussion of speeds and feeds. ~F As I feel file count and daily turnaround of file deletion/creation (which lead to file fragmentation) will play major role in total backup time through ndmpcopy, whether it is over FC or ethernet. I have experience in NDMP over ethernet and in that case volume with low turnaround and less file count were getting done almost based on wire speed but volume those had more file count with higher turnaround were having problem. I can't comment on NDMP over FC but I feel bottleneck is due to filecount which will remain as it is so must be same scenerio. The time to backup may actually not b a problem now that I understand what you say... although I'd expect the backup to happen in somehwat a reasonable amount of time... For example, respoding to the post by Raju, if we had a 14 hour window to backup just 1.2 TB of data, that wouldnt fly with 500TB of data... Its not that it needs to be done in 24 hrs or even a few days... its just that we cant have operations that span more than a few weeks. Is Raju's enviornment an anomoly or is that common? I'm asking because I've heard some caution about long backup windows for large amounts of data. This is not with our all netapp volumes so can't say common. Its only with some netapp volume those are for some specific user groups where heavy turnaround of file creation/deletion occurs with very small files. Rest of the volumes are ok and don't have these issues. Probobally more in particular, restore on a volume level also seems to be something some people in my old organization had warned me about... some volumes are fairly very large, and just to recover a single file, would I need to restore an entire volume? I am not sure but I hope that shouldn't be the case. Restore procedure must depend on Backup software policy. If it is the case then I think useless so I almost sure it shouldn't be the case. - Raju |
#14
|
|||
|
|||
NDMP considerations
|
#15
|
|||
|
|||
NDMP considerations
On 6 Apr 2007 02:01:35 -0700, "Raju Mahala"
wrote: On Apr 6, 12:19 am, wrote: On Apr 5, 10:39 am, "Raju Mahala" wrote: On Apr 5, 12:19 pm, wrote: Hi Friends, I have a task of consolidating some of our datacenters. When done, I would have accumilated about 1 peta byte of data on our brand new netapp filers. While I am in our scoping out phase, I would like to know what considerations I should take for backing up (full and incremental backups) of 1 petabyte of data on an on going basis. What issues will I run into? What are best practises? How long will this backup take? Any other things to watch out for? Just a few notes on our architecture... we have a few high end netapp filers (combinations of 3020s, 960cs etc) as tier 1 followed by a tier of 270cs for older data. I dont have anything speced for VTL and backup yet. Thank You Ludlas. Infact I don't have experience on ndmpcopy backup but I feel during ndmpcopy filer performance may be affected. So you have to findout whether you have sufficient off hours for backup. Another point I would like to say that check average file size, if its very less then ndmpcopy may take more time. More no. of small size files on volume affect ndmpcopy performance very badly. In out environment, 1.2TB volume with around 30M files take more than 14hr to complete ndmpcopy backup. So we take backup from another linux box where storage volumes are mounted. - Raju Thanks Raju, that somewhat helps... I cant understand though why a mere 1.2 TB volume is taking 14 hours to read. That is way slower than wireline speeds... We are able to read 1.2TB of data much faster than that from even our slow 270s just over NFS infact. (one of our modules copies all snapshot directories via NFS instead of ndmp right now) Do you know where the bottleneck is? Any one else out there with experiences in this area for such large number of files and data?- Hide quoted text - - Show quoted text - Yes we know the bottleneck. Lots of RCA was done by Netapp guy also. So there are two major reasons : 1) large no. of small files (average file size : ~40KB) 2) Heavy turnaround of file creation/deletion So due to these factors volume gets fragmented at file level and because ndmpcopy works at file level so it takes more time to read fragmentated files. If we go for snapmirror then it doesn't take much time because that is block level copy. Even it takes more than expected time during snapvault which is block level copy but works at qtree level so need to find out corresponding block for qtree. - Raju There is really no fragmentation in WAFL, at least not what most people understand it to be. Because ALL writes on a filer are sequential (think NVRAM) and because of the way WAFL functions it's almost zero in a general purpose environment. However, the more you fill up a volume the harder the file system has to work to figure out where to put stuff. This is what causes performance degradation at 85% and above. Data churn is also not an issue for backups unless you run out of snapshot space. Performance may be impacted because of the general resource usage of other processes but the data churn itself is not an issue. ~F |
|
Thread Tools | |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
NDMP backups I/O bottleneck? | Mike | Storage & Hardrives | 3 | January 25th 07 05:02 PM |
Power Supply - compatibility considerations? | Colleyville Alan | General | 5 | January 8th 06 10:31 AM |
Lost with NDMP | kudzu-cro | Storage & Hardrives | 3 | November 9th 05 06:24 PM |
Notebook HDDs Considerations | nameruse | Storage (alternative) | 3 | July 20th 04 08:35 PM |
AI7 Temps & Water Cooling Considerations | Jim | Abit Motherboards | 7 | January 10th 04 04:28 AM |