If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. |
|
|
|
Thread Tools | Display Modes |
#11
|
|||
|
|||
I am too a netapp fan, didn't want to make it sound otherwise. The
types of upgrades or errors which would cause an outage (from our experience) a Upgrades of the heads (adding cards, swapping cards, replacing cards or swapping the head itself). Replacing almost anything on the disk shelf other than the disk itself (LRC, Shelf faults with temp sensors as it's build into the physical shelf, problems with the cables, or recabling the cables to allow expansion). The reason is the potential of disk corruption and hot LRC replacement is not support by netapp. You can probably try a failover, replace it, then failback, but netapp won't certify it and you could possibly panic the box. Any diagnostic which requires a coredump (we've had to do it 4 times). To do that you basically push the power button and it should automatically generate one. Certain upgrades of code (6.4 to 6.5) or going to 7.0 as it will involve completely restructuring the disk array to take advantage of flexvols (unless you try using something like a rainfinity or neopath in band replication mechanish.) Again, I don't want to appear as a netapp basher, because it's sooooooo much better than Windows 2000 cluster servers sitting on a SAN, and many of the types of outages could be risked by playing with the disk timeouts of nfs timeouts, but that might not be the best solution depending on the criticality of the data base and the possiblity of data corruption. I think we've been treated too carefully by netapp because we are a very large customer who've encountered lots of bugs and they were worried. Now we need to do things that I'm sure you've already done such as extending the nfs timeouts so we can do a routine operating system upgrade without having to create an outage window. Lots of companies such as sprint use netapp as a SAN, because their ability to use snapshots with sql is vastly superior. The integration of exchange using a filer as a san and snapmanager on the exchange server is also very powerful and industry leading. So even for certain SAN uses, NetApp might be a good choice. Though certainly as a regular SAN for ordinary disks, it's just to complicated with netapp vs. a very simple to configure and robust EMC or Hitachi array. Also, netapp is integrating with spinnaker, and that should make the units themselves virtualized behind a namespace, which will be great. I love netapp's flexibility, and it's ability to make a simple DR plan (which we actually had to use when we lost a building) or its ability to retrieve brick level data from exchange, or do instant oracle hot backups all far outweigh in the vast majority of cases the small amount of time you would need to take for an outage in the cases you might encounter such as the ones above. Specifically we are looking at products like rainfinity to help virtualize the storage systems under NIS and DFS and do inband replication to move critical data off of systems which might need hardware replaced to avoid the outage. -Doug |
#12
|
|||
|
|||
On 28 Jul 2005 14:48:15 -0700, "boatgeek"
wrote: I am too a netapp fan, didn't want to make it sound otherwise. The types of upgrades or errors which would cause an outage (from our experience) a Upgrades of the heads (adding cards, swapping cards, replacing cards or swapping the head itself). Able to do almost all of this with cluster failover, minus a full head swap (I think that's possible too but requires some serious prep work). Replacing almost anything on the disk shelf other than the disk itself (LRC, Shelf faults with temp sensors as it's build into the physical shelf, problems with the cables, or recabling the cables to allow expansion). The reason is the potential of disk corruption and hot LRC replacement is not support by netapp. You can probably try a failover, replace it, then failback, but netapp won't certify it and you could possibly panic the box. Temp sensors require downtime, just had to do one recently. It sucked. LRC replacement is cold too, but ESH modules are hot swappable so it depends on the vintage of your hardware I guess. Recabling can be done on the fly (cluster failover) and expansions are hot, assuming newer hardware. Any diagnostic which requires a coredump (we've had to do it 4 times). To do that you basically push the power button and it should automatically generate one. Certain upgrades of code (6.4 to 6.5) or going to 7.0 as it will involve completely restructuring the disk array to take advantage of flexvols (unless you try using something like a rainfinity or neopath in band replication mechanish.) I count both of the above as reboots, well within NFS timeouts. In 6 years I've only seen one application that didn't deal with NFS timeout retries well. Meaning a normal upgrade, or reboot, or manual core dump are all non-disruptive to the clients. All this is NFS-based mind you, CIFS gets dorked in all cases except cluster failover. And fail-back only survives if you have newer hardware with Compact Flash cards. Again, I don't want to appear as a netapp basher, because it's sooooooo much better than Windows 2000 cluster servers sitting on a SAN, and many of the types of outages could be risked by playing with the disk timeouts of nfs timeouts, but that might not be the best solution depending on the criticality of the data base and the possiblity of data corruption. I think we've been treated too carefully by netapp because we are a very large customer who've encountered lots of bugs and they were worried. Now we need to do things that I'm sure you've already done such as extending the nfs timeouts so we can do a routine operating system upgrade without having to create an outage window. Lots of companies such as sprint use netapp as a SAN, because their ability to use snapshots with sql is vastly superior. The integration of exchange using a filer as a san and snapmanager on the exchange server is also very powerful and industry leading. So even for certain SAN uses, NetApp might be a good choice. Though certainly as a regular SAN for ordinary disks, it's just to complicated with netapp vs. a very simple to configure and robust EMC or Hitachi array. I actually counsel against NetApp as block based storage. It's designed as a file server and excels in that arena, it is not designed as block based storage (for clients that is). That being said, there's almost no reason Oracle (or most db's) require block access these days. Plenty of companies, including mine, run Oracle over NFS and it works fantastically. Oracle over NFS on NetApp is, as unbiased as possible, far superior to running it with any other block based storage. Snapshots - true block delta changes based on file Snapmirror/SnapVault - stupidly easy replication for DR, test, dev, etc.. SnapRestore - near instant recovery to any point in time snapshot NFS - built in multi-write (and therefore clusterable) file system We're running Oracle 10G RAC over NFS on a cluster pair of filers. It's truly awesome. The 5% of databases that truly need the ultimate in performance should run on something like HDS, completely optimized for performance. The other 95% could easily run over NFS with all the feature, functions, and ease of management and recovery that the NetApp allows. Of course if you need CIFS that's a different story. You can probably tell I don't use CIFS much... Also, netapp is integrating with spinnaker, and that should make the units themselves virtualized behind a namespace, which will be great. I love netapp's flexibility, and it's ability to make a simple DR plan (which we actually had to use when we lost a building) or its ability to retrieve brick level data from exchange, or do instant oracle hot backups all far outweigh in the vast majority of cases the small amount of time you would need to take for an outage in the cases you might encounter such as the ones above. Specifically we are looking at products like rainfinity to help virtualize the storage systems under NIS and DFS and do inband replication to move critical data off of systems which might need hardware replaced to avoid the outage. -Doug we've got Rainstorage in house, have had for a while now. I've used them at 2 different jobs also. Nice product, great for what I'm asking of it. Works almost all the time. ;-] For CIFS it's not that good since you have to break the connections to go inband. If you plan to stay in band then no problem, but I have a hard time leaving a single gig linux box in front of my trunked gig filer. ~F |
#13
|
|||
|
|||
What type of Oracle Database is it? If you do allot of Random Read you
will need a SAN based solution. I have implemented a ton of raq clusters and nothing in the end is faster then SAN. I would look at the Clariion line or the HDS Lighting line. You can configure allot of Raid 1/0 which will be the best thing you can do for a oracle DB. I hate to say this... Any high transaction based DB needs fiber. To try to do it on NAS is a wasted effort. It has been proven even with 10 GIG Ethernet you can not out perform a solid fiber channel solution. Now with 4 GIG Fiber channel i would not even burn the cycles for nas. |
#14
|
|||
|
|||
The Clariion speed is roughly the speed of the netapp FAS960 (don't put
anything on them asking for more than 100 GBs/sec total or around 50-60k IOs/sec). Clariions are midtier storage in my mind vs the DMX. We have netapps, clariions, hitachis and symms. We have around 250 oracle databases on 6 pairs of netapp filers. It's much easier to give space to on the netapp, much easier. As to speed, the EMC DMX 2000 and 3000 is FAR greater speed than netapp, but frankly two of our filers are putting around 120 GBs/Sec between them. The rest of our entire switch infrastructure with 4 EMC SANs and 4 Hitachi SANs is doing around 20 GBs/sec. The reason is the DBAs never know what the throughput is going to be. The speed debate is often silly. DBAs, or baby faced consultants with no realworld experience ask for the fastest thing possible and run everyones budget into the ground. It's like saying you want to commute from DC to NY and therefore if you had a ferrari you could make the trip as fast as possible. When in fact, the guy with the diesel jetta will be right behind you every step of the way at a tenth of the cost because of traffic lights, and speed limits and all sorts of real world things that stop performance for almost ever touching the sky high limits of a DMX. So if you know you have something that really really cranks out data, then definitely go to a high end SAN like a DMX, but be prepared to pay 4 times the cost of the netapp and spend 5 times as long trying to get it going. |
#15
|
|||
|
|||
On 29 Jul 2005 08:25:51 -0700, "boatgeek"
wrote: The Clariion speed is roughly the speed of the netapp FAS960 (don't put anything on them asking for more than 100 GBs/sec total or around" 50-60k IOs/sec). Clariions are midtier storage in my mind vs the DMX. We have netapps, clariions, hitachis and symms. We have around 250 oracle databases on 6 pairs of netapp filers. It's much easier to give space to on the netapp, much easier. As to speed, the EMC DMX 2000 and 3000 is FAR greater speed than netapp, but frankly two of our filers are putting around 120 GBs/Sec between them. The rest of our entire switch infrastructure with 4 EMC SANs and 4 Hitachi SANs is doing around 20 GBs/sec. The reason is the DBAs never know what the throughput is going to be. The speed debate is often silly. DBAs, or baby faced consultants with no realworld experience ask for the fastest thing possible and run everyones budget into the ground. It's like saying you want to commute from DC to NY and therefore if you had a ferrari you could make the trip as fast as possible. When in fact, the guy with the diesel jetta will be right behind you every step of the way at a tenth of the cost because of traffic lights, and speed limits and all sorts of real world things that stop performance for almost ever touching the sky high limits of a DMX. So if you know you have something that really really cranks out data, then definitely go to a high end SAN like a DMX, but be prepared to pay 4 times the cost of the netapp and spend 5 times as long trying to get it going. I think that was a great summation, I'm gonna have to remember the NY trip explanation. For carmellomcc I can only say this; as I mentioned previously only about 5% of *ALL* databases need transaction performance beyond what NAS can do. So if you run in that 5% then by all means, get block access with an HDS or EMC or IBM. But if you're like the other 95% NAS has all the power you'll need *plus* features and functions and manageability you will not find with current block access arrays. "What type of Oracle Database is it? If you do allot of Random Read you will need a SAN based solution. I have implemented a ton of raq clusters and nothing in the end is faster then SAN." Several items to address there. 1) 10G RAC (I assume 'raq' was a typo on your part) is what we've implemented. We do random reads, but the definition of "alot" is variable. The simple fact is most people who think they do alot of random access really don't. 2) I never said NAS was faster than SAN, merely that the vast majority of db's out there don't need the blazing performance they think they do. 3) implementing RAC also adds overhead with a CFS on SAN. With NAS you don't have any of that overhead or complexity. It's built into the protocol. A side note, if the db server is under heavy load from memory and cpu usage NAS can actually be faster than SAN or DAS. The reason is that the server only makes a single call to the driver for data, the actual IO functions are offloaded to the NAS host. And when dealing with db's under 6GB it's essentially a memory to memory transfer. Several publicly available tests have been done showing that NFS, as a protocol, is not slower than fiber channel. It's mostly the overhead of ethernet, 70% use v. 90%. ~F |
|
Thread Tools | |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Why Can't NetApp Keep Time | [email protected] | Storage & Hardrives | 8 | February 7th 05 11:50 PM |
snapshot schemes in emc and netapp | vidyesh | Storage & Hardrives | 8 | November 24th 04 09:37 PM |
emc ns 700 v/s. netapp f 980 | vidyesh | Storage & Hardrives | 13 | August 27th 04 04:25 PM |
iSCSI on NetAPP as Target and Windows 2003 Software initiator | Moshiko | Storage & Hardrives | 6 | February 17th 04 05:32 PM |
Alternative for NetApp F825c (for CIFS & iSCSI) | Benno... | Storage & Hardrives | 4 | January 19th 04 06:20 PM |