If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. |
|
|
Thread Tools | Display Modes |
#1
|
|||
|
|||
Complex EMC CX600 Storage problem. Anyone using Win2K3Ent on EMC CX600?
Situation: Hourly MSSQL Backups to disk containers are consuming all Free
System PTEs ultimately leading to an unstable system. Background: I have a Windows 2003 Enterprise Server(Dell PE6650, 4x 2.0 GHz 2MB Xeon, 16GB RAM) running SQL2000 sp3. This server is attached to an EMC CX600 via Dual QLogic 2200 HBA's. The SQL server has about 2TB storage assigned to it across 7 logical drives. Each Drive is it's own LUN. We are running PowerPath v3.0.5 for fibre channel path failover. The problem manifested itself as the server would run for about 2 days and then just hang. No BSOD, just hang. We enabled the CrashonScroll registry setting and dumped the kernel at the next hang. Debugging the kernel dump led us to free system PTE's running out. We enabled the perfmon counter Memory|Free System Page Table Entries and began watching it. Every hour at the top of the hour we would lose about 20,000 PTE's and eventually end up with about 2,000 free. 2,000 Free PTE's = an unstable system especially a large SQL server. So we enabled the TrackPtes Registry key which would enable stack traces and force a BSOD xDA if the servers PTE's pool was misused. Sure enough at the top of the next hour the server BSOD dumping the kernel. A debug of that dump showed that the following 2 drivers were probably misusing the PTE pool. CLASSPNP.sys and emcpbase.sys. Both are storage drivers. CLASSPNP.sys is a Microsoft SCSI driver and emcpbase.sys is a EMC PowerPath base driver. We backup our SQL databases to local disk containers. These backups are scheduled every hour at the top of the hour. As a test I rebooted the server and disabled all backups. So far the PTE pool is fine except for normal use. I was able to duplicate the PTE misuse by running the following in both QA and Enterprise Manager: BACKUP DATABASE [network] TO DISK = N'x:\network.bak' WITH NOINIT , NOUNLOAD , NAME = N'Network Backup', SKIP , STATS = 10, NOFORMAT This is an approximately 500MB Database and running this command results in the loss of approximately 10,000 Free PTE's. If I point this backup to a local drive vs. a SAN drive the Free PTE pool is not misused. This eliminates the CLASSPNP.sys driver. I have not been able to cause the PTE misuse to happen in any other way. I've tried copying very large files from local to SAN disks, from SAN disks to SAN disks, I've tried generating extremely large files on the SAN disks with IOMeter. I've tried exporting 10GB worth of data from a DTS job to SAN disks. So far nothing but a SQL backup causes the PTE misuse. Is anyone else in this group using Win2K3 Enterprise with any sort of EMC storage? UPDATE: We've got /PAE in the boot.ini but no /3GB because of PTE problems above 12GB when /3GB used. We determined today that performing an incremental backup of the transaction log(.ldf) does not consume PTEs. But a full or Differential of the Database(.mdf) does. There must be something about the SQL Backup Command that does a non-standard io call. Maybe something to do with the MS tape format or something. At this point the case has escallated all the way through Dell to EMC engineering and a corresponding MS engineering. Still looking for someone with a similar setup to try and duplicate the problem. --T ----== Posted via Newsfeed.Com - Unlimited-Uncensored-Secure Usenet News==---- http://www.newsfeed.com The #1 Newsgroup Service in the World! 100,000 Newsgroups ---= 19 East/West-Coast Specialized Servers - Total Privacy via Encryption =--- |
#2
|
|||
|
|||
Definitely a bug in EMC's emcpbase.sys driver. Try to obtain the newer build
version of it. CLASSPNP is a common half of disk and CD-ROM drivers. The disk driver is DISK+CLASSPNP combo, while the CD-ROM driver is CDROM+CLASSPNP combo. CLASSPNP is loaded on any NT machine, so, it is unlikely that SysPTE leaks are caused by it. -- Maxim Shatskih, Windows DDK MVP StorageCraft Corporation http://www.storagecraft.com "TeKn0" wrote in message ... Situation: Hourly MSSQL Backups to disk containers are consuming all Free System PTEs ultimately leading to an unstable system. Background: I have a Windows 2003 Enterprise Server(Dell PE6650, 4x 2.0 GHz 2MB Xeon, 16GB RAM) running SQL2000 sp3. This server is attached to an EMC CX600 via Dual QLogic 2200 HBA's. The SQL server has about 2TB storage assigned to it across 7 logical drives. Each Drive is it's own LUN. We are running PowerPath v3.0.5 for fibre channel path failover. The problem manifested itself as the server would run for about 2 days and then just hang. No BSOD, just hang. We enabled the CrashonScroll registry setting and dumped the kernel at the next hang. Debugging the kernel dump led us to free system PTE's running out. We enabled the perfmon counter Memory|Free System Page Table Entries and began watching it. Every hour at the top of the hour we would lose about 20,000 PTE's and eventually end up with about 2,000 free. 2,000 Free PTE's = an unstable system especially a large SQL server. So we enabled the TrackPtes Registry key which would enable stack traces and force a BSOD xDA if the servers PTE's pool was misused. Sure enough at the top of the next hour the server BSOD dumping the kernel. A debug of that dump showed that the following 2 drivers were probably misusing the PTE pool. CLASSPNP.sys and emcpbase.sys. Both are storage drivers. CLASSPNP.sys is a Microsoft SCSI driver and emcpbase.sys is a EMC PowerPath base driver. We backup our SQL databases to local disk containers. These backups are scheduled every hour at the top of the hour. As a test I rebooted the server and disabled all backups. So far the PTE pool is fine except for normal use. I was able to duplicate the PTE misuse by running the following in both QA and Enterprise Manager: BACKUP DATABASE [network] TO DISK = N'x:\network.bak' WITH NOINIT , NOUNLOAD , NAME = N'Network Backup', SKIP , STATS = 10, NOFORMAT This is an approximately 500MB Database and running this command results in the loss of approximately 10,000 Free PTE's. If I point this backup to a local drive vs. a SAN drive the Free PTE pool is not misused. This eliminates the CLASSPNP.sys driver. I have not been able to cause the PTE misuse to happen in any other way. I've tried copying very large files from local to SAN disks, from SAN disks to SAN disks, I've tried generating extremely large files on the SAN disks with IOMeter. I've tried exporting 10GB worth of data from a DTS job to SAN disks. So far nothing but a SQL backup causes the PTE misuse. Is anyone else in this group using Win2K3 Enterprise with any sort of EMC storage? UPDATE: We've got /PAE in the boot.ini but no /3GB because of PTE problems above 12GB when /3GB used. We determined today that performing an incremental backup of the transaction log(.ldf) does not consume PTEs. But a full or Differential of the Database(.mdf) does. There must be something about the SQL Backup Command that does a non-standard io call. Maybe something to do with the MS tape format or something. At this point the case has escallated all the way through Dell to EMC engineering and a corresponding MS engineering. Still looking for someone with a similar setup to try and duplicate the problem. --T ----== Posted via Newsfeed.Com - Unlimited-Uncensored-Secure Usenet News==---- http://www.newsfeed.com The #1 Newsgroup Service in the World! 100,000 Newsgroups ---= 19 East/West-Coast Specialized Servers - Total Privacy via Encryption =--- |
#3
|
|||
|
|||
We have just started deploying W2003 against a CX-600 and a CX-600 with
PowerPath 3.05. I have not seen that issue, however, our SQL servers are still W2k running ATF against FC-4700s. Have you tried running SQL backups without PowerPath ? What is EMC telling you ? We are moving to PowerPath for SQL server. I will try it out in our test lab and tell you what I find. Good Luck, Greg G. "TeKn0" wrote in message ... Situation: Hourly MSSQL Backups to disk containers are consuming all Free System PTEs ultimately leading to an unstable system. Background: I have a Windows 2003 Enterprise Server(Dell PE6650, 4x 2.0 GHz 2MB Xeon, 16GB RAM) running SQL2000 sp3. This server is attached to an EMC CX600 via Dual QLogic 2200 HBA's. The SQL server has about 2TB storage assigned to it across 7 logical drives. Each Drive is it's own LUN. We are running PowerPath v3.0.5 for fibre channel path failover. The problem manifested itself as the server would run for about 2 days and then just hang. No BSOD, just hang. We enabled the CrashonScroll registry setting and dumped the kernel at the next hang. Debugging the kernel dump led us to free system PTE's running out. We enabled the perfmon counter Memory|Free System Page Table Entries and began watching it. Every hour at the top of the hour we would lose about 20,000 PTE's and eventually end up with about 2,000 free. 2,000 Free PTE's = an unstable system especially a large SQL server. So we enabled the TrackPtes Registry key which would enable stack traces and force a BSOD xDA if the servers PTE's pool was misused. Sure enough at the top of the next hour the server BSOD dumping the kernel. A debug of that dump showed that the following 2 drivers were probably misusing the PTE pool. CLASSPNP.sys and emcpbase.sys. Both are storage drivers. CLASSPNP.sys is a Microsoft SCSI driver and emcpbase.sys is a EMC PowerPath base driver. We backup our SQL databases to local disk containers. These backups are scheduled every hour at the top of the hour. As a test I rebooted the server and disabled all backups. So far the PTE pool is fine except for normal use. I was able to duplicate the PTE misuse by running the following in both QA and Enterprise Manager: BACKUP DATABASE [network] TO DISK = N'x:\network.bak' WITH NOINIT , NOUNLOAD , NAME = N'Network Backup', SKIP , STATS = 10, NOFORMAT This is an approximately 500MB Database and running this command results in the loss of approximately 10,000 Free PTE's. If I point this backup to a local drive vs. a SAN drive the Free PTE pool is not misused. This eliminates the CLASSPNP.sys driver. I have not been able to cause the PTE misuse to happen in any other way. I've tried copying very large files from local to SAN disks, from SAN disks to SAN disks, I've tried generating extremely large files on the SAN disks with IOMeter. I've tried exporting 10GB worth of data from a DTS job to SAN disks. So far nothing but a SQL backup causes the PTE misuse. Is anyone else in this group using Win2K3 Enterprise with any sort of EMC storage? UPDATE: We've got /PAE in the boot.ini but no /3GB because of PTE problems above 12GB when /3GB used. We determined today that performing an incremental backup of the transaction log(.ldf) does not consume PTEs. But a full or Differential of the Database(.mdf) does. There must be something about the SQL Backup Command that does a non-standard io call. Maybe something to do with the MS tape format or something. At this point the case has escallated all the way through Dell to EMC engineering and a corresponding MS engineering. Still looking for someone with a similar setup to try and duplicate the problem. --T ----== Posted via Newsfeed.Com - Unlimited-Uncensored-Secure Usenet News==---- http://www.newsfeed.com The #1 Newsgroup Service in the World! 100,000 Newsgroups ---= 19 East/West-Coast Specialized Servers - Total Privacy via Encryption =--- |
#4
|
|||
|
|||
EMC is just changing thumbs passing the case around. Personally I'm really
getting frustrated. I'd love to hear if you have the same trouble under 2K3. I think the problem specifically involves the SQL BACKUP command and PowerPath under 2K3. I have a test SQL server on 2K and the problem does not occur. I'm very hesitant to remove PPath because the server is 24x7 and I can't risk any extended downtime. Love to hear you experiences. --T "Greg Guenther" wrote in message hlink.net... We have just started deploying W2003 against a CX-600 and a CX-600 with PowerPath 3.05. I have not seen that issue, however, our SQL servers are still W2k running ATF against FC-4700s. Have you tried running SQL backups without PowerPath ? What is EMC telling you ? We are moving to PowerPath for SQL server. I will try it out in our test lab and tell you what I find. Good Luck, Greg G. "TeKn0" wrote in message ... Situation: Hourly MSSQL Backups to disk containers are consuming all Free System PTEs ultimately leading to an unstable system. Background: I have a Windows 2003 Enterprise Server(Dell PE6650, 4x 2.0 GHz 2MB Xeon, 16GB RAM) running SQL2000 sp3. This server is attached to an EMC CX600 via Dual QLogic 2200 HBA's. The SQL server has about 2TB storage assigned to it across 7 logical drives. Each Drive is it's own LUN. We are running PowerPath v3.0.5 for fibre channel path failover. The problem manifested itself as the server would run for about 2 days and then just hang. No BSOD, just hang. We enabled the CrashonScroll registry setting and dumped the kernel at the next hang. Debugging the kernel dump led us to free system PTE's running out. We enabled the perfmon counter Memory|Free System Page Table Entries and began watching it. Every hour at the top of the hour we would lose about 20,000 PTE's and eventually end up with about 2,000 free. 2,000 Free PTE's = an unstable system especially a large SQL server. So we enabled the TrackPtes Registry key which would enable stack traces and force a BSOD xDA if the servers PTE's pool was misused. Sure enough at the top of the next hour the server BSOD dumping the kernel. A debug of that dump showed that the following 2 drivers were probably misusing the PTE pool. CLASSPNP.sys and emcpbase.sys. Both are storage drivers. CLASSPNP.sys is a Microsoft SCSI driver and emcpbase.sys is a EMC PowerPath base driver. We backup our SQL databases to local disk containers. These backups are scheduled every hour at the top of the hour. As a test I rebooted the server and disabled all backups. So far the PTE pool is fine except for normal use. I was able to duplicate the PTE misuse by running the following in both QA and Enterprise Manager: BACKUP DATABASE [network] TO DISK = N'x:\network.bak' WITH NOINIT , NOUNLOAD , NAME = N'Network Backup', SKIP , STATS = 10, NOFORMAT This is an approximately 500MB Database and running this command results in the loss of approximately 10,000 Free PTE's. If I point this backup to a local drive vs. a SAN drive the Free PTE pool is not misused. This eliminates the CLASSPNP.sys driver. I have not been able to cause the PTE misuse to happen in any other way. I've tried copying very large files from local to SAN disks, from SAN disks to SAN disks, I've tried generating extremely large files on the SAN disks with IOMeter. I've tried exporting 10GB worth of data from a DTS job to SAN disks. So far nothing but a SQL backup causes the PTE misuse. Is anyone else in this group using Win2K3 Enterprise with any sort of EMC storage? UPDATE: We've got /PAE in the boot.ini but no /3GB because of PTE problems above 12GB when /3GB used. We determined today that performing an incremental backup of the transaction log(.ldf) does not consume PTEs. But a full or Differential of the Database(.mdf) does. There must be something about the SQL Backup Command that does a non-standard io call. Maybe something to do with the MS tape format or something. At this point the case has escallated all the way through Dell to EMC engineering and a corresponding MS engineering. Still looking for someone with a similar setup to try and duplicate the problem. --T ----== Posted via Newsfeed.Com - Unlimited-Uncensored-Secure Usenet News==---- http://www.newsfeed.com The #1 Newsgroup Service in the World! 100,000 Newsgroups ---= 19 East/West-Coast Specialized Servers - Total Privacy via Encryption =--- ----== Posted via Newsfeed.Com - Unlimited-Uncensored-Secure Usenet News==---- http://www.newsfeed.com The #1 Newsgroup Service in the World! 100,000 Newsgroups ---= 19 East/West-Coast Specialized Servers - Total Privacy via Encryption =--- |
#5
|
|||
|
|||
For anyone following this thread, and the benefit of google searches.
UPDATE: We have determined that PowerPath 3.0.5 is the culprit misusing the PTE pool. We were able to duplicate the problem on a Windows 2003 Standard Server, and the problem did not occur when powerpath was removed from the system. The problem returned when PowerPath was re-installed. So, if you run any flavor of Windows 2003 Server with MS SQL 2000 sp3a and utilize EMC PowerPath 3.0.5 be aware that running a SQL BACKUP command with a SAN disk as the destination will result in a misused PTE pool and the server will eventually hang because of no more PTE's. --T |
Thread Tools | |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
not the same old B&W problem (I don't think) | Brian Lee | Ati Videocards | 0 | July 9th 04 01:15 AM |
Really WEIRD problem! | TMack | Ati Videocards | 4 | April 12th 04 04:30 AM |
Problem with freezing mouse, Windows Explorer | Jon Davis | General | 3 | April 3rd 04 04:06 AM |
help with refresh problem please | Michelle | Ati Videocards | 15 | November 27th 03 03:01 AM |
Freezing, lock up, unresponsive problem. | James | General | 5 | September 5th 03 02:54 PM |