A computer components & hardware forum. HardwareBanter

If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

Go Back   Home » HardwareBanter forum » General Hardware & Peripherals » Storage & Hardrives
Site Map Home Register Authors List Search Today's Posts Mark Forums Read Web Partners

Complex EMC CX600 Storage problem. Anyone using Win2K3Ent on EMC CX600?



 
 
Thread Tools Display Modes
  #1  
Old September 17th 03, 01:48 AM
TeKn0
external usenet poster
 
Posts: n/a
Default Complex EMC CX600 Storage problem. Anyone using Win2K3Ent on EMC CX600?

Situation: Hourly MSSQL Backups to disk containers are consuming all Free
System PTEs ultimately leading to an unstable system.

Background: I have a Windows 2003 Enterprise Server(Dell PE6650, 4x 2.0 GHz
2MB Xeon, 16GB RAM) running SQL2000 sp3. This server is attached to an EMC
CX600 via Dual QLogic 2200 HBA's. The SQL server has about 2TB storage
assigned to it across 7 logical drives. Each Drive is it's own LUN. We are
running PowerPath v3.0.5 for fibre channel path failover.

The problem manifested itself as the server would run for about 2 days and
then just hang. No BSOD, just hang. We enabled the CrashonScroll registry
setting and dumped the kernel at the next hang. Debugging the kernel dump
led us to free system PTE's running out. We enabled the perfmon counter
Memory|Free System Page Table Entries and began watching it. Every hour at
the top of the hour we would lose about 20,000 PTE's and eventually end up
with about 2,000 free. 2,000 Free PTE's = an unstable system especially a
large SQL server. So we enabled the TrackPtes Registry key which would
enable stack traces and force a BSOD xDA if the servers PTE's pool was
misused. Sure enough at the top of the next hour the server BSOD dumping
the kernel. A debug of that dump showed that the following 2 drivers were
probably misusing the PTE pool. CLASSPNP.sys and emcpbase.sys. Both are
storage drivers. CLASSPNP.sys is a Microsoft SCSI driver and emcpbase.sys
is a EMC PowerPath base driver.

We backup our SQL databases to local disk containers. These backups are
scheduled every hour at the top of the hour. As a test I rebooted the
server and disabled all backups. So far the PTE pool is fine except for
normal use. I was able to duplicate the PTE misuse by running the following
in both QA and Enterprise Manager:

BACKUP DATABASE [network] TO DISK = N'x:\network.bak' WITH NOINIT ,
NOUNLOAD , NAME = N'Network Backup', SKIP , STATS = 10, NOFORMAT

This is an approximately 500MB Database and running this command results in
the loss of approximately 10,000 Free PTE's.
If I point this backup to a local drive vs. a SAN drive the Free PTE pool is
not misused. This eliminates the CLASSPNP.sys driver.

I have not been able to cause the PTE misuse to happen in any other way.
I've tried copying very large files from local to SAN disks, from SAN disks
to SAN disks, I've tried generating extremely large files on the SAN disks
with IOMeter. I've tried exporting 10GB worth of data from a DTS job to SAN
disks. So far nothing but a SQL backup causes the PTE misuse.

Is anyone else in this group using Win2K3 Enterprise with any sort of EMC
storage?

UPDATE:
We've got /PAE in the boot.ini but no /3GB because of PTE problems above
12GB when /3GB used.

We determined today that performing an incremental backup of the transaction
log(.ldf) does not consume PTEs. But a full or Differential of the
Database(.mdf) does. There must be something about the SQL Backup Command
that does a non-standard io call. Maybe something to do with the MS tape
format
or something.

At this point the case has escallated all the way through Dell to EMC
engineering and a corresponding MS engineering.

Still looking for someone with a similar setup to try and duplicate the
problem.

--T




----== Posted via Newsfeed.Com - Unlimited-Uncensored-Secure Usenet News==----
http://www.newsfeed.com The #1 Newsgroup Service in the World! 100,000 Newsgroups
---= 19 East/West-Coast Specialized Servers - Total Privacy via Encryption =---
  #2  
Old September 17th 03, 11:09 AM
Maxim S. Shatskih
external usenet poster
 
Posts: n/a
Default

Definitely a bug in EMC's emcpbase.sys driver. Try to obtain the newer build
version of it.

CLASSPNP is a common half of disk and CD-ROM drivers. The disk driver is
DISK+CLASSPNP combo, while the CD-ROM driver is CDROM+CLASSPNP combo.

CLASSPNP is loaded on any NT machine, so, it is unlikely that SysPTE leaks are
caused by it.

--
Maxim Shatskih, Windows DDK MVP
StorageCraft Corporation

http://www.storagecraft.com


"TeKn0" wrote in message ...
Situation: Hourly MSSQL Backups to disk containers are consuming all Free
System PTEs ultimately leading to an unstable system.

Background: I have a Windows 2003 Enterprise Server(Dell PE6650, 4x 2.0 GHz
2MB Xeon, 16GB RAM) running SQL2000 sp3. This server is attached to an EMC
CX600 via Dual QLogic 2200 HBA's. The SQL server has about 2TB storage
assigned to it across 7 logical drives. Each Drive is it's own LUN. We are
running PowerPath v3.0.5 for fibre channel path failover.

The problem manifested itself as the server would run for about 2 days and
then just hang. No BSOD, just hang. We enabled the CrashonScroll registry
setting and dumped the kernel at the next hang. Debugging the kernel dump
led us to free system PTE's running out. We enabled the perfmon counter
Memory|Free System Page Table Entries and began watching it. Every hour at
the top of the hour we would lose about 20,000 PTE's and eventually end up
with about 2,000 free. 2,000 Free PTE's = an unstable system especially a
large SQL server. So we enabled the TrackPtes Registry key which would
enable stack traces and force a BSOD xDA if the servers PTE's pool was
misused. Sure enough at the top of the next hour the server BSOD dumping
the kernel. A debug of that dump showed that the following 2 drivers were
probably misusing the PTE pool. CLASSPNP.sys and emcpbase.sys. Both are
storage drivers. CLASSPNP.sys is a Microsoft SCSI driver and emcpbase.sys
is a EMC PowerPath base driver.

We backup our SQL databases to local disk containers. These backups are
scheduled every hour at the top of the hour. As a test I rebooted the
server and disabled all backups. So far the PTE pool is fine except for
normal use. I was able to duplicate the PTE misuse by running the following
in both QA and Enterprise Manager:

BACKUP DATABASE [network] TO DISK = N'x:\network.bak' WITH NOINIT ,
NOUNLOAD , NAME = N'Network Backup', SKIP , STATS = 10, NOFORMAT

This is an approximately 500MB Database and running this command results in
the loss of approximately 10,000 Free PTE's.
If I point this backup to a local drive vs. a SAN drive the Free PTE pool is
not misused. This eliminates the CLASSPNP.sys driver.

I have not been able to cause the PTE misuse to happen in any other way.
I've tried copying very large files from local to SAN disks, from SAN disks
to SAN disks, I've tried generating extremely large files on the SAN disks
with IOMeter. I've tried exporting 10GB worth of data from a DTS job to SAN
disks. So far nothing but a SQL backup causes the PTE misuse.

Is anyone else in this group using Win2K3 Enterprise with any sort of EMC
storage?

UPDATE:
We've got /PAE in the boot.ini but no /3GB because of PTE problems above
12GB when /3GB used.

We determined today that performing an incremental backup of the transaction
log(.ldf) does not consume PTEs. But a full or Differential of the
Database(.mdf) does. There must be something about the SQL Backup Command
that does a non-standard io call. Maybe something to do with the MS tape
format
or something.

At this point the case has escallated all the way through Dell to EMC
engineering and a corresponding MS engineering.

Still looking for someone with a similar setup to try and duplicate the
problem.

--T




----== Posted via Newsfeed.Com - Unlimited-Uncensored-Secure Usenet

News==----
http://www.newsfeed.com The #1 Newsgroup Service in the World! 100,000

Newsgroups
---= 19 East/West-Coast Specialized Servers - Total Privacy via Encryption

=---


  #3  
Old September 21st 03, 10:43 PM
Greg Guenther
external usenet poster
 
Posts: n/a
Default

We have just started deploying W2003 against a CX-600 and a CX-600 with
PowerPath 3.05.

I have not seen that issue, however, our SQL servers are still W2k running
ATF against FC-4700s.

Have you tried running SQL backups without PowerPath ?

What is EMC telling you ?

We are moving to PowerPath for SQL server. I will try it out in our test lab
and tell you what I find.

Good Luck,

Greg G.
"TeKn0" wrote in message
...
Situation: Hourly MSSQL Backups to disk containers are consuming all Free
System PTEs ultimately leading to an unstable system.

Background: I have a Windows 2003 Enterprise Server(Dell PE6650, 4x 2.0

GHz
2MB Xeon, 16GB RAM) running SQL2000 sp3. This server is attached to an

EMC
CX600 via Dual QLogic 2200 HBA's. The SQL server has about 2TB storage
assigned to it across 7 logical drives. Each Drive is it's own LUN. We

are
running PowerPath v3.0.5 for fibre channel path failover.

The problem manifested itself as the server would run for about 2 days and
then just hang. No BSOD, just hang. We enabled the CrashonScroll

registry
setting and dumped the kernel at the next hang. Debugging the kernel dump
led us to free system PTE's running out. We enabled the perfmon counter
Memory|Free System Page Table Entries and began watching it. Every hour

at
the top of the hour we would lose about 20,000 PTE's and eventually end up
with about 2,000 free. 2,000 Free PTE's = an unstable system especially a
large SQL server. So we enabled the TrackPtes Registry key which would
enable stack traces and force a BSOD xDA if the servers PTE's pool was
misused. Sure enough at the top of the next hour the server BSOD dumping
the kernel. A debug of that dump showed that the following 2 drivers were
probably misusing the PTE pool. CLASSPNP.sys and emcpbase.sys. Both are
storage drivers. CLASSPNP.sys is a Microsoft SCSI driver and emcpbase.sys
is a EMC PowerPath base driver.

We backup our SQL databases to local disk containers. These backups are
scheduled every hour at the top of the hour. As a test I rebooted the
server and disabled all backups. So far the PTE pool is fine except for
normal use. I was able to duplicate the PTE misuse by running the

following
in both QA and Enterprise Manager:

BACKUP DATABASE [network] TO DISK = N'x:\network.bak' WITH NOINIT ,
NOUNLOAD , NAME = N'Network Backup', SKIP , STATS = 10, NOFORMAT

This is an approximately 500MB Database and running this command results

in
the loss of approximately 10,000 Free PTE's.
If I point this backup to a local drive vs. a SAN drive the Free PTE pool

is
not misused. This eliminates the CLASSPNP.sys driver.

I have not been able to cause the PTE misuse to happen in any other way.
I've tried copying very large files from local to SAN disks, from SAN

disks
to SAN disks, I've tried generating extremely large files on the SAN disks
with IOMeter. I've tried exporting 10GB worth of data from a DTS job to

SAN
disks. So far nothing but a SQL backup causes the PTE misuse.

Is anyone else in this group using Win2K3 Enterprise with any sort of EMC
storage?

UPDATE:
We've got /PAE in the boot.ini but no /3GB because of PTE problems above
12GB when /3GB used.

We determined today that performing an incremental backup of the

transaction
log(.ldf) does not consume PTEs. But a full or Differential of the
Database(.mdf) does. There must be something about the SQL Backup Command
that does a non-standard io call. Maybe something to do with the MS tape
format
or something.

At this point the case has escallated all the way through Dell to EMC
engineering and a corresponding MS engineering.

Still looking for someone with a similar setup to try and duplicate the
problem.

--T




----== Posted via Newsfeed.Com - Unlimited-Uncensored-Secure Usenet

News==----
http://www.newsfeed.com The #1 Newsgroup Service in the World! 100,000

Newsgroups
---= 19 East/West-Coast Specialized Servers - Total Privacy via Encryption

=---


  #4  
Old September 27th 03, 07:00 PM
TeKn0
external usenet poster
 
Posts: n/a
Default

EMC is just changing thumbs passing the case around. Personally I'm really
getting frustrated.

I'd love to hear if you have the same trouble under 2K3. I think the
problem specifically involves the SQL BACKUP command and PowerPath under
2K3. I have a test SQL server on 2K and the problem does not occur.

I'm very hesitant to remove PPath because the server is 24x7 and I can't
risk any extended downtime.

Love to hear you experiences.
--T





"Greg Guenther" wrote in message
hlink.net...
We have just started deploying W2003 against a CX-600 and a CX-600 with
PowerPath 3.05.

I have not seen that issue, however, our SQL servers are still W2k running
ATF against FC-4700s.

Have you tried running SQL backups without PowerPath ?

What is EMC telling you ?

We are moving to PowerPath for SQL server. I will try it out in our test

lab
and tell you what I find.

Good Luck,

Greg G.
"TeKn0" wrote in message
...
Situation: Hourly MSSQL Backups to disk containers are consuming all

Free
System PTEs ultimately leading to an unstable system.

Background: I have a Windows 2003 Enterprise Server(Dell PE6650, 4x 2.0

GHz
2MB Xeon, 16GB RAM) running SQL2000 sp3. This server is attached to an

EMC
CX600 via Dual QLogic 2200 HBA's. The SQL server has about 2TB storage
assigned to it across 7 logical drives. Each Drive is it's own LUN. We

are
running PowerPath v3.0.5 for fibre channel path failover.

The problem manifested itself as the server would run for about 2 days

and
then just hang. No BSOD, just hang. We enabled the CrashonScroll

registry
setting and dumped the kernel at the next hang. Debugging the kernel

dump
led us to free system PTE's running out. We enabled the perfmon counter
Memory|Free System Page Table Entries and began watching it. Every hour

at
the top of the hour we would lose about 20,000 PTE's and eventually end

up
with about 2,000 free. 2,000 Free PTE's = an unstable system especially

a
large SQL server. So we enabled the TrackPtes Registry key which would
enable stack traces and force a BSOD xDA if the servers PTE's pool was
misused. Sure enough at the top of the next hour the server BSOD

dumping
the kernel. A debug of that dump showed that the following 2 drivers

were
probably misusing the PTE pool. CLASSPNP.sys and emcpbase.sys. Both

are
storage drivers. CLASSPNP.sys is a Microsoft SCSI driver and

emcpbase.sys
is a EMC PowerPath base driver.

We backup our SQL databases to local disk containers. These backups are
scheduled every hour at the top of the hour. As a test I rebooted the
server and disabled all backups. So far the PTE pool is fine except for
normal use. I was able to duplicate the PTE misuse by running the

following
in both QA and Enterprise Manager:

BACKUP DATABASE [network] TO DISK = N'x:\network.bak' WITH NOINIT ,
NOUNLOAD , NAME = N'Network Backup', SKIP , STATS = 10, NOFORMAT

This is an approximately 500MB Database and running this command results

in
the loss of approximately 10,000 Free PTE's.
If I point this backup to a local drive vs. a SAN drive the Free PTE

pool
is
not misused. This eliminates the CLASSPNP.sys driver.

I have not been able to cause the PTE misuse to happen in any other way.
I've tried copying very large files from local to SAN disks, from SAN

disks
to SAN disks, I've tried generating extremely large files on the SAN

disks
with IOMeter. I've tried exporting 10GB worth of data from a DTS job to

SAN
disks. So far nothing but a SQL backup causes the PTE misuse.

Is anyone else in this group using Win2K3 Enterprise with any sort of

EMC
storage?

UPDATE:
We've got /PAE in the boot.ini but no /3GB because of PTE problems above
12GB when /3GB used.

We determined today that performing an incremental backup of the

transaction
log(.ldf) does not consume PTEs. But a full or Differential of the
Database(.mdf) does. There must be something about the SQL Backup

Command
that does a non-standard io call. Maybe something to do with the MS

tape
format
or something.

At this point the case has escallated all the way through Dell to EMC
engineering and a corresponding MS engineering.

Still looking for someone with a similar setup to try and duplicate the
problem.

--T




----== Posted via Newsfeed.Com - Unlimited-Uncensored-Secure Usenet

News==----
http://www.newsfeed.com The #1 Newsgroup Service in the World! 100,000

Newsgroups
---= 19 East/West-Coast Specialized Servers - Total Privacy via

Encryption
=---






----== Posted via Newsfeed.Com - Unlimited-Uncensored-Secure Usenet News==----
http://www.newsfeed.com The #1 Newsgroup Service in the World! 100,000 Newsgroups
---= 19 East/West-Coast Specialized Servers - Total Privacy via Encryption =---
  #5  
Old October 2nd 03, 10:31 PM
Tekn0
external usenet poster
 
Posts: n/a
Default

For anyone following this thread, and the benefit of google searches.

UPDATE: We have determined that PowerPath 3.0.5 is the culprit misusing the
PTE pool. We were able to duplicate the problem on a Windows 2003 Standard
Server, and the problem did not occur when powerpath was removed from the
system. The problem returned when PowerPath was re-installed. So, if you
run any flavor of Windows 2003 Server with MS SQL 2000 sp3a and utilize EMC
PowerPath 3.0.5 be aware that running a SQL BACKUP command with a SAN disk
as the destination will result in a misused PTE pool and the server will
eventually hang because of no more PTE's.

--T


 




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
not the same old B&W problem (I don't think) Brian Lee Ati Videocards 0 July 9th 04 01:15 AM
Really WEIRD problem! TMack Ati Videocards 4 April 12th 04 04:30 AM
Problem with freezing mouse, Windows Explorer Jon Davis General 3 April 3rd 04 04:06 AM
help with refresh problem please Michelle Ati Videocards 15 November 27th 03 04:01 AM
Freezing, lock up, unresponsive problem. James General 5 September 5th 03 02:54 PM


All times are GMT +1. The time now is 07:31 AM.


Powered by vBulletin® Version 3.6.4
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 HardwareBanter.
The comments are property of their posters.