A computer components & hardware forum. HardwareBanter

If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

Go Back   Home » HardwareBanter forum » General Hardware & Peripherals » Storage (alternative)
Site Map Home Register Authors List Search Today's Posts Mark Forums Read Web Partners

Drives that drop off line every couple of days



 
 
Thread Tools Display Modes
  #1  
Old September 27th 03, 08:51 AM
CJT
external usenet poster
 
Posts: n/a
Default Drives that drop off line every couple of days

I've got a couple of WD 1200AB drives in a server as master and slave
on the same cable (not that I think it matters, but for completeness).
I'm running Solaris 9. The machine is not heavily loaded. At random
intervals of 2-3 days, one or the other (usually the slave) will audibly
click a couple of times and drop off line. The log file will reflect a
series of timeouts followed by an indication that an error has occurred,
described as --

Sense key: aborted command
error code 0x3

Once it happens, I haven't found a way to get the disk to respond again
except hard rebooting, after which everything will appear fine for a
few days. When the slave goes down, the master stays up, and vice
versa.

They're not hot. The cables are good quality. There are two other
(Maxtor) drives on the other channel that don't exhibit the problem.

The motherboard is a Shuttle AK12A with an underclocked Athlon running
at 900 MHz. It seems pretty solid otherwise.

The disks in question are not used by the system (which has its own
SCSI disk) -- they're loaded with user files being served via Samba
and NFS shares.

Any thoughts?

--
After being targeted with gigabytes of trash by the
"SWEN" worm, I have concluded we must conceal our
e-mail address. Our true address is the mirror image
of what you see before the "@" symbol. It's a shame
such steps are necessary.

Charlie

  #2  
Old September 27th 03, 10:08 AM
Rod Speed
external usenet poster
 
Posts: n/a
Default


"CJT" wrote in message ...
I've got a couple of WD 1200AB drives in a server as master and slave
on the same cable (not that I think it matters, but for completeness).
I'm running Solaris 9. The machine is not heavily loaded. At random
intervals of 2-3 days, one or the other (usually the slave) will audibly
click a couple of times and drop off line. The log file will reflect a
series of timeouts followed by an indication that an error has occurred,
described as --

Sense key: aborted command
error code 0x3

Once it happens, I haven't found a way to get the disk to respond again
except hard rebooting, after which everything will appear fine for a
few days. When the slave goes down, the master stays up, and vice
versa.

They're not hot. The cables are good quality. There are two other
(Maxtor) drives on the other channel that don't exhibit the problem.

The motherboard is a Shuttle AK12A with an underclocked Athlon running
at 900 MHz. It seems pretty solid otherwise.

The disks in question are not used by the system (which has its own
SCSI disk) -- they're loaded with user files being served via Samba
and NFS shares.

Any thoughts?


What size power supply ?


  #3  
Old September 27th 03, 05:53 PM
CJT
external usenet poster
 
Posts: n/a
Default

Rod Speed wrote:

"CJT" wrote in message ...

I've got a couple of WD 1200AB drives in a server as master and slave
on the same cable (not that I think it matters, but for completeness).
I'm running Solaris 9. The machine is not heavily loaded. At random
intervals of 2-3 days, one or the other (usually the slave) will audibly
click a couple of times and drop off line. The log file will reflect a
series of timeouts followed by an indication that an error has occurred,
described as --

Sense key: aborted command
error code 0x3

Once it happens, I haven't found a way to get the disk to respond again
except hard rebooting, after which everything will appear fine for a
few days. When the slave goes down, the master stays up, and vice
versa.

They're not hot. The cables are good quality. There are two other
(Maxtor) drives on the other channel that don't exhibit the problem.

The motherboard is a Shuttle AK12A with an underclocked Athlon running
at 900 MHz. It seems pretty solid otherwise.

The disks in question are not used by the system (which has its own
SCSI disk) -- they're loaded with user files being served via Samba
and NFS shares.

Any thoughts?



What size power supply ?



350 W

Interesting thought.

--
After being targeted with gigabytes of trash by the
"SWEN" worm, I have concluded we must conceal our
e-mail address. Our true address is the mirror image
of what you see before the "@" symbol. It's a shame
such steps are necessary.

Charlie

  #4  
Old September 27th 03, 07:57 PM
Rod Speed
external usenet poster
 
Posts: n/a
Default


"CJT" wrote in message ...
Rod Speed wrote:

"CJT" wrote in message ...

I've got a couple of WD 1200AB drives in a server as master and slave
on the same cable (not that I think it matters, but for completeness).
I'm running Solaris 9. The machine is not heavily loaded. At random
intervals of 2-3 days, one or the other (usually the slave) will audibly
click a couple of times and drop off line. The log file will reflect a
series of timeouts followed by an indication that an error has occurred,
described as --

Sense key: aborted command
error code 0x3

Once it happens, I haven't found a way to get the disk to respond again
except hard rebooting, after which everything will appear fine for a
few days. When the slave goes down, the master stays up, and vice
versa.

They're not hot. The cables are good quality. There are two other
(Maxtor) drives on the other channel that don't exhibit the problem.

The motherboard is a Shuttle AK12A with an underclocked Athlon running
at 900 MHz. It seems pretty solid otherwise.

The disks in question are not used by the system (which has its own
SCSI disk) -- they're loaded with user files being served via Samba
and NFS shares.

Any thoughts?



What size power supply ?



350 W


That should be ok. It'd more likely to be the problem
if it was only say a 200W with an Athlon and 4 hard
drives specially those older Athlon cpus which could
be a bit hungry power wise and behave oddly if the
power supply was marginal.

Interesting thought.




  #5  
Old September 27th 03, 08:09 PM
Folkert Rienstra
external usenet poster
 
Posts: n/a
Default


"CJT" wrote in message ...
I've got a couple of WD 1200AB drives in a server as master and slave
on the same cable (not that I think it matters, but for completeness).
I'm running Solaris 9. The machine is not heavily loaded. At random
intervals of 2-3 days, one or the other (usually the slave) will audibly
click a couple of times and drop off line. The log file will reflect a
series of timeouts followed by an indication that an error has occurred,
described as --

Sense key: aborted command
error code 0x3


That sounds like a SCSI error except there isn't a sense key of
aborted command: http://www.t10.org/lists/asc-num.htm#ASC_03
03/00 PERIPHERAL DEVICE WRITE FAULT


Once it happens, I haven't found a way to get the disk to respond again
except hard rebooting, after which everything will appear fine for a
few days. When the slave goes down, the master stays up, and vice
versa.

They're not hot. The cables are good quality. There are two other
(Maxtor) drives on the other channel that don't exhibit the problem.

The motherboard is a Shuttle AK12A with an underclocked Athlon running
at 900 MHz. It seems pretty solid otherwise.

The disks in question are not used by the system (which has its own
SCSI disk) -- they're loaded with user files being served via Samba
and NFS shares.

Any thoughts?

--
After being targeted with gigabytes of trash by the
"SWEN" worm, I have concluded we must conceal our
e-mail address. Our true address is the mirror image
of what you see before the "@" symbol. It's a shame
such steps are necessary.


That doesn't help. You have to get a new email address.


Charlie

  #6  
Old September 27th 03, 11:01 PM
CJT
external usenet poster
 
Posts: n/a
Default

Folkert Rienstra wrote:
"CJT" wrote in message ...

I've got a couple of WD 1200AB drives in a server as master and slave
on the same cable (not that I think it matters, but for completeness).
I'm running Solaris 9. The machine is not heavily loaded. At random
intervals of 2-3 days, one or the other (usually the slave) will audibly
click a couple of times and drop off line. The log file will reflect a
series of timeouts followed by an indication that an error has occurred,
described as --

Sense key: aborted command
error code 0x3



That sounds like a SCSI error except there isn't a sense key of
aborted command: http://www.t10.org/lists/asc-num.htm#ASC_03
03/00 PERIPHERAL DEVICE WRITE FAULT


I should have made clear that these are ATA drives, not SCSI.

I think the Solaris drivers use similar terminology to SCSI, though.



Once it happens, I haven't found a way to get the disk to respond again
except hard rebooting, after which everything will appear fine for a
few days. When the slave goes down, the master stays up, and vice
versa.

They're not hot. The cables are good quality. There are two other
(Maxtor) drives on the other channel that don't exhibit the problem.

The motherboard is a Shuttle AK12A with an underclocked Athlon running
at 900 MHz. It seems pretty solid otherwise.

The disks in question are not used by the system (which has its own
SCSI disk) -- they're loaded with user files being served via Samba
and NFS shares.

Any thoughts?

--
After being targeted with gigabytes of trash by the
"SWEN" worm, I have concluded we must conceal our
e-mail address. Our true address is the mirror image
of what you see before the "@" symbol. It's a shame
such steps are necessary.



That doesn't help. You have to get a new email address.


Charlie



--
After being targeted with gigabytes of trash by the
"SWEN" worm, I have concluded we must conceal our
e-mail address. Our true address is the mirror image
of what you see before the "@" symbol. It's a shame
such steps are necessary.

Charlie

  #7  
Old September 28th 03, 01:56 AM
Mike Tomlinson
external usenet poster
 
Posts: n/a
Default

In article , CJT
writes

350 W


How many amps on the 12v rail? You're running five drives (4 IDE and
one SCSI), judging by your description.

Checked power management isn't spinning them down?

--
A. Top posters.
Q. What's the most annoying thing on Usenet?

  #8  
Old September 28th 03, 07:40 AM
CJT
external usenet poster
 
Posts: n/a
Default

Mike Tomlinson wrote:

In article , CJT
writes


350 W



How many amps on the 12v rail? You're running five drives (4 IDE and
one SCSI), judging by your description.

Checked power management isn't spinning them down?


That is an interesting line of inquiry, because I also have some hefty
12 volt fans. I'll have to check again that I'm ok on that score. When
I first put it together, I monitored the 12V line and it appeared to be
actually slightly overvoltage, if anything, so I thought I was ok. But
it is possible that it sags occasionally, which is all it would take to
generate my symptoms. Unfortunately, I don't really know how much 12V
the motherboard takes for things like serial ports, so it's hard to do
a precise accounting.

I've always assumed that if the 12V were weak, the most likely time to
see it would be at startup rather than after a day or two of uptime, but
that could be not entirely true. I do know my 120VAC is good, because
it's regulated.

Power management shouldn't be spinning them down according to its
settings, and it would be weird if for some reason it were spinning down
only the WD drives, which are considerably less power hungry than their
Maxtor neighbors.

Thanks for your comments.

--
After being targeted with gigabytes of trash by the
"SWEN" worm, I have concluded we must conceal our
e-mail address. Our true address is the mirror image
of what you see before the "@" symbol. It's a shame
such steps are necessary.

Charlie

  #9  
Old September 28th 03, 10:19 AM
Mike Tomlinson
external usenet poster
 
Posts: n/a
Default

In article , CJT
writes

That is an interesting line of inquiry, because I also have some hefty
12 volt fans. I'll have to check again that I'm ok on that score. When
I first put it together, I monitored the 12V line and it appeared to be
actually slightly overvoltage, if anything, so I thought I was ok.


It's the ability of the supply to deal with sudden surges in demand that
is the issue (read on.)

But
it is possible that it sags occasionally, which is all it would take to
generate my symptoms. Unfortunately, I don't really know how much 12V
the motherboard takes for things like serial ports


Very little for serial ports, but some Athlon motherboards generate the
processor Vcore from the 12v line. This can cause sudden current
demands from the 12v line when the CPU starts working hard. On the
other hand, you said earlier you were running an Athlon 900, so the
current draw from this will not be as great as that for an XP. Does the
PC ever crash inexplicably?

I've always assumed that if the 12V were weak, the most likely time to
see it would be at startup rather than after a day or two of uptime, but
that could be not entirely true.


It can result in intermittent symptoms. You could try jury-rigging a
second power supply - connect it up to your 4 IDE drives and leave the
original PSU connected to the system SCSI drive, fans and motherboard.
Run the system for a while and see if the symptoms change.

--
A. Top posters.
Q. What's the most annoying thing on Usenet?

  #10  
Old September 28th 03, 10:24 AM
Rod Speed
external usenet poster
 
Posts: n/a
Default


CJT wrote in message
...
Mike Tomlinson wrote:
CJT writes


350 W


How many amps on the 12v rail? You're running five
drives (4 IDE and one SCSI), judging by your description.


Checked power management isn't spinning them down?


That is an interesting line of inquiry, because I also have some
hefty 12 volt fans. I'll have to check again that I'm ok on that
score. When I first put it together, I monitored the 12V line
and it appeared to be actually slightly overvoltage, if anything,
so I thought I was ok. But it is possible that it sags occasionally,
which is all it would take to generate my symptoms.


Yep.

Unfortunately, I don't really know how much 12V
the motherboard takes for things like serial ports,


That stuff is completely trivial. The only thing that
matters is the drives, the fans and the motherboard
if it uses that. It likely doesnt given that its an Athlon.

so it's hard to do a precise accounting.


Thats not necessary, just count the major loads.

I've always assumed that if the 12V were weak, the most likely time
to see it would be at startup rather than after a day or two of uptime,


Correct, particularly with the IDE drives all spinning up at once.

but that could be not entirely true.


It likely still is, particularly with an Athlon that doesnt
use the 12V rail like a P4 and recent Celeron does.

I do know my 120VAC is good, because it's regulated.


Power management shouldn't be spinning them down
according to its settings, and it would be weird if for
some reason it were spinning down only the WD drives,


Correct.

which are considerably less power
hungry than their Maxtor neighbors.




 




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Seagate Redesigns Drives (with 73GB to 300GB capacities) Ablang General 0 May 23rd 04 04:01 AM
Boot problem *Vanguard* General 8 February 29th 04 06:53 AM
help with motherboard choice S.Boardman General 30 October 20th 03 10:23 PM
Dial up modem problem Richard Freeman Homebuilt PC's 21 September 22nd 03 05:50 AM
Help! - The dreaded buffer underrun XPG Cdr 5 August 31st 03 06:27 PM


All times are GMT +1. The time now is 04:59 AM.


Powered by vBulletin® Version 3.6.4
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 HardwareBanter.
The comments are property of their posters.