A computer components & hardware forum. HardwareBanter

If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

Go Back   Home » HardwareBanter forum » General Hardware & Peripherals » Storage (alternative)
Site Map Home Register Authors List Search Today's Posts Mark Forums Read Web Partners

Do SSD drives really fail a lot ?



 
 
Thread Tools Display Modes
  #1  
Old May 3rd 11, 04:30 PM posted to comp.sys.ibm.pc.hardware.storage
Lynn McGuire[_2_]
external usenet poster
 
Posts: 149
Default Do SSD drives really fail a lot ?

Do SSD drives really fail a lot ?
http://www.codinghorror.com/blog/201...ive-scale.html

"… I feel ethically and morally obligated to let you in on a
dirty little secret I've discovered in the last two years of
full time SSD ownership. Solid state hard drives fail. A lot.
And not just any fail. I'm talking about catastrophic,
oh-my-God-what-just-happened-to-all-my-data instant gigafail.
It's not pretty. "

Lynn

  #2  
Old May 3rd 11, 11:11 PM posted to comp.sys.ibm.pc.hardware.storage
Don Phillipson[_4_]
external usenet poster
 
Posts: 320
Default Do SSD drives really fail a lot ?

"Lynn McGuire" wrote in message
...

Do SSD drives really fail a lot ?

http://www.codinghorror.com/blog/201...ive-scale.html

"… I feel ethically and morally obligated to let you in on a
dirty little secret I've discovered in the last two years of
full time SSD ownership. Solid state hard drives fail. A lot.
And not just any fail. I'm talking about catastrophic,
oh-my-God-what-just-happened-to-all-my-data instant gigafail.
It's not pretty. "


LM omitted from the next page:
"Solid state hard drives are so freaking amazing performance wise, and the
experience you will have with them is so transformative, that I don't even
care if they fail every 12 months on average! I can't imagine using a
computer without a SSD any more; it'd be like going back to dial-up internet
.. . . "


--
Don Phillipson
Carlsbad Springs
(Ottawa, Canada)


  #3  
Old May 4th 11, 03:14 AM posted to comp.sys.ibm.pc.hardware.storage
Arno[_3_]
external usenet poster
 
Posts: 1,425
Default Do SSD drives really fail a lot ?

Lynn McGuire wrote:
Do SSD drives really fail a lot ?
http://www.codinghorror.com/blog/201...ive-scale.html


"? I feel ethically and morally obligated to let you in on a
dirty little secret I've discovered in the last two years of
full time SSD ownership. Solid state hard drives fail. A lot.
And not just any fail. I'm talking about catastrophic,
oh-my-God-what-just-happened-to-all-my-data instant gigafail.
It's not pretty. "
Lynn


It depends on your usage pattern and the SSD. Failure rate is
a designed feature with SSDs, i.e. the manufacturers know pretty
well how much writing an SSD can take. By designing wear-leveling
and spare capacity, they can design a specific write load
that kills a drive. In the beginning, this process is shaky
though and whole drive series can have worse reliability.

The typical reliability design goal is a 5% failure rate
per year for an average usage pattern. Consumers are willing
to tolerate that. That is a real failure rate, but it is
not "all the time". There are people that think because SSDs
are not suceptible to mechanical damage, they could do without
backup. Thise people will lose their data, no matter what
storage medium it is on, untill some day no money can be saved
by aiming for that 5% and reliability slowly goes up.

That said, I think the coding horror person (which has some
prrry nice things about coding in his blog) has a census of
mostly early models. These, like any new technology, have
increased failure rates, as the manufacturers try to aim
for that 5%/year but make mistakes in the process. It could
also just be a statistical annomaly.

There is one additional thing: SSDs are susceptible to
heat, just like any other electronics and to bad power.
It is possible that the guy with the 8 of 8 dead deives
just killed them by overheating or by voltage-spikes
from a cheap/bad PSU. For heat, rule of thumb is half
the lifetime every 10C for semiconductors and this works
pretty well. I have seen it several times now, one a 22
unit network card sample. As SSDs contain power circutry,
some parts of them run much hotter (step-up regulators for
converting 5V to the write-voltage needed), and lifetime
of 5 years is typically calculated at 40C environmental
temperature. Run them at 60C and you get 1.25 years average
lifetime. Other example: Memory and logic chips have something
like 30 years at 25C (figure from a very old Intel databook).
Run them at 65C and you get around 2 years lifetime.
That means you get the first failured (depending on
sample size) after 1-1.5 years and after 3 years most are
dead. This incidentally was my intital measurement and
prediction for the 22 network cards and what happened
then. Note that high-performance CPUs are different, as
they are more designed as power semiconductors. But chipsets
are not. I have seen several fail from inadequate cooling
in 1-3 years.

There is one other effect at work he A lot of people
expected SSDs to be much more reliable than HDDs.
They are not in general, see above. This can lead
to disappointments causing overstatement of the problem.

Altogether, I don't believe we are seeing more than
early-adopter problems, and they are always the same.
Also, there are certainly cheap SSDs and better
SSDs, just like allways and it is possible to treat SSDs
well or badly.

Arno
--
Arno Wagner, Dr. sc. techn., Dipl. Inform., CISSP -- Email:
GnuPG: ID: 1E25338F FP: 0C30 5782 9D93 F785 E79C 0296 797F 6B50 1E25 338F
----
Cuddly UI's are the manifestation of wishful thinking. -- Dylan Evans
  #4  
Old May 17th 11, 01:43 AM posted to comp.sys.ibm.pc.hardware.storage
Franc Zabkar
external usenet poster
 
Posts: 1,118
Default Do SSD drives really fail a lot ?

On Tue, 03 May 2011 10:30:46 -0500, Lynn McGuire put
finger to keyboard and composed:

Do SSD drives really fail a lot ?
http://www.codinghorror.com/blog/201...ive-scale.html


The most common reason for failure (90%) in flash drives appears to be
translator corruption (damaged lookup tables), especially if the power
fails while the translator is being updated. Afterwards the drive
powers up in safe mode with a very small capacity.

What are the Flash drives' typical failures [Public Forum]:
http://www.salvationdata.com/forum/topic1873.html

I suspect that SSDs may be similarly affected. Perhaps that's why some
newer models have large super capacitors for power backup.

- Franc Zabkar
--
Please remove one 'i' from my address when replying by email.
  #5  
Old May 17th 11, 05:06 AM posted to comp.sys.ibm.pc.hardware.storage
Arno[_3_]
external usenet poster
 
Posts: 1,425
Default Do SSD drives really fail a lot ?

Franc Zabkar wrote:
On Tue, 03 May 2011 10:30:46 -0500, Lynn McGuire put
finger to keyboard and composed:


Do SSD drives really fail a lot ?
http://www.codinghorror.com/blog/201...ive-scale.html


The most common reason for failure (90%) in flash drives appears to be
translator corruption (damaged lookup tables), especially if the power
fails while the translator is being updated. Afterwards the drive
powers up in safe mode with a very small capacity.


That should not happen if the firmware designers know how
to do this. The trick is to have a log-structure. In addition
enough stored power to complete one write is also a good idea
but not strictly needed.

I did have USB flash drives lose all data and return different data
on each read. That would be an explanation. The problem went away
after a full overwrite. I guess the developers of these devices
are still learning how to do this right. Not that the relevant
algorithms have been around for several decades. This possibly
is an education problem.

What are the Flash drives' typical failures [Public Forum]:
http://www.salvationdata.com/forum/topic1873.html


I suspect that SSDs may be similarly affected. Perhaps that's why some
newer models have large super capacitors for power backup.


With a supercap you can always complete the write.

It is possible to deal with this issue in the filesystem case
by accepting that writes some time before the power failure
(seconds) may get lost. The filesystem needs to be aware of the
SSD blocksize though. Otherwise you can get corruption in data
that was not actually requested to be written, which is really
bad. I guess how to do this in practice is still being hashed
out at this time.

Personally, I do not trust SSDs at the moment, because of
this error amplification property and for other reasons.
The one SSD I have with critical data is in a RAID1 with
normal disks. Reads are done from the SSD, unless there
is an error, which gives me SSD speeds for my apllication.

Arno
--
Arno Wagner, Dr. sc. techn., Dipl. Inform., CISSP -- Email:
GnuPG: ID: 1E25338F FP: 0C30 5782 9D93 F785 E79C 0296 797F 6B50 1E25 338F
----
Cuddly UI's are the manifestation of wishful thinking. -- Dylan Evans
  #6  
Old May 17th 11, 11:32 AM posted to comp.sys.ibm.pc.hardware.storage
JW
external usenet poster
 
Posts: 82
Default Do SSD drives really fail a lot ?

On Tue, 17 May 2011 10:43:49 +1000 Franc Zabkar
wrote in Message id:
:

On Tue, 03 May 2011 10:30:46 -0500, Lynn McGuire put
finger to keyboard and composed:

Do SSD drives really fail a lot ?
http://www.codinghorror.com/blog/201...ive-scale.html


The most common reason for failure (90%) in flash drives appears to be
translator corruption (damaged lookup tables), especially if the power
fails while the translator is being updated. Afterwards the drive
powers up in safe mode with a very small capacity.

What are the Flash drives' typical failures [Public Forum]:
http://www.salvationdata.com/forum/topic1873.html

I suspect that SSDs may be similarly affected. Perhaps that's why some
newer models have large super capacitors for power backup.


Be wary of the new Intel SSD 320 series. Currently, there's a bug in the
controller that can cause the device to revert to 8MB during a power
failure. AFAIK they have not yet publicly announced it, and won't have a
firmware fix ready for release until the end of July.

We had an SSD 320 600GB 2.5" SATA drive in for evaluation from our Intel
rep. I was able to kill it in two or three hours by power cycling it.
Apparently (according to the Intel rep) when the power failure is
happening, the SSD device tries to reconnect with the SATA port instead of
initiating a proper shutdown. Something to do with interrupt priority
being higher for reconnection rather than a proper shutdown.

I was able to kill their 80GB device as well. We've sent both drives back
to Intel and they're going to give us their pre-release firmware for
testing.
  #7  
Old May 17th 11, 07:32 PM posted to comp.sys.ibm.pc.hardware.storage
Arno[_3_]
external usenet poster
 
Posts: 1,425
Default Do SSD drives really fail a lot ?

JW wrote:
On Tue, 17 May 2011 10:43:49 +1000 Franc Zabkar
wrote in Message id:
:


On Tue, 03 May 2011 10:30:46 -0500, Lynn McGuire put
finger to keyboard and composed:

Do SSD drives really fail a lot ?
http://www.codinghorror.com/blog/201...ive-scale.html


The most common reason for failure (90%) in flash drives appears to be
translator corruption (damaged lookup tables), especially if the power
fails while the translator is being updated. Afterwards the drive
powers up in safe mode with a very small capacity.

What are the Flash drives' typical failures [Public Forum]:
http://www.salvationdata.com/forum/topic1873.html

I suspect that SSDs may be similarly affected. Perhaps that's why some
newer models have large super capacitors for power backup.


Be wary of the new Intel SSD 320 series. Currently, there's a bug in the
controller that can cause the device to revert to 8MB during a power
failure. AFAIK they have not yet publicly announced it, and won't have a
firmware fix ready for release until the end of July.


We had an SSD 320 600GB 2.5" SATA drive in for evaluation from our Intel
rep. I was able to kill it in two or three hours by power cycling it.
Apparently (according to the Intel rep) when the power failure is
happening, the SSD device tries to reconnect with the SATA port instead of
initiating a proper shutdown. Something to do with interrupt priority
being higher for reconnection rather than a proper shutdown.


I was able to kill their 80GB device as well. We've sent both drives back
to Intel and they're going to give us their pre-release firmware for
testing.


Interesting. Goes to show that firmware development is apparently
not done any better than other software development. I am tempted
to run my next SSD through similar tests before using it.

Arno

--
Arno Wagner, Dr. sc. techn., Dipl. Inform., CISSP -- Email:
GnuPG: ID: 1E25338F FP: 0C30 5782 9D93 F785 E79C 0296 797F 6B50 1E25 338F
----
Cuddly UI's are the manifestation of wishful thinking. -- Dylan Evans
  #8  
Old August 16th 11, 01:21 PM posted to comp.sys.ibm.pc.hardware.storage
JW
external usenet poster
 
Posts: 82
Default Do SSD drives really fail a lot ?

On Tue, 17 May 2011 06:32:45 -0400 JW wrote in Message id:
:

On Tue, 17 May 2011 10:43:49 +1000 Franc Zabkar
wrote in Message id:
:

On Tue, 03 May 2011 10:30:46 -0500, Lynn McGuire put
finger to keyboard and composed:

Do SSD drives really fail a lot ?
http://www.codinghorror.com/blog/201...ive-scale.html


The most common reason for failure (90%) in flash drives appears to be
translator corruption (damaged lookup tables), especially if the power
fails while the translator is being updated. Afterwards the drive
powers up in safe mode with a very small capacity.

What are the Flash drives' typical failures [Public Forum]:
http://www.salvationdata.com/forum/topic1873.html

I suspect that SSDs may be similarly affected. Perhaps that's why some
newer models have large super capacitors for power backup.


Be wary of the new Intel SSD 320 series. Currently, there's a bug in the
controller that can cause the device to revert to 8MB during a power
failure. AFAIK they have not yet publicly announced it, and won't have a
firmware fix ready for release until the end of July.

We had an SSD 320 600GB 2.5" SATA drive in for evaluation from our Intel
rep. I was able to kill it in two or three hours by power cycling it.
Apparently (according to the Intel rep) when the power failure is
happening, the SSD device tries to reconnect with the SATA port instead of
initiating a proper shutdown. Something to do with interrupt priority
being higher for reconnection rather than a proper shutdown.

I was able to kill their 80GB device as well. We've sent both drives back
to Intel and they're going to give us their pre-release firmware for
testing.


The Pre-release firmware also had the problem. I ended up supplying Intel
SSD engineering with my test platform and they reproduced the problem and
have a fix pending. See:
http://communities.intel.com/thread/24121?tstart=0

The firmware is not yet released however.

Looks like this Usenet thread caused quite a bit of commotion on their
forum:
http://communities.intel.com/thread/22227?tstart=0


  #9  
Old August 16th 11, 05:22 PM posted to comp.sys.ibm.pc.hardware.storage
Arno[_3_]
external usenet poster
 
Posts: 1,425
Default Do SSD drives really fail a lot ?

JW wrote:
On Tue, 17 May 2011 06:32:45 -0400 JW wrote in Message id:
:


On Tue, 17 May 2011 10:43:49 +1000 Franc Zabkar
wrote in Message id:
:

On Tue, 03 May 2011 10:30:46 -0500, Lynn McGuire put
finger to keyboard and composed:

Do SSD drives really fail a lot ?
http://www.codinghorror.com/blog/201...ive-scale.html

The most common reason for failure (90%) in flash drives appears to be
translator corruption (damaged lookup tables), especially if the power
fails while the translator is being updated. Afterwards the drive
powers up in safe mode with a very small capacity.

What are the Flash drives' typical failures [Public Forum]:
http://www.salvationdata.com/forum/topic1873.html

I suspect that SSDs may be similarly affected. Perhaps that's why some
newer models have large super capacitors for power backup.


Be wary of the new Intel SSD 320 series. Currently, there's a bug in the
controller that can cause the device to revert to 8MB during a power
failure. AFAIK they have not yet publicly announced it, and won't have a
firmware fix ready for release until the end of July.

We had an SSD 320 600GB 2.5" SATA drive in for evaluation from our Intel
rep. I was able to kill it in two or three hours by power cycling it.
Apparently (according to the Intel rep) when the power failure is
happening, the SSD device tries to reconnect with the SATA port instead of
initiating a proper shutdown. Something to do with interrupt priority
being higher for reconnection rather than a proper shutdown.

I was able to kill their 80GB device as well. We've sent both drives back
to Intel and they're going to give us their pre-release firmware for
testing.


The Pre-release firmware also had the problem. I ended up supplying Intel
SSD engineering with my test platform and they reproduced the problem and
have a fix pending. See:
http://communities.intel.com/thread/24121?tstart=0


This is rather patheric on their side (not so at all on your side,
obviously).

The firmware is not yet released however.


Looks like this Usenet thread caused quite a bit of commotion on their
forum:
http://communities.intel.com/thread/22227?tstart=0




Understandable. The conclusion can only be to stay away from
Intel SSDs for the next few years, until they have
demonstrated they their Q/A under control and have started to take
the date safety of their customers seriously.

It also underlines somethign I have been saying for a while,
namely that SSDs should be regarded as less reliable than HDDs at
this time, because of engineering screw-ups like this one.

My SSDs are either in a RAID with non-SSDs (with "write mostly"
that gives SSD read-speeds under Linux software RAID) or do
not have critical data on them.

Arno
--
Arno Wagner, Dr. sc. techn., Dipl. Inform., CISSP -- Email:
GnuPG: ID: 1E25338F FP: 0C30 5782 9D93 F785 E79C 0296 797F 6B50 1E25 338F
----
Cuddly UI's are the manifestation of wishful thinking. -- Dylan Evans
 




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
SSD drives rock John Doe Homebuilt PC's 9 March 13th 10 06:55 AM
Info on connecting ssd drives to motherboards SantaClaus Homebuilt PC's 4 October 24th 09 12:27 AM
ssd esata drives P. Kaminski General 7 June 8th 09 08:50 PM
SSD drives -- anyone have experience? journey Dell Computers 3 March 13th 09 05:15 PM
raid10.. how many drives can fail and still have the array in tact?(4 drives/8 drives) markm75 Storage (alternative) 18 December 23rd 07 03:48 AM


All times are GMT +1. The time now is 01:06 PM.


Powered by vBulletin® Version 3.6.4
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 HardwareBanter.
The comments are property of their posters.