[ale] SSD info you need to know ( was OT SSD remaining lifetime indicator)

Ron Frazier (ALE) atllinuxenthinfo at techstarship.com
Sat Sep 8 01:56:35 EDT 2012


Hi guys,

I've been doing more research into SSD's since Rich posted the info about that special that Tiger Direct had.  This topic is complex and confusing.  It's late at night and I'm tired, but I wanted to post several things I've found out.  I hope to post more in depth data later, maybe.  Finding data on SSD reliability is hard and there's lots of FUD (fear uncertainty and doubt) out there.  Here is a synopsis of what I believe to be reliable data.  I don't have time tonight to distill and quote all the sources.

Intel seems to be the king of the hill in terms of reliability.

Much of this probably also applies to memory sticks and memory cards.

1) NAND endurance shouldn't be a problem with GOOD QUALITY consumer drives in consumer applications when you're writing less than 10 GB / day to the drive.  See the specs and warranty for details.  The drive should last it's warranted time, usually 3 or 5 years.  A good measure to look at is the TBW (terabytes written) spec, which would give you a maximum about of data you can write before degradation.  Use the worst case number.  If the drive provides it, the media wear indicator should provide a reasonable indicator of remaining life percentage, but not necessarily time.

2) Some warranties end when the media wear indicator reaches 1 (starting from 100).  So, if you use it within the specified limits, you get the 5 year (for example) warranty.  If you exceed those limits, in effect, it wears out when it wears out, and your warranty is over.

3) What is more alarming is that SSD's are, in many cases, more susceptible to glitches caused by power failures, system lockups, firmware problems, controller failures, etc.  Failures of this type can sometimes trash all your data.  Some of this you cannot control.  However, you probably want to review published data on fault protection for your existing or proposed drive, make sure it has power fail protection or add some of your own, and upgrade to the latest firmware and upgrade again if you hear about problems that have been corrected.

4) NAND chips made on a smaller fabrication process, 25 nm vs 34 nm, etc., have LESS endurance but have a LOWER price.  Typical endurance for 25 nm MLC NAND is 3000 program / erase cycles.

5) Look for a controller architecture with a typical WRITE AMPLIFICATION of less than 1.  Higher numbers wear out the chips faster.  The Sandforce 2xxx controllers are highly regarded here.

6) I find this one very disturbing.  Many of the drives have a POWER OFF DATA RETENTION spec.  For CONSUMER drives, this is 1 year.  For ENTERPRISE drives, this is 3 MONTHS!  If you want your drive to keep your data, keep it turned on.  Don't put data on the drive and put the drive on a shelf, otherwise you could come back and find the data has vanished.  Supposedly, if the drive is on, the controller is periodically scrubbing the NAND for weak cells and will regenerate or relocate the data.

7) If the drive glitches out, for whatever reason, data recovery may be difficult or impossible.  It will likely be expensive.  You have to find a data recovery company with intimate experience with your particular NAND / controller setup.  Encryption makes things even worse.

8) Although endurance may not be a problem for most users, STUDIES SHOW THAT SSD'S ARE NOT SUBSTANTIALLY MORE RELIABLE THAN HDD'S OVERALL.  Failure rates for SSD's in the first few years of life are in the 1% - 2% range, fairly comparable to hard drives.  In later years of life, HDD's have a failure rate of up to 7%.  Little long term data is available for SSD's.  Half of all HDD failures occur because of non mechanical reasons.  Those non mechanical things are similar in both HDD's and SSD's.  Because of the vulnerability of SSD's to glitches and the difficulty of data recovery, your data may be less safe on an SSD, endurance and physical shock resistance notwithstanding.

(I believe that rock solid power protection, both surge protection and power failure protection, is one of the  best ways to improve computer reliability, including the SSD's.  The Smart UPS line from APC can boost sagging line voltage or cut excessive line voltage, to a point.  This type of equipment is especially important in GA, where we have the 2nd highest incidence of lightning in the USA.)

9) I have been able to find very little about the possible failure modes of the drive when it wears out.  However, Intel's documentation recommends that you replace the drive when the media wear indicator reaches 1 to avoid data loss.

If you are considering or are using SSD's, I hope you find this useful.  Please feel free to add to the discussion from your own experience.  I certainly found these facts eye opening and somewhat disturbing.

Sincerely,

Ron


Lawrence Hamblin <infinity.d2 at gmail.com> wrote:

>I tried this out earlier, and it wasn't able to give me an estimate of
>my
>drive's lifetime because, apparently, it doesn't report that statistic.
> It
>did, however, say that my drive was healthy.  I'm running Windows 7 Pro
>64-bit on a 128GB Samsung 830 SSD, which has been rock solid for the
>past
>nine months since I bought it.  I'm upgrading to a 512GB model soon.
>
>On Fri, Sep 7, 2012 at 4:32 PM, David Tomaschik
><david at systemoverlord.com>wrote:
>
>> On Thu, Sep 6, 2012 at 7:02 PM, Ron Frazier (ALE)
>> <atllinuxenthinfo at techstarship.com> wrote:
>> > Hi guys,
>> >
>> > I was doing research on SSD's and ran across this. It's an SSD
>remaining
>> > lifetime indicator. It monitors your ssd's and indicates their
>health as
>> a
>> > percentage from 0 - 100%. It also gives an estimated end of life
>date
>> which
>> > continually updates based on your usage of the drive. It does all
>this by
>> > monitoring the SMART data from the drive and the number of write
>cycles.
>> I
>> > don't know anything about it other than what's on the website, but
>it
>> looks
>> > pretty cool. Unfortunately for this group, it's a Windows program.
>> However,
>> > it might be possible to run it under Wine or find something that
>does the
>> > same thing in Linux. According to the website, when the drive's
>life has
>> > expired, it becomes a read only device, like a giant dvd rom. The
>data
>> > SHOULD still be there and remain readable. How long it remains
>readable,
>> I
>> > have no idea.
>> >
>> > http://www.ssd-life.com/
>> >
>> > Sincerely,
>> >
>> > Ron
>>
>>
>> It most likely won't work under wine as getting that data from the
>> drive requires sending raw device commands (ioctls on Linux) and I
>> don't think wine provides an emulation layer for this.  (It might,
>but
>> the use cases would be pretty limited.)
>>
>> My understanding is that smartmontools (at least as of 5.40) supports
>> SSD wear level indication.  Basically, the drive can report what % of
>> its spare blocks are still available.  Read (and record) that over
>> time, extrapolate, and there's your wear leveling limit.
>>
>> That being said, most SSDs that die don't die because they've hit
>> their write cycle limit.  I've personally seen a couple die due to
>> controller failures, and those drives *completely disappear* from the
>> system.
>>
>> I operate as if my drive will fail any second.  I have an extensive
>> set of backups, and that's how I plan to preserve my data, not by
>> trying to guess when a drive will fail.  (Or be stolen.)
>>
>>
>> --
>> David Tomaschik
>> OpenPGP: 0x5DEA789B
>> http://systemoverlord.com
>> david at systemoverlord.com
>> _______________________________________________
>> Ale mailing list
>> Ale at ale.org
>> http://mail.ale.org/mailman/listinfo/ale
>> See JOBS, ANNOUNCE and SCHOOLS lists at
>> http://mail.ale.org/mailman/listinfo
>>
>
>
>------------------------------------------------------------------------
>
>_______________________________________________
>Ale mailing list
>Ale at ale.org
>http://mail.ale.org/mailman/listinfo/ale
>See JOBS, ANNOUNCE and SCHOOLS lists at
>http://mail.ale.org/mailman/listinfo


--

Sent from my Android Acer A500 tablet with bluetooth keyboard and K-9 Mail.
Please excuse my potential brevity.

(To whom it may concern.  My email address has changed.  Replying to former
messages prior to 03/31/12 with my personal address will go to the wrong
address.  Please send all personal correspondence to the new address.)

(PS - If you email me and don't get a quick response, you might want to
call on the phone.  I get about 300 emails per day from alternate energy
mailing lists and such.  I don't always see new email messages very quickly.)

Ron Frazier
770-205-9422 (O)   Leave a message.
linuxdude AT techstarship.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.ale.org/pipermail/ale/attachments/20120908/4aa693f0/attachment.html 


More information about the Ale mailing list