savannah-hackers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [savannah-help-public] GNU Emacs repository corrupted on savannah


From: Bob Proulx
Subject: Re: [savannah-help-public] GNU Emacs repository corrupted on savannah
Date: Sat, 21 May 2016 14:55:19 -0600
User-agent: Mutt/1.5.24 (2015-08-30)

Paul Eggert wrote:
> Bob Proulx wrote:
> >The FSF admins have isolated and corrected the problem.  The report is
> >that the problem was a faulty SSD in a RAID10 set of four.  It was
> >returning corrupted data and reporting it as good data.  That's
> >exceptionally bad hardware.
> 
> Ouch! What type of SSD it was, exactly?

They didn't say.  They only said that they had three Intel SSDs and
one non-Intel SSD.  The Intel ones were good.  (I have had great
personal experience with Intel SSDs.)  The bad SSD was a non-Intel
model and I don't know the type.  It was extremely bad that it
returned corrupted data silently with no error indication.  That's
BAD!

Since the bad drive has been removed everything seems to be working
correctly and I haven't heard any more reports of data corruption.
Let's hope that good fortune continues.

> I don't know of any published measurements for this type of error, though

SSD definitely have different failure modes from spinning media.  RAID
continues to be very important.  As can be shown by this incident.  It
is likely that the new challenges of different SSD failures will
require more active checksuming with SSDs than with traditional
spinning media.

> Google has published measures for many other types. In their experience,
> flash drives have significantly more uncorrectable errors than hard disk
> drives, with different failure characteristics for MLC vs SLC vs eMLC in the
> field. See:
> 
> Schroeder B, Lagisetty R, Merchant A. Flash reliability in production: the
> expected and the unexpected. FAST'16, 2016-02-22, 67-80. 
> https://usenix.org/conference/fast16/technical-sessions/presentation/schroeder
> 
> 
> For more details about flash drives problems after cycling power, see:
> 
> Zheng M, Tucek J, Qin F, Lillibridge M. Understanding the robustness of SSDs
> under power fault. FAST'13, 2013-02-12, 271-84. 
> https://www.usenix.org/conference/fast13/technical-sessions/presentation/zheng

I will pass that information along to the FSF admins.

Bob



reply via email to

[Prev in Thread] Current Thread [Next in Thread]