[Log In ] [New Posts] []
Go Back   GotApex? Forums Forums > General Topics > Hardware
User Name
Password

Reply
 
Thread Tools Search this Thread Display Modes
Old 02-19-2007, 08:15 AM   #1
TruckStuff
Commander
 
Join Date: Jun 2005
Posts: 1,335
Scientific study of why hard drives fail

From Google, one of the largest users of cheap. off-the-shelf hard drives:
Quote:
Massive Google hard drive survey turns up very interesting things

Posted Feb 18th 2007 9:47PM by Ryan Block

When your server farm is in the hundreds of thousands and you're using cheap, off-the-shelf hard drives as your primary means of storage, you've probably good a pretty damned good data set for looking at the health and failure patterns of hard drives. Google studied a hundred thousand SATA and PATA drives with between 80 and 400GB storage and 5400 to 7200rpm, and while unfortunately they didn't call out specific brands or models that had high failure rates, they did find a few interesting patterns in failing hard drives. One of those we thought was most intriguing was that drives often needed replacement for issues that SMART drive status polling didn't or couldn't determine, and 56% of failed drives did not raise any significant SMART flags (and that's interesting, of course, because SMART exists solely to survey hard drive health); other notable patterns showed that failure rates are indeed definitely correlated to drive manufacturer, model, and age; failure rates did not correspond to drive usage except in very young and old drives (i.e. heavy data "grinding" is not a significant factor in failure); and there is less correlation between drive temperature and failure rates than might have been expected, and drives that are cooled excessively actually fail more often than those running a little hot. Normally we'd recommend you go on ahead and read the document, but be ready for a seriously academic and scientific analysis.
http://www.engadget.com/2007/02/18/m...resting-thing/

Link to study: http://labs.google.com/papers/disk_failures.pdf
__________________
DISCLAIMER
The preceding statements are meant to be taken as a whole, in their entirety. They may not be quoted in part and then used to flame me. They also do not imply that I believe the exact opposite of their meaning. They do not make any implication about any group, race, ethnicity, age group, or other cohort beyond what is stated above. They do not make any implications at all. They have no "tone" or "attitude." They are words. Nothing more.
TruckStuff is offline   Reply With Quote
Old 02-19-2007, 08:25 AM   #2
renovation
Admiral
 
renovation's Avatar
 
Join Date: Jan 2003
Location: You could pick up Lindsay Lohan for less than a intel 990x, and still have money left over to bail her outta jail
Posts: 5,029
Send a message via ICQ to renovation Send a message via MSN to renovation
Good find TruckStuff. So we should save our money and buy better ide drives! So from this study I have to agree keeping a drive cool may be your best hope for long life!

I try to always put a cooling fan in line with a harddrive.
__________________
You could pick up Lindsay Lohan for less than a intel 990x, and still have money left over to bail her outta jail

Last edited by renovation : 02-19-2007 at 08:32 AM.
renovation is offline   Reply With Quote
Old 02-19-2007, 10:13 AM   #3
johnnymk
Chief of Naval Operations
 
johnnymk's Avatar
 
Join Date: May 2000
Location: LEVITTOWN< PA> USA
Posts: 13,621
Quote:
Originally Posted by renovation
Good find TruckStuff. So we should save our money and buy better ide drives! So from this study I have to agree keeping a drive cool may be your best hope for long life!

I try to always put a cooling fan in line with a harddrive.
If I read correctly, there is no correlation between a cool drive and a moderately hot drive.
johnnymk is offline   Reply With Quote
Old 02-19-2007, 10:16 AM   #4
InfiniteNothing
Chief of Naval Operations
 
InfiniteNothing's Avatar
 
Join Date: Aug 2002
Location: San Diego
Posts: 10,086
I'm betting the actual temperature is less important than how many cold starts the drive has.
InfiniteNothing is offline   Reply With Quote
Old 02-19-2007, 10:20 AM   #5
johnnymk
Chief of Naval Operations
 
johnnymk's Avatar
 
Join Date: May 2000
Location: LEVITTOWN< PA> USA
Posts: 13,621
Quote:
Originally Posted by InfiniteNothing
I'm betting the actual temperature is less important than how many cold starts the drive has.

Wouldn't you think that at Google that the drives are being accessed continously 24 hours per day?
johnnymk is offline   Reply With Quote
Old 02-19-2007, 12:41 PM   #6
zippyjuan
Picture of the Day Guru
 
zippyjuan's Avatar
 
Join Date: Oct 2002
Location: Sunny San Diego
Posts: 8,756
BBC's report on it:
Quote:
Hard disk test 'surprises' Google

Hard disks are getting smaller with greater storage
The impact of heavy use and high temperatures on hard disk drive failure may be overstated, says a report by three Google engineers.
The report examined 100,000 commercial hard drives, ranging from 80GB to 400GB in capacity, used at Google since 2001.

The firm uses "off-the-shelf" drives to store cached web pages and services.

"Our data indicate a much weaker correlation between utilisation levels and failures than previous work has suggested," the authors noted.

A wide variety of manufacturers and models were included in the report, but a breakdown was not provided.

Widely-held belief

There is a widely held belief that hard disks which are subject to heavy use are more likely to fail than those used intermittently. It was also thought that hard drives preferred cool temperatures to hotter environments.

The authors wrote: "We expected to notice a very strong and consistent correlation between high utilisation and higher failure rates.

"However our results appear to paint a more complex picture. First, only very young and very old age groups appear to show the expected behaviour."

A hard disk was described as having "failed" if it needed to be replaced.

The report was compiled by Eduardo Pinheiro, Wolf-Dietrich Weber and Luiz Andre Barroso, and was presented to a storage conference in California last week.

In the report the authors said Google had developed an infrastructure which collected "vital information" about all of the firm's systems every few minutes.

'Essentially forever'

The firm then stores that information "essentially forever".

Google employs its own file system to organise the storage of data, using inexpensive commercially available hard drives rather than bespoke systems.


Lower temperatures are associated with higher failure rates

Google report
Hard drives less than three years old and used a lot are less likely to fail than similarly aged hard drives that are used infrequently, according to the report.

"One possible explanation for this behaviour is the survival of the fittest theory," said the authors, speculating that drives which failed early on in their lifetime had been removed from the overall sample leaving only the older, more robust units.

The report said that there was a clear trend showing "that lower temperatures are associated with higher failure rates".

"Only at very high temperatures is there a slight reversal of this trend."

But hard drives which are three years old and older were more likely to suffer a failure when used in warmer environments.

"This is a surprising result, which could indicate that data centre or server designers have more freedom than previously thought when setting operating temperatures for equipment containing disk drives," said the authors.

The report also looked at the impact of scan errors - problems found on the surface of a disc - on hard drive failure.

"We find that the group of drives with scan errors are 10 times more likely to fail than the group with no errors," said the authors.

They added: "After the first scan error, drives are 39 times more likely to fail within 60 days than drives without scan errors."


__________________
I add new pictures to my photo gallery pretty regularly. You can see them here if you are interested: http://www.pbase.com/jeffryz
zippyjuan is offline   Reply With Quote
Old 02-20-2007, 03:50 AM   #7
johnnymk
Chief of Naval Operations
 
johnnymk's Avatar
 
Join Date: May 2000
Location: LEVITTOWN< PA> USA
Posts: 13,621
It would be nice if Google would note which brands failed more frequently. But being Google, they would probably offend many companies and lose advertising revenue.
johnnymk is offline   Reply With Quote
Old 02-20-2007, 08:22 PM   #8
redcolours
in living colour
 
Join Date: Sep 2001
Location: above a raging c
Posts: 1,739
Send a message via AIM to redcolours Send a message via Yahoo to redcolours
Quote:
Originally Posted by johnnymk
It would be nice if Google would note which brands failed more frequently. But being Google, they would probably offend many companies and lose advertising revenue.

[cough]maxtor[/cough]

__________________
there are pictures, but no,nothing happens on my site.
redcolours is offline   Reply With Quote
Old 02-20-2007, 08:24 PM   #9
redcolours
in living colour
 
Join Date: Sep 2001
Location: above a raging c
Posts: 1,739
Send a message via AIM to redcolours Send a message via Yahoo to redcolours
Quote:
Originally Posted by InfiniteNothing
I'm betting the actual temperature is less important than how many cold starts the drive has.

unless the drive is glowing red hot already...

lesson of the story: keep your PCs on 24/7 (i know i do...)
redcolours is offline   Reply With Quote
Old 02-20-2007, 08:39 PM   #10
MikeD
President, Cowboys Nation
 
MikeD's Avatar
 
Join Date: Dec 2004
Location: In the 'burbs, west of D.C.
Posts: 5,139
Quote:
Originally Posted by redcolours
[cough]maxtor[/cough]

You got a cold? Me too...I've had the SAME THING for awhile now.

Just got over the WD bug, too...
__________________
MikeD is offline   Reply With Quote
Old 02-20-2007, 09:30 PM   #11
stufine
Lieutenant Junior Grade
 
Join Date: Nov 2006
Posts: 143
[cough]maxtor[/cough]

what? maxtor a bad drive? i've only had 2 go bad in the past 2 yrs.. hehe well 1 (120g) was from the wife pushing my pc off the table. 1 WD died recently in a external usb box.. think it overheated.. now it likes to click.. Been thinking about the WD with the 5yr warranty for a raid box..
stufine is offline   Reply With Quote
Old 02-21-2007, 07:39 AM   #12
DarkFury
Secretary of the Navy
 
DarkFury's Avatar
 
Join Date: Feb 2001
Location: Chillin' N Da 'Hood
Posts: 34,997
Quote:
Originally Posted by redcolours
[cough]maxtor[/cough]

I must be kinda lucky...

My Maxtors have been fairly bulletproof.... however, I've sent in at least 2 of my Western Digitals over the past 10 years.
__________________


DarkFury's Pimptopia - Don't Hate the Playa, Hate the Game!
Home of the Original OG Pimp (accept NO imitations)
DarkFury is offline   Reply With Quote
Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump


All times are GMT -7. The time now is 10:44 AM.