Another DOA WD20EARS


Recommended Posts

I picked up another pair of 2TB WD20EARS drives on the weekend (planning to use one to replace an older 1TB drive in my unRAID array and keep the other as a spare). One is preclearing nicely, just about done its second pass with no issues, but the other appeared to have failed in the first preclear pass, towards the end of the writing with zeroes phase I think.

 

Here is it's SMART report:

 

=== START OF INFORMATION SECTION ===
Device Model:     WDC WD20EARS-00MVWB0
Serial Number:    WD-WCAZA4335011
Firmware Version: 51.0AB51
User Capacity:    2,000,398,934,016 bytes
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   8
ATA Standard is:  Exact ATA specification draft version not indicated
Local Time is:    Mon Apr 18 06:32:20 2011 MDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x84)	Offline data collection activity
				was suspended by an interrupting command from host.
				Auto Offline Data Collection: Enabled.
Self-test execution status:      ( 121)	The previous self-test completed having
				the read element of the test failed.
Total time to complete Offline 
data collection: 		 (36660) seconds.
Offline data collection
capabilities: 			 (0x7b) SMART execute Offline immediate.
				Auto Offline data collection on/off support.
				Suspend Offline collection upon new
				command.
				Offline surface scan supported.
				Self-test supported.
				Conveyance Self-test supported.
				Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
				power-saving mode.
				Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
				General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   2) minutes.
Extended self-test routine
recommended polling time: 	 ( 255) minutes.
Conveyance self-test routine
recommended polling time: 	 (   5) minutes.
SCT capabilities: 	       (0x3035)	SCT Status supported.
				SCT Feature Control supported.
				SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       1
  3 Spin_Up_Time            0x0027   253   253   021    Pre-fail  Always       -       958
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       11
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       13
10 Spin_Retry_Count        0x0032   100   253   000    Old_age   Always       -       0
11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   Always       -       0
12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       9
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       8
193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       26
194 Temperature_Celsius     0x0022   118   114   000    Old_age   Always       -       32
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   196   196   000    Old_age   Always       -       1374
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   100   253   000    Old_age   Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed: read failure       90%        13         1590963

SMART Selective self-test log data structure revision number 1
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

 

I tried to do a short SMART test before the above report was recorded.

 

Also I have tried to start a long SMART test twice on this drive and it appears to just give up right away.

 

If I understand the report correctly the drive thinks that there are 1374 sectors that it is having read problems with and is waiting for them to be written to before it will actually remap them?

 

Since I can't get the drive to run a long SMART test I don't trust it and am planning to just return it for replacement, any other suggestions?  Perhaps I should attempt another preclear session?

 

Regards,

 

Stephen

Link to comment

It doesn't look good. You can either try to return it now or run more preclears until the overall assessment says failed and then it'll be easy to return with no questions about it being good or not.

 

 

I think I'll give it one more kick with preclear, I'm curious to see if another attempt to write will cause those pending sectors to be reallocated.

 

Stephen

Link to comment

I tried to run preclear on it a second time, but it never got past 0% of the first reading phase, when I looked at the syslog there were a ton of errors after a few minutes so I just killed the preclear.

 

I then downloaded Western Digital's "Data Lifeguard Diagnostics for Windows" and moved the drive over to my Windows 7 desktop.  I ran the WD diagnostics on it, just using their "quick test" and it got about half way through the scan in a minute or so, then it ground to a halt.  About 10 minutes later the diagnostic terminated with an error saying there were too many bad sectors (note that the SMART report was still saying the drive was ok).

 

So I took it back to the MemoryExpress store (http://www.memoryexpress.com/) I use all the time, they did a quick test on it and then gave me a new one.

 

I've now purchased a total of 9 of these drives and I've had to get replacements for 2 due to being exposed as defective by preclear (so I have actually used 11 of them).  So a failure rate of 2/11 or 18.2%.

 

So far I've not had any issues with any of these drives once they have passed the preclear burn in.

 

Perhaps they should come with a label: "Advanced format drive, use only with adult supervision" or something...

 

Regards,

 

Stephen

 

 

 

Link to comment

Seems to be a sad fact of hard-drive economics these days... cheaper to push them out without thoroughly testing for defects, and then pay for RMAs rather than ensure that a very low % are DOA. 

 

I mean before UnRaid, I would throw a drive in to a computer without another thought... a very small % of the population bother to test drives at all before using, no matter the extensiveness of a preclear.  And maybe your drive would have survived long enough to get through its warranty period (under normal use patterns... not preclear's) and then it would be on you to pay to replace.

 

I'm sure that someone much smarter (and better paid!) has run the numbers and it works out cheaper for them.

Link to comment
I've now purchased a total of 9 of these drives and I've had to get replacements for 2 due to being exposed as defective by preclear (so I have actually used 11 of them).  So a failure rate of 2/11 or 18.2%.

 

Stats on 9 drives are meaningless. If you tested 1,000 drives from 10 vendors over a 12 month period, maybe it'd start to become statistically relevant.

 

If WD have more than a 1% failure rate, they are in trouble because margins are low and it is very expensive to replace drives once they are in the field.

 

See this PDF and search for "quality assurance": lyle.smu.edu/~mhd/8331f03/36516_1000.pdf

Link to comment

I had 4 of 10 DOA - makes 40%... far away from 1%  ::)

 

Sure, 10 is not a big count, but I'm quite fed up with WD20EARS drives, well knowing that the same $*?* could have happened with any other drive manufacturer. I'll need 5 drives in near future, and still don't have a hint which drive I'm going to use. Maybe a 5K3000, but I might as well stay with the EARS drives... wel'll see.

Link to comment

10 is also not statistically relevant! WD ship out hundreds of thousands of drives a day. Some will be DOA. But not many.

 

I've bought hundreds (maybe thousands) of hard drives for personal use and in my business, mostly Seagate and WD.

 

DOAs? Zero. None. Does that mean no hard drives are ever DOA? Nope.

 

You can bet if there was a general production problem with the WD20EARS, we'd hear about it -- like there was with the ST31500341AS -- I have two of those and both were "DOA" (unreliable) -- but fixed after a firmware update.

Link to comment

Of course it is statistically irrelevant, but it is definitely a shame for the producer.

 

It is also true that almost 99% of all consumer harddisks is not as tough tested as the disks used in unRaid systems. Who else does preclearing? But since the troubles I've had with my EARS, I do a preclear cycle on all my disks prior to inserting them in my customers PCs. And yes, there have been some Hitachis and some Seagate drives too which had problems on preclearing. RMAed them and the replacement drives worked without troubles. I suppose Windows would never ever noticed problems with the other disks - until it would have been to late and data lost.

 

Most defective drives I've had problems with died after short usage. I have a bunch of hard disk drives which are older than 10 years which are still working (slow, but reliable). And I have disks, which were in use for 2 weeks and died. I suppose those would never ever have passed the precelaring cycles. Just my humble opinion, though.

Link to comment

2 of 4 of my EARS drives also failed preclear.

 

I understand the sample is way too small to be considered relevant, and I'm quite sure that one reason we're seeing a fair number of reported fails is because it is one of the most popular 2TB drives being purchased, but the thing that makes me think there may be more to it is that we never saw this with the EARS predecessor: The EADS drive...

 

I'm confident that if one does a search in this forum on # of threads about failed EADS drives vs # of threads about failed EARS drives, the overwhelming majority of failed drive threads would be in reference to the EARS drives....

 

Who feels like sifting through the forums and provide us a count???  ;D

 

Personally, I am no longer buying any EARS drives. I'm just not comfortable with them. I'll stick with the Hitachi 5K3000 for now... :)

 

 

Link to comment

How often does a drive show failure after more than 1 pre-clear pass?  Obviously, pre-clearing takes at least 20 hours for most drives, so I've only done one pass before loading into my array (all of which passed).

 

How often have you guys passed on the first pass, but have shown errors/failures on subsequent passes?

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.