Re: preclear_disk.sh - a new utility to burn-in and pre-clear disks for quick add

Biggy2872 · December 9, 2008

Note from Joe L:

I reached the character limit for a single post in my original preclear_disk thread and wanted more room to add more release notes! I split the threads so that I could add new stuff without getting rid of all of these following posts by users of the utility.. I would love to merge the threads back into a single one again, but it will only merge the threads in date order. That would put my new release notes near the end of a very long set of pages. For that reason, the original thread with only the release notes is locked, and this one split from it.

The original preclear_disk.sh thread with the release notes is here:

http://lime-technology.com/forum/index.php?topic=2817.0

What is the possibility of including this in the unmenu awk interface? Maybe in a future version.

My thoughts (provided it is possible) would be to include a button beside each disk on the "Disk Mgmt" page in the "Drive Partitions - Not In Protected Array" along with radio button for the options of "no read/write", "1 cycle", "5 cycles", "20 cycles" and "test"...

Cheers, and keep up the good work!

Matt

Joe L. · December 9, 2008

What is the possibility of including this in the unmenu awk interface? Maybe in a future version.

My thoughts (provided it is possible) would be to include a button beside each disk on the "Disk Mgmt" page in the "Drive Partitions - Not In Protected Array" along with radio button for the options of "no read/write", "1 cycle", "5 cycles", "20 cycles" and "test"...

Cheers, and keep up the good work!

Matt

That is a long-term goal, although I would put it on its own plug-in page. The biggest issue is the display of progress as it performs the clear. I am taking advantage of a feature of the "dd" command to get its status when writing to the drive. I would need to re-write that section.

Then, I would need a way to start, and stop the process from the web-browser, and a way to get the periodic update of status as it progresses. I know I don't want to submit a task and wait 200+ hours for the browser to return...

First I just want to make sure it is doing the correct "pre-cleared" signature, and not creating a black-hole somewhere in the universe.

Before I tackle anything else, I need to get the next version of unmenu.awk out published. bjp999 recently added a bit of logic and figured out how to get it to POST to a plug-in as well as GET. I've added that to the next version, along with a few fixes.

(non-geek translation... unmenu.awk can now handle more complicated data entry forms)

Glad you like unmenu.awk. It certainly is turning out to be interesting.

Joe L.

WeeboTech · December 9, 2008

Then, I would need a way to start, and stop the process from the web-browser, and a way to get the periodic update of status as it progresses. I know I don't want to submit a task and wait 200+ hours for the browser to return...

Perhaps submit it to the background via batch.

The process can write it's pid in /var/run and log file in /var/log.

The browser interface section can refresh on the log and if need be use the pid file to send a kill to the process and it's children.

Joe L. · December 9, 2008

Then, I would need a way to start, and stop the process from the web-browser, and a way to get the periodic update of status as it progresses. I know I don't want to submit a task and wait 200+ hours for the browser to return...

Perhaps submit it to the background via batch.

The process can write it's pid in /var/run and log file in /var/log.

The browser interface section can refresh on the log and if need be use the pid file to send a kill to the process and it's children.

Sounds like it could work... I'll put it on the list if somebody does not get to it first...

I need to get my own array hardware working properly first. I've determined that one drive tray slot locks up the server when it has a disk installed. Not sure if it is the cable, or the controller, or the drive tray connector in the case... Time will tell. (I can't experiment in the evenings as much, as we are using the server to watch movies...)

Joe L.

SSD · December 10, 2008

This looks very impressive! Can't wait to try it when I get another disk!

Joe L. · December 11, 2008

I re-seated my existing disk controller cables and interface card in an attempt to diagnose the DMA errors I've been experiencing since attempting to expand my array into the last 4 empty slots. As I stated earlier, I was getting a DMA error that just locked up the server.

Since these 4 slots are not yet assigned to anything in my array, my data is safe... I do need to test these slots by reading and writing to disks in them, and this preclear_disk.sh script is perfect for this. I can keep a disk far more active than otherwise, and at no risk to the overall parity protection of my array.

Last night I re-ran a pre-clear cycle of my tiny small 8Gig test drive It is at the end connector of the first cable of the disk controller. It ran successfully in about 25 minutes. I then tried a pre-clear of a much larger 750Gig drive. It is on the end of the second cable off of the same Promise IDE controller.

As you might have guessed, the 750Gig drive took quite a bit longer to pre-read/clear/post-read than my 8Gig drive. It took just under 10 hours for 1 cycle. It also experienced some changes to the SMART data.

The preclear_disk.sh script is designed to take a SMART status report when it starts, and another at its end, and to show you any differences between them if they exist. In my example screen-shot, the Raw_Read_Error_Rate and See_Error_Rate are un-changed, but their "raw value" changed. (last value on the line) These are not likely to be problems. The Airflow_Temperature_Cel changed... also not likely to be a problem. There was an increase in the Hardware_ECC_Recovered counter. I'll need to keep an eye on that. It indicated the hardware in the disk corrected an error it detected in reading the disk. The unRAID OS never even knew anything as the error-correction-code in the drive's firmware handled the error.

Makes you kind of wonder if all this is also happening on disks in our Windows PCs, and we are not notified by it until it fails to boot...

Here is a screen shot of how it looked when it was done:

I'm going to run this 750Gig drive through a few more pre-read/clear/post-read cycles to see if it changes any more, or if I get any more DMA errors. First, I'm going to save a copy of my syslog, as the SMART reports are logged there. That way, if I do have another DMA error lockup, the SMART report in the saved syslog will be available next time for comparison.

Note: the SMART difference output is in "diff" format. The lines with a leading "<" are from the before SMART report, the lines with a leading ">" are from the after SMART report. Lines that are unchanged are not shown at all.

Joe L.

Joe L. · December 11, 2008

I just loaded the new 4.4 version of unRAID. Apparently, Tom has not included the "ncurses" package.

This will break the display of the preclear_disk.sh script.

To fix it, you can either install the "ncurses" package, or, a lot easier, change these few lines in the script

from:

clearscreen=`tput clear`

goto_top=`tput cup 0 1`

screen_line_three=`tput cup 3 1`

bold=`tput smso`

norm=`tput rmso`

ul=`tput smul`

noul=`tput rmul`

to:

if [ -x /usr/bin/tput ]

then

clearscreen=`tput clear`

goto_top=`tput cup 0 1`

screen_line_three=`tput cup 3 1`

bold=`tput smso`

norm=`tput rmso`

ul=`tput smul`

noul=`tput rmul`

else

clearscreen=`echo -n -e "\033[H\033[2J"`

goto_top=`echo -n -e "\033[1;2H"`

screen_line_three=`echo -n -e "\033[4;2H"`

bold=`echo -n -e "\033[7m"`

norm=`echo -n -e "\033[27m"`

ul=`echo -n -e "\033[4m"`

noul=`echo -n -e "\033[24m"`

fi

I'll post a new version of the preclear script shortly with these changes.

Edit: updated version now attached to first post in this thread.

Joe L.

j4ck4l · December 14, 2008

This is an excellent utility. I'm using it now to clear my new 1,5TB disk. This is gonna take a while

Joe L. · December 15, 2008

This is an excellent utility. I'm using it now to clear my new 1,5TB disk. This is gonna take a while

I'll bet it will take a while...

Please let us know how long it does take.... (I hope you only did one cycle, at least at first)

I figure your disk is twice the capacity as mine, but probably twice as fast, so 10 hours or so for one cycle is my guess....

Joe L.

Joe L. · December 15, 2008

I noticed that the 4.4final and 4.5beta releases do not have a working "smartctl" command. This will not prevent the preclear script from running, but you will be unable to learn if the disk SMART attributes change during the preclear process.

To fix the smartctl program all you need to do is install the missing library it needs. It can be downloaded from:

http://slackware.cs.utah.edu/pub/slackware/slackware-12.0/slackware/a/cxxlibs-6.0.8-i486-4.tgz

Then, using file-explorer on your windows PC you can open up

\\tower\flash

and create a packages folder. You can copy or move the downloaded file there. From windows it will be at

\\tower\flash\packages\cxxlibs-6.0.8-i486-4.tgz

If you log in via the system console, or via telnet, the flash drive is mounted at /boot. The new directory you created is therefore at

/boot/packages Your file will then be at /boot/packages/cxxlibs-6.0.8-i486-4.tgz

Once downloaded, and saved as cxxlibs-6.0.8-i486-4.tgz you can install it by changing to the directory where you downloaded it

(I have all my downloaded packages in the /boot/packages directory, so after logging in on the system console or via telnet

To change directory I type

cd /boot/packages

and install it by typing :

installpkg cxxlibs-6.0.8-i486-4.tgz

As an alternative, if I did not want to change directory to where I put the file, I could just give the full path to the downloaded file like this:

installpkg /boot/packages/cxxlibs-6.0.8-i486-4.tgz

Once it is installed, the smartctl program will work until you reboot, at which time you will need to re-install it once more.

Joe L.

j4ck4l · December 15, 2008

This is an excellent utility. I'm using it now to clear my new 1,5TB disk. This is gonna take a while

I'll bet it will take a while...
Please let us know how long it does take.... (I hope you only did one cycle, at least at first)

I figure your disk is twice the capacity as mine, but probably twice as fast, so 10 hours or so for one cycle is my guess....

Joe L.

Running for 13:20 hours now. Post-read @ 30%

approx. 3 minutes per percent so i guess still 3,5 hours to go.

Joe L. · December 15, 2008

This is an excellent utility. I'm using it now to clear my new 1,5TB disk. This is gonna take a while

I'll bet it will take a while...
Please let us know how long it does take.... (I hope you only did one cycle, at least at first)

I figure your disk is twice the capacity as mine, but probably twice as fast, so 10 hours or so for one cycle is my guess....

Joe L.

Running for 13:20 hours now. Post-read @ 30%

approx. 3 minutes per percent so i guess still 3,5 hours to go.

I'm running a preclear_disk cycle on a 750Gig SATA drive I have plugged into a new 2-port PCI-Bus SATA controller. It is a very inexpensive controller card, and only rated at SATA 1.0 speeds, but I figure I am very limited by the PCI bus, so it really does not matter.

I'm 9 hours, 20 minutes into the process, and 85% complete in the post-read process.

Joe L.

Joe L. · December 15, 2008

It took 10 hours, 2 minutes to pre-read/clear/post-read my 750Gig SATA drive. This is interesting in that it is almost exactly the same time it took for an IDE based drive of the same size. Read speeds averaged in the 75-80MB/s range. Write speeds averaged in the mid 60MB/s through mid 70MB/s. It shows the PCI bus can keep up with a single SATA drive and we are still mostly limited by the drive itself.

I stopped my unRAID array, assigned the newly cleared 750 Gig SATA drive on the devices page to an empty slot, then went back to the main page. I was presented with a display showing a "blue" indicator for the new drive.

The text alongside the "Start" button indicated it would be cleared if it was not already pre-cleared when the array was started.

I checked the "checkbox" under the "Start" button (to enable it) and then clicked on the "Start" button to start the array.

The screen indicated that it was starting. After I refreshed it, it showed the new disk as "Unformatted" and the array was up and running. My array was off-line for perhaps a minute as I assigned the new drive.

Only a minute of down-time is a HUGE accomplishment, as in the past it took about 4 hours of down-time to add a 750 Gig drive while it was being cleared. In addition, I had some confidence in the drive as any marginal sectors would have been identified.

I clicked on the "Format" button, and in a minute or two more I saw the new disk was available for data.

Joe L.

WeeboTech · December 15, 2008

I clicked on the "Format" button, and in a minute or two more I saw the new disk was available for data.

Does it pay to have a FORMAT option in the pre-clear script?

Joe L. · December 15, 2008

I clicked on the "Format" button, and in a minute or two more I saw the new disk was available for data.

Does it pay to have a FORMAT option in the pre-clear script?

No, it would not pay at all to have a FORMAT option.

If the drive was formatted it would have a file-system type of "83" set in the 16 bytes that define the first partition in the MBR. It would then not have a valid pre-clear signature, and the unRAID software would go about clearing it when you assign it to the array. After it cleared the drive, you would still have to format it. For a large drive, you are facing 4 or more hours of down-time as the drive is cleared.

The only other way to add a "formatted" drive to the array would be to use the button labeled "Restore", the array would be on line quickly, but you would lose parity protection for many hours as it would then start a full parity calculation. On my array, it takes over 12 hours... I'd rather keep the array protected, so this is not the best way to add a new drive for me.

Joe L.

j4ck4l · December 15, 2008

It took 16 hours 47 minutes on my computer to pre-read/clear/post read a 1,5 TB disc on an onboard sil3114 interface.

I have to wait on a copy action to finish before I can add this disc. I'll keep you informed on how that went.

j4ck4l · December 15, 2008

Worked like a charm

Added disk in just a couple of minutes fully functional.

Tnx for this great utility.

Joe L. · December 15, 2008

Worked like a charm

Added disk in just a couple of minutes fully functional.

Tnx for this great utility.

Well... "a couple of minutes" plus 16 hours, 47 minutes pre-clearing time.

;D

Glad to hear it all went smoothly. I guess you were in too much of a hurry to try 20 cycles... It would have only taken 14 days....

Joe L.

RobJ · December 16, 2008

Added to UnRAID Add Ons, here. Feel free to edit.

WeeboTech · December 16, 2008

No, it would not pay at all to have a FORMAT option.

Duah.. You're right, I did not think it through all the way.

jimwhite · December 16, 2008

Then unRaid wouldn't see it as a cleared disk...

abq-pete · December 18, 2008

I just received another 1.5TB drive from Amazon. Sadly this one had the older firmware with the issues. I updated the firmware and proceeded to use SpinRite 6 to check the drive. After about an hour, SPinRite reported that an additional 400+ hours (more than 16 days) were needed to complete the very thorough check! So I decided to choose a middle ground for testing. I just loaded the drive into my array and am running this excellent utility. Hopefully in about 17 hours, I will get some good news.

Thanks Joe L. !

Regards, Peter

jkm9000 · December 18, 2008

Thanks a lot for doing this, will make adding my new disks a lot less painful!

abq-pete · December 18, 2008

Good news and bad news. First the good. The entire process took only 6:34:14, to do the 1.5TB Seagate. The bad news is that post-read portion ended after the 40% mark. Following that, the S.M.A.R.T. reports listed quite some errors. Can anyone help me understand if I should send this drive back?

First the last Post-Read progress information:

===========================================================================

= unRAID server Pre-Clear disk /dev/sda

= cycle 1 of 1

= Disk Pre-Clear-Read completed DONE

= Step 1 of 10 - Copying zeros to first 2048k bytes DONE

= Step 2 of 10 - Copying zeros to remainder of disk to clear it DONE

= Step 3 of 10 - Disk is now cleared from MBR onward. DONE

= Step 4 of 10 - Clearing MBR bytes for partition 2,3 & 4 DONE

= Step 5 of 10 - Clearing MBR code area DONE

= Step 6 of 10 - Setting MBR signature bytes DONE

= Step 7 of 10 - Setting partition 1 to precleared state DONE

= Step 8 of 10 - Notifying kernel we changed the partitioning DONE

= Step 9 of 10 - Creating the /dev/disk/by* entries DONE

= Step 10 of 10 - Testing if the clear has been successful. DONE

= Post-Read in progress: 40% complete.

( 608,670,720,000 of 1,500,301,910,016 bytes read )

Elapsed Time: 6:31:49

Next, the Post-Read summary:

===========================================================================

= unRAID server Pre-Clear disk /dev/sda

= cycle 1 of 1

= Disk Pre-Clear-Read completed DONE

= Step 1 of 10 - Copying zeros to first 2048k bytes DONE

= Step 2 of 10 - Copying zeros to remainder of disk to clear it DONE

= Step 3 of 10 - Disk is now cleared from MBR onward. DONE

= Step 4 of 10 - Clearing MBR bytes for partition 2,3 & 4 DONE

= Step 5 of 10 - Clearing MBR code area DONE

= Step 6 of 10 - Setting MBR signature bytes DONE

= Step 7 of 10 - Setting partition 1 to precleared state DONE

= Step 8 of 10 - Notifying kernel we changed the partitioning DONE

= Step 9 of 10 - Creating the /dev/disk/by* entries DONE

= Step 10 of 10 - Testing if the clear has been successful. DONE

= Disk Post-Clear-Read completed DONE

Elapsed Time: 6:34:14

============================================================================

==

== Disk /dev/sda has been successfully precleared

==

============================================================================

Now the S.M.A.R.T. error count:

S.M.A.R.T. error count differences detected after pre-clear

note, some 'raw' values may change, but not be an indication of a problem

54c54

< 1 Raw_Read_Error_Rate 0x000f 100 100 006 Pre-fail Always - 2194400

---

> 1 Raw_Read_Error_Rate 0x000f 103 099 006 Pre-fail Always - 42336756

57,58c57,58

< 5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 0

< 7 Seek_Error_Rate 0x000f 100 253 030 Pre-fail Always - 63626

---

> 5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 15

> 7 Seek_Error_Rate 0x000f 100 253 030 Pre-fail Always - 261274

62c62

< 187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0

---

> 187 Reported_Uncorrect 0x0032 076 076 000 Old_age Always - 24

64c64

< 189 High_Fly_Writes 0x003a 100 100 000 Old_age Always - 0

---

> 189 High_Fly_Writes 0x003a 075 075 000 Old_age Always - 25

66,68c66,68

< 195 Hardware_ECC_Recovered 0x001a 100 100 000 Old_age Always

< 197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0

< 198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0

---

> 195 Hardware_ECC_Recovered 0x001a 052 049 000 Old_age Always

> 197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 1

> 198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 1

72c72,170

< No Errors Logged

---

Finally, the last 5 errors (out of 24 apparently):

> ATA Error Count: 24 (device log contains only the most recent five errors)

> CR = Command Register [HEX]

> FR = Features Register [HEX]

> SC = Sector Count Register [HEX]

> SN = Sector Number Register [HEX]

> CL = Cylinder Low Register [HEX]

> CH = Cylinder High Register [HEX]

> DH = Device/Head Register [HEX]

> DC = Device Command Register [HEX]

> ER = Error register [HEX]

> ST = Status register [HEX]

> Powered_Up_Time is measured from power on, and printed as

> DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,

> SS=sec, and sss=millisec. It "wraps" after 49.710 days.

>

> Error 24 occurred at disk power-on lifetime: 8 hours (0 days + 8 hours)

> When the command that caused the error occurred, the device was active or idle.

>

> After command completion occurred, registers were:

> ER ST SC SN CL CH DH

> -- -- -- -- -- -- --

> 40 51 00 ff ff ff 0f

>

> Commands leading to the command that caused the error were:

> CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name

> -- -- -- -- -- -- -- -- ---------------- --------------------

> 60 00 00 ff ff ff 4f 00 06:55:09.851 READ FPDMA QUEUED

> 27 00 00 00 00 00 e0 02 06:55:09.831 READ NATIVE MAX ADDRESS EXT

> ec 00 00 00 00 00 a0 02 06:55:09.811 IDENTIFY DEVICE

> ef 03 46 00 00 00 a0 02 06:55:09.791 SET FEATURES [set transfer mode]

> 27 00 00 00 00 00 e0 02 06:55:09.771 READ NATIVE MAX ADDRESS EXT

>

> Error 23 occurred at disk power-on lifetime: 8 hours (0 days + 8 hours)

> When the command that caused the error occurred, the device was active or idle.

>

> After command completion occurred, registers were:

> ER ST SC SN CL CH DH

> -- -- -- -- -- -- --

> 40 51 00 ff ff ff 0f

>

> Commands leading to the command that caused the error were:

> CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name

> -- -- -- -- -- -- -- -- ---------------- --------------------

> 60 00 00 ff ff ff 4f 00 06:55:06.474 READ FPDMA QUEUED

> 27 00 00 00 00 00 e0 02 06:55:06.454 READ NATIVE MAX ADDRESS EXT

> ec 00 00 00 00 00 a0 02 06:55:06.434 IDENTIFY DEVICE

> ef 03 46 00 00 00 a0 02 06:55:06.414 SET FEATURES [set transfer mode]

> 27 00 00 00 00 00 e0 02 06:55:06.394 READ NATIVE MAX ADDRESS EXT

>

> Error 22 occurred at disk power-on lifetime: 8 hours (0 days + 8 hours)

> When the command that caused the error occurred, the device was active or idle.

>

> After command completion occurred, registers were:

> ER ST SC SN CL CH DH

> -- -- -- -- -- -- --

> 40 51 00 ff ff ff 0f

>

> Commands leading to the command that caused the error were:

> CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name

> -- -- -- -- -- -- -- -- ---------------- --------------------

> 60 00 00 ff ff ff 4f 00 06:55:02.987 READ FPDMA QUEUED

> 27 00 00 00 00 00 e0 02 06:55:02.967 READ NATIVE MAX ADDRESS EXT

> ec 00 00 00 00 00 a0 02 06:55:02.947 IDENTIFY DEVICE

> ef 03 46 00 00 00 a0 02 06:55:02.927 SET FEATURES [set transfer mode]

> 27 00 00 00 00 00 e0 02 06:55:02.907 READ NATIVE MAX ADDRESS EXT

>

> Error 21 occurred at disk power-on lifetime: 8 hours (0 days + 8 hours)

> When the command that caused the error occurred, the device was active or idle.

>

> After command completion occurred, registers were:

> ER ST SC SN CL CH DH

> -- -- -- -- -- -- --

> 40 51 00 ff ff ff 0f

>

> Commands leading to the command that caused the error were:

> CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name

> -- -- -- -- -- -- -- -- ---------------- --------------------

> 60 00 00 ff ff ff 4f 00 06:54:59.692 READ FPDMA QUEUED

> 60 00 00 ff ff ff 4f 00 06:54:59.690 READ FPDMA QUEUED

> 27 00 00 00 00 00 e0 02 06:54:59.670 READ NATIVE MAX ADDRESS EXT

> ec 00 00 00 00 00 a0 02 06:54:59.650 IDENTIFY DEVICE

> ef 03 46 00 00 00 a0 02 06:54:59.630 SET FEATURES [set transfer mode]

>

> Error 20 occurred at disk power-on lifetime: 8 hours (0 days + 8 hours)

> When the command that caused the error occurred, the device was active or idle.

>

> After command completion occurred, registers were:

> ER ST SC SN CL CH DH

> -- -- -- -- -- -- --

> 40 51 00 ff ff ff 0f

>

> Commands leading to the command that caused the error were:

> CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name

> -- -- -- -- -- -- -- -- ---------------- --------------------

> 60 00 00 ff ff ff 4f 00 06:54:56.314 READ FPDMA QUEUED

> 60 00 00 ff ff ff 4f 00 06:54:56.313 READ FPDMA QUEUED

> 27 00 00 00 00 00 e0 02 06:54:56.293 READ NATIVE MAX ADDRESS EXT

> ec 00 00 00 00 00 a0 02 06:54:56.273 IDENTIFY DEVICE

> ef 03 46 00 00 00 a0 02 06:54:56.253 SET FEATURES [set transfer mode]

============================================================================

Of course running the Seagate tools test indicates no problems. Had this tool not existed, I would not have known any of this. Was ignorance bliss? I will re-run this as well as try some other tests. Luckily I do not have an immediate need for this drive yet.

Thanks and regards, Peter

Joe L. · December 18, 2008

Good news and bad news. First the good. The entire process took only 6:34:14, to do the 1.5TB Seagate. The bad news is that post-read portion ended after the 40% mark.

It aborted the "post-read" when a read of 2000 blocks of data returned after reading less than 2000 blocks... So, 60% of the remaining blocks were not post-read. We do not know about the pre-read...It could have aborted early too. (I don't currently track if it got to the end, but, clearly I need to, as the display is overwritten by the next phase) Odds are as good as any that the pre-read aborted too, especially with the short total elapsed time.

Following that, the S.M.A.R.T. reports listed quite some errors. Can anyone help me understand if I should send this drive back?

First the last Post-Read progress information:

===========================================================================

= unRAID server Pre-Clear disk /dev/sda

= cycle 1 of 1

= Disk Pre-Clear-Read completed DONE

= Step 1 of 10 - Copying zeros to first 2048k bytes DONE

= Step 2 of 10 - Copying zeros to remainder of disk to clear it DONE

= Step 3 of 10 - Disk is now cleared from MBR onward. DONE

= Step 4 of 10 - Clearing MBR bytes for partition 2,3 & 4 DONE

= Step 5 of 10 - Clearing MBR code area DONE

= Step 6 of 10 - Setting MBR signature bytes DONE

= Step 7 of 10 - Setting partition 1 to precleared state DONE

= Step 8 of 10 - Notifying kernel we changed the partitioning DONE

= Step 9 of 10 - Creating the /dev/disk/by* entries DONE

= Step 10 of 10 - Testing if the clear has been successful. DONE

= Post-Read in progress: 40% complete.

( 608,670,720,000 of 1,500,301,910,016 bytes read )

Elapsed Time: 6:31:49

Next, the Post-Read summary:

===========================================================================

= unRAID server Pre-Clear disk /dev/sda

= cycle 1 of 1

= Disk Pre-Clear-Read completed DONE

= Step 1 of 10 - Copying zeros to first 2048k bytes DONE

= Step 2 of 10 - Copying zeros to remainder of disk to clear it DONE

= Step 3 of 10 - Disk is now cleared from MBR onward. DONE

= Step 4 of 10 - Clearing MBR bytes for partition 2,3 & 4 DONE

= Step 5 of 10 - Clearing MBR code area DONE

= Step 6 of 10 - Setting MBR signature bytes DONE

= Step 7 of 10 - Setting partition 1 to precleared state DONE

= Step 8 of 10 - Notifying kernel we changed the partitioning DONE

= Step 9 of 10 - Creating the /dev/disk/by* entries DONE

= Step 10 of 10 - Testing if the clear has been successful. DONE

= Disk Post-Clear-Read completed DONE

Elapsed Time: 6:34:14

============================================================================

==

== Disk /dev/sda has been successfully precleared

==

============================================================================

Now the S.M.A.R.T. error count:

S.M.A.R.T. error count differences detected after pre-clear

note, some 'raw' values may change, but not be an indication of a problem

54c54

< 1 Raw_Read_Error_Rate 0x000f 100 100 006 Pre-fail Always - 2194400

---

> 1 Raw_Read_Error_Rate 0x000f 103 099 006 Pre-fail Always - 42336756

57,58c57,58

< 5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 0

< 7 Seek_Error_Rate 0x000f 100 253 030 Pre-fail Always - 63626

---

> 5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 15

> 7 Seek_Error_Rate 0x000f 100 253 030 Pre-fail Always - 261274

62c62

< 187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0

---

> 187 Reported_Uncorrect 0x0032 076 076 000 Old_age Always - 24

64c64

< 189 High_Fly_Writes 0x003a 100 100 000 Old_age Always - 0

---

> 189 High_Fly_Writes 0x003a 075 075 000 Old_age Always - 25

66,68c66,68

< 195 Hardware_ECC_Recovered 0x001a 100 100 000 Old_age Always

< 197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0

< 198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0

---

> 195 Hardware_ECC_Recovered 0x001a 052 049 000 Old_age Always

> 197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 1

> 198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 1

72c72,170

< No Errors Logged

---

Finally, the last 5 errors (out of 24 apparently):

> ATA Error Count: 24 (device log contains only the most recent five errors)

> CR = Command Register [HEX]

> FR = Features Register [HEX]

> SC = Sector Count Register [HEX]

> SN = Sector Number Register [HEX]

> CL = Cylinder Low Register [HEX]

> CH = Cylinder High Register [HEX]

> DH = Device/Head Register [HEX]

> DC = Device Command Register [HEX]

> ER = Error register [HEX]

> ST = Status register [HEX]

> Powered_Up_Time is measured from power on, and printed as

> DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,

> SS=sec, and sss=millisec. It "wraps" after 49.710 days.

>

> Error 24 occurred at disk power-on lifetime: 8 hours (0 days + 8 hours)

> When the command that caused the error occurred, the device was active or idle.

>

> After command completion occurred, registers were:

> ER ST SC SN CL CH DH

> -- -- -- -- -- -- --

> 40 51 00 ff ff ff 0f

>

> Commands leading to the command that caused the error were:

> CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name

> -- -- -- -- -- -- -- -- ---------------- --------------------

> 60 00 00 ff ff ff 4f 00 06:55:09.851 READ FPDMA QUEUED

> 27 00 00 00 00 00 e0 02 06:55:09.831 READ NATIVE MAX ADDRESS EXT

> ec 00 00 00 00 00 a0 02 06:55:09.811 IDENTIFY DEVICE

> ef 03 46 00 00 00 a0 02 06:55:09.791 SET FEATURES [set transfer mode]

> 27 00 00 00 00 00 e0 02 06:55:09.771 READ NATIVE MAX ADDRESS EXT

>

> Error 23 occurred at disk power-on lifetime: 8 hours (0 days + 8 hours)

> When the command that caused the error occurred, the device was active or idle.

>

> After command completion occurred, registers were:

> ER ST SC SN CL CH DH

> -- -- -- -- -- -- --

> 40 51 00 ff ff ff 0f

>

> Commands leading to the command that caused the error were:

> CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name

> -- -- -- -- -- -- -- -- ---------------- --------------------

> 60 00 00 ff ff ff 4f 00 06:55:06.474 READ FPDMA QUEUED

> 27 00 00 00 00 00 e0 02 06:55:06.454 READ NATIVE MAX ADDRESS EXT

> ec 00 00 00 00 00 a0 02 06:55:06.434 IDENTIFY DEVICE

> ef 03 46 00 00 00 a0 02 06:55:06.414 SET FEATURES [set transfer mode]

> 27 00 00 00 00 00 e0 02 06:55:06.394 READ NATIVE MAX ADDRESS EXT

>

> Error 22 occurred at disk power-on lifetime: 8 hours (0 days + 8 hours)

> When the command that caused the error occurred, the device was active or idle.

>

> After command completion occurred, registers were:

> ER ST SC SN CL CH DH

> -- -- -- -- -- -- --

> 40 51 00 ff ff ff 0f

>

> Commands leading to the command that caused the error were:

> CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name

> -- -- -- -- -- -- -- -- ---------------- --------------------

> 60 00 00 ff ff ff 4f 00 06:55:02.987 READ FPDMA QUEUED

> 27 00 00 00 00 00 e0 02 06:55:02.967 READ NATIVE MAX ADDRESS EXT

> ec 00 00 00 00 00 a0 02 06:55:02.947 IDENTIFY DEVICE

> ef 03 46 00 00 00 a0 02 06:55:02.927 SET FEATURES [set transfer mode]

> 27 00 00 00 00 00 e0 02 06:55:02.907 READ NATIVE MAX ADDRESS EXT

>

> Error 21 occurred at disk power-on lifetime: 8 hours (0 days + 8 hours)

> When the command that caused the error occurred, the device was active or idle.

>

> After command completion occurred, registers were:

> ER ST SC SN CL CH DH

> -- -- -- -- -- -- --

> 40 51 00 ff ff ff 0f

>

> Commands leading to the command that caused the error were:

> CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name

> -- -- -- -- -- -- -- -- ---------------- --------------------

> 60 00 00 ff ff ff 4f 00 06:54:59.692 READ FPDMA QUEUED

> 60 00 00 ff ff ff 4f 00 06:54:59.690 READ FPDMA QUEUED

> 27 00 00 00 00 00 e0 02 06:54:59.670 READ NATIVE MAX ADDRESS EXT

> ec 00 00 00 00 00 a0 02 06:54:59.650 IDENTIFY DEVICE

> ef 03 46 00 00 00 a0 02 06:54:59.630 SET FEATURES [set transfer mode]

>

> Error 20 occurred at disk power-on lifetime: 8 hours (0 days + 8 hours)

> When the command that caused the error occurred, the device was active or idle.

>

> After command completion occurred, registers were:

> ER ST SC SN CL CH DH

> -- -- -- -- -- -- --

> 40 51 00 ff ff ff 0f

>

> Commands leading to the command that caused the error were:

> CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name

> -- -- -- -- -- -- -- -- ---------------- --------------------

> 60 00 00 ff ff ff 4f 00 06:54:56.314 READ FPDMA QUEUED

> 60 00 00 ff ff ff 4f 00 06:54:56.313 READ FPDMA QUEUED

> 27 00 00 00 00 00 e0 02 06:54:56.293 READ NATIVE MAX ADDRESS EXT

> ec 00 00 00 00 00 a0 02 06:54:56.273 IDENTIFY DEVICE

> ef 03 46 00 00 00 a0 02 06:54:56.253 SET FEATURES [set transfer mode]

============================================================================

Of course running the Seagate tools test indicates no problems. Had this tool not existed, I would not have known any of this. Was ignorance bliss? I will re-run this as well as try some other tests. Luckily I do not have an immediate need for this drive yet.

Thanks and regards, Peter

It appears to me as if the drive has already reallocated 15 sectors, and has 1 more pending re-allocation. The "High-Fly Writes" are not too good either.

Hard to say what to do... I'd run a few more cycles of preclear before I decide what to do... It sure might be a candidate for return, but they might not take it if their utility does not indicate it is over their failure "threshold"

In the mean-time, I'll see about a modification of the script to force it to continue reading past early "read" aborts of the drive. In the unRAID server, the read failure would have resulting in the same data block being reconstructed from parity and then re-written to the drive. That would have forced the sector reallocation. Since this "read failure" happened on the disk post-read is troubling, as all reallocation should have taken place in the zeroing phase. (assuming the pre-read completed, that is)

At this point I'm guessing it did not complete either.

I'm learning how drives and SMART firmware react to this script as you are... glad to have helped, at least to identify a possible flaky drive.

Joe L.

Re: preclear_disk.sh - a new utility to burn-in and pre-clear disks for quick add

Recommended Posts

Link to comment

Top Posters In This Topic

Popular Days

Top Posters In This Topic

Popular Days

Popular Posts

Joe L.

sureguy

sureguy

Posted Images

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Join the conversation