Lt-Chewie

Members
  • Posts

    21
  • Joined

  • Last visited

Converted

  • Gender
    Undisclosed

Lt-Chewie's Achievements

Noob

Noob (1/14)

0

Reputation

  1. I'm currently in the process of upgrading my 2 racks to v6. Now with the SAS cards having been known to have issues with v6 (for some people), I feel it prudent to look into purchasing new v6 compatible cards on the likely chance I too will have issues with them. I see Dell Perch H310s are popular on the forum, but I also see a lot of problems with them mainly in flashing and bricking them - if I were to purchase these is there a specific version of the H310 I need to purchase or are they all flashable? Also is there a process to flash them that if I don't have UEFI? If there was a chance to buy LSI SAS 2008's already flashed (9211-8i), but more expensive and more used should I go for these or better to go for newer less used H310 and flash those myself? Thank you in advance. Setup: x2 24 bay 4u racks each with 8gb Ram, Supermicro X9SCM-F-O motherboard and each with Supermicro x3 AOC-SAS2LP-MV8 cards.
  2. Thanks RobJ, much appreciated. Saves myself from RMA'ing it and getting a refurbished one back.
  3. The full story is of my drive problem is here. But the short of it is the drive had REISERFS errors, I used the Western Digital Diagnostic tool and it said too many errors. So I precleared it twice (picture 1), ran the WD diagnostic tool again and it said the drive was fine. Then I precleared it one more time (picture 2) so I could add it to the array if needed. 1) 2) Start: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 547 3 Spin_Up_Time 0x0027 176 174 021 Pre-fail Always - 4191 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 592 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0 9 Power_On_Hours 0x0032 098 098 000 Old_age Always - 1812 10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 93 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 12 193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 582 194 Temperature_Celsius 0x0022 112 108 000 Old_age Always - 35 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 194 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 194 194 000 Old_age Offline - 2172 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 182 182 000 Old_age Offline - 7276 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed without error 00% 1812 - # 2 Conveyance offline Completed without error 00% 1806 - # 3 Conveyance offline Completed: read failure 90% 1760 1137246032 # 4 Short offline Completed without error 00% 1717 - # 5 Short offline Completed without error 00% 44 - Finish: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 547 3 Spin_Up_Time 0x0027 176 174 021 Pre-fail Always - 4191 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 592 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 100 253 000 Old_age Always - 0 9 Power_On_Hours 0x0032 098 098 000 Old_age Always - 1835 10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 93 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 12 193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 582 194 Temperature_Celsius 0x0022 111 108 000 Old_age Always - 36 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 194 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 194 194 000 Old_age Offline - 2172 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 182 182 000 Old_age Offline - 7276 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed without error 00% 1812 - # 2 Conveyance offline Completed without error 00% 1806 - # 3 Conveyance offline Completed: read failure 90% 1760 1137246032 # 4 Short offline Completed without error 00% 1717 - # 5 Short offline Completed without error 00% 44 - Looking at this image and the report, it seems the drive is okay right? Or am I missing something?
  4. Thanks for the advice and help itimpi, I will download the wd diagnostic tool (which from their instruction page looks like a smarttest tool) and test it, then give it a 2 cycle preclear. I'll report back the details when that's done **UPDATE** Ran the WD Diagnostic tool...which in all honesty wasn't very helpful..picture below; It's going to be RMA'd, I don't trust putting it back into the server. At the moment I'm 2 cycle preclearing the drive just to see it what it will say for future reference - will update again when thats done. **UPDATE 2** Any ideas what the following results mean would be helpful - I've also ran the WD Diagnostic tool again on the drive which it now says has no errors
  5. I have a spare 2tb drive of the same WD Green series pre-cleared and ready in case of this happening. The files that are 'read locked' should be on one of the back-up firewire drives, something I will have to check when I get back home. From the report it doesn't look like too many files are affected by these bad-blocks, so if I am missing a few files on the back-up drives it shouldn't be too much of a hit. I'm assuming this bad-block problem is an issue that the drive won't recover from? So the next action for me would be to have it RMA'd to WD right? Or is this something that isn't covered by RMA's? Also would mounting/unmounting the drive during the starting and stopping of the array damage it further or not?
  6. Power-cycle done, same REISERFS error reappears. Smart report indicates no issues. unRAID dashboard shows no redball or any other indication of problems. Syslog provided below; syslog.txt
  7. Just saw this on the IPMI-View screen after I finished backing up data to another server; REISERFS error (device md1): vs-13070 reiserfs_read_locked_inode: i/o failure occurred trying to find stat data of [1522 1523 0x0 SD] REISERFS error (device md1): vs-13070 reiserfs_read_locked_inode: i/o failure occurred trying to find stat data of [1522 1553 0x0 SD] REISERFS error (device md1): vs-13070 reiserfs_read_locked_inode: i/o failure occurred trying to find stat data of [1576 1577 0x0 SD] this repeats with different numbers for about 9 page downs.. After searching then reading the "Check Disk Filesystems" page, I ran the "reiserfsck --check /dev/md1" command and it gave me this: ########### reiserfsck --check started at Tue Sep 2 22:53:39 2014 ########### Replaying journal: Done. Reiserfs journal '/dev/md1' in blocks [18..8211]: 0 transactions replayed The problem has occurred looks like a hardware problem. If you have bad blocks, we advise you to get a new hard drive, because once you get one bad block that the disk drive internals cannot hide from your sight,the chances of getting more are generally said to become much higher (precise statistics are unknown to us), and this disk drive is probably not expensive enough for you to you to risk your time and data on it. If you don't want to follow that follow that advice then if you have just a few bad blocks, try writing to the bad blocks and see if the drive remaps the bad blocks (that means it takes a block it has in reserve and allocates it for use for of that block number). If it cannot remap the block, use badblock option (-B) with reiserfs utils to handle this block correctly. bread: Cannot read the block (268206080): (Input/output error). Aborted (core dumped) What I get from this is that the drive /md1 (disk1) has bad blocks, but when I check the dashboard on the unRAID webpage it gives me no indication that there is an issue with the drive, nor when I ran an extended smartest on it. There was also no indication of any problems when I first ran a preclear 6 months or so ago, nor has it been under heavy usage..so am I to trust this RESIERFS check report or the smarttest or the dashboard that shows no redball or other indication of fault? Any advice as to what I should do next, RMA the drive, preclear it again, any other suggestions?
  8. Thank you itimpi and Joe, RMA won't happen since it's past warranty - purchased about 3 or so years ago. I don't think it's the power-supply as I have precleared other drives with no issues. I'll just stick this into something unimportant like the NAS for the CCTV at work. Thanks again for taking the time to look at it
  9. Just precleared one of my Samsungs that I had laying about, I think from the numbers in the start and finish files it looks okay but the "pending re-allocation" parts in the report is what concerns me. I've also attached the start and finish files from the preclears if that is needed. Any insight as to what I should do with this drive, and perhaps a little understanding of what the last paragraph means for future preclears would be appreciated, thanks. First 2 cylcle preclear: ========================================================================1.15 == invoked as: ./preclear_disk.sh -A -c 2 /dev/sdb == SAMSUNGHD204UI S2HGJ1BZ828875 == Disk /dev/sdb has been successfully precleared == with a starting sector of 64 == Ran 2 cycles == == Using :Read block size = 8388608 Bytes == Last Cycle's Pre Read Time : 6:34:59 (84 MB/s) == Last Cycle's Zeroing time : 5:33:53 (99 MB/s) == Last Cycle's Post Read Time : 12:17:30 (45 MB/s) == Last Cycle's Total Time : 17:52:27 == == Total Elapsed Time 42:21:33 == == Disk Start Temperature: 29C == == Current Disk Temperature: 32C, == ============================================================================ ** Changed attributes in files: /tmp/smart_start_sdb /tmp/smart_finish_sdb ATTRIBUTE NEW_VAL OLD_VAL FAILURE_THRESHOLD STATUS RAW_VALUE Current_Pending_Sector = 99 100 0 ok 149 No SMART attributes are FAILING_NOW 11 sectors were pending re-allocation before the start of the preclear. 160 sectors were pending re-allocation after pre-read in cycle 1 of 2. 0 sectors were pending re-allocation after zero of disk in cycle 1 of 2. 151 sectors were pending re-allocation after post-read in cycle 1 of 2. 0 sectors were pending re-allocation after zero of disk in cycle 2 of 2. 149 sectors are pending re-allocation at the end of the preclear, a change of 138 in the number of sectors pending re-allocation. 0 sectors had been re-allocated before the start of the preclear. 0 sectors are re-allocated at the end of the preclear, the number of sectors re-allocated did not change. ============================================================================ Second 2 cycle preclear: ========================================================================1.15 == invoked as: ./preclear_disk.sh -A -c 2 /dev/sdb == SAMSUNGHD204UI S2HGJ1BZ828875 == Disk /dev/sdb has been successfully precleared == with a starting sector of 64 == Ran 2 cycles == == Using :Read block size = 8388608 Bytes == Last Cycle's Pre Read Time : 6:29:41 (85 MB/s) == Last Cycle's Zeroing time : 5:34:01 (99 MB/s) == Last Cycle's Post Read Time : 12:07:12 (45 MB/s) == Last Cycle's Total Time : 17:42:18 == == Total Elapsed Time 42:05:16 == == Disk Start Temperature: 27C == == Current Disk Temperature: 32C, == ============================================================================ ** Changed attributes in files: /tmp/smart_start_sdb /tmp/smart_finish_sdb ATTRIBUTE NEW_VAL OLD_VAL FAILURE_THRESHOLD STATUS RAW_VALUE Current_Pending_Sector = 100 99 0 ok 100 No SMART attributes are FAILING_NOW 149 sectors were pending re-allocation before the start of the preclear. 150 sectors were pending re-allocation after pre-read in cycle 1 of 2. 0 sectors were pending re-allocation after zero of disk in cycle 1 of 2. 149 sectors were pending re-allocation after post-read in cycle 1 of 2. 0 sectors were pending re-allocation after zero of disk in cycle 2 of 2. 100 sectors are pending re-allocation at the end of the preclear, a change of -49 in the number of sectors pending re-allocation. 0 sectors had been re-allocated before the start of the preclear. 0 sectors are re-allocated at the end of the preclear, the number of sectors re-allocated did not change. ============================================================================ preclear_start_S2HGJ1BZ828875_2014-07-17.txt preclear_finish_S2HGJ1BZ828875_2014-07-17.txt preclear_start_S2HGJ1BZ828875_2014-07-19.txt preclear_finish_S2HGJ1BZ828875_2014-07-19.txt