BloombergBuff

Members
  • Posts

    11
  • Joined

  • Last visited

Converted

  • Gender
    Undisclosed

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

BloombergBuff's Achievements

Noob

Noob (1/14)

0

Reputation

  1. So the new HBA card and cables came and after installing them, all seems to be alright. Even though I switched out both, I suspect the issue was the cables. The card is running 20.00.02.00 firmware with no issues.
  2. I flashed the 20.00.07.00 firmware to the card but the error was still present. I then rolled the firmware back to P19 and the error is still there. Now I am truly out of ideas and believe it's a hardware issue. I really will have to wait for the new HBA and cables to arrive to test.
  3. I reinstalled the 120GB drive when I got home from work. There were no errors after starting the array so I powered down and added the 960GB back in. Still no errors. Around 9PM EST, I started getting the errors again so I stopped the array and pulled diagnostics. I shutdown and removed the 120GB SSD, rebooted, and started the array. Still got errors. I stopped the array and pulled another diagnostics just in case. All the diagnostics files are attached. I guess I will have to wait for the HBA card and cables to come. EDIT: I found some people on this forums referencing that it may be an issue with the P20 firmware on the card. Here is a link: http://lime-technology.com/forum/index.php?topic=12767.msg339192#msg339192 http://lime-technology.com/forum/index.php?topic=12767.msg259006#msg259006 You can see from my syslog that I am mpt2sas_cm0: LSISAS2008: FWVersion(20.00.02.00) Here is a thread from a ZFS forum also referencing the P20 firmware issue: http://list.zfsonlinux.org/pipermail/zfs-discuss/2015-January/020778.html EDIT2: Here is another thread referencing very similar issues to mine: http://www.overclock.net/t/1528012/ibm-m1015-dell-perc-h200-to-lsi-9211-8i-it-ir-mode Let me know what you think Johnnie! We're so close to figuring this out! tower-diagnostics-20170111-2201.zip tower-diagnostics-20170111-2212.zip
  4. When I first did the move to the new server, the HDD activity light of the 120GB SSD was solid/unblinking which was odd. After removing it the errors seem to have gone away. Do you think I should try reinstalling the 960GB SSD back into it's bay to test? I just finished an appdata backup without error so worst-case scenario, I just remove it again, wipe, and rebuild.
  5. Connecting the 960GB SSD via SATA2 worked and I was able to rebuild my cache drive and docker without the filesystem becoming corrupted or the drive/image becoming read-only. However, the server was still having the "mpt2sas" errors and disk3 disabled itself during an appdata backup. The server froze on "sync filesystems..." and I had to do a hard shutdown. I rebooted and was able to mount, unmount, and use xfs_repair to check the drive. My cache pool in the previous system was made of the SSD drive above, a 1TB WD Black, and a 120GB Kingston SSD. Once I removed the 120GB SSD from it's bay, the errors no longer appear. I also reseated the HBA and the cables. I was able to use the "Trust My Array" procedure to re-enable disk3 and so far so good, still no errors and a parity check is almost finished. Fortunately, the ebay seller is sending me more cables and another HBA to test, but I suspect the issue may be with the backplane/expander and not the HBA/cables. Let me know if this makes sense to you Johnnie and thanks for the suggestion.
  6. So I updated ca.backup Also, from my general support thread johnnie.black told me to try connecting the SSD cache drive via SATA to see if it would work and it DID! I can now add the docker containers without getting the "/var/lib/docker/tmp read-only filesystem" error while trying to download a container. It seems to be working okay now, I wonder what was happening. Once I configure my docker image and containers back, should I try using the expander/backplane again? I just hate having to use a SATA2 port on the mobo for a SSD and also I have nowhere to actually mount it physically.
  7. Ran an extended SMART on the SSD, came back all clear but I still attached the log. I connected the SSD via SATA2 directly to the motherboard and all seems to be okay. It mounted fine, I added the docker image and tested one container and it worked. When it was connected to the hot swap bay I kept getting the error "/var/lib/docker... read-only filesystem" while trying to download a docker container. It seems to be working okay now, I wonder what was happening. Once I configure my docker image and containers back, should I try using the expander/backplane again? I just hate having to use a SATA2 port on the mobo for a SSD and also I have nowhere to actually mount it physically. tower-smart-20170110-0818.zip
  8. I was running through the syslog and also found: blk_update_request: I/O error, dev sdb, sector 9342280 sdb is my only cache drive at the moment. I had a 3 drive pool before moving to the new server. I recently updated to the following server and after transplanting the drives all these errors started to happen. Server Chassis/ Case SuperMicro SC826E16-R800LPB Chassis (High Performance SAS2/6GbS Expander) Back Plane BPN-SAS2-826EL1 826 backplane with single LSI SAS2X28 expander chip Motherboard X8DTN+ CPU Processor 2x Intel Xeon Six Core E5645 2.4Ghz RAM Memory 64GB RAM DDR3 ECC REG DIMM (8x 8GB) Hard Drives/ Caddies 12x 3.5" Drive Caddies RAID Controller LSI 9211-8i JBOD IT Mode HBA Card connected to Expander backplane NIC Ports On board Intel® 82576 Dual-Port Gigabit Ethernet Controller Here are some of the things I've tried so far: -btrfs scrub on the cache drive and docker image -btrfs check --repair on the cache drive -redoing the drive (changing the file system to XFS and then back as per the Check Disk Filesystems wiki) -ran a memtest overnight, result: 1 pass with no errors The cache seems to be okay when I restore the appdata and create the docker image, but once I try to add my old template containers back, the docker gives me the error in my post above and the drive filesystem seems to become corrupted. I think I'm going to try moving the drive to a different slot in the hot swap bay. A lot of posts on here suggest that an I/O error is hardware rather than software. The drive is fairly new but I'll run a SMART test on it later today. I've attached a new diagnostics file. tower-diagnostics-20170109-1131.zip
  9. Will do. I was running through the syslog and also found: blk_update_request: I/O error, dev sdb, sector 9342280 sdb is my only cache drive at the moment. I had a 3 drive pool before moving to the new server. I recently updated to the following server and after transplanting the drives all these errors started to happen. Server Chassis/ Case SuperMicro SC826E16-R800LPB Chassis (High Performance SAS2/6GbS Expander) Back Plane BPN-SAS2-826EL1 826 backplane with single LSI SAS2X28 expander chip Motherboard X8DTN+ CPU Processor 2x Intel Xeon Six Core E5645 2.4Ghz RAM Memory 64GB RAM DDR3 ECC REG DIMM (8x 8GB) Hard Drives/ Caddies 12x 3.5" Drive Caddies RAID Controller LSI 9211-8i JBOD IT Mode HBA Card connected to Expander backplane NIC Ports On board Intel® 82576 Dual-Port Gigabit Ethernet Controller Here are some of the things I've tried so far: -btrfs scrub on the cache drive and docker image -btrfs check --repair on the cache drive -redoing the drive (changing the file system to XFS and then back as per the Check Disk Filesystems wiki) -ran a memtest overnight, result: 1 pass with no errors The cache seems to be okay when I restore the appdata and create the docker image, but once I try to add my old template containers back, the docker gives me the error in my post above and the drive filesystem seems to become corrupted. I think I'm going to try moving the drive to a different slot in the hot swap bay. A lot of posts on here suggest that an I/O error is hardware rather than software. The drive is fairly new but I'll run a SMART test on it later today. I've been trying to fix this for a few days with no success so let me know what you think or if I should try something else!
  10. This is the error I get when I try to add a container: Error: open /var/lib/docker/tmp/GetImageBlob443829641: read-only file system I've tried redoing BTRFS on the cache and deleting/rebuilding the docker from scratch. I attached the diagnostics from right after I tried to add a new container. tower-diagnostics-20170109-1131.zip
  11. I recently upgraded to a Supermicro SC826E16 with an SAS2 backplane (w/ LSI 9211-8i JBOD IT Mode). When I try to mount the drives I am receiving the error in the syslog attached. It also seems like the webGUI is hung and I cannot refresh and it will not load in another tab (Firefox, if it matters). The webGUI reads "Mounting disks...". In addition, I am receving the error "ALL DATA ON THIS DISK WILL BE ERASED WHEN ARRAY IS STARTED" on my first parity drive (second parity is fine). It is showing the red "x" and the message "Parity drive is disabled". SMART tests are showing everything is okay for the drive. I believe this is an issue with the backplane, although it is finding all the drives, just not mounting them. I've reseated the cables and drives and it is still getting the error. Any advice would be greatly appreciated! EDIT: I rebooted and tried to run the system in maintenance mode. I've attached the diagnostics file. EDIT2: Checked out the syslog and found my Disk 3 had a corrupt filesystem (XFS). Ran xfs_repair -vL to fix it. No data loss and the drives are mounting. Although now the cache and docker are stuck in read-only and the same parity drive is still disabled. syslogtail.txt tower-diagnostics-20170108-1044.zip