Vynce

Members
  • Posts

    48
  • Joined

  • Last visited

Converted

  • Gender
    Undisclosed
  • Location
    Midwest US

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

Vynce's Achievements

Rookie

Rookie (2/14)

0

Reputation

  1. I'm also running into this issue with HAProxy on pfSense forwarding to Unraid and I think I've found a solution. The weird thing is this issue seems to be intermittent for me. I also sometimes get HTTP 503 errors when doing docker container updates. The live progress UI will stop updating and I can see 503 errors in the browser inspector. TL;DR: set the HAProxy client and server timeouts to 60000. Seems to work for me so far. The HAProxy logs show that the client connection (cD--) is timing out after 30s (0/0/24/1/30028). See this page for details on the HAProxy log format and the timers, and this page for the session termination state descriptions. Internal~ Unraid_ipvANY/Unraid 0/0/24/1/30028 101 369 - - cD-- 4/4/1/1/0 0/0 "GET /sub/dockerload?last_event_id=1696795636%3A0 HTTP/1.1" So I bumped up the client timeout in the HAProxy frontend config from 30s to 60s. But then I started getting server connection (sD--) timeouts after 30s: Internal~ Unraid_ipvANY/Unraid 0/0/36/1/30039 101 369 - - sD-- 5/5/1/1/0 0/0 "GET /sub/dockerload?last_event_id=1696795636%3A0 HTTP/1.1" So I bumped up the server timeout in the HAProxy Unraid backend config from 30s to 60s. That seems to work. My guess is that the Unraid console sends a keepalive packet every 30s. With the HAProxy default timeout also at 30s, HAProxy will often end up killing these connections. I have no idea if a 60s timeout is optimal or if Unraid is doing something unusual that doesn't play well with HAProxy. Client timeout: pfSense -> Services -> HAProxy -> Frontend -> Edit Unraid frontend -> Client timeout -> 60000 Server timeout: pfSense -> Services -> HAProxy -> Backend -> Edit Unraid backend -> Server timeout -> 60000 https://serverfault.com/questions/504308/by-what-criteria-do-you-tune-timeouts-in-ha-proxy-config https://delta.blue/blog/haproxy-timeouts/ https://www.papertrail.com/solution/tips/haproxy-logging-how-to-tune-timeouts-for-performance/
  2. That's what this thread/bug report is about 🙂. In 6.8.1 with hardlink support disabled it takes ~43s to list 200K files on my server. Looking through the comments above, for @bonienl the same task only took ~9.5s, for @ljm42 it took ~13s, and for you it took ~10.5s.
  3. Following @limetech's suggestion of setting "case sensitive = yes" in SMB extras has resolved the sparsebundle slowness issue for me (TimeMachine and Carbon Copy Cloner). Listing large directories through SHFS is still unusually slow on my server. I'm still using a disk share for Minio as a workaround.
  4. In this comment I mentioned that backing up to sparsebundles on macOS had become incredibly slow. Setting "case sensitive = yes" in SMB extras has resolved that issue for me (including TimeMachine backups).
  5. I rolled back to 6.7.2 to gather some numbers. 100K files 200K files 6.7.2 Disk|SHFS: 0.43| 16.27 0.85s| 34.70 6.8.1 HL Off Disk|SHFS: 0.42| 18.94 0.82s| 43.03 6.8.1 HL On Disk|SHFS: 0.47| 26.97 0.93s| 57.31 I noticed something very weird. If I disable docker on 6.7.2 and the server is totally idle, doing on ls -l on a directory with 200K files takes ~51s (so it takes longer than if there's some minimal activity). If I open two terminals and run the same command in both at the same time, they both complete in ~11s. There are no disk reads -- this is all cached in RAM. The same thing happens if I kick off two background instances in the same terminal: for x in {1..2}; do { time -p /bin/ls -l --color=never /mnt/user/Download/benchmark &>/dev/null; } 2>&1 & done The same pattern happens in 6.8.1, but it's a bit slower overall. I'll also note that backing up to a sparsebundle file using TimeMachine and Carbon Copy Cloner over SMB became substantially slower in Unraid 6.8.0 and 6.8.1. The sparsebundle format creates 8MB band files in a single directory. With a large backup, that gets into the range of 50K-100K files in one directory. I'm not sure if that's part of the problem in this case, but it's another common instance where a lot of files end up in a single directory. It might be more closely related to the SMB slowness reported in these two threads. With 6.7.2 I'm getting a fairly consistent transfer rate of 13MB/s to a sparsebundle over SMB to a user share (wired gigabit ethernet). With 6.8.1 (with hardlink support disabled) I get very intermittent transfer spikes of 2~12MB followed by nothing for a while. Over a long period that's averaging out to ~0.8MB/s so far 🙁.
  6. I've copied the benchmark script to a gist at @ljm42's suggestion: https://gist.github.com/Vynce/44f224c2846de5fa4cf1d5b1dcad2dc4. Anyone is welcome to hack on it as they like 😊.
  7. Thanks to both of you for testing. It's interesting that you're seeing much better performance than I am. Both of you got results in the 10~13s range for 200K files with hardlink support disabled. My result is up at 46s! I do see a lot of variance between runs with this benchmark, but all my results come from running it several times and recording the minimum values. My server isn't under much load, but it's also not a fresh setup with no content. None of the user shares currently span multiple disks. I don't think it's worth it for the Unraid team to spend a ton of time squeezing every ounce of performance out of SHFS for extremely large directories. But I think it would be worth figuring out why my system is so much slower than your "reference" systems. I'll try rolling back to 6.7.2 this weekend to see what the performance is like there. I'll also test with docker disabled, etc. Let me know if there are other settings worth playing with to see if they make any difference. Also any suggestions on more precise/relevant benchmarks. I contacted the author of Arq about this issue and he said the "next version" would use a different file organization strategy (current version of Arq on macOS is 5.17.2). Arq currently stores all the backup chunks in a single folder each using a sha1 hash as the filename. I suspect he'll split those up into subdirectories based on the first character or two of the hash (similar to git). As a side note, Minio is just providing an S3-compatible interface to the storage -- it doesn't play much of a role in defining the directory structure -- that's mostly up to Arq in this case. These are some interesting charts. It would be nice if they published the benchmark source.
  8. Disabling hard link support seems to be slightly faster, but not substantially. 100K files 200K files 6.8.0 Disk|SHFS: 0.46| 29.34 0.95s| 59.69 6.8.1 Disk|SHFS: 0.47| 26.97 0.93s| 57.31 6.8.1 NoHL Disk|SHFS: 0.42| 20.77 0.82s| 45.88
  9. I did a quick spot check with 6.8.1 and SHFS performance is about the same as 6.8.0. Based on the changelog, I wasn't expecting any difference. 100K files 200K files 6.8.0 Disk|SHFS: 0.46| 29.34 0.95| 59.69 6.8.1 Disk|SHFS: 0.47| 26.97 0.93| 57.31
  10. I wrote a quick script (benchmark_shfs.sh) to benchmark the difference in performance of running "ls -l" on the disk mount points vs the shfs "user share" mount points. Don't run this script unless you read and understand what it's doing. A couple of the variables at the top need to be modified for your system and there's a risk of data loss if any of the paths it uses conflict with existing paths -- there's no error checking or warnings. Both look fairly linear, but my problem is with the slope of the shfs line. On my system, user shares start to become pretty unusable around 100K-200K files per folder (30s~60s for "ls -l" to enumerate the files). I have no idea how this compares on other hardware or if it was better on Unraid 6.7.2 -- I suspect the performance did decrease somewhat in 6.8.0 since Arq wasn't timing out before I upgraded.
  11. Yeah, I think it’s probably a similar (the same?) issue. I decided to post a new report since I see the poor performance right on Unraid itself — no SMB or other network protocols in the loop.
  12. I use Arq to back up to a Minio docker container running on Unraid. It was working well with Unraid 6.7.2, but larger backups are failing with Unraid 6.8.0. The versions of Arq and Minio in use haven't changed recently. Arq is failing due to a GET request timeout: 2019/12/19 00:04:17:767 DETAIL [thread 307] retrying GET /foo/?prefix=713EC506-32A1-4454-A885-19334B4FB242/objects/95&delimiter=/&max-keys=500: Error Domain=NSURLErrorDomain Code=-1001 "The request timed out." I reproduced the same request using aws-cli. 3m10s seems excessive for getting a listing of ~1000 files. [REQUEST s3.ListObjectsV1] 05:58:00.385 GET /foo?delimiter=%2F&prefix=713EC506-32A1-4454-A885-19334B4FB242%2Fobjects%2F91&encoding-type=url [RESPONSE] [06:01:10.817] [ Duration 3m10.432524s Dn 93 B Up 388 KiB ] 200 OK Server: MinIO/RELEASE.2019-10-12T01-39-57Z The Minio container has a user share mapping for backend storage. If I perform essentially the same file listing from an Unraid terminal, it's also pretty slow: time ls /mnt/user/minio/foo/713EC506-32A1-4454-A885-19334B4FB242/objects/91* | wc -l 1140 real 0m24.676s user 0m0.242s sys 0m0.310s If I do the same thing using the disk mount point instead, it's several orders of magnitude faster: time ls /mnt/disk3/minio/foo/713EC506-32A1-4454-A885-19334B4FB242/objects/91* | wc -l 1140 real 0m0.090s user 0m0.069s sys 0m0.026s There are a lot of files in these folders, but I don't think it's an unreasonable amount (?): ls /mnt/disk3/minio/foo/713EC506-32A1-4454-A885-19334B4FB242/objects/ | wc -l 278844 Changing the Minio container path mapping to use the disk share instead of the user share works around the issue, but I'll need user shares to span across disks at some point. I'd prefer not to downgrade to 6.7.2 to gather comparable metrics there, but I can if that would be helpful. unraid-diagnostics-20191227-1342.zip
  13. That sounds exactly like the issue I saw. Try moving all albums that include jpg artwork to another folder temporarily and see if that fixes the Remote app issue.
  14. The default username/password for forked-daapd seems to be admin/unused, but there's no actual web interface there in 25.0. A lot of the commits since 25.0 look like they're related to adding a web interface, so the web interface instructions must be for that or maybe from a previous web interface that was removed from this fork (?).
  15. Just as a heads up for anyone else trying to get this working, forked-daapd 25.0 seems to crash repeatedly when the iOS Remote app requests album artwork and the artwork is stored in jpg format -- artwork in png format works fine. It doesn't matter if the artwork is embedded in the audio files or stored as a separate artwork.jpg file in each album folder. I don't see any obvious issues in the logs after turning on debug logging and I haven't been able to find any crash logs anywhere in the docker container. I tried rolling back to forked-daapd 24.2 by pulling the 115 tag, but I couldn't get remote pairing to work there. The work around I eventually came up with was to export each album's artwork and save it as artwork.png in each album folder. This works because forked-daapd looks for an artwork.png/jpg file in the folder before trying to extract any embedded artwork from the audio files, so the audio files can still contain embedded jpg artwork. There have been a few artwork-related changes in forked-daapd since 25.0, so I was wondering if it's possible to build the latest source inside the container to test it out? I didn't want to open a new issue in the forked-daapd project without first checking if the issue is still present at the tip of master.