jdog09 Posted June 4, 2012 Share Posted June 4, 2012 uRaid 5.0rc3 can't even get through a parity check before having an upstable NIC connection.. This is a trace from rolling back to Beta 14.. Last one still works.. un 3 18:35:31 stadiumNAS kernel: ------------[ cut here ]------------ Jun 3 18:35:31 stadiumNAS kernel: WARNING: at net/core/dev.c:3827 net_rx_action+0x78/0x12a() Jun 3 18:35:31 stadiumNAS kernel: Hardware name: EP45-UD3P Jun 3 18:35:31 stadiumNAS kernel: Modules linked in: md_mod xor pata_jmicron jmicron r8169 ahci libahci i2c_i801 i2c_core [last unloaded: md_mod] Jun 3 18:35:31 stadiumNAS kernel: Pid: 1645, comm: nfsd Tainted: G W 3.1.1-unRAID #1 Jun 3 18:35:31 stadiumNAS kernel: Call Trace: Jun 3 18:35:31 stadiumNAS kernel: [] warn_slowpath_common+0x65/0x7a Jun 3 18:35:31 stadiumNAS kernel: [] ? net_rx_action+0x78/0x12a Jun 3 18:35:31 stadiumNAS kernel: [] warn_slowpath_null+0xf/0x13 Jun 3 18:35:31 stadiumNAS kernel: [] net_rx_action+0x78/0x12a Jun 3 18:35:31 stadiumNAS kernel: [] __do_softirq+0x6b/0xe5 Jun 3 18:35:31 stadiumNAS kernel: [] ? irq_enter+0x3c/0x3c Jun 3 18:35:31 stadiumNAS kernel: [] ? irq_exit+0x32/0x53 Jun 3 18:35:31 stadiumNAS kernel: [] ? do_IRQ+0x7c/0x90 Jun 3 18:35:31 stadiumNAS kernel: [] ? common_interrupt+0x29/0x30 Jun 3 18:35:31 stadiumNAS kernel: [] ? csum_partial_copy_generic+0x6c/0x100 Jun 3 18:35:31 stadiumNAS kernel: [] ? tcp_sendmsg+0x2a3/0x915 Jun 3 18:35:31 stadiumNAS kernel: [] ? inet_sendmsg+0x6f/0x78 Jun 3 18:35:31 stadiumNAS kernel: [] ? sock_sendmsg+0xa5/0xbb Jun 3 18:35:31 stadiumNAS kernel: [] ? blk_finish_plug+0xd/0x28 Jun 3 18:35:31 stadiumNAS kernel: [] ? do_sync_readv_writev+0x84/0xb7 Jun 3 18:35:31 stadiumNAS kernel: [] ? fsnotify+0x1ad/0x1c7 Jun 3 18:35:31 stadiumNAS kernel: [] ? kernel_sendmsg+0x28/0x37 Jun 3 18:35:31 stadiumNAS kernel: [] ? sock_no_sendpage+0x45/0x58 Jun 3 18:35:31 stadiumNAS kernel: [] ? tcp_sendpage+0x32/0x66 Jun 3 18:35:31 stadiumNAS kernel: [] ? do_tcp_sendpages+0x490/0x490 Jun 3 18:35:31 stadiumNAS kernel: [] ? inet_sendpage+0x82/0x9c Jun 3 18:35:31 stadiumNAS kernel: [] ? inet_dgram_connect+0x5e/0x5e Jun 3 18:35:31 stadiumNAS kernel: [] ? kernel_sendpage+0x1a/0x2d Jun 3 18:35:31 stadiumNAS kernel: [] ? svc_send_common+0x41/0xdf Jun 3 18:35:31 stadiumNAS kernel: [] ? svc_sendto+0x112/0x15f Jun 3 18:35:31 stadiumNAS kernel: [] ? nfsd_cache_update+0x88/0x10f Jun 3 18:35:31 stadiumNAS kernel: [] ? nfs3svc_encode_renameres+0x3b/0x3b Jun 3 18:35:31 stadiumNAS kernel: [] ? nfsd_cache_update+0xb3/0x10f Jun 3 18:35:31 stadiumNAS kernel: [] ? auth_domain_put+0x10/0x42 Jun 3 18:35:31 stadiumNAS kernel: [] ? svcauth_unix_release+0x15/0x4d Jun 3 18:35:31 stadiumNAS kernel: [] ? svc_authorise+0x28/0x2e Jun 3 18:35:31 stadiumNAS kernel: [] ? svc_tcp_sendto+0x32/0x83 Jun 3 18:35:31 stadiumNAS kernel: [] ? svc_send+0x53/0x88 Jun 3 18:35:31 stadiumNAS kernel: [] ? svc_process+0xff/0x112 Jun 3 18:35:31 stadiumNAS kernel: [] ? nfsd+0xd1/0x108 Jun 3 18:35:31 stadiumNAS kernel: [] ? nfsd_svc+0x131/0x131 Jun 3 18:35:31 stadiumNAS kernel: [] ? kthread+0x62/0x67 Jun 3 18:35:31 stadiumNAS kernel: [] ? kthread_worker_fn+0x10a/0x10a Jun 3 18:35:31 stadiumNAS kernel: [] ? kernel_thread_helper+0x6/0xd Jun 3 18:35:31 stadiumNAS kernel: ---[ end trace 85ce947744be7903 ]--- Jun 3 18:35:31 stadiumNAS kernel: r8169 0000:04:00.0: eth0: link up Quote Link to comment
limetech Posted June 4, 2012 Share Posted June 4, 2012 This is a prime example of the problems in linux development now, IMHO. Refer to this ugly thread: http://thread.gmane.org/gmane.linux.debian.devel.bugs.general/871123/focus=213090 Apparently there is a fix in 3.2 kernel done by a guy named "Francois", and later a guy named "Jonathan Nieder" is asking about back-porting the patch to 3.0.x series, whereupon Francois says, "go for it", but in looking at 3.0.31, the patch is not applied (and I don't see it in change log of 3.0.32 or 3.0.33). So.. I can't move beyond 3.0.x at the moment because of issues with LSI controllers, and this issue you are reporting will eventually hit more unRaid users, so I'm kinda stuck. Probably what I will do is generate the patch myself, but this will have to wait until after 5.0-final, sorry. Quote Link to comment
cyrnel Posted June 4, 2012 Share Posted June 4, 2012 Might not the alternate preemptive kernel alleviate some of these contention issues? Quote Link to comment
darkside40 Posted June 11, 2012 Share Posted June 11, 2012 Jep that Realtek NIC is problematic, again as you can also see here http://lime-technology.com/forum/index.php?topic=20759.0 Would it be possible to supply a build of the official Realtek drivers for that NIC, like it was donw several beta's ago, or does this solution also have any downsides (besides that you need to build the driver everytime new when you update to a new Kernel Version)? Quote Link to comment
Interstellar Posted June 11, 2012 Share Posted June 11, 2012 This is a prime example of the problems in linux development now, IMHO. Refer to this ugly thread: http://thread.gmane.org/gmane.linux.debian.devel.bugs.general/871123/focus=213090 Apparently there is a fix in 3.2 kernel done by a guy named "Francois", and later a guy named "Jonathan Nieder" is asking about back-porting the patch to 3.0.x series, whereupon Francois says, "go for it", but in looking at 3.0.31, the patch is not applied (and I don't see it in change log of 3.0.32 or 3.0.33). So.. I can't move beyond 3.0.x at the moment because of issues with LSI controllers, and this issue you are reporting will eventually hit more unRaid users, so I'm kinda stuck. Probably what I will do is generate the patch myself, but this will have to wait until after 5.0-final, sorry. If someone knocks together a 3.2 kernel with this patch applied I would be happy to test if it solves the problem. No point in you doing the work and then finding it doesn't do anything! You are rather stuck between a rock and a hard place tho Quote Link to comment
jumperalex Posted June 11, 2012 Share Posted June 11, 2012 I don't know if I "should" be seeing this problem but I don't seem to be. Note my motherboard in my sig, it has an 8112L NIC. so far all seems well according to my syslog. Yes I'm running with Simple Features, but for a non-bug report I don't think you can hold this against me. Anyway the server is connected to a Netgear GS605v2 switch over a 6ft CAT5e cable. Not sure if that helps with differential analysis but I thought I'd try to contribute. syslog.zip Quote Link to comment
darkside40 Posted June 12, 2012 Share Posted June 12, 2012 The problem also only constantly appears at my server at high network load. So writing from my fast Desktop to my cache drive (50MB/s +). I have not seen it till now when writing rather slow to my protected array (approx 20MB/s). Quote Link to comment
Zaxxan Posted June 12, 2012 Share Posted June 12, 2012 I have a M4A78LT-M motherboard that uses the Realtek 8112L, I don't use a cache drive and I don't seem to have a problem, transfer speeds are approx 18MB/s. If I understand correctly if I were to use a cache drive the transfer speed would increase and put the 8112L under load causing it to stop/crash, is that correct? If a new network card is installed in the server that uses a different manufacturer would it correct the problem? If so what is the best card to install? Quote Link to comment
darkside40 Posted June 12, 2012 Share Posted June 12, 2012 If i could manage to build myself an unRaid development system i will try to compile the original realtek driver for my system. Hope that could solve the problem. I just have to gather all the information fron the forum and wiki how to setup such a system, and of course find the time to do that. Quote Link to comment
mr-hexen Posted June 12, 2012 Share Posted June 12, 2012 someone on this forum replaced the r8169 with r8168 and it solved the problems... Quote Link to comment
mr-hexen Posted June 12, 2012 Share Posted June 12, 2012 found it. http://lime-technology.com/forum/index.php?topic=19547.msg173757#msg173757 Quote Link to comment
darkside40 Posted June 13, 2012 Share Posted June 13, 2012 Know that threat, but i will not try using a driver which is compiled against a 2.6er Kernel from unRaid 4.x on an unRaid 5.0-RC3 which uses a 3.0.x Kernel. Dont know which problems that causes. And i dont want to carry Monitor etc. to my Server because i lost the network by using a faulty driver. The way he describes althought is the answer, but like i explained i just dont have a running unRaid Dev system at hand to compile my own driver. Quote Link to comment
rick.p Posted June 13, 2012 Share Posted June 13, 2012 that wasn't the question (I was unclear) anyone else tried this on 4.7? Quote Link to comment
JonathanM Posted June 13, 2012 Share Posted June 13, 2012 that wasn't the question (I was unclear) anyone else tried this on 4.7? This particular section of the forum is for 5.0 release candidate discussion, so I would suggest starting a new topic in general support to ask this question. Quote Link to comment
darkside40 Posted June 14, 2012 Share Posted June 14, 2012 @limetech: Would it please be possible for you to cimpile the original Realtek driver (for RC3 or RC4) and give it to us as extra package? I am messing around here since two days trying to compile the driver with on various Slack Distros and even on a seperate unraid Stick, but it is always reverted by unraid when trying to load it. You have done something similiar some beta's ago. Quote Link to comment
jaybee Posted June 15, 2012 Share Posted June 15, 2012 What realtek NICs are affected? Mine on my board is apparently: "Gigabit LAN controller Realtek® 8111E Gigabit LAN controller featuring AI NET2" Is that affected? Quote Link to comment
RoninTech Posted June 17, 2012 Share Posted June 17, 2012 This is a prime example of the problems in linux development now, IMHO. Refer to this ugly thread: http://thread.gmane.org/gmane.linux.debian.devel.bugs.general/871123/focus=213090 Apparently there is a fix in 3.2 kernel done by a guy named "Francois", and later a guy named "Jonathan Nieder" is asking about back-porting the patch to 3.0.x series, whereupon Francois says, "go for it", but in looking at 3.0.31, the patch is not applied (and I don't see it in change log of 3.0.32 or 3.0.33). So.. I can't move beyond 3.0.x at the moment because of issues with LSI controllers, and this issue you are reporting will eventually hit more unRaid users, so I'm kinda stuck. Probably what I will do is generate the patch myself, but this will have to wait until after 5.0-final, sorry. Doh! I was starting to get excited after the beta's appeared to work with this r8169 driver (which cause my 4.7 box to occasionally kernel panic under heavy network load) and also support my AOC-SASLP-MV8. Am I reading the various threads correctly in that now neither of these are supported (i.e. working without issues) in the current RC? Quote Link to comment
Interstellar Posted June 17, 2012 Share Posted June 17, 2012 This is a prime example of the problems in linux development now, IMHO. Refer to this ugly thread: http://thread.gmane.org/gmane.linux.debian.devel.bugs.general/871123/focus=213090 Apparently there is a fix in 3.2 kernel done by a guy named "Francois", and later a guy named "Jonathan Nieder" is asking about back-porting the patch to 3.0.x series, whereupon Francois says, "go for it", but in looking at 3.0.31, the patch is not applied (and I don't see it in change log of 3.0.32 or 3.0.33). So.. I can't move beyond 3.0.x at the moment because of issues with LSI controllers, and this issue you are reporting will eventually hit more unRaid users, so I'm kinda stuck. Probably what I will do is generate the patch myself, but this will have to wait until after 5.0-final, sorry. Doh! I was starting to get excited after the beta's appeared to work with this r8169 driver (which cause my 4.7 box to occasionally kernel panic under heavy network load) and also support my AOC-SASLP-MV8. Am I reading the various threads correctly in that now neither of these are supported (i.e. working without issues) in the current RC? Supported, but not working to full speeds/ability. Quote Link to comment
WingmanNZ Posted June 23, 2012 Share Posted June 23, 2012 I would like to add that replacing the r8169.ko module with r8168.ko (same version from b12a and compiled for RC4 kernel) has solved my dropouts under high load. my NIC is a Realtek RTL8111D running a 1GB/s # ethtool -i eth0 driver: r8168 version: 8.025.00-NAPI firmware-version: bus-info: 0000:03:00.0 Quote Link to comment
RoninTech Posted June 23, 2012 Share Posted June 23, 2012 Any chance you could make this available? Here's what 4.7 says I have and I get regular kernel panics that require a reboot: Jun 23 10:42:40 unraid kernel: r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded (System) Jun 23 10:42:40 unraid kernel: r8169 0000:01:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16 (Network) Jun 23 10:42:40 unraid kernel: r8169 0000:01:00.0: setting latency timer to 64 (Network) Jun 23 10:42:40 unraid kernel: r8169 0000:01:00.0: unknown MAC, using family default (Network) Jun 23 10:42:40 unraid kernel: r8169 0000:01:00.0: irq 30 for MSI/MSI-X (Network) Jun 23 10:42:40 unraid kernel: eth0: RTL8168b/8111b at 0xf84fa000, 20:cf:30:e2:06:33, XID 0c200000 IRQ 30 (Network) Quote Link to comment
WingmanNZ Posted June 24, 2012 Share Posted June 24, 2012 Any chance you could make this available? Here's what 4.7 says I have and I get regular kernel panics that require a reboot: Jun 23 10:42:40 unraid kernel: r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded (System) Jun 23 10:42:40 unraid kernel: r8169 0000:01:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16 (Network) Jun 23 10:42:40 unraid kernel: r8169 0000:01:00.0: setting latency timer to 64 (Network) Jun 23 10:42:40 unraid kernel: r8169 0000:01:00.0: unknown MAC, using family default (Network) Jun 23 10:42:40 unraid kernel: r8169 0000:01:00.0: irq 30 for MSI/MSI-X (Network) Jun 23 10:42:40 unraid kernel: eth0: RTL8168b/8111b at 0xf84fa000, 20:cf:30:e2:06:33, XID 0c200000 IRQ 30 (Network) It looks like LimeTech will be adding the R8168 driver into a test build of unRAID 5.0 RC5 which should solve this for other users. Sorry the version I have compiled does not work with 4.7 releases. I recommend waiting till the r8168 driver in added to the 5.0 RC builds. If you don't wish to wait. you can upgrade to 5.0 b12a (which is the latest 5.0 beta the has the r8168 driver) DO HOWEVER, READ THE UPGRADE INSTRUCTIONS FROM THE RELEASE NOTES. ASK HERE IF YOU RUN INTO ANY ISSUES. If ANY drive shows as unformatted, DO NOT FORMAT IT. It typically indicates the MBR has been incorrectly identified by unRAID. In fact, I'd not start the array until seeking more guidance from lime-tech. (sorry for the caps, just need to stress the importance of reading the 4.7 -> 5.0 upgrade info) cheers, - WingmanNZ Quote Link to comment
chickensoup Posted June 24, 2012 Share Posted June 24, 2012 5.0-rc5-8168 is out. http://lime-technology.com/forum/index.php?topic=21019.0 Quote Link to comment
RoninTech Posted June 24, 2012 Share Posted June 24, 2012 5.0-rc5-8168 is out. http://lime-technology.com/forum/index.php?topic=21019.0 Cool! Now I just need the AOC-SASLP-MV8 issues sorted and I can try 5. In the meantime, anyone know if there's a 8168 driver available for 4.7? Quote Link to comment
RoninTech Posted June 24, 2012 Share Posted June 24, 2012 Nevermind, found one. r8168.zip Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.