Jump to content

r8169 Still a problem..


Recommended Posts

uRaid 5.0rc3 can't even get through a parity check before having an upstable NIC connection..

 

This is a trace from rolling back to Beta 14.. Last one still works..

 

 

 

un  3 18:35:31 stadiumNAS kernel: ------------[ cut here ]------------

Jun  3 18:35:31 stadiumNAS kernel: WARNING: at net/core/dev.c:3827 net_rx_action+0x78/0x12a()

Jun  3 18:35:31 stadiumNAS kernel: Hardware name: EP45-UD3P

Jun  3 18:35:31 stadiumNAS kernel: Modules linked in: md_mod xor pata_jmicron jmicron r8169 ahci libahci i2c_i801 i2c_core [last unloaded: md_mod]

Jun  3 18:35:31 stadiumNAS kernel: Pid: 1645, comm: nfsd Tainted: G        W  3.1.1-unRAID #1

Jun  3 18:35:31 stadiumNAS kernel: Call Trace:

Jun  3 18:35:31 stadiumNAS kernel:  [] warn_slowpath_common+0x65/0x7a

Jun  3 18:35:31 stadiumNAS kernel:  [] ? net_rx_action+0x78/0x12a

Jun  3 18:35:31 stadiumNAS kernel:  [] warn_slowpath_null+0xf/0x13

Jun  3 18:35:31 stadiumNAS kernel:  [] net_rx_action+0x78/0x12a

Jun  3 18:35:31 stadiumNAS kernel:  [] __do_softirq+0x6b/0xe5

Jun  3 18:35:31 stadiumNAS kernel:  [] ? irq_enter+0x3c/0x3c

Jun  3 18:35:31 stadiumNAS kernel:    [] ? irq_exit+0x32/0x53

Jun  3 18:35:31 stadiumNAS kernel:  [] ? do_IRQ+0x7c/0x90

Jun  3 18:35:31 stadiumNAS kernel:  [] ? common_interrupt+0x29/0x30

Jun  3 18:35:31 stadiumNAS kernel:  [] ? csum_partial_copy_generic+0x6c/0x100

Jun  3 18:35:31 stadiumNAS kernel:  [] ? tcp_sendmsg+0x2a3/0x915

Jun  3 18:35:31 stadiumNAS kernel:  [] ? inet_sendmsg+0x6f/0x78

Jun  3 18:35:31 stadiumNAS kernel:  [] ? sock_sendmsg+0xa5/0xbb

Jun  3 18:35:31 stadiumNAS kernel:  [] ? blk_finish_plug+0xd/0x28

Jun  3 18:35:31 stadiumNAS kernel:  [] ? do_sync_readv_writev+0x84/0xb7

Jun  3 18:35:31 stadiumNAS kernel:  [] ? fsnotify+0x1ad/0x1c7

Jun  3 18:35:31 stadiumNAS kernel:  [] ? kernel_sendmsg+0x28/0x37

Jun  3 18:35:31 stadiumNAS kernel:  [] ? sock_no_sendpage+0x45/0x58

Jun  3 18:35:31 stadiumNAS kernel:  [] ? tcp_sendpage+0x32/0x66

Jun  3 18:35:31 stadiumNAS kernel:  [] ? do_tcp_sendpages+0x490/0x490

Jun  3 18:35:31 stadiumNAS kernel:  [] ? inet_sendpage+0x82/0x9c

Jun  3 18:35:31 stadiumNAS kernel:  [] ? inet_dgram_connect+0x5e/0x5e

Jun  3 18:35:31 stadiumNAS kernel:  [] ? kernel_sendpage+0x1a/0x2d

Jun  3 18:35:31 stadiumNAS kernel:  [] ? svc_send_common+0x41/0xdf

Jun  3 18:35:31 stadiumNAS kernel:  [] ? svc_sendto+0x112/0x15f

Jun  3 18:35:31 stadiumNAS kernel:  [] ? nfsd_cache_update+0x88/0x10f

Jun  3 18:35:31 stadiumNAS kernel:  [] ? nfs3svc_encode_renameres+0x3b/0x3b

Jun  3 18:35:31 stadiumNAS kernel:  [] ? nfsd_cache_update+0xb3/0x10f

Jun  3 18:35:31 stadiumNAS kernel:  [] ? auth_domain_put+0x10/0x42

Jun  3 18:35:31 stadiumNAS kernel:  [] ? svcauth_unix_release+0x15/0x4d

Jun  3 18:35:31 stadiumNAS kernel:  [] ? svc_authorise+0x28/0x2e

Jun  3 18:35:31 stadiumNAS kernel:  [] ? svc_tcp_sendto+0x32/0x83

Jun  3 18:35:31 stadiumNAS kernel:  [] ? svc_send+0x53/0x88

Jun  3 18:35:31 stadiumNAS kernel:  [] ? svc_process+0xff/0x112

Jun  3 18:35:31 stadiumNAS kernel:  [] ? nfsd+0xd1/0x108

Jun  3 18:35:31 stadiumNAS kernel:  [] ? nfsd_svc+0x131/0x131

Jun  3 18:35:31 stadiumNAS kernel:  [] ? kthread+0x62/0x67

Jun  3 18:35:31 stadiumNAS kernel:  [] ? kthread_worker_fn+0x10a/0x10a

Jun  3 18:35:31 stadiumNAS kernel:  [] ? kernel_thread_helper+0x6/0xd

Jun  3 18:35:31 stadiumNAS kernel: ---[ end trace 85ce947744be7903 ]---

Jun  3 18:35:31 stadiumNAS kernel: r8169 0000:04:00.0: eth0: link up

Link to comment

This is a prime example of the problems in linux development now, IMHO.

 

Refer to this ugly thread:

http://thread.gmane.org/gmane.linux.debian.devel.bugs.general/871123/focus=213090

 

Apparently there is a fix in 3.2 kernel done by a guy named "Francois", and later a guy named "Jonathan Nieder" is asking about back-porting the patch to 3.0.x series, whereupon Francois says, "go for it", but in looking at 3.0.31, the patch is not applied (and I don't see it in change log of 3.0.32 or 3.0.33).

 

So.. I can't move beyond 3.0.x at the moment because of issues with LSI controllers, and this issue you are reporting will eventually hit more unRaid users, so I'm kinda stuck.

 

Probably what I will do is generate the patch myself, but this will have to wait until after 5.0-final, sorry.

Link to comment

This is a prime example of the problems in linux development now, IMHO.

 

Refer to this ugly thread:

http://thread.gmane.org/gmane.linux.debian.devel.bugs.general/871123/focus=213090

 

Apparently there is a fix in 3.2 kernel done by a guy named "Francois", and later a guy named "Jonathan Nieder" is asking about back-porting the patch to 3.0.x series, whereupon Francois says, "go for it", but in looking at 3.0.31, the patch is not applied (and I don't see it in change log of 3.0.32 or 3.0.33).

 

So.. I can't move beyond 3.0.x at the moment because of issues with LSI controllers, and this issue you are reporting will eventually hit more unRaid users, so I'm kinda stuck.

 

Probably what I will do is generate the patch myself, but this will have to wait until after 5.0-final, sorry.

 

If someone knocks together a 3.2 kernel with this patch applied I would be happy to test if it solves the problem.

 

No point in you doing the work and then finding it doesn't do anything!

 

You are rather stuck between a rock and a hard place tho :/

Link to comment

I don't know if I "should" be seeing this problem but I don't seem to be.  Note my motherboard in my sig, it has an 8112L NIC.  so far all seems well according to my syslog.  Yes I'm running with Simple Features, but for a non-bug report I don't think you can hold this against me.  Anyway the server is connected to a Netgear GS605v2 switch over a 6ft CAT5e cable.

 

Not sure if that helps with differential analysis but I thought I'd try to contribute.

syslog.zip

Link to comment

I have a M4A78LT-M motherboard that uses the Realtek 8112L, I don't use a cache drive and I don't seem to have a problem, transfer speeds are approx 18MB/s. If I understand correctly if I were to use a cache drive the transfer speed would increase and put the 8112L under load causing it to stop/crash, is that correct? If a new network card is installed in the server that uses a different manufacturer would it correct the problem? If so what is the best card to install?

Link to comment

If i could manage to build myself an unRaid development system i will try to compile the original realtek driver for my system.

 

Hope that could solve the problem.

 

I just have to gather all the information fron the forum and wiki how to setup such a system, and of course find the time to do that.

Link to comment

Know that threat, but i will not try using a driver which is compiled against a 2.6er Kernel from unRaid 4.x on an unRaid 5.0-RC3 which uses a 3.0.x Kernel.

Dont know which problems that causes. And i dont want to carry Monitor etc. to my Server because i lost the network by using a faulty driver.

 

The way he describes althought is the answer, but like i explained i just dont have a running unRaid Dev system at hand to compile my own driver.

Link to comment

@limetech:

Would it please be possible for you to cimpile the original Realtek driver  (for RC3 or RC4) and give it to us as extra package?

 

I am messing around here since two days trying to compile the driver with on various Slack Distros and even on a seperate unraid Stick, but it is always reverted by unraid when trying to load it.

 

You have done something similiar some beta's ago.

Link to comment

This is a prime example of the problems in linux development now, IMHO.

 

Refer to this ugly thread:

http://thread.gmane.org/gmane.linux.debian.devel.bugs.general/871123/focus=213090

 

Apparently there is a fix in 3.2 kernel done by a guy named "Francois", and later a guy named "Jonathan Nieder" is asking about back-porting the patch to 3.0.x series, whereupon Francois says, "go for it", but in looking at 3.0.31, the patch is not applied (and I don't see it in change log of 3.0.32 or 3.0.33).

 

So.. I can't move beyond 3.0.x at the moment because of issues with LSI controllers, and this issue you are reporting will eventually hit more unRaid users, so I'm kinda stuck.

 

Probably what I will do is generate the patch myself, but this will have to wait until after 5.0-final, sorry.

 

Doh!  I was starting to get excited after the beta's appeared to work with this r8169 driver (which cause my 4.7 box to occasionally kernel panic under heavy network load) and also support my AOC-SASLP-MV8. 

 

Am I reading the various threads correctly in that now neither of these are supported (i.e. working without issues) in the current RC?

Link to comment

This is a prime example of the problems in linux development now, IMHO.

 

Refer to this ugly thread:

http://thread.gmane.org/gmane.linux.debian.devel.bugs.general/871123/focus=213090

 

Apparently there is a fix in 3.2 kernel done by a guy named "Francois", and later a guy named "Jonathan Nieder" is asking about back-porting the patch to 3.0.x series, whereupon Francois says, "go for it", but in looking at 3.0.31, the patch is not applied (and I don't see it in change log of 3.0.32 or 3.0.33).

 

So.. I can't move beyond 3.0.x at the moment because of issues with LSI controllers, and this issue you are reporting will eventually hit more unRaid users, so I'm kinda stuck.

 

Probably what I will do is generate the patch myself, but this will have to wait until after 5.0-final, sorry.

 

Doh!  I was starting to get excited after the beta's appeared to work with this r8169 driver (which cause my 4.7 box to occasionally kernel panic under heavy network load) and also support my AOC-SASLP-MV8. 

 

Am I reading the various threads correctly in that now neither of these are supported (i.e. working without issues) in the current RC?

 

Supported, but not working to full speeds/ability.

Link to comment

I would like to add that replacing the r8169.ko module with r8168.ko (same version from b12a and compiled for RC4 kernel) has solved my dropouts under high load. my NIC is a Realtek RTL8111D running a 1GB/s

 

# ethtool -i eth0

 

driver: r8168

version: 8.025.00-NAPI

firmware-version:

bus-info: 0000:03:00.0

Link to comment

Any chance you could make this available?  :)

 

Here's what 4.7 says I have and I get regular kernel panics that require a reboot:

 

Jun 23 10:42:40 unraid kernel: r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded (System)

Jun 23 10:42:40 unraid kernel: r8169 0000:01:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16 (Network)

Jun 23 10:42:40 unraid kernel: r8169 0000:01:00.0: setting latency timer to 64 (Network)

Jun 23 10:42:40 unraid kernel: r8169 0000:01:00.0: unknown MAC, using family default (Network)

Jun 23 10:42:40 unraid kernel: r8169 0000:01:00.0: irq 30 for MSI/MSI-X (Network)

Jun 23 10:42:40 unraid kernel: eth0: RTL8168b/8111b at 0xf84fa000, 20:cf:30:e2:06:33, XID 0c200000 IRQ 30 (Network)

Link to comment

Any chance you could make this available?  :)

 

Here's what 4.7 says I have and I get regular kernel panics that require a reboot:

 

Jun 23 10:42:40 unraid kernel: r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded (System)

Jun 23 10:42:40 unraid kernel: r8169 0000:01:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16 (Network)

Jun 23 10:42:40 unraid kernel: r8169 0000:01:00.0: setting latency timer to 64 (Network)

Jun 23 10:42:40 unraid kernel: r8169 0000:01:00.0: unknown MAC, using family default (Network)

Jun 23 10:42:40 unraid kernel: r8169 0000:01:00.0: irq 30 for MSI/MSI-X (Network)

Jun 23 10:42:40 unraid kernel: eth0: RTL8168b/8111b at 0xf84fa000, 20:cf:30:e2:06:33, XID 0c200000 IRQ 30 (Network)

 

 

It looks like LimeTech will be adding the R8168 driver into a test build of unRAID 5.0 RC5 which should solve this for other users. Sorry the version I have compiled does not work with 4.7 releases. I recommend waiting till the r8168 driver in added to the 5.0 RC builds. If you don't wish to wait. you can upgrade to 5.0 b12a (which is the latest 5.0 beta the has the r8168 driver)

 

DO HOWEVER, READ THE UPGRADE INSTRUCTIONS FROM THE RELEASE NOTES. ASK HERE IF YOU RUN INTO ANY ISSUES.

 

If ANY drive shows as unformatted, DO NOT FORMAT IT.  It typically indicates the MBR has been incorrectly identified by unRAID. In fact, I'd not start the array until seeking more guidance from lime-tech.

 

(sorry for the caps, just need to stress the importance of reading the 4.7 -> 5.0 upgrade info)

 

cheers,

 

- WingmanNZ

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...