lwip-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lwip-users] How to root-cause duplicate ACKs?


From: address@hidden
Subject: Re: [lwip-users] How to root-cause duplicate ACKs?
Date: Thu, 30 Apr 2020 02:31:29 +0000 (UTC)

I've turned off UART debugging, but it was heavily buffered anyway and hasn't made a difference.


I did discover a problem with my low level Rx driver however. This micro (PIC32MZ) uses Ethernet descriptors, so when an Ethernet Rx interrupt is generated, the driver I wrote scans through all descriptors from the 1st to last and loads any containing packets into lwIP. The problem was, this did not necessarily read them in the order they arrived. Things are slightly better, but still my main issue exists which I haven't talked about yet.

I'd like to eliminate nearly all retransmissions and dup ACKs as they aren't normal, but since I use a 10Mb hub to monitor communications between lwIP and the Internet via Wireshark, the hub could be causing some of this. But the real issue is something else. When lwIP connects to one of these streaming Internet radio stations, the station disconnects after around 1350 seconds, regardless of network hub or switch. It's repeatable, and the station actually sends a FIN, too.

The kicker is the old design I did of the same idea using lwIP 1.3.1 is very reliable. The application code is essentially identical. That micro has far less RAM and therefore I've configured fewer PBUFs etc., and is slower, yet doesn't have this issue. I'm baffled. I don't see anything obvious in the Wireshark traces. I copied the lwipopts.h into this newer one and it behaves the same.

I'm able to set up a basic server on my computer of the same protocol and it will stream from that all night long. I don't know. Ughhh.


On Wednesday, April 29, 2020, 09:46:39 AM CDT, Indan Zupancic <address@hidden> wrote:


Hello,

Your application is slow in handling the data, which seems to cause a backlog.
Debug printing via UART is very slow, try not to do this while debugging this.

I can think of three explanations:

- Because of the extreme backlog between packet reception and lwIP handling,
  or because of the 4K packet size, which get split unfortunately into multiple
  pbufs, some bug in lwIP gets triggered.
- You have special ethernet hardware which splits that 4K packet into multiple
  smaller ones and one of them got dropped or corrupted.
- Somehow packets/pbufs are given to lwIP in re-ordered sequence (e.g. reverse
  order, then it only happens if there is a backlog).

Best regards,

Indan Zupancic



TT Vasumweg 150  |  1033 SH Amsterdam  |  The Netherlands
Phone: + 31 [0]20 482 56 32  |  Fax: + 31 [0]20 482 00 77  |  Email: address@hidden

-----Original Message-----
From: lwip-users <lwip-users-bounces+indan.zupancic=address@hidden> On Behalf Of hondgm--- via lwip-users
Sent: Wednesday, 29 April 2020 14:39
To: address@hidden
Cc: address@hidden
Subject: [lwip-users] How to root-cause duplicate ACKs?

I'm using 2.1.2, no RTOS, and seeing frequent duplicate ACKs on Wireshark while lwIP is receiving a continuous audio stream. I've seen up to 6 duplicate ACKs but usually it's one or two.

I enabled:

#define TCP_OUTPUT_DEBUG                LWIP_DBG_ON

and here's a section of the output. The second number in each "sending ACK for" line is an addition of mine and is the micro's core timer for an idea of how quickly these are being sent. Each timer count is 13.3ns.

tcp_output: sending ACK for 1450257509  3252152102
tcp_output: nothing to send (0)
tcp_output: nothing to send (0)
tcp_output: sending ACK for 1450260429  3252196160
tcp_output: nothing to send (0)
tcp_output: nothing to send (0)
tcp_output: sending ACK for 1450261889  3269439860
tcp_output: nothing to send (0)
tcp_output: sending ACK for 1450261889  3288290782
tcp_output: nothing to send (0)
tcp_output: sending ACK for 1450261889  3288310973
tcp_output: nothing to send (0)
tcp_output: nothing to send (0)
tcp_output: sending ACK for 1450261889  3325967548
tcp_output: nothing to send (0)
tcp_output: nothing to send (0)
tcp_output: sending ACK for 1450262782  3333899283
tcp_output: nothing to send (0)


1450261889 was sent 4 times according to this, and there's actually quite a lot of these in the debug output. Is this normal? I've also attached a small section Wireshark trace which does not necessarily match up with the above debug trace as I can't determine what matches up. 192.168.0.120 is lwIP.

How does one go about determining what is causing this? The system running lwIP is otherwise quite reliable and will stream the same TCP connection for hours, despite the duplicate ACKs and retransmissions even over my LAN.




_______________________________________________
lwip-users mailing list
address@hidden
https://lists.nongnu.org/mailman/listinfo/lwip-users

reply via email to

[Prev in Thread] Current Thread [Next in Thread]