lwip-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lwip-users] Performance drop due port numbers reused too fast


From: Jochen Strohbeck
Subject: Re: [lwip-users] Performance drop due port numbers reused too fast
Date: Thu, 20 Oct 2022 14:44:04 +0200

Hello Indan,


>> from time to time I experience a significant network performance drop.
> 
> Is this during normal use or during load testing/benchmarking?

It happens faster if run requests without any delay in between but can
also happen at requests sent e.g. every 100ms.


>> In the wireshark trace I see ACK+RST after a client SYN in case a port
>> number is used in the request which has been already used a few
>> milliseconds before.
> 
> lwIp does the right thing here, and one extra RST to force Windows to
> use another port number should not cause a huge performance drop.

It depends on the OS, the reserved ephemeral port range and port
selection algorithm.

Regarding Linux, port numbers are incremented by a random number. Most
of the time the full ephemeral port range (32k to 64k) is used but from
time to time only a part of the full range is used.

A typical port number sequence in case of a failure is e.g.:

..., 50094, 50096, 50108, 50118, 50130, 50144, 50152, ..., 50870, 50880,
50894.

Then suddenly a new port number range is chosen and the following
numbers occur:

50110, 50112, 50118, 50130, 50146, ...

The server will sent ACK instead SYN+ACK for 50118 and 50130 because
these port numbers have been previously used less than a second before.

This means some requests fail and some not, the resulting performance
drop is in this example ca. 50% of the normal speed.

If the client is run on Windows the problem is more severe because the
port number increment is almost 1, which means that all requests fail if
the new range overlaps fully the previous used range.

I have graphics which show the port reuse algorithm better than I can
explain in words.


> I think the main issue is that Windows propagates the RST as an error
> to the program doing a connect, instead of silently retrying with a
> different port number. So the huge performance drop may be caused by
> the program waiting before reconnecting.

It depends. First I used a single timeout parameter e.g. 10s for the
http request. This results in multiple retransmit (using backoff) until
10s are over. Because the retransmit is using the same port number, each
retry fails.

I found that most request API has a second parameter, e.g. called
connect timeout. Set e.g. to 50ms, the function returns after 50ms so
the next request can be sent after 50ms. This is a huge improve but the
root problem still remains.

> You can reduce this problem by increasing the ephemeral port range on
> the client PC.

Yes, but I guess there is something else I can do to make the
performance drop less painful. Maybe some other windows parameter?

What about implementing a custom ephemeral port number generator which
avoids port number reuse before MSL time is reached?

Best regards,
Jochen

> _______________________________________________
> lwip-users mailing list
> lwip-users@nongnu.org
> https://lists.nongnu.org/mailman/listinfo/lwip-users



reply via email to

[Prev in Thread] Current Thread [Next in Thread]