The time consuming of tcp waving hands(close(2)) will be affected
by network jitter, especially the wireless device cannot receive
the last-ack under worst environment, in this change we move the
tcp close callback into background and invoke the resource free
from workqueue, which will avoid the user application from being
blocked for a long time and unable to return in the call of close
Signed-off-by: chao.an <anchao@xiaomi.com>
1. remove the unnecessary interfaces tcp_close_monitor()
socket flags(s_flags) is a global state for net connection
remove the incorrect update for stop monitor
2. do not start the tcp monitor from duplicated psock
the tcp monitor has already registered in connect callback
------------------------------------------------------------
This patch also fix the telnet issue reported by:
https://github.com/apache/incubator-nuttx/pull/5434#issuecomment-1035600651
the orignal session fd is closed after dup, the connect state
has incorrectly migrated to close:
drivers/net/telnet.c:
977 static int telnet_session(FAR struct telnet_session_s *session)
...
1031 ret = psock_dup2(psock, &priv->td_psock);
...
1082 nx_close(session->ts_sd);
Signed-off-by: chao.an <anchao@xiaomi.com>
According to RFC 5681 (3.2) the TCP Fast Retransmit algorithm should start
if the threshold of 3 duplicate ACKs is reached.
Thus the threshold should be a constant, not an integer option.
tcp_sendfile() reads data directly from a file and does not use NET_TCP_WRITE_BUFFERS data flow
even if CONFIG_NET_TCP_WRITE_BUFFERS option is enabled.
Despite this, tcp_sendfile relied on NET_TCP_WRITE_BUFFERS specific flow control variables that
were idle during sendfile operation. Thus it was a total inconsistency.
E.g. because of the issue, TCP socket used by sendfile() operation never issued
FIN packet on close() command, and the TCP connection hung up.
As a result of the fix, simultaneously enabled CONFIG_NET_TCP_WRITE_BUFFERS and
CONFIG_NET_SENDFILE options can coexist.
If the remote TCP receiver advertised TCP window size greater than 64 KB
and TCP ACK packets returned to the NuttX TCP sender with a significant delay,
tx_unacked variable overflowed and further TCP send stalled forever
(until TCP re-connection).
In case of enabled packet forwarding mode, packets were forwarded in a reverse order
because of LIFO behavior of the connection event list.
The issue exposed only during high network traffic. Thus the event list started to grow
that resulted in changing the order of packets inside of groups of several packets
like the following: 3, 2, 1, 6, 5, 4, 8, 7 etc.
Remarks concerning the connection event list implementation:
* Now the queue (list) is FIFO as it should be.
* The list is singly linked.
* The list has a head pointer (inside of outer net_driver_s structure),
and a tail pointer is added into outer net_driver_s structure.
* The list item is devif_callback_s structure.
It still has two pointers to two different list chains (*nxtconn and *nxtdev).
* As before the first argument (*dev) of the list functions can be NULL,
while the other argument (*list) is effective (not NULL).
* An extra (*tail) argument is added to devif_callback_alloc()
and devif_conn_callback_free() functions.
* devif_callback_alloc() time complexity is O(1) (i.e. O(n) to fill the whole list).
* devif_callback_free() time complexity is O(n) (i.e. O(n^2) to empty the whole list).
* devif_conn_event() time complexity is O(n).
* Do not accept the window in old segments.
Implement SND.WL1/WL2 things in the RFC.
* Do not accept the window in the segment w/o ACK bit set.
The window is an offset from the ack seq.
(maybe it's simpler to just drop segments w/o ACK though)
* Subtract snd_wnd by the amount of the ack advancement.
Consider a bi-directional TCP connection:
1. we use all IOBs for tx queue
2. we advertize zero recv window because we have no free IOBs
3. if the peer tcp does the same thing,
both sides advertize zero window and can not drain the tx queue.
For a similar stall to happen, the peer doesn't need to be
a naive tcp implementation like nuttx. A naive application blocking
on send() without draining its read buffer is enough.
(Probably such an application should be fixed to drain rx even
when tx is full. However, it's another story.)
This commit avoids the situation by prevent tx from grabbing
the all IOBs in the first place. (assuming CONFIG_IOB_THROTTLE > 0)
1. change all window relative value type to uint32_t
2. move window range validity check(UINT16_MAX) before assembling TCP header
Signed-off-by: chao.an <anchao@xiaomi.com>
Do not bother to preserve segment boundaries in the tcp
readahead queues.
* Avoid wasting the tail IOB space for each segments.
Instead, pack the newly received data into the tail space
of the last IOB. Also, advertise the tail space as
a part of the window.
* Use IOB chain directly. Eliminate IOB queue overhead.
* Allow to accept only a part of a segment.
* This change improves the memory efficiency.
And probably more importantly, allows less-confusing
recv window advertisement behavior.
Previously, even when we advertise N bytes window,
we often couldn't actually accept N bytes. Depending on
the segment sizes and IOB configurations, it was causing
segment drops.
Also, the previous code was moving the right edge of the
window back and forth too often, even when nothing in
the system was competing on the IOBs. Shrinking the
window that way is a kinda well known recipe to confuse
the peer stack.
* Fixes the case where the window was small but not zero.
* tcp_recvfrom: Remove tcp_ackhandler. Instead, simply schedule TX for
a possible window update and make tcp_appsend decide.
* Replace rcv_wnd (the last advertized window size value) with
rcv_adv. (the window edge sequence number advertized to the peer)
rcv_wnd was complicated to deal with because its base (rcvseq) is
also moving.
* tcp_appsend: Send a window update even if there are no other reasons
to send an ack.
Namely, send an update if it increases the window by
* 2 * mss
* or the half of the max possible window size
Request the TCP ACK to estimate the receive window after handle
any data already buffered in a read-ahead buffer.
Change-Id: Id998a1125dd2991d73ba4bef081ddcb7adea4f0d
Signed-off-by: chao.an <anchao@xiaomi.com>
RFC2001: TCP Slow Start, Congestion Avoidance, Fast Retransmit,
and Fast Recovery Algorithms
...
3. Fast Retransmit
Modifications to the congestion avoidance algorithm were proposed in
1990 [3]. Before describing the change, realize that TCP may
generate an immediate acknowledgment (a duplicate ACK) when an out-
of-order segment is received (Section 4.2.2.21 of [1], with a note
that one reason for doing so was for the experimental fast-
retransmit algorithm). This duplicate ACK should not be delayed.
The purpose of this duplicate ACK is to let the other end know that a
segment was received out of order, and to tell it what sequence
number is expected.
Since TCP does not know whether a duplicate ACK is caused by a lost
segment or just a reordering of segments, it waits for a small number
of duplicate ACKs to be received. It is assumed that if there is
just a reordering of the segments, there will be only one or two
duplicate ACKs before the reordered segment is processed, which will
then generate a new ACK. If three or more duplicate ACKs are
received in a row, it is a strong indication that a segment has been
lost. TCP then performs a retransmission of what appears to be the
missing segment, without waiting for a retransmission timer to
expire.
Change-Id: Ie2cbcecab507c3d831f74390a6a85e0c5c8e0652
Signed-off-by: chao.an <anchao@xiaomi.com>
Add a fallback mechanism to ensure that there are still available
iobs for an free connection, Guarantees all connections will have
a minimum threshold iob to keep the connection not be hanged.
Change-Id: I59bed98d135ccd1f16264b9ccacdd1b0d91261de
Signed-off-by: chao.an <anchao@xiaomi.com>
Fixes an issue where tcp sockets with activated keepalives stalled and
were not properly closed. Poll would not indicate a POLLHUP and therefore
locks down the application.
* tcp_conn_s.tcp_conn_s & tcp_conn_s.keepintvl changed to uint32_t
According RFC1122 keepidle MUST have a default of 2 hours.
It's enough to check the buffer available in the net event handler
Change-Id: I2d7c7a03675cf6eff6ffb42a81b7c7245253e92c
Signed-off-by: Xiang Xiao <xiaoxiang@xiaomi.com>
MSG_DONTWAIT (since Linux 2.2)
Enables nonblocking operation; if the operation would block, the
call fails with the error EAGAIN or EWOULDBLOCK. This provides
similar behavior to setting the O_NONBLOCK flag (via the fcntl(2)
F_SETFL operation), but differs in that MSG_DONTWAIT is a per-call
option, whereas O_NONBLOCK is a setting on the open file description
(see open(2)), which will affect all threads in the calling process
and as well as other processes that hold file descriptors referring
to the same open file description.