Issue:
TCP rx buffer is freed after 4-way handshake with current design.
3 socket's rx buffer might be consumed during ffmpeg switch music procedure,
and this might cause IOB exhausted.
Solution:
free TCP rx buffer immediately in tcp_close to make sure IOB won't be
exhausted.
Signed-off-by: 梁超众 <liangchaozhong@xiaomi.com>
Signed-off-by: chao an <anchao@xiaomi.com>
1. Remove tcp_txdrain() from close() to avoid indefinitely block
2. Send TCP_RST immediately if linger timeout
Signed-off-by: chao an <anchao@xiaomi.com>
devif/ipv6_input.c: In function 'ipv6_input':
devif/ipv6_input.c:59:33: error: 'TCPIPv6BUF' undeclared (first use in this function); did you mean 'IPv6BUF'?
59 | #define PAYLOAD ((FAR uint8_t *)TCPIPv6BUF)
| ^~~~~~~~~~
devif/ipv6_input.c:302:14: note: in expansion of macro 'PAYLOAD'
302 | payload = PAYLOAD; /* Assume payload starts right after IPv6 header */
|
Signed-off-by: chao an <anchao@xiaomi.com>
Fixed by commit:
| net: remove pvconn reference from all devif callback
|
| Do not use 'pvconn' argument to get the connection pointer since
| pvconn is normally NULL for some events like NETDEV_DOWN.
| Instead, the connection pointer can be reliably obtained from the
| corresponding private pointer.
Signed-off-by: chao an <anchao@xiaomi.com>
I noticed that the conn instance will leak during stress test,
The close work queued from tcp_close_eventhandler() will be canceled
by tcp_timer() immediately:
Breakpoint 1, tcp_close_eventhandler (dev=0x565cd338 <up_irq_restore+108>, pvpriv=0x5655e6ff <getpid+12>, flags=0) at tcp/tcp_close.c:71
(gdb) bt
| #0 tcp_close_eventhandler (dev=0x565cd338 <up_irq_restore+108>, pvpriv=0x5655e6ff <getpid+12>, flags=0) at tcp/tcp_close.c:71
| #1 0x5658bf1e in devif_conn_event (dev=0x5660bd80 <g_sim_dev>, flags=512, list=0x5660d558 <g_cbprealloc+312>) at devif/devif_callback.c:508
| #2 0x5658a219 in tcp_callback (dev=0x5660bd80 <g_sim_dev>, conn=0x5660c4a0 <g_tcp_connections>, flags=512) at tcp/tcp_callback.c:167
| #3 0x56589253 in tcp_timer (dev=0x5660bd80 <g_sim_dev>, conn=0x5660c4a0 <g_tcp_connections>) at tcp/tcp_timer.c:378
| #4 0x5658dd47 in tcp_poll (dev=0x5660bd80 <g_sim_dev>, conn=0x5660c4a0 <g_tcp_connections>) at tcp/tcp_devpoll.c:95
| #5 0x5658b95f in devif_poll_tcp_connections (dev=0x5660bd80 <g_sim_dev>, callback=0x565770f2 <netdriver_txpoll>) at devif/devif_poll.c:601
| #6 0x5658b9ea in devif_poll (dev=0x5660bd80 <g_sim_dev>, callback=0x565770f2 <netdriver_txpoll>) at devif/devif_poll.c:722
| #7 0x56577230 in netdriver_txavail_work (arg=0x5660bd80 <g_sim_dev>) at sim/up_netdriver.c:308
| #8 0x5655999e in work_thread (argc=2, argv=0xf3db5dd0) at wqueue/kwork_thread.c:178
| #9 0x5655983f in nxtask_start () at task/task_start.c:129
(gdb) c
Continuing.
Breakpoint 2, tcp_update_timer (conn=0x5660c4a0 <g_tcp_connections>) at tcp/tcp_timer.c:178
(gdb) bt
| #0 tcp_update_timer (conn=0x5660c4a0 <g_tcp_connections>) at tcp/tcp_timer.c:178
| #1 0x5658952a in tcp_timer (dev=0x5660bd80 <g_sim_dev>, conn=0x5660c4a0 <g_tcp_connections>) at tcp/tcp_timer.c:708
| #2 0x5658dd47 in tcp_poll (dev=0x5660bd80 <g_sim_dev>, conn=0x5660c4a0 <g_tcp_connections>) at tcp/tcp_devpoll.c:95
| #3 0x5658b95f in devif_poll_tcp_connections (dev=0x5660bd80 <g_sim_dev>, callback=0x565770f2 <netdriver_txpoll>) at devif/devif_poll.c:601
| #4 0x5658b9ea in devif_poll (dev=0x5660bd80 <g_sim_dev>, callback=0x565770f2 <netdriver_txpoll>) at devif/devif_poll.c:722
| #5 0x56577230 in netdriver_txavail_work (arg=0x5660bd80 <g_sim_dev>) at sim/up_netdriver.c:308
| #6 0x5655999e in work_thread (argc=2, argv=0xf3db5dd0) at wqueue/kwork_thread.c:178
| #7 0x5655983f in nxtask_start () at task/task_start.c:129
Since a separate work will add 24 bytes to each conn instance,
but in order to support the feature of asynchronous close(),
I can not find a better way than adding a separate work,
for resource constraints, I recommend the developers to enable
CONFIG_NET_ALLOC_CONNS, which will reduce the ram usage.
Signed-off-by: chao an <anchao@xiaomi.com>
Do not use 'pvconn' argument to get the connection pointer since
pvconn is normally NULL for some events like NETDEV_DOWN.
Instead, the connection pointer can be reliably obtained from the
corresponding private pointer.
Signed-off-by: chao.an <anchao@xiaomi.com>
since it is impossible to track producer and consumer
correctly if TCP/IP stack pass IOB directly to netdev
Signed-off-by: Xiang Xiao <xiaoxiang@xiaomi.com>
fix regression of invalid update the rexmit_seq in buffer mode
rexmit_seq should not be used instead of sndseq in fast retransmission,
sndseq of retransmission in the packet does not need to be re-updated
Signed-off-by: chao.an <anchao@xiaomi.com>
Retransmit only one the earliest not acknowledged segment
(according to RFC 6298 (5.4)). The issue is the same as it was
in tcp_send_unbuffered.c and tcp_sendfile.c.
Signed-off-by: chao.an <anchao@xiaomi.com>
The time consuming of tcp waving hands(close(2)) will be affected
by network jitter, especially the wireless device cannot receive
the last-ack under worst environment, in this change we move the
tcp close callback into background and invoke the resource free
from workqueue, which will avoid the user application from being
blocked for a long time and unable to return in the call of close
Signed-off-by: chao.an <anchao@xiaomi.com>
1. remove the unnecessary interfaces tcp_close_monitor()
socket flags(s_flags) is a global state for net connection
remove the incorrect update for stop monitor
2. do not start the tcp monitor from duplicated psock
the tcp monitor has already registered in connect callback
------------------------------------------------------------
This patch also fix the telnet issue reported by:
https://github.com/apache/incubator-nuttx/pull/5434#issuecomment-1035600651
the orignal session fd is closed after dup, the connect state
has incorrectly migrated to close:
drivers/net/telnet.c:
977 static int telnet_session(FAR struct telnet_session_s *session)
...
1031 ret = psock_dup2(psock, &priv->td_psock);
...
1082 nx_close(session->ts_sd);
Signed-off-by: chao.an <anchao@xiaomi.com>
According to RFC 5681 (3.2) the TCP Fast Retransmit algorithm should start
if the threshold of 3 duplicate ACKs is reached.
Thus the threshold should be a constant, not an integer option.
tcp_sendfile() reads data directly from a file and does not use NET_TCP_WRITE_BUFFERS data flow
even if CONFIG_NET_TCP_WRITE_BUFFERS option is enabled.
Despite this, tcp_sendfile relied on NET_TCP_WRITE_BUFFERS specific flow control variables that
were idle during sendfile operation. Thus it was a total inconsistency.
E.g. because of the issue, TCP socket used by sendfile() operation never issued
FIN packet on close() command, and the TCP connection hung up.
As a result of the fix, simultaneously enabled CONFIG_NET_TCP_WRITE_BUFFERS and
CONFIG_NET_SENDFILE options can coexist.
If the remote TCP receiver advertised TCP window size greater than 64 KB
and TCP ACK packets returned to the NuttX TCP sender with a significant delay,
tx_unacked variable overflowed and further TCP send stalled forever
(until TCP re-connection).
In case of enabled packet forwarding mode, packets were forwarded in a reverse order
because of LIFO behavior of the connection event list.
The issue exposed only during high network traffic. Thus the event list started to grow
that resulted in changing the order of packets inside of groups of several packets
like the following: 3, 2, 1, 6, 5, 4, 8, 7 etc.
Remarks concerning the connection event list implementation:
* Now the queue (list) is FIFO as it should be.
* The list is singly linked.
* The list has a head pointer (inside of outer net_driver_s structure),
and a tail pointer is added into outer net_driver_s structure.
* The list item is devif_callback_s structure.
It still has two pointers to two different list chains (*nxtconn and *nxtdev).
* As before the first argument (*dev) of the list functions can be NULL,
while the other argument (*list) is effective (not NULL).
* An extra (*tail) argument is added to devif_callback_alloc()
and devif_conn_callback_free() functions.
* devif_callback_alloc() time complexity is O(1) (i.e. O(n) to fill the whole list).
* devif_callback_free() time complexity is O(n) (i.e. O(n^2) to empty the whole list).
* devif_conn_event() time complexity is O(n).
* Do not accept the window in old segments.
Implement SND.WL1/WL2 things in the RFC.
* Do not accept the window in the segment w/o ACK bit set.
The window is an offset from the ack seq.
(maybe it's simpler to just drop segments w/o ACK though)
* Subtract snd_wnd by the amount of the ack advancement.
Consider a bi-directional TCP connection:
1. we use all IOBs for tx queue
2. we advertize zero recv window because we have no free IOBs
3. if the peer tcp does the same thing,
both sides advertize zero window and can not drain the tx queue.
For a similar stall to happen, the peer doesn't need to be
a naive tcp implementation like nuttx. A naive application blocking
on send() without draining its read buffer is enough.
(Probably such an application should be fixed to drain rx even
when tx is full. However, it's another story.)
This commit avoids the situation by prevent tx from grabbing
the all IOBs in the first place. (assuming CONFIG_IOB_THROTTLE > 0)
1. change all window relative value type to uint32_t
2. move window range validity check(UINT16_MAX) before assembling TCP header
Signed-off-by: chao.an <anchao@xiaomi.com>