Commit Graph

460 Commits

Author SHA1 Message Date
chao.an
aab03ef86d net/tcp: add window scale support
Reference here:
https://tools.ietf.org/html/rfc1323

Signed-off-by: chao.an <anchao@xiaomi.com>
2021-07-07 03:55:41 -05:00
chao.an
a5cdc4e69b net/tcp: change the tcp optdata to dynamic arrays
Signed-off-by: chao.an <anchao@xiaomi.com>
2021-07-07 03:55:41 -05:00
chao.an
87bffc190c net/tcp: remove the invalid break during tcp option loop
Signed-off-by: chao.an <anchao@xiaomi.com>
2021-07-07 03:55:41 -05:00
chao.an
b901f22c27 net/socket: add SO_RCVBUF support
Signed-off-by: chao.an <anchao@xiaomi.com>
2021-07-06 01:44:55 -05:00
chao.an
eabe535de7 net/inet: add support of FIONREAD
Signed-off-by: chao.an <anchao@xiaomi.com>
2021-07-05 06:20:52 -05:00
Anthony Merlino
a885b24cc1 Attempt to fix race condition reported in issue #3647 2021-07-04 08:54:15 -05:00
YAMAMOTO Takashi
669619a06a tcp_close: Fix a race with passive close
tcp_close disposes the connection immediately if it's called in
TCP_LAST_ACK. If it happens, we will end up with responding the
last ACK with a RST.

This commit fixes it by making tcp_close wait for the completion
of the passive close.
2021-07-02 13:54:15 +09:00
YAMAMOTO Takashi
c7ba75697c tcp_recvwindow.c: Use iob_tailroom to replace the home grown one 2021-06-30 06:40:13 -05:00
YAMAMOTO Takashi
52c237cb5f net/tcp/tcp.h: Update a comment about readahead 2021-06-30 06:40:13 -05:00
YAMAMOTO Takashi
08e9dff0e9 tcp_close: disable send callback before sending FIN
This fixes connection closing issues with CONFIG_NET_TCP_WRITE_BUFFERS.

Because TCP_CLOSE is used for both of input and output for tcp_callback,
the close callback and the send callback confuses each other as
the following. As it effectively disposes the connection immediately,
we end up with responding to the consequent ACK and FIN/ACK from the peer
with RSTs.

tcp_timer
    -> tcp_close_eventhandler
        returns TCP_CLOSE (meaning an active close)
    -> psock_send_eventhandler
        called with TCP_CLOSE from tcp_close_eventhandler, misinterpet as
        a passive close.
        -> tcp_lost_connection
            -> tcp_shutdown_monitor
                -> tcp_callback
                    -> tcp_close_eventhandler
                        misinterpret TCP_CLOSE from itself as
                        a passive close
2021-06-30 06:39:13 -05:00
YAMAMOTO Takashi
326a8ef0a2 tcp_close_disconnect: don't nullify sndcb
It isn't necessary and I plan to use the value later in
the close processing.
2021-06-30 06:39:13 -05:00
YAMAMOTO Takashi
8472430f22 tcp_close: replace scaring comments 2021-06-30 06:39:13 -05:00
YAMAMOTO Takashi
1ce13ee731 tcp_reset: Don't copy the peer window
The current code just leave the window value from the segment
from the peer. It doesn't make sense.

Instead, always use 0.
This matches what NetBSD and Linux do.
(As far as I read their code correctly.)
2021-06-29 22:23:48 -05:00
YAMAMOTO Takashi
98e7c6924d tcp: always responds to keep-alive segments
* It doesn't make sense to have this conditional on our own
  SO_KEEPALIVE support. (CONFIG_NET_TCP_KEEPALIVE)
  Actually we don't have a control on the peer tcp stack,
  who decides to send us keep-alive probes.

* We should respond them for non ESTABLISHED states. eg. FIN_WAIT_2
  See also:
  https://github.com/apache/incubator-nuttx/pull/3919#issuecomment-868248576
2021-06-30 11:52:08 +09:00
YAMAMOTO Takashi
4878b7729c tcp: simplify readahead
Do not bother to preserve segment boundaries in the tcp
readahead queues.

* Avoid wasting the tail IOB space for each segments.
  Instead, pack the newly received data into the tail space
  of the last IOB. Also, advertise the tail space as
  a part of the window.

* Use IOB chain directly. Eliminate IOB queue overhead.

* Allow to accept only a part of a segment.

* This change improves the memory efficiency.
  And probably more importantly, allows less-confusing
  recv window advertisement behavior.
  Previously, even when we advertise N bytes window,
  we often couldn't actually accept N bytes. Depending on
  the segment sizes and IOB configurations, it was causing
  segment drops.
  Also, the previous code was moving the right edge of the
  window back and forth too often, even when nothing in
  the system was competing on the IOBs. Shrinking the
  window that way is a kinda well known recipe to confuse
  the peer stack.
2021-06-30 06:22:14 +09:00
YAMAMOTO Takashi
0886257eb4 tcp_input: Accept segments spanning over rcvseq 2021-06-30 06:22:14 +09:00
YAMAMOTO Takashi
eeafe070ec tcp.h: Add TCP_SEQ_ADD 2021-06-30 06:22:14 +09:00
YAMAMOTO Takashi
022a2490d1 tcp: Change the way to advance rcvseq
* Move the code to advance rcvseq for user data from tcp_input
  to receive handlers.
  Motivation: allow partial ack.

* If we drop a segment, ignore FIN as well. Note than tcp FIN bit is
  logically after the user data in the same segment.
2021-06-30 06:22:14 +09:00
YAMAMOTO Takashi
1448de4451 tcp_should_send_recvwindow: Remove function name from ninfo()
ninfo() itself usually prefixes the function name
automatically for us.
2021-06-18 00:47:47 -05:00
YAMAMOTO Takashi
ca0f2bdb95 tcp_get_recvwindow: Make this match the reality (tcp_datahandler)
It doesn't make much sense to advertize more than we can actually
accept, especially when we don't queue partial segments.
2021-06-15 01:17:38 -05:00
YAMAMOTO Takashi
af64912833 tcp_get_recvwindow: use tcp_rx_mss
Just to reduce code duplication.
No functional changes are intended.
2021-06-15 01:17:38 -05:00
YAMAMOTO Takashi
64676641cb tcp_datahandler: try throttled=false on iob_trycopyin failure as well
I assume this was just an oversight because I couldn't
find any obvious reason to special-case only the first IOB.

The commit message of the original commit is cited below.

```
commit bf21056001
Author: chao.an <anchao@xiaomi.com>
Date:   Fri Nov 27 09:50:38 2020 +0800

    net/tcp: fallback to unthrottle pool to avoid deadlock

    Add a fallback mechanism to ensure that there are still available
    iobs for an free connection, Guarantees all connections will have
    a minimum threshold iob to keep the connection not be hanged.

    Change-Id: I59bed98d135ccd1f16264b9ccacdd1b0d91261de
    Signed-off-by: chao.an <anchao@xiaomi.com>
```
2021-06-15 01:17:38 -05:00
YAMAMOTO Takashi
0347cd3f09 tcp_should_send_recvwindow: Add a few ninfo() 2021-06-13 21:20:24 -05:00
YAMAMOTO Takashi
14ec75e7fc tcp: window update improvements
* Fixes the case where the window was small but not zero.

* tcp_recvfrom: Remove tcp_ackhandler. Instead, simply schedule TX for
  a possible window update and make tcp_appsend decide.

* Replace rcv_wnd (the last advertized window size value) with
  rcv_adv. (the window edge sequence number advertized to the peer)
  rcv_wnd was complicated to deal with because its base (rcvseq) is
  also moving.

* tcp_appsend: Send a window update even if there are no other reasons
  to send an ack.
  Namely, send an update if it increases the window by
    * 2 * mss
    * or the half of the max possible window size
2021-06-13 21:20:24 -05:00
YAMAMOTO Takashi
1f6fdf04b7 tcp: Extract MSS calculation from tcp_synack
I plan to use it for recv window update decision logic.
2021-06-13 21:20:24 -05:00
YAMAMOTO Takashi
7d82e7a7c4 tcp_input: fix a confusing variable name and a comment
It looks like a copy-and-paste mistake.
2021-06-10 22:47:04 -05:00
YAMAMOTO Takashi
eb00e00e48 tcp: Use the tcp seq macros in some obvious places 2021-06-10 22:47:04 -05:00
YAMAMOTO Takashi
433a2b27d9 tcp: add macros to deal with sequence number wraparound 2021-06-10 22:47:04 -05:00
Xiang Xiao
5b2a17b892 Include assert.h in necessary place
Signed-off-by: Xiang Xiao <xiaoxiang@xiaomi.com>
2021-06-08 13:06:08 -07:00
YAMAMOTO Takashi
69b3f034a4 tcp: Move buffered/unbuffered common code to tcp_send.c 2021-06-03 21:33:10 -05:00
YAMAMOTO Takashi
7d33c01a0a tcp_get_recvwindow: Add a revisit comment 2021-06-03 21:32:34 -05:00
YAMAMOTO Takashi
7ac6c0a8de tcp_data_event: Add a comment 2021-05-31 01:37:51 -05:00
YAMAMOTO Takashi
92328792fd tcp_data_event: Fix an indent 2021-05-31 01:37:51 -05:00
YAMAMOTO Takashi
2ce0457edb tcp_get_recvwindow: Add a comment 2021-05-31 01:37:51 -05:00
YAMAMOTO Takashi
0c606ecb8e psock_tcp_recvfrom: Add a comment about window updates 2021-05-31 01:37:51 -05:00
Alin Jerpelea
b3ad98c89a net: update licenses to Apache
Gregory Nutt is the copyright holder for those files and he has submitted the
SGA as a result we can migrate the licenses to Apache.

Signed-off-by: Alin Jerpelea <alin.jerpelea@sony.com>
2021-05-27 08:07:25 +09:00
Xiang Xiao
001e7c3e76 sched: Don't include nuttx/sched.h inside sched.h
But let nuttx/sched.h include sched.h instead to
avoid expose nuttx kernel API to userspace.

Signed-off-by: Xiang Xiao <xiaoxiang@xiaomi.com>
2021-05-24 12:11:53 +09:00
chao.an
48b0e48cd4 net/tcp: set/get TCP_KEEPINTVL/IDLE value as BSD style
Signed-off-by: chao.an <anchao@xiaomi.com>
2021-05-22 09:01:18 -05:00
chao.an
a876f0253a net/tcp: recounter the ack counter during obtain newdata
Signed-off-by: chao.an <anchao@xiaomi.com>
2021-05-21 18:02:53 -03:00
YAMAMOTO Takashi
acc3596adc tcp_netpoll.c: Fix a performance issue with CONFIG_NET_TCP_WRITE_BUFFERS
Tested with a modified version of webclient, which uses non-blocking i/o.
The packet dumps look more reasonable with this change.
2021-04-05 06:16:46 -05:00
chao.an
621242e890 net/tcp: support bind the same port with different domain
Reference here:
https://man7.org/linux/man-pages/man7/ipv6.7.html

IPV6_V6ONLY slice

Signed-off-by: chao.an <anchao@xiaomi.com>
2021-04-01 20:05:14 -04:00
YAMAMOTO Takashi
09869e5d41 net/tcp/tcp.h: Remove unused extern g_netdevices 2021-03-30 12:27:50 -05:00
YAMAMOTO Takashi
1c29a2e8e8 net/tcp/tcp_send_buffered.c: Fix non-blocking I/O
My recent changes to buffered tcp send broke this. [1]

One of my local apps using non-blocking tcp is working
again with this fix.

[1]
```
commit 837e1a72a4
Author: YAMAMOTO Takashi <yamamoto@midokura.com>
Date:   Mon Mar 15 16:19:42 2021 +0900

    tcp_send_buffered.c: improve tcp write buffering
```
2021-03-30 01:12:55 -05:00
YAMAMOTO Takashi
271e748ba5 tcp_send_buffered.c: Add a bit more info to an ninfo() 2021-03-30 01:12:55 -05:00
YAMAMOTO Takashi
a2840b6354 tcp_send_buffered.c: Add an assertion 2021-03-30 01:12:55 -05:00
YAMAMOTO Takashi
ef9adcf399 tcp_send_buffered.c: Remove dead code 2021-03-30 01:12:55 -05:00
YAMAMOTO Takashi
837e1a72a4 tcp_send_buffered.c: improve tcp write buffering
* Send data chunk-by-chunk
  Note: A stream socket doesn't have atomicity requirement.

* Increase the chance to use full-sized segments

Benchmark numbers in my environment:

* Over ESP32 wifi
* The peer is NetBSD, which has traditional delayed ack TCP
* iperf uses 16384 bytes buffer

---

without this patch,
CONFIG_IOB_NBUFFERS=36
CONFIG_IOB_BUFSIZE=196

does not work.
see https://github.com/apache/incubator-nuttx/pull/2772#discussion_r592820639

---

without this patch,
CONFIG_IOB_NBUFFERS=128
CONFIG_IOB_BUFSIZE=196
```
nsh> iperf -c 192.168.8.1
       IP: 192.168.8.103

 mode=tcp-client sip=192.168.8.103:5001,dip=192.168.8.1:5001, interval=3, time=30

        Interval Bandwidth

   0-   3 sec,  4.11 Mbits/sec
   3-   6 sec,  4.63 Mbits/sec
   6-   9 sec,  4.89 Mbits/sec
   9-  12 sec,  4.63 Mbits/sec
  12-  15 sec,  4.85 Mbits/sec
  15-  18 sec,  4.85 Mbits/sec
  18-  21 sec,  5.02 Mbits/sec
  21-  24 sec,  3.67 Mbits/sec
  24-  27 sec,  4.94 Mbits/sec
  27-  30 sec,  4.81 Mbits/sec
   0-  30 sec,  4.64 Mbits/sec
nsh>
```

---

with this patch,
CONFIG_IOB_NBUFFERS=36
CONFIG_IOB_BUFSIZE=196
```
nsh> iperf -c 192.168.8.1
       IP: 192.168.8.103

 mode=tcp-client sip=192.168.8.103:5001,dip=192.168.8.1:5001, interval=3, time=30

        Interval Bandwidth

   0-   3 sec,  5.33 Mbits/sec
   3-   6 sec,  5.59 Mbits/sec
   6-   9 sec,  5.55 Mbits/sec
   9-  12 sec,  5.59 Mbits/sec
  12-  15 sec,  5.59 Mbits/sec
  15-  18 sec,  5.72 Mbits/sec
  18-  21 sec,  5.68 Mbits/sec
  21-  24 sec,  5.29 Mbits/sec
  24-  27 sec,  4.67 Mbits/sec
  27-  30 sec,  4.50 Mbits/sec
   0-  30 sec,  5.35 Mbits/sec
nsh>
```

---

with this patch,
CONFIG_IOB_NBUFFERS=128
CONFIG_IOB_BUFSIZE=196
```
nsh> iperf -c 192.168.8.1
       IP: 192.168.8.103

 mode=tcp-client sip=192.168.8.103:5001,dip=192.168.8.1:5001, interval=3, time=30

        Interval Bandwidth

   0-   3 sec,  5.51 Mbits/sec
   3-   6 sec,  4.67 Mbits/sec
   6-   9 sec,  4.54 Mbits/sec
   9-  12 sec,  5.42 Mbits/sec
  12-  15 sec,  5.37 Mbits/sec
  15-  18 sec,  5.11 Mbits/sec
  18-  21 sec,  5.07 Mbits/sec
  21-  24 sec,  5.29 Mbits/sec
  24-  27 sec,  5.77 Mbits/sec
  27-  30 sec,  4.63 Mbits/sec
   0-  30 sec,  5.14 Mbits/sec
nsh>
```
2021-03-22 01:12:59 -07:00
chao.an
e03218ab71 net/tcp: reset the connection ref count before tcp_free()
reset the connection refcount if SYN retry count has elapsed

Assertion:

up_assert: Assertion failed at file:tcp/tcp_conn.c line: 764 task: netdev_wq

N/A

Signed-off-by: chao.an <anchao@xiaomi.com>
2021-03-22 10:55:30 +09:00
chao.an
a5613e6008 net/tcp: correct the port byte order
1. unify the byte order to network
2. Do not re-select the port if the local port has been bind()

Signed-off-by: chao.an <anchao@xiaomi.com>
2021-03-20 09:13:18 -07:00
YAMAMOTO Takashi
45098769e7 tcp_sendfile.c: Remove an unused copy of CONFIG_NET_TCP_SPLIT_SIZE 2021-03-15 04:52:58 -07:00