Commit Graph

178 Commits

Author SHA1 Message Date
wangyingdong
2ce31c442f net/tcp:Added tcp zero window probe timer support
https://www.rfc-editor.org/rfc/rfc1122#page-92

Signed-off-by: wangyingdong <wangyingdong@xiaomi.com>
2023-08-20 19:47:11 -03:00
liqinhui
8631e0c6b5 net/tcp: Fix the sack byte aligment error.
The error log is as follows:
  tcp/tcp_send_buffered.c:376:57: runtime error: member access within misaligned
  address 0x10075942 for type 'struct tcp_sack_s', which requires 4 byte alignment

Signed-off-by: liqinhui <liqinhui@xiaomi.com>
2023-08-14 20:46:09 +08:00
Zhe Weng
d98bfc3e49 net/icmpv6: Fix icmpv6_neighbor for link-local address
The netdev of link-local address cannot be auto decided, and the link-local address should always be reguarded as address on local network.

The problem we met:
When using `icmpv6_autoconfig` with multiple netdev, the `icmpv6_neighbor` may take out wrong netdev with ip address already set, then it may send solicitation with wrong address (`dev->d_ipv6draddr`) on wrong device, and regard the link-local address as conflict (because `dev->d_ipv6draddr` exists on this network).

Signed-off-by: Zhe Weng <wengzhe@xiaomi.com>
2023-08-11 02:00:39 +08:00
Fotis Panagiotopoulos
3c54d82d81 net: Fix task block when devif_send fails.
When a task needs to send data, a callback is allocated and the
transmission is happening in a worker task through devif_send.
Synchronization between the two tasks (sender & worker) is
achieved by a semaphore.

If devif_send fails, this semaphore was never posted, leaving
the sending task blocked indefinitely. This commit fixes this
by checking the return code of netif_send, and posting this
semaphore in case of failure.

Polling then stops, and execution is resumed on the sending
task.
2023-06-01 17:05:54 +08:00
liqinhui
f61dc72892 net/tcp:Add NewReno congestion control.
- NewReno congestion control algorithm is used to solve the problem
  of network congestion breakdown. NewReno congestion control includes
  slow start, collision avoidance, fast retransmission, and fast
  recovery. The implementation refers to RFC6582 and RFC5681.

- In addition, we optimize the congestion algorithm. In the conflict
  avoidance stage, the maximum congestion window max_cwnd is used to
  limit the excessive growth of cwnd and prevent network jitter
  caused by congestion. Maximum congestion window max_cwnd is updated
  with the current congestion window cwnd and the update weight is
  0.875 when an RTO timeout occurs.

Signed-off-by: liqinhui <liqinhui@xiaomi.com>
2023-05-16 12:35:01 -03:00
zhanghongyu
91e13c47ae net: remove conn-related casts
remove redundant casts associated with psock

Signed-off-by: zhanghongyu <zhanghongyu@xiaomi.com>
2023-05-10 19:32:09 -03:00
chao an
247b050e5f net/tcp: remove conn check since which can not be NULL
Signed-off-by: chao an <anchao@xiaomi.com>
2023-02-02 13:31:06 +08:00
chao an
98e1f9c36d net/tcp: reuse common api to replace some ip select code
Signed-off-by: chao an <anchao@xiaomi.com>
2023-01-30 11:25:10 +08:00
chao an
85a8249821 Revert "socket:return -EAGAIN if timeout happends in psock_tcp_send"
This reverts commit fbe641a916.

This issue already fixed by below change:

| commit 715785245c
| Author: chao an <anchao@xiaomi.com>
| Date:   Mon Jan 16 12:37:44 2023 +0800
|
|     net/tcp: fix potential busy loop in tcp_send_buffered.c
|
|     if the wrbuffer does not have enough space to send the rest of
|     the data, then the send function will loop infinitely in nonblock
|     mode, add send timeout check to avoid this issue.
|
|     Signed-off-by: chao an <anchao@xiaomi.com>

Signed-off-by: chao an <anchao@xiaomi.com>
2023-01-29 13:46:01 +08:00
zhanghongyu
3725fe28fb tcp: fix trans data error when fast retrans
Signed-off-by: zhanghongyu <zhanghongyu@xiaomi.com>
2023-01-28 09:02:14 +02:00
liangchaozhong
fbe641a916 socket:return -EAGAIN if timeout happends in psock_tcp_send
psock_tcp_send will enter busyloop state when no IOB keeps at unavailable state
when the socket is blocking socket and this will cause WDT.
According to socket's manpage, -1 should be returned while errno set to EAGAIN
if send/recv timeout was set.

Here's the description:
SO_RCVTIMEO and SO_SNDTIMEO
Specify the receiving or sending timeouts until reporting an error. The argument is a struct timeval. If an input or output function blocks for this period of time, and data has been sent or received, the return value of that function will be the amount of data transferred; if no data has been transferred and the timeout has been reached then -1 is returned with errno set to EAGAIN or EWOULDBLOCK, or EINPROGRESS (for connect(2)) just as if the socket was specified to be nonblocking. If the timeout is set to zero (the default) then the operation will never timeout. Timeouts only have effect for system calls that perform socket I/O (e.g., read(2), recvmsg(2), send(2), sendmsg(2)); timeouts have no effect for select(2), poll(2), epoll_wait(2), and so on.

Signed-off-by: liangchaozhong <liangchaozhong@xiaomi.com>
2023-01-28 08:06:57 +02:00
chao an
64dd7e6376 net/tcp: add Selective-ACK support
Reference:
https://datatracker.ietf.org/doc/html/rfc2018

Iperf2 client/server test on esp32c3:

Drop(1/50):
CONFIG_NET_TCP_DEBUG_DROP_SEND=y
CONFIG_NET_TCP_DEBUG_DROP_SEND_PROBABILITY=50  // Drop probability: 1/50
CONFIG_NET_TCP_DEBUG_DROP_RECV=y
CONFIG_NET_TCP_DEBUG_DROP_RECV_PROBABILITY=50  // Drop probability: 1/50

Drop(1/50) + OFO/SACK:
CONFIG_NET_TCP_DEBUG_DROP_SEND=y
CONFIG_NET_TCP_DEBUG_DROP_SEND_PROBABILITY=50  // Drop probability: 1/50
CONFIG_NET_TCP_DEBUG_DROP_RECV=y
CONFIG_NET_TCP_DEBUG_DROP_RECV_PROBABILITY=50  // Drop probability: 1/50

CONFIG_NET_TCP_OUT_OF_ORDER=y
CONFIG_NET_TCP_SELECTIVE_ACK=y

---------------------------------------------------------
|  TCP Config            | Server | Client |            |
|-------------------------------------------------------|
|  Original              |   12   |     9  |  Mbits/sec |
|  Drop(1/50)            |  0.6   |   0.3  |  Mbits/sec |
|  Drop(1/50) + OFO/SACK |    8   |     8  |  Mbits/sec |
---------------------------------------------------------

Signed-off-by: chao an <anchao@xiaomi.com>
2023-01-18 16:24:09 +08:00
chao an
715785245c net/tcp: fix potential busy loop in tcp_send_buffered.c
if the wrbuffer does not have enough space to send the rest of
the data, then the send function will loop infinitely in nonblock
mode, add send timeout check to avoid this issue.

Signed-off-by: chao an <anchao@xiaomi.com>
2023-01-18 02:11:33 +08:00
Xiang Xiao
a851ad84c3 net: consistent the net sem wait naming conversion
to prepare the new mutex wait function

Signed-off-by: Xiang Xiao <xiaoxiang@xiaomi.com>
2023-01-15 12:31:30 -03:00
chao an
1510dbee68 net/tcp: Do not trigger retransmission if the new data has not been consumed.
Signed-off-by: chao an <anchao@xiaomi.com>
2023-01-03 16:28:30 +08:00
Xiang Xiao
d5689e070b net/arp: Remove nuttx/net/arp.h
1.move ARPHRD_ETHER to netinet/arp.h
1.move arp_entry_s to net/arp/arp.h
2.move arp_input to nuttx/net/netdev.h

Signed-off-by: Xiang Xiao <xiaoxiang@xiaomi.com>
2022-12-16 22:10:59 +02:00
chao an
11de08de27 net/devif_send: replace all block send to nonblock mode
Signed-off-by: chao an <anchao@xiaomi.com>
2022-12-08 09:53:15 +01:00
chao an
34d2cde8a8 net/l2/l3/l4: add support of iob offload
1. Add new config CONFIG_NET_LL_GUARDSIZE to isolation of l2 stack,
   which will benefit l3(IP) layer for multi-MAC(l2) implementation,
   especially in some NICs such as celluler net driver.

new configuration options: CONFIG_NET_LL_GUARDSIZE

CONFIG_NET_LL_GUARDSIZE will reserved l2 buffer header size of
network buffer to isolate the L2/L3 (MAC/IP) data on network layer,
which will be beneficial to L3 network layer protocol transparent
transmission and forwarding

------------------------------------------------------------
Layout of frist iob entry:

        iob_data (aligned by CONFIG_IOB_ALIGNMENT)
            |
            |                  io_offset(CONFIG_NET_LL_GUARDSIZE)
            |                                |
            -------------------------------------------------
      iob   |            Reserved            |    io_len    |
            -------------------------------------------------

-------------------------------------------------------------
Layout of different NICs implementation:

        iob_data (aligned by CONFIG_IOB_ALIGNMENT)
            |
            |                 io_offset(CONFIG_NET_LL_GUARDSIZE)
            |                                |
            -------------------------------------------------
 Ethernet   |       Reserved    | ETH_HDRLEN |    io_len    |
            ---------------------------------|---------------
 8021Q      |   Reserved  | ETH_8021Q_HDRLEN |    io_len    |
            ---------------------------------|---------------
 ipforward  |            Reserved            |    io_len    |
            -------------------------------------------------

--------------------------------------------------------------------

2. Support iob offload to l2 driver to avoid unnecessary memory copy

Support send/receive iob vectors directly between the NICs and l3/l4
stack to avoid unnecessary memory copies, especially on hardware that
supports Scatter/gather, which can greatly improve performance.

new interface to support iob offload:

  ------------------------------------------
  |    IOB version     |     original      |
  |----------------------------------------|
  |  devif_iob_poll()  |   devif_poll()    |
  |       ...          |       ...         |
  ------------------------------------------

--------------------------------------------------------------------

1> NIC hardware support Scatter/gather transfer

TX:

                tcp_poll()/udp_poll()/pkt_poll()/...(l3|l4)
                           /              \
                          /                \
devif_poll_[l3|l4]_connections()     devif_iob_send() (nocopy:udp/icmp/...)
           /                                   \      (copy:tcp)
          /                                     \
  devif_iob_poll("NIC"_txpoll)                callback() // "NIC"_txpoll
                                                  |
                            dev->d_iob:           |
                                                ---------------         ---------------
                             io_data       iob1 |  |          |    iob3 |  |          |
                                    \           ---------------         ---------------
                                  ---------------  |       --------------- |
                             iob0 |  |          |  |  iob2 |  |          | |
                                  ---------------  |       --------------- |
                                     \             |          /           /
                                        \          |       /           /
                                   ----------------------------------------------
                    NICs io vector |    |    |    |    |    |    |    |    |    |
                                   ----------------------------------------------

RX:

  [tcp|udp|icmp|...]ipv[4|6]_data_handler()(iob_concat/append to readahead)
                    |
                    |
      [tcp|udp|icmp|...]_ipv[4|6]_in()/...
                    |
                    |
          pkt/ipv[4/6]_input()/...
                    |
                    |
     NICs io vector receive(iov_base to each iobs)

--------------------------------------------------------------------

2> CONFIG_IOB_BUFSIZE is greater than MTU:

TX:

"(CONFIG_IOB_BUFSIZE) > (MAX_NETDEV_PKTSIZE + CONFIG_NET_GUARDSIZE + CONFIG_NET_LL_GUARDSIZE)"

                tcp_poll()/udp_poll()/pkt_poll()/...(l3|l4)
                           /              \
                          /                \
devif_poll_[l3|l4]_connections()     devif_iob_send() (nocopy:udp/icmp/...)
           /                                   \      (copy:tcp)
          /                                     \
  devif_iob_poll("NIC"_txpoll)                callback() // "NIC"_txpoll
                                                  |
                                             "NIC"_send()
                          (dev->d_iob->io_data[CONFIG_NET_LL_GUARDSIZE - NET_LL_HDRLEN(dev)])

RX:

  [tcp|udp|icmp|...]ipv[4|6]_data_handler()(iob_concat/append to readahead)
                    |
                    |
      [tcp|udp|icmp|...]_ipv[4|6]_in()/...
                    |
                    |
          pkt/ipv[4/6]_input()/...
                    |
                    |
     NICs io vector receive(iov_base to io_data)

--------------------------------------------------------------------

3> Compatible with all old flat buffer NICs

TX:
                tcp_poll()/udp_poll()/pkt_poll()/...(l3|l4)
                           /              \
                          /                \
devif_poll_[l3|l4]_connections()     devif_iob_send() (nocopy:udp/icmp/...)
           /                                   \      (copy:tcp)
          /                                     \
  devif_iob_poll(devif_poll_callback())  devif_poll_callback() /* new interface, gather iobs to flat buffer */
       /                                           \
      /                                             \
 devif_poll("NIC"_txpoll)                     "NIC"_send()(dev->d_buf)

RX:

  [tcp|udp|icmp|...]ipv[4|6]_data_handler()(iob_concat/append to readahead)
                    |
                    |
      [tcp|udp|icmp|...]_ipv[4|6]_in()/...
                    |
                    |
               netdev_input()  /* new interface, Scatter/gather flat/iob buffer */
                    |
                    |
          pkt/ipv[4|6]_input()/...
                    |
                    |
    NICs io vector receive(Orignal flat buffer)

3. Iperf passthrough on NuttX simulator:

  -------------------------------------------------
  |  Protocol      | Server | Client |            |
  |-----------------------------------------------|
  |  TCP           |  813   |   834  |  Mbits/sec |
  |  TCP(Offload)  | 1720   |  1100  |  Mbits/sec |
  |  UDP           |   22   |   757  |  Mbits/sec |
  |  UDP(Offload)  |   25   |  1250  |  Mbits/sec |
  -------------------------------------------------

Signed-off-by: chao an <anchao@xiaomi.com>
2022-12-03 11:47:04 +08:00
chao an
d54a20b393 net/tcp/udp: remove all domain assertion
fix build break if enable CONFIG_NET_IPv6 only

In file included from tcp/tcp_sendfile.c:38:
tcp/tcp_sendfile.c: In function ‘sendfile_eventhandler’:
tcp/tcp_sendfile.c:173:27: error: ‘struct tcp_conn_s’ has no member named ‘domain’
  173 |           DEBUGASSERT(conn->domain == PF_INET6);
      |                           ^~

Signed-off-by: chao an <anchao@xiaomi.com>
2022-11-29 17:28:36 +08:00
chao an
a8d3286258 net: move device buffer define to common header
Signed-off-by: chao an <anchao@xiaomi.com>
2022-10-28 00:32:16 -04:00
chao an
6978446b8e net/tcp: remove debug counter of connect instance
Fixed by commit:

 | net: remove pvconn reference from all devif callback
 |
 | Do not use 'pvconn' argument to get the connection pointer since
 | pvconn is normally NULL for some events like NETDEV_DOWN.
 | Instead, the connection pointer can be reliably obtained from the
 | corresponding private pointer.

Signed-off-by: chao an <anchao@xiaomi.com>
2022-10-19 09:46:02 +08:00
chao.an
76c64b7d30 net/tcp: syscall send() should not return ETIMEDOUT
Signed-off-by: chao an <anchao@xiaomi.com>
2022-10-15 10:17:25 +09:00
chao.an
162fcd10ca net: cleanup pvconn reference to avoid confuse
More reference:
https://github.com/apache/incubator-nuttx/pull/5252
https://github.com/apache/incubator-nuttx/pull/5434

Signed-off-by: chao.an <anchao@xiaomi.com>
2022-08-26 20:58:11 +08:00
Xiang Xiao
ba9486de4a iob: Remove iob_user_e enum and related code
since it is impossible to track producer and consumer
correctly if TCP/IP stack pass IOB directly to netdev

Signed-off-by: Xiang Xiao <xiaoxiang@xiaomi.com>
2022-08-15 08:41:20 +03:00
chao.an
1c80675e87 net/tcp: fix regression of invalid update the rexmit_seq in buffer mode
fix regression of invalid update the rexmit_seq in buffer mode
rexmit_seq should not be used instead of sndseq in fast retransmission,
sndseq of retransmission in the packet does not need to be re-updated

Signed-off-by: chao.an <anchao@xiaomi.com>
2022-07-12 11:04:39 +03:00
chao.an
669fb83706 net/tcp(buffered): retransmit only one the earliest not acknowledged segment
Retransmit only one the earliest not acknowledged segment
(according to RFC 6298 (5.4)). The issue is the same as it was
in tcp_send_unbuffered.c and tcp_sendfile.c.

Signed-off-by: chao.an <anchao@xiaomi.com>
2022-06-16 18:14:29 +08:00
zhanghongyu
c50d7e174f net: tcp/udp/icmp/icmpv6 add FIONSPACE support
Signed-off-by: zhanghongyu <zhanghongyu@xiaomi.com>
2022-04-02 13:39:38 +08:00
Alexander Lunev
404ceffae2 tcp: added debug asserts and logging to investigate the rare (conn->dev == NULL) bug in callback handlers 2022-02-26 11:48:07 -03:00
chao.an
f073ed3a44 net/tcp: add support for send timeout on buffer mode
Signed-off-by: chao.an <anchao@xiaomi.com>
2022-02-18 16:04:55 +08:00
Alexander Lunev
774994b951 net/tcp: support for FIN+ACK case in tcp send event handlers 2022-02-13 03:20:18 +08:00
chao.an
8fb2468785 net/tcp: remove the socket hook reference from netdev callback
Signed-off-by: chao.an <anchao@xiaomi.com>
2022-02-10 15:04:33 -03:00
chao.an
99cde13a11 net/inet: move socket flags into socket_conn_s
Signed-off-by: chao.an <anchao@xiaomi.com>
2022-02-10 15:04:33 -03:00
chao.an
39e142243d net/tcp/udp: move the send callback into tcp/udp structure
Signed-off-by: chao.an <anchao@xiaomi.com>
2022-02-10 15:04:33 -03:00
Alexander Lunev
2a6de301ee net/tcp: transformed NET_TCP_FAST_RETRANSMIT_WATERMARK option to boolean.
According to RFC 5681 (3.2) the TCP Fast Retransmit algorithm should start
if the threshold of 3 duplicate ACKs is reached.
Thus the threshold should be a constant, not an integer option.
2022-01-26 11:50:48 +08:00
Petro Karashchenko
08043fb5bc net: unify FAR keyword usage for all net buffer memory mapped buffers
Signed-off-by: Petro Karashchenko <petro.karashchenko@gmail.com>
2022-01-20 01:42:56 +08:00
Alexander Lunev
6bb7a92a9a net/tcp/tcp_send*: added debug asserts for TCP_ACKDATA, TCP_REXMIT and TCP_DISCONN_EVENTS flags 2022-01-19 10:45:38 +08:00
Alexander Lunev
5b13797cce net/tcp/tcp_send*: reliably obtain the TCP connection pointer in TCP event handlers
Do not use pvconn argument to get the TCP connection pointer because pvconn is
normally NULL for some events like NETDEV_DOWN. Instead, the TCP connection pointer
can be reliably obtained from the corresponding TCP socket.
2022-01-18 16:14:38 +08:00
chao.an
0ee7400fdf net/tcp: fix send deadlock if disconnect
Signed-off-by: chao.an <anchao@xiaomi.com>
2021-12-16 01:29:10 -06:00
YAMAMOTO Takashi
42f1851ca6 tcp_send_buffered.c: Fix snd_wnd
snd_wnd is an offset from the acked sequence number.
2021-08-25 20:56:05 +08:00
YAMAMOTO Takashi
b5bb0d56ad tcp: fix an assertion in "fix iob allocation deadlock" commit
Fix a wrong assertion in:

```
commit 98ec46d726
Author: YAMAMOTO Takashi <yamamoto@midokura.com>
Date:   Tue Jul 20 09:10:43 2021 +0900

    tcp_send_buffered.c: fix iob allocation deadlock

    Ensure to put the wrb back onto the write_q when blocking
    on iob allocation. Otherwise, it can deadlock with other
    threads doing the same thing.
```

I forget to submit this with https://github.com/apache/incubator-nuttx/pull/4257
2021-08-05 23:50:06 -07:00
YAMAMOTO Takashi
1d3594ba07 tcp_send_buffered.c: Fix broken retransmit
With an applictation using mbedtls, I observed retransmitted segments
with corrupted user data, detected by the peer tls during mac processing.

Looking at the packet dump, I suspect that a wrb which has been put back
onto the write_q for retransmission was partially sent but fully acked.
Note: it's normal for a retransmission to be acked before sent.

In that case, the bug fixed in this commit would cause the wrb have
a wrong sequence number, possibly the same as the next wrb. It matches
what I saw in the packet dump. That is, the broken segments contain the
payload identical to one of the previous segment.
2021-08-02 11:30:01 -07:00
YAMAMOTO Takashi
98ec46d726 tcp_send_buffered.c: fix iob allocation deadlock
Ensure to put the wrb back onto the write_q when blocking
on iob allocation. Otherwise, it can deadlock with other
threads doing the same thing.
2021-08-02 11:25:55 -07:00
chao.an
7d4502aca6 net/socket: add SO_SNDBUF support
Signed-off-by: chao.an <anchao@xiaomi.com>
2021-07-20 20:24:58 -07:00
chao.an
f52757f90a net/wrb: add tcp/udp_inqueue_wrb_size() interface
Signed-off-by: chao.an <anchao@xiaomi.com>
2021-07-20 20:24:58 -07:00
YAMAMOTO Takashi
09f3a1ec8e tcp_send_buffered: throttle IOB allocations for send
Consider a bi-directional TCP connection:

1. we use all IOBs for tx queue
2. we advertize zero recv window because we have no free IOBs
3. if the peer tcp does the same thing,
   both sides advertize zero window and can not drain the tx queue.

For a similar stall to happen, the peer doesn't need to be
a naive tcp implementation like nuttx. A naive application blocking
on send() without draining its read buffer is enough.
(Probably such an application should be fixed to drain rx even
when tx is full. However, it's another story.)

This commit avoids the situation by prevent tx from grabbing
the all IOBs in the first place. (assuming CONFIG_IOB_THROTTLE > 0)
2021-07-14 15:08:18 +08:00
YAMAMOTO Takashi
eb00e00e48 tcp: Use the tcp seq macros in some obvious places 2021-06-10 22:47:04 -05:00
Xiang Xiao
5b2a17b892 Include assert.h in necessary place
Signed-off-by: Xiang Xiao <xiaoxiang@xiaomi.com>
2021-06-08 13:06:08 -07:00
YAMAMOTO Takashi
69b3f034a4 tcp: Move buffered/unbuffered common code to tcp_send.c 2021-06-03 21:33:10 -05:00
Alin Jerpelea
b3ad98c89a net: update licenses to Apache
Gregory Nutt is the copyright holder for those files and he has submitted the
SGA as a result we can migrate the licenses to Apache.

Signed-off-by: Alin Jerpelea <alin.jerpelea@sony.com>
2021-05-27 08:07:25 +09:00
chao.an
a876f0253a net/tcp: recounter the ack counter during obtain newdata
Signed-off-by: chao.an <anchao@xiaomi.com>
2021-05-21 18:02:53 -03:00