Commit Graph

142 Commits

Author SHA1 Message Date
chao an
98e1f9c36d net/tcp: reuse common api to replace some ip select code
Signed-off-by: chao an <anchao@xiaomi.com>
2023-01-30 11:25:10 +08:00
Xiang Xiao
a851ad84c3 net: consistent the net sem wait naming conversion
to prepare the new mutex wait function

Signed-off-by: Xiang Xiao <xiaoxiang@xiaomi.com>
2023-01-15 12:31:30 -03:00
Xiang Xiao
d5689e070b net/arp: Remove nuttx/net/arp.h
1.move ARPHRD_ETHER to netinet/arp.h
1.move arp_entry_s to net/arp/arp.h
2.move arp_input to nuttx/net/netdev.h

Signed-off-by: Xiang Xiao <xiaoxiang@xiaomi.com>
2022-12-16 22:10:59 +02:00
chao an
11de08de27 net/devif_send: replace all block send to nonblock mode
Signed-off-by: chao an <anchao@xiaomi.com>
2022-12-08 09:53:15 +01:00
chao an
34d2cde8a8 net/l2/l3/l4: add support of iob offload
1. Add new config CONFIG_NET_LL_GUARDSIZE to isolation of l2 stack,
   which will benefit l3(IP) layer for multi-MAC(l2) implementation,
   especially in some NICs such as celluler net driver.

new configuration options: CONFIG_NET_LL_GUARDSIZE

CONFIG_NET_LL_GUARDSIZE will reserved l2 buffer header size of
network buffer to isolate the L2/L3 (MAC/IP) data on network layer,
which will be beneficial to L3 network layer protocol transparent
transmission and forwarding

------------------------------------------------------------
Layout of frist iob entry:

        iob_data (aligned by CONFIG_IOB_ALIGNMENT)
            |
            |                  io_offset(CONFIG_NET_LL_GUARDSIZE)
            |                                |
            -------------------------------------------------
      iob   |            Reserved            |    io_len    |
            -------------------------------------------------

-------------------------------------------------------------
Layout of different NICs implementation:

        iob_data (aligned by CONFIG_IOB_ALIGNMENT)
            |
            |                 io_offset(CONFIG_NET_LL_GUARDSIZE)
            |                                |
            -------------------------------------------------
 Ethernet   |       Reserved    | ETH_HDRLEN |    io_len    |
            ---------------------------------|---------------
 8021Q      |   Reserved  | ETH_8021Q_HDRLEN |    io_len    |
            ---------------------------------|---------------
 ipforward  |            Reserved            |    io_len    |
            -------------------------------------------------

--------------------------------------------------------------------

2. Support iob offload to l2 driver to avoid unnecessary memory copy

Support send/receive iob vectors directly between the NICs and l3/l4
stack to avoid unnecessary memory copies, especially on hardware that
supports Scatter/gather, which can greatly improve performance.

new interface to support iob offload:

  ------------------------------------------
  |    IOB version     |     original      |
  |----------------------------------------|
  |  devif_iob_poll()  |   devif_poll()    |
  |       ...          |       ...         |
  ------------------------------------------

--------------------------------------------------------------------

1> NIC hardware support Scatter/gather transfer

TX:

                tcp_poll()/udp_poll()/pkt_poll()/...(l3|l4)
                           /              \
                          /                \
devif_poll_[l3|l4]_connections()     devif_iob_send() (nocopy:udp/icmp/...)
           /                                   \      (copy:tcp)
          /                                     \
  devif_iob_poll("NIC"_txpoll)                callback() // "NIC"_txpoll
                                                  |
                            dev->d_iob:           |
                                                ---------------         ---------------
                             io_data       iob1 |  |          |    iob3 |  |          |
                                    \           ---------------         ---------------
                                  ---------------  |       --------------- |
                             iob0 |  |          |  |  iob2 |  |          | |
                                  ---------------  |       --------------- |
                                     \             |          /           /
                                        \          |       /           /
                                   ----------------------------------------------
                    NICs io vector |    |    |    |    |    |    |    |    |    |
                                   ----------------------------------------------

RX:

  [tcp|udp|icmp|...]ipv[4|6]_data_handler()(iob_concat/append to readahead)
                    |
                    |
      [tcp|udp|icmp|...]_ipv[4|6]_in()/...
                    |
                    |
          pkt/ipv[4/6]_input()/...
                    |
                    |
     NICs io vector receive(iov_base to each iobs)

--------------------------------------------------------------------

2> CONFIG_IOB_BUFSIZE is greater than MTU:

TX:

"(CONFIG_IOB_BUFSIZE) > (MAX_NETDEV_PKTSIZE + CONFIG_NET_GUARDSIZE + CONFIG_NET_LL_GUARDSIZE)"

                tcp_poll()/udp_poll()/pkt_poll()/...(l3|l4)
                           /              \
                          /                \
devif_poll_[l3|l4]_connections()     devif_iob_send() (nocopy:udp/icmp/...)
           /                                   \      (copy:tcp)
          /                                     \
  devif_iob_poll("NIC"_txpoll)                callback() // "NIC"_txpoll
                                                  |
                                             "NIC"_send()
                          (dev->d_iob->io_data[CONFIG_NET_LL_GUARDSIZE - NET_LL_HDRLEN(dev)])

RX:

  [tcp|udp|icmp|...]ipv[4|6]_data_handler()(iob_concat/append to readahead)
                    |
                    |
      [tcp|udp|icmp|...]_ipv[4|6]_in()/...
                    |
                    |
          pkt/ipv[4/6]_input()/...
                    |
                    |
     NICs io vector receive(iov_base to io_data)

--------------------------------------------------------------------

3> Compatible with all old flat buffer NICs

TX:
                tcp_poll()/udp_poll()/pkt_poll()/...(l3|l4)
                           /              \
                          /                \
devif_poll_[l3|l4]_connections()     devif_iob_send() (nocopy:udp/icmp/...)
           /                                   \      (copy:tcp)
          /                                     \
  devif_iob_poll(devif_poll_callback())  devif_poll_callback() /* new interface, gather iobs to flat buffer */
       /                                           \
      /                                             \
 devif_poll("NIC"_txpoll)                     "NIC"_send()(dev->d_buf)

RX:

  [tcp|udp|icmp|...]ipv[4|6]_data_handler()(iob_concat/append to readahead)
                    |
                    |
      [tcp|udp|icmp|...]_ipv[4|6]_in()/...
                    |
                    |
               netdev_input()  /* new interface, Scatter/gather flat/iob buffer */
                    |
                    |
          pkt/ipv[4|6]_input()/...
                    |
                    |
    NICs io vector receive(Orignal flat buffer)

3. Iperf passthrough on NuttX simulator:

  -------------------------------------------------
  |  Protocol      | Server | Client |            |
  |-----------------------------------------------|
  |  TCP           |  813   |   834  |  Mbits/sec |
  |  TCP(Offload)  | 1720   |  1100  |  Mbits/sec |
  |  UDP           |   22   |   757  |  Mbits/sec |
  |  UDP(Offload)  |   25   |  1250  |  Mbits/sec |
  -------------------------------------------------

Signed-off-by: chao an <anchao@xiaomi.com>
2022-12-03 11:47:04 +08:00
chao an
d54a20b393 net/tcp/udp: remove all domain assertion
fix build break if enable CONFIG_NET_IPv6 only

In file included from tcp/tcp_sendfile.c:38:
tcp/tcp_sendfile.c: In function ‘sendfile_eventhandler’:
tcp/tcp_sendfile.c:173:27: error: ‘struct tcp_conn_s’ has no member named ‘domain’
  173 |           DEBUGASSERT(conn->domain == PF_INET6);
      |                           ^~

Signed-off-by: chao an <anchao@xiaomi.com>
2022-11-29 17:28:36 +08:00
chao an
3caa551ff4 net: remove psock reference from connect
Signed-off-by: chao an <anchao@xiaomi.com>
2022-11-24 22:57:42 +08:00
chao an
a8d3286258 net: move device buffer define to common header
Signed-off-by: chao an <anchao@xiaomi.com>
2022-10-28 00:32:16 -04:00
anjiahao
5724c6b2e4 sem:remove sem default protocl
Signed-off-by: anjiahao <anjiahao@xiaomi.com>
2022-10-22 14:50:48 +08:00
chao.an
162fcd10ca net: cleanup pvconn reference to avoid confuse
More reference:
https://github.com/apache/incubator-nuttx/pull/5252
https://github.com/apache/incubator-nuttx/pull/5434

Signed-off-by: chao.an <anchao@xiaomi.com>
2022-08-26 20:58:11 +08:00
Xiang Xiao
abc72ad128 net: Ensure sendmsg and sendfile return -EAGAIN in case of timeout
instead of -ETIMEOUT, as specify here:
https://pubs.opengroup.org/onlinepubs/009604599/functions/sendmsg.html
https://man7.org/linux/man-pages/man2/sendfile.2.html

Signed-off-by: Xiang Xiao <xiaoxiang@xiaomi.com>
2022-06-28 06:19:13 +03:00
Alexander Lunev
404ceffae2 tcp: added debug asserts and logging to investigate the rare (conn->dev == NULL) bug in callback handlers 2022-02-26 11:48:07 -03:00
Alexander Lunev
774994b951 net/tcp: support for FIN+ACK case in tcp send event handlers 2022-02-13 03:20:18 +08:00
chao.an
8fb2468785 net/tcp: remove the socket hook reference from netdev callback
Signed-off-by: chao.an <anchao@xiaomi.com>
2022-02-10 15:04:33 -03:00
chao.an
3fce144aeb net/inet: move recv/send timeout into socket_conn_s
Signed-off-by: chao.an <anchao@xiaomi.com>
2022-02-10 15:04:33 -03:00
chao.an
99cde13a11 net/inet: move socket flags into socket_conn_s
Signed-off-by: chao.an <anchao@xiaomi.com>
2022-02-10 15:04:33 -03:00
Alexander Lunev
2a6de301ee net/tcp: transformed NET_TCP_FAST_RETRANSMIT_WATERMARK option to boolean.
According to RFC 5681 (3.2) the TCP Fast Retransmit algorithm should start
if the threshold of 3 duplicate ACKs is reached.
Thus the threshold should be a constant, not an integer option.
2022-01-26 11:50:48 +08:00
Alexander Lunev
ad25c43983 net/tcp/sendfile: fast retransmit on duplicate acknowledgments (RFC 5681).
(the same as it was implemented in tcp_send_unbuffered.c)
2022-01-25 16:30:38 +08:00
Alexander Lunev
c9e32dd4a4 tcp: fixed warning: ISO C90 forbids mixed declarations and code 2022-01-22 00:41:42 +08:00
Alexander Lunev
64dd669749 net/tcp/sendfile: retransmit only one the earliest not acknowledged segment
(according to RFC 6298 (5.4)). The issue is the same as it was in tcp_send_unbuffered.c.
2022-01-20 18:37:39 +08:00
Petro Karashchenko
08043fb5bc net: unify FAR keyword usage for all net buffer memory mapped buffers
Signed-off-by: Petro Karashchenko <petro.karashchenko@gmail.com>
2022-01-20 01:42:56 +08:00
Alexander Lunev
6bb7a92a9a net/tcp/tcp_send*: added debug asserts for TCP_ACKDATA, TCP_REXMIT and TCP_DISCONN_EVENTS flags 2022-01-19 10:45:38 +08:00
Alexander Lunev
79e609a8c8 net/tcp/sendfile: swapped the location of TCP_DISCONN_EVENTS and TCP_ACKDATA conditions towards tcp_send_unbuffered.c unification 2022-01-19 00:13:38 +08:00
Alexander Lunev
5b13797cce net/tcp/tcp_send*: reliably obtain the TCP connection pointer in TCP event handlers
Do not use pvconn argument to get the TCP connection pointer because pvconn is
normally NULL for some events like NETDEV_DOWN. Instead, the TCP connection pointer
can be reliably obtained from the corresponding TCP socket.
2022-01-18 16:14:38 +08:00
Alexander Lunev
0afb1d8dbb net/tcp(unbuffered): fast retransmit on duplicate acknowledgments 2022-01-02 23:25:09 +08:00
Alexander Lunev
2b60468845 net/tcp(unbuffered): removed excessive overwrites of conn->sndseq
(conn->sndseq was updated in multiple places that was unreasonable and complicated).
2021-12-29 05:35:23 -06:00
chao.an
0ee7400fdf net/tcp: fix send deadlock if disconnect
Signed-off-by: chao.an <anchao@xiaomi.com>
2021-12-16 01:29:10 -06:00
Alexander Lunev
1e07b6d528 net/tcp(unbuffered): retransmit only one the earliest not acknowledged segment
(according to RFC 6298 (5.4)).
2021-11-10 12:21:07 -06:00
YAMAMOTO Takashi
28d168e1b8 tcp_send_unbuffered.c: Fix nxstyle errors 2021-11-04 13:32:57 -05:00
YAMAMOTO Takashi
1550a525e9 tcp_send_unbuffered.c: unifdef -UCONFIG_NET_TCP_SPLIT 2021-11-04 13:32:57 -05:00
Anthony Merlino
a885b24cc1 Attempt to fix race condition reported in issue 2021-07-04 08:54:15 -05:00
YAMAMOTO Takashi
eb00e00e48 tcp: Use the tcp seq macros in some obvious places 2021-06-10 22:47:04 -05:00
YAMAMOTO Takashi
69b3f034a4 tcp: Move buffered/unbuffered common code to tcp_send.c 2021-06-03 21:33:10 -05:00
Jiuzhu Dong
4d5a964f29 net: unify socket into file descriptor
Change-Id: I9bcd21564e6c97d3edbb38aed1748c114160ea36
Signed-off-by: Jiuzhu Dong <dongjiuzhu1@xiaomi.com>
2021-03-03 19:01:41 -08:00
Alin Jerpelea
37d5c1b0d9 net: Author Gregory Nutt: update licenses to Apache
Gregory Nutt has submitted the SGA and we can migrate the licenses
 to Apache.

Signed-off-by: Alin Jerpelea <alin.jerpelea@sony.com>
2021-02-20 00:38:18 -08:00
chao.an
794a6ec23d net/tcp: rename the winszie to snd_wnd to make the semantics more accurate
Change-Id: I8fdc7cf78a7f2cd53a30ef1de702b1a697c43238
Signed-off-by: chao.an <anchao@xiaomi.com>
2020-12-10 12:23:47 +09:00
YAMAMOTO Takashi
28fda4e5bb net/tcp/tcp_send_unbuffered.c: Fix syslog formats 2020-11-23 05:00:10 -08:00
YAMAMOTO Takashi
8cb6790cef net/tcp/tcp_send_unbuffered.c: Fix a syslog format 2020-11-20 22:22:53 -08:00
Gregory Nutt
a569006fd8 sched/: Make more naming consistent
Rename various functions per the quidelines of https://cwiki.apache.org/confluence/display/NUTTX/Naming+of+OS+Internal+Functions

    nxsem_setprotocol -> nxsem_set_protocol
    nxsem_getprotocol -> nxsem_get_protocol
    nxsem_getvalue -> nxsem_get_value
2020-05-17 14:01:00 -03:00
Xiang Xiao
cde88cabcc Run codespell -w with the latest dictonary again
Signed-off-by: Xiang Xiao <xiaoxiang@xiaomi.com>
2020-02-23 22:27:46 +01:00
chao.an
c65d8e6a23 net/socket: add MSG_DONTWAIT support
MSG_DONTWAIT (since Linux 2.2)
  Enables nonblocking operation; if the operation would block, the
  call fails with the error EAGAIN or EWOULDBLOCK. This provides
  similar behavior to setting the O_NONBLOCK flag (via the fcntl(2)
  F_SETFL operation), but differs in that MSG_DONTWAIT is a per-call
  option, whereas O_NONBLOCK is a setting on the open file description
  (see open(2)), which will affect all threads in the calling process
  and as well as other processes that hold file descriptors referring
  to the same open file description.
2020-02-19 12:21:28 -06:00
Gregory Nutt
45a83e9481 net/tcp/tcp_send_unbuffered.c: Correct a bad signal number
There is no errno ETIMEOUT.  Probably should be ETIMEDOUT.
2020-01-11 15:36:51 -03:00
Xiang Xiao
5c5c08efcd network: simplify the timeout process logic
1.Consolidate absolute to relative timeout conversion into one place(_net_timedwait)
2.Drive the wait timeout logic by net_timedwait instead of devif_timer
This patch help us remove devif_timer(period tick) to save the power in the future.

Change-Id: I534748a5d767ca6da8a7843c3c2f993ed9ea77d4
Signed-off-by: Xiang Xiao <xiaoxiang@xiaomi.com>
2020-01-11 08:24:49 -06:00
Xiang Xiao
6a3c2aded6 Fix wait loop and void cast ()
* Simplify EINTR/ECANCEL error handling

1. Add semaphore uninterruptible wait function
2 .Replace semaphore wait loop with a single uninterruptible wait
3. Replace all sem_xxx to nxsem_xxx

* Unify the void cast usage

1. Remove void cast for function because many place ignore the returned value witout cast
2. Replace void cast for variable with UNUSED macro
2020-01-02 10:54:43 -06:00
Xiang Xiao
90c52e6f8f Squashed commit of the following:
Author: Gregory Nutt <gnutt@nuttx.org>

    Run all .h and .c files modified in last PR through nxstyle.

Author: Xiang Xiao <xiaoxiang@xiaomi.com>

    Net cleanup ()

    * Fix the semaphore usage issue found in tcp/udp

    1. The count semaphore need disable priority inheritance
    2. Loop again if net_lockedwait return -EINTR
    3. Call nxsem_trywait to avoid the race condition
    4. Call nxsem_post instead of sem_post

    * Put the work notifier into free list to avoid the heap fragment in the long run.  Since the allocation strategy is encapsulated internally, we can even refine the implementation later.

    * Network stack shouldn't allocate memory in the poll implementation to avoid the heap fragment in the long run, other modification include:

    1. Select MM_IOB automatically since ICMP[v6] socket can't work without the read ahead buffer
    2. Remove the net lock since xxx_callback_free already do the same thing
    3. TCP/UDP poll should work even the read ahead buffer isn't enabled at all

    * Add NET_ prefix for UDP_NOTIFIER and TCP_NOTIFIER option to align with other UDP/TCP option convention

    * Remove the unused _SF_[IDLE|ACCEPT|SEND|RECV|MASK] flags since there are code to set/clear these flags, but nobody check them.
2019-12-31 09:26:14 -06:00
Gregory Nutt
66ef6d143a This commit adds an initial implemented of TCP delayed ACKs as specified in RFC 1122.
Squashed commit of the following:

    net/tmp:  Rename the unacked field of the tcp connection structure to tx_unacked.  Too confusing with the implementation of delayed RX ACKs.

    net/tcp:  Initial implementation of TCP delayed ACKs.

    net/tcp:  Add delayed ACK configuration selection.  Rename tcp_ack() to tcp_synack().  It may or may not send a ACK.  It will always send SYN or SYN/ACK.
2019-12-08 13:13:51 -06:00
Xiang Xiao
98fc60eb75 [tcp|udp]_poll_eventhandler check psock_[tcp|udp]_cansend before report POLLOUT. Change the unbuffered psock_[tcp|udp]_cansend return OK to unify the code logic and remove the unnecessary [tcp|udp]_poll_txnotify call. 2019-11-24 10:30:48 -06:00
Xiang Xiao
5fd8f78bf9 net/ipforward, tcp, and udp: Fix a chicken and egg problem by eliminating the check of the arp/neighbor tables before packet transmission
1. For buffered tcp/udp case, if CONFIG_NET_ARP_SEND/CONFIG_NET_ARP_IPIN/CONFIG_NET_ICMPv6_NEIGHBOR isn't enabled and the table doesn't contain ip<->ethaddr mapping yet, the logic will skip the realtransmission and then arp/neighbor can't steal the final buffer to generate arp/icmpv6 packet.
2.for all other case, the tcp layer or user program should already contain the retransmit logic, the check is redundancy and may generate many duplicated packets if arp/icmpv6 response is too slow because the cursor stop forward. If user still concern about the very first packet lost, he could fix the issue by enabling CONFIG_NET_ARP_SEND/CONFIG_NET_ICMPv6_NEIGHBOR at begin.
2019-09-18 12:33:41 -06:00
Gregory Nutt
1346f29151 net/: Fix alignment and spacing problems found by tools/nxstyle. 2019-07-02 18:02:23 -06:00
Gregory Nutt
f6b00e1966 tools/nxstyle.c: Fix logic error that prevent detecion of '/' and '/=' as operators. net/: Minor updates resulting from testing tools/nxstyle. 2019-03-11 12:48:39 -06:00