Commit Graph

10 Commits

Author SHA1 Message Date
chao an
34d2cde8a8 net/l2/l3/l4: add support of iob offload
1. Add new config CONFIG_NET_LL_GUARDSIZE to isolation of l2 stack,
   which will benefit l3(IP) layer for multi-MAC(l2) implementation,
   especially in some NICs such as celluler net driver.

new configuration options: CONFIG_NET_LL_GUARDSIZE

CONFIG_NET_LL_GUARDSIZE will reserved l2 buffer header size of
network buffer to isolate the L2/L3 (MAC/IP) data on network layer,
which will be beneficial to L3 network layer protocol transparent
transmission and forwarding

------------------------------------------------------------
Layout of frist iob entry:

        iob_data (aligned by CONFIG_IOB_ALIGNMENT)
            |
            |                  io_offset(CONFIG_NET_LL_GUARDSIZE)
            |                                |
            -------------------------------------------------
      iob   |            Reserved            |    io_len    |
            -------------------------------------------------

-------------------------------------------------------------
Layout of different NICs implementation:

        iob_data (aligned by CONFIG_IOB_ALIGNMENT)
            |
            |                 io_offset(CONFIG_NET_LL_GUARDSIZE)
            |                                |
            -------------------------------------------------
 Ethernet   |       Reserved    | ETH_HDRLEN |    io_len    |
            ---------------------------------|---------------
 8021Q      |   Reserved  | ETH_8021Q_HDRLEN |    io_len    |
            ---------------------------------|---------------
 ipforward  |            Reserved            |    io_len    |
            -------------------------------------------------

--------------------------------------------------------------------

2. Support iob offload to l2 driver to avoid unnecessary memory copy

Support send/receive iob vectors directly between the NICs and l3/l4
stack to avoid unnecessary memory copies, especially on hardware that
supports Scatter/gather, which can greatly improve performance.

new interface to support iob offload:

  ------------------------------------------
  |    IOB version     |     original      |
  |----------------------------------------|
  |  devif_iob_poll()  |   devif_poll()    |
  |       ...          |       ...         |
  ------------------------------------------

--------------------------------------------------------------------

1> NIC hardware support Scatter/gather transfer

TX:

                tcp_poll()/udp_poll()/pkt_poll()/...(l3|l4)
                           /              \
                          /                \
devif_poll_[l3|l4]_connections()     devif_iob_send() (nocopy:udp/icmp/...)
           /                                   \      (copy:tcp)
          /                                     \
  devif_iob_poll("NIC"_txpoll)                callback() // "NIC"_txpoll
                                                  |
                            dev->d_iob:           |
                                                ---------------         ---------------
                             io_data       iob1 |  |          |    iob3 |  |          |
                                    \           ---------------         ---------------
                                  ---------------  |       --------------- |
                             iob0 |  |          |  |  iob2 |  |          | |
                                  ---------------  |       --------------- |
                                     \             |          /           /
                                        \          |       /           /
                                   ----------------------------------------------
                    NICs io vector |    |    |    |    |    |    |    |    |    |
                                   ----------------------------------------------

RX:

  [tcp|udp|icmp|...]ipv[4|6]_data_handler()(iob_concat/append to readahead)
                    |
                    |
      [tcp|udp|icmp|...]_ipv[4|6]_in()/...
                    |
                    |
          pkt/ipv[4/6]_input()/...
                    |
                    |
     NICs io vector receive(iov_base to each iobs)

--------------------------------------------------------------------

2> CONFIG_IOB_BUFSIZE is greater than MTU:

TX:

"(CONFIG_IOB_BUFSIZE) > (MAX_NETDEV_PKTSIZE + CONFIG_NET_GUARDSIZE + CONFIG_NET_LL_GUARDSIZE)"

                tcp_poll()/udp_poll()/pkt_poll()/...(l3|l4)
                           /              \
                          /                \
devif_poll_[l3|l4]_connections()     devif_iob_send() (nocopy:udp/icmp/...)
           /                                   \      (copy:tcp)
          /                                     \
  devif_iob_poll("NIC"_txpoll)                callback() // "NIC"_txpoll
                                                  |
                                             "NIC"_send()
                          (dev->d_iob->io_data[CONFIG_NET_LL_GUARDSIZE - NET_LL_HDRLEN(dev)])

RX:

  [tcp|udp|icmp|...]ipv[4|6]_data_handler()(iob_concat/append to readahead)
                    |
                    |
      [tcp|udp|icmp|...]_ipv[4|6]_in()/...
                    |
                    |
          pkt/ipv[4/6]_input()/...
                    |
                    |
     NICs io vector receive(iov_base to io_data)

--------------------------------------------------------------------

3> Compatible with all old flat buffer NICs

TX:
                tcp_poll()/udp_poll()/pkt_poll()/...(l3|l4)
                           /              \
                          /                \
devif_poll_[l3|l4]_connections()     devif_iob_send() (nocopy:udp/icmp/...)
           /                                   \      (copy:tcp)
          /                                     \
  devif_iob_poll(devif_poll_callback())  devif_poll_callback() /* new interface, gather iobs to flat buffer */
       /                                           \
      /                                             \
 devif_poll("NIC"_txpoll)                     "NIC"_send()(dev->d_buf)

RX:

  [tcp|udp|icmp|...]ipv[4|6]_data_handler()(iob_concat/append to readahead)
                    |
                    |
      [tcp|udp|icmp|...]_ipv[4|6]_in()/...
                    |
                    |
               netdev_input()  /* new interface, Scatter/gather flat/iob buffer */
                    |
                    |
          pkt/ipv[4|6]_input()/...
                    |
                    |
    NICs io vector receive(Orignal flat buffer)

3. Iperf passthrough on NuttX simulator:

  -------------------------------------------------
  |  Protocol      | Server | Client |            |
  |-----------------------------------------------|
  |  TCP           |  813   |   834  |  Mbits/sec |
  |  TCP(Offload)  | 1720   |  1100  |  Mbits/sec |
  |  UDP           |   22   |   757  |  Mbits/sec |
  |  UDP(Offload)  |   25   |  1250  |  Mbits/sec |
  -------------------------------------------------

Signed-off-by: chao an <anchao@xiaomi.com>
2022-12-03 11:47:04 +08:00
Zhe Weng
025430a964 net/icmpv6: Fix icmpv6_reply function
Fix icmpv6_reply logic broken by commit 48311cc61f and 391b501639.

- 48311cc61f "Fix unaligned memory access when creating ICMP Port Unreachable messages"
  - It removed `htonl` function outside `data`, then the byte order may be wrong, so add `htons` back.
- 391b501639 "net: extract l3 header build code into new functions"
  - It mis-removed the `memmove`, and the icmpv6 has no payload copied after this commit.

Signed-off-by: Zhe Weng <wengzhe@xiaomi.com>
2022-12-02 15:26:45 +08:00
liyi
391b501639 net: extract l3 header build code into new functions
Signed-off-by: liyi <liyi25@xiaomi.com>
2022-11-29 18:36:15 +08:00
Zhe Weng
869c93638d net/icmpv6: Fix ipv6->len in icmpv6_reply
The `ipv6->len` is the length excluding the IPv6 header, so need to be `dev->d_len - IPv6_HDRLEN`.

Signed-off-by: Zhe Weng <wengzhe@xiaomi.com>
2022-11-24 23:19:10 +08:00
Zhe Weng
a26ea96f3b net/icmpv6: Fix datalen in icmpv6_reply
The `datalen` indicates the whole len of original packet, which will become the payload inside icmpv6 packet.

Using `datalen = (ipv4->len[0] << 8) + ipv4->len[1]` in icmp_reply is correct, because it includes IPv4 header, but when coming to IPv6, the `len` does not include the header, so we need to add it back.

Signed-off-by: Zhe Weng <wengzhe@xiaomi.com>
2022-11-24 23:19:10 +08:00
Zhe Weng
233974b327 net/icmpv6: Fix icmpv6 checksum calculation in icmpv6_reply
The second param of icmpv6_chksum is `iplen`, which indicates `The size of the IPv6 header`.

Signed-off-by: Zhe Weng <wengzhe@xiaomi.com>
2022-11-24 23:19:10 +08:00
chao an
a8d3286258 net: move device buffer define to common header
Signed-off-by: chao an <anchao@xiaomi.com>
2022-10-28 00:32:16 -04:00
Petro Karashchenko
08043fb5bc net: unify FAR keyword usage for all net buffer memory mapped buffers
Signed-off-by: Petro Karashchenko <petro.karashchenko@gmail.com>
2022-01-20 01:42:56 +08:00
Norman Rasmussen
48311cc61f Fix unaligned memory access when creating ICMP Port Unreachable messages
commit 3b69d09c80 corrected the
unreachable handling for net/udp/icmp but introduced an unaligned store.
This splits the uint32_t data field into a two element uint16_t data
field to avoid the unaligned store.
2021-12-28 03:51:53 -06:00
chao.an
3b69d09c80 net/udp/icmp: correct the unreadchable handling
Reference RFC1122:
https://datatracker.ietf.org/doc/html/rfc1122
----------------------------------------------

4.1.3  SPECIFIC ISSUES

  4.1.3.1  Ports

    If a datagram arrives addressed to a UDP port for which
    there is no pending LISTEN call, UDP SHOULD send an ICMP
    Port Unreachable message.

Signed-off-by: chao.an <anchao@xiaomi.com>
2021-11-26 08:47:54 -06:00