Commit Graph

1172588 Commits

Author SHA1 Message Date
Arnd Bergmann
33d74c8ff5 net: mscc: ocelot: remove incompatible prototypes
The types for the register argument changed recently, but there are
still incompatible prototypes that got left behind, and gcc-13 warns
about these:

In file included from drivers/net/ethernet/mscc/ocelot.c:13:
drivers/net/ethernet/mscc/ocelot.h:97:5: error: conflicting types for 'ocelot_port_readl' due to enum/integer mismatch; have 'u32(struct ocelot_port *, u32)' {aka 'unsigned int(struct ocelot_port *, unsigned int)'} [-Werror=enum-int-mismatch]
   97 | u32 ocelot_port_readl(struct ocelot_port *port, u32 reg);
      |     ^~~~~~~~~~~~~~~~~

Just remove the two prototypes, and rely on the copy in the global
header.

Fixes: 9ecd05794b ("net: mscc: ocelot: strengthen type of "u32 reg" in I/O accessors")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20230417205531.1880657-1-arnd@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-18 21:13:23 -07:00
Corinna Vinschen
6b2c6e4a93 net: stmmac: propagate feature flags to vlan
stmmac_dev_probe doesn't propagate feature flags to VLANs.  So features
like offloading don't correspond with the general features and it's not
possible to manipulate features via ethtool -K to affect VLANs.

Propagate feature flags to vlan features.  Drop TSO feature because
it does not work on VLANs yet.

Signed-off-by: Corinna Vinschen <vinschen@redhat.com>
Link: https://lore.kernel.org/r/20230417192845.590034-1-vinschen@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-18 21:13:13 -07:00
Hangbin Liu
980f0799a1 bonding: add software tx timestamping support
Currently, bonding only obtain the timestamp (ts) information of
the active slave, which is available only for modes 1, 5, and 6.
For other modes, bonding only has software rx timestamping support.

However, some users who use modes such as LACP also want tx timestamp
support. To address this issue, let's check the ts information of each
slave. If all slaves support tx timestamping, we can enable tx
timestamping support for the bond.

Add a note that the get_ts_info may be called with RCU, or rtnl or
reference on the device in ethtool.h>

Suggested-by: Miroslav Lichvar <mlichvar@redhat.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Link: https://lore.kernel.org/r/20230418034841.2566262-1-liuhangbin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-18 20:48:59 -07:00
Paolo Abeni
dce46f1b0c Merge branch 'add-ethernet-driver-for-starfive-jh7110-soc'
Samin Guo says:

====================
Add Ethernet driver for StarFive JH7110 SoC

This series adds ethernet support for the StarFive JH7110 RISC-V SoC,
which includes a dwmac-5.20 MAC driver (from Synopsys DesignWare).
This series has been tested and works fine on VisionFive-2 v1.2A and
v1.3B SBC boards.

For more information and support, you can visit RVspace wiki[1].
You can simply review or test the patches at the link [2].
This patchset should be applied after the patchset [3] [4].

[1]: https://wiki.rvspace.org/
[2]: https://github.com/saminGuo/linux/tree/vf2-6.3rc4-gmac-net-next
[3]: https://patchwork.kernel.org/project/linux-riscv/cover/20230401111934.130844-1-hal.feng@starfivetech.com
[4]: https://patchwork.kernel.org/project/linux-riscv/cover/20230315055813.94740-1-william.qiu@starfivetech.com
====================

Link: https://lore.kernel.org/r/20230417100251.11871-1-samin.guo@starfivetech.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-04-18 13:13:44 +02:00
Samin Guo
b4a5afa51c net: stmmac: dwmac-starfive: Add phy interface settings
dwmac supports multiple modess. When working under rmii and rgmii,
you need to set different phy interfaces.

According to the dwmac document, when working in rmii, it needs to be
set to 0x4, and rgmii needs to be set to 0x1.

The phy interface needs to be set in syscon, the format is as follows:
starfive,syscon: <&syscon, offset, shift>

Tested-by: Tommaso Merciai <tomm.merciai@gmail.com>
Signed-off-by: Samin Guo <samin.guo@starfivetech.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-04-18 13:13:41 +02:00
Samin Guo
4bd3bb7b45 net: stmmac: Add glue layer for StarFive JH7110 SoC
This adds StarFive dwmac driver support on the StarFive JH7110 SoC.

Tested-by: Tommaso Merciai <tomm.merciai@gmail.com>
Co-developed-by: Emil Renner Berthing <kernel@esmil.dk>
Signed-off-by: Emil Renner Berthing <kernel@esmil.dk>
Signed-off-by: Samin Guo <samin.guo@starfivetech.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-04-18 13:13:41 +02:00
Yanhong Wang
b76eaf7d7e dt-bindings: net: Add support StarFive dwmac
Add documentation to describe StarFive dwmac driver(GMAC).

Signed-off-by: Yanhong Wang <yanhong.wang@starfivetech.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Samin Guo <samin.guo@starfivetech.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-04-18 13:13:41 +02:00
Samin Guo
843f603762 dt-bindings: net: snps,dwmac: Add 'ahb' reset/reset-name
According to:
stmmac_platform.c: stmmac_probe_config_dt
stmmac_main.c: stmmac_dvr_probe

dwmac controller may require one (stmmaceth) or two (stmmaceth+ahb)
reset signals, and the maxItems of resets/reset-names is going to be 2.

The gmac of Starfive Jh7110 SOC must have two resets.
it uses snps,dwmac-5.20 IP.

Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Samin Guo <samin.guo@starfivetech.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-04-18 13:13:41 +02:00
Emil Renner Berthing
65a1d72f0c net: stmmac: platform: Add snps,dwmac-5.20 IP compatible string
Add "snps,dwmac-5.20" compatible string for 5.20 version that can avoid
to define some platform data in the glue layer.

Tested-by: Tommaso Merciai <tomm.merciai@gmail.com>
Signed-off-by: Emil Renner Berthing <kernel@esmil.dk>
Signed-off-by: Samin Guo <samin.guo@starfivetech.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-04-18 13:13:40 +02:00
Emil Renner Berthing
13f9351180 dt-bindings: net: snps,dwmac: Add dwmac-5.20 version
Add dwmac-5.20 IP version to snps.dwmac.yaml

Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Emil Renner Berthing <kernel@esmil.dk>
Signed-off-by: Samin Guo <samin.guo@starfivetech.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-04-18 13:13:40 +02:00
Paolo Abeni
6714d478eb Merge branch 'r8169-use-new-macros-from-netdev_queues-h'
Heiner Kallweit says:

====================
r8169: use new macros from netdev_queues.h

Add one missing subqueue version of the macros, and use the new macros
in r8169 to simplify the code.
====================

Link: https://lore.kernel.org/r/7147a001-3d9c-a48d-d398-a94c666aa65b@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-04-18 12:59:06 +02:00
Heiner Kallweit
1a31ae0048 r8169: use new macro netif_subqueue_completed_wake in the tx cleanup path
Use new net core macro netif_subqueue_completed_wake to simplify
the code of the tx cleanup path.

Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-04-18 12:59:01 +02:00
Heiner Kallweit
8624e9bbef r8169: use new macro netif_subqueue_maybe_stop in rtl8169_start_xmit
Use new net core macro netif_subqueue_maybe_stop in the start_xmit path
to simplify the code. Whilst at it, set the tx queue start threshold to
twice the stop threshold. Before values were the same, resulting in
stopping/starting the queue more often than needed.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-04-18 12:59:01 +02:00
Heiner Kallweit
cb18e5595d net: add macro netif_subqueue_completed_wake
Add netif_subqueue_completed_wake, complementing the subqueue versions
netif_subqueue_try_stop and netif_subqueue_maybe_stop.

Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-04-18 12:59:01 +02:00
Jakub Kicinski
3684a23b5a Merge branch 'ocelot-felix-driver-support-for-preemptible-traffic-classes'
Vladimir Oltean says:

====================
Ocelot/Felix driver support for preemptible traffic classes

The series "Add tc-mqprio and tc-taprio support for preemptible traffic
classes" from:
https://lore.kernel.org/netdev/20230220122343.1156614-1-vladimir.oltean@nxp.com/

was eventually submitted in a form without the support for the
Ocelot/Felix switch driver. This patch set picks up that work again,
and presents a fairly modified form compared to the original.
====================

Link: https://lore.kernel.org/r/20230415170551.3939607-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-17 19:01:20 -07:00
Vladimir Oltean
403ffc2c34 net: mscc: ocelot: add support for preemptible traffic classes
In order to not transmit (preemptible) frames which will be received by
the link partner as corrupted (because it doesn't support FP), the
hardware requires the driver to program the QSYS_PREEMPTION_CFG_P_QUEUES
register only after the MAC Merge layer becomes active (verification
succeeds, or was disabled).

There are some cases when FP is known (through experimentation) to be
broken. Give priority to FP over cut-through switching, and disable FP
for known broken link modes.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-17 19:01:19 -07:00
Vladimir Oltean
a1ca9f8b07 net: dsa: felix: act upon the mqprio qopt in taprio offload
The mqprio queue configuration can appear either through
TC_SETUP_QDISC_MQPRIO or through TC_SETUP_QDISC_TAPRIO. Make sure both
are treated in the same way.

Code does nothing new for now (except for rejecting multiple TXQs per
TC, which is a useless concept with DSA switches).

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Ferenc Fejes <fejes@inf.elte.hu>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-17 19:01:19 -07:00
Vladimir Oltean
aac80140dc net: mscc: ocelot: add support for mqprio offload
This doesn't apply anything to hardware and in general doesn't do
anything that the software variant doesn't do, except for checking that
there isn't more than 1 TXQ per TC (TXQs for a DSA switch are a dubious
concept anyway). The reason we add this is to be able to parse one more
field added to struct tc_mqprio_qopt_offload, namely preemptible_tcs.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Ferenc Fejes <fejes@inf.elte.hu>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-17 19:01:19 -07:00
Vladimir Oltean
bddd96dd80 net: mscc: ocelot: don't rely on cached verify_status in ocelot_port_get_mm()
ocelot_mm_update_port_status() updates mm->verify_status, but when the
verification state of a port changes, an IRQ isn't emitted, but rather,
only when the verification state reaches one of the final states (like
DISABLED, FAILED, SUCCEEDED) - things that would affect mm->tx_active,
which is what the IRQ *is* actually emitted for.

That is to say, user space may miss reports of an intermediary MAC Merge
verification state (like from INITIAL to VERIFYING), unless there was an
IRQ notifying the driver of the change in mm->tx_active as well.

This is not a huge deal, but for reliable reporting to user space, let's
call ocelot_mm_update_port_status() synchronously from
ocelot_port_get_mm(), which makes user space see the current MM status.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-17 19:01:18 -07:00
Vladimir Oltean
7bf4a5b071 net: mscc: ocelot: optimize ocelot_mm_irq()
The MAC Merge IRQ of all ports is shared with the PTP TX timestamp IRQ
of all ports, which means that currently, when a PTP TX timestamp is
generated, felix_irq_handler() also polls for the MAC Merge layer status
of all ports, looking for changes. This makes the kernel do more work,
and under certain circumstances may make ptp4l require a
tx_timestamp_timeout argument higher than before.

Changes to the MAC Merge layer status are only to be expected under
certain conditions - its TX direction needs to be enabled - so we can
check early if that is the case, and omit register access otherwise.

Make ocelot_mm_update_port_status() skip register access if
mm->tx_enabled is unset, and also call it once more, outside IRQ
context, from ocelot_port_set_mm(), when mm->tx_enabled transitions from
true to false, because an IRQ is also expected in that case.

Also, a port may have its MAC Merge layer enabled but it may not have
generated the interrupt. In that case, there's no point in writing to
DEV_MM_STATUS to acknowledge that IRQ. We can reduce the number of
register writes per port with MM enabled by keeping an "ack" variable
which writes the "write-one-to-clear" bits. Those are 3 in number:
PRMPT_ACTIVE_STICKY, UNEXP_RX_PFRM_STICKY and UNEXP_TX_PFRM_STICKY.
The other fields in DEV_MM_STATUS are read-only and it doesn't matter
what is written to them, so writing zero is just fine.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-17 19:01:18 -07:00
Vladimir Oltean
3ff468ef98 net: mscc: ocelot: remove struct ocelot_mm_state :: lock
Unfortunately, the workarounds for the hardware bugs make it pointless
to keep fine-grained locking for the MAC Merge state of each port.

Our vsc9959_cut_through_fwd() implementation requires
ocelot->fwd_domain_lock to be held, in order to serialize with changes
to the bridging domains and to port speed changes (which affect which
ports can be cut-through). Simultaneously, the traffic classes which can
be cut-through cannot be preemptible at the same time, and this will
depend on the MAC Merge layer state (which changes from threaded
interrupt context).

Since vsc9959_cut_through_fwd() would have to hold the mm->lock of all
ports for a correct and race-free implementation with respect to
ocelot_mm_irq(), in practice it means that any time a port's mm->lock is
held, it would potentially block holders of ocelot->fwd_domain_lock.

In the interest of simple locking rules, make all MAC Merge layer state
changes (and preemptible traffic class changes) be serialized by the
ocelot->fwd_domain_lock.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-17 19:01:18 -07:00
Vladimir Oltean
15f93f46f3 net: mscc: ocelot: export a single ocelot_mm_irq()
When the switch emits an IRQ, we don't know what caused it, and we
iterate through all ports to check the MAC Merge status.

Move that iteration inside the ocelot lib; we will change the locking in
a future change and it would be good to encapsulate that lock completely
within the ocelot lib.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-17 19:01:18 -07:00
Jakub Kicinski
3b53ada514 Merge branch 'xdp-rx-hwts-metadata-for-stmmac-driver'
Song Yoong Siang says:

====================
XDP Rx HWTS metadata for stmmac driver

Implemented XDP receive hardware timestamp metadata for stmmac driver.

This patchset is tested with tools/testing/selftests/bpf/xdp_hw_metadata.
Below are the test steps and results.

Command on DUT:
	sudo ./xdp_hw_metadata <interface name>

Command on Link Partner:
	echo -n xdp | nc -u -q1 <destination IPv4 addr> 9091
	echo -n skb | nc -u -q1 <destination IPv4 addr> 9092

Result for port 9091:
	poll: 1 (0) skip=1 fail=0 redir=1
	xsk_ring_cons__peek: 1
	0x55f69f65f6d0: rx_desc[0]->addr=100000000008000 addr=8100 comp_addr=8000
	rx_timestamp: 1677762069053692631
	No rx_hash err=-95
	0x55f69f65f6d0: complete idx=8 addr=8000

Result for port 9092:
	poll: 1 (0) skip=2 fail=0 redir=1
	found skb hwtstamp = 1677762071.937207680
====================

Link: https://lore.kernel.org/r/20230415064503.3225835-1-yoong.siang.song@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-17 18:57:28 -07:00
Song Yoong Siang
9570df3533 net: stmmac: add Rx HWTS metadata to XDP ZC receive pkt
Add receive hardware timestamp metadata support via kfunc to XDP Zero Copy
receive packets.

Signed-off-by: Song Yoong Siang <yoong.siang.song@intel.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-17 18:57:26 -07:00
Song Yoong Siang
e3f9c3e348 net: stmmac: add Rx HWTS metadata to XDP receive pkt
Add receive hardware timestamp metadata support via kfunc to XDP receive
packets.

Suggested-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Song Yoong Siang <yoong.siang.song@intel.com>
Acked-by: Stanislav Fomichev <sdf@google.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-17 18:57:26 -07:00
Song Yoong Siang
5b24324a90 net: stmmac: introduce wrapper for struct xdp_buff
Introduce struct stmmac_xdp_buff as a preparation to support XDP Rx
metadata via kfuncs.

Signed-off-by: Song Yoong Siang <yoong.siang.song@intel.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-17 18:57:26 -07:00
Jakub Kicinski
6c829efed5 Merge branch 'support-tunnel-mode-in-mlx5-ipsec-packet-offload'
Leon Romanovsky says:

====================
Support tunnel mode in mlx5 IPsec packet offload

This series extends mlx5 to support tunnel mode in its IPsec packet
offload implementation.

v0: https://lore.kernel.org/all/cover.1681106636.git.leonro@nvidia.com
====================

Link: https://lore.kernel.org/r/cover.1681388425.git.leonro@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-17 18:55:27 -07:00
Leon Romanovsky
c941da23aa net/mlx5e: Accept tunnel mode for IPsec packet offload
Open mlx5 driver to accept IPsec tunnel mode.

Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-17 18:55:25 -07:00
Leon Romanovsky
146c196b60 net/mlx5e: Create IPsec table with tunnel support only when encap is disabled
Current hardware doesn't support double encapsulation which is
happening when IPsec packet offload tunnel mode is configured
together with eswitch encap option.

Any user attempt to add new SA/policy after he/she sets encap mode, will
generate the following FW syndrome:

 mlx5_core 0000:08:00.0: mlx5_cmd_out_err:803:(pid 1904): CREATE_FLOW_TABLE(0x930) op_mod(0x0) failed,
 status bad parameter(0x3), syndrome (0xa43321), err(-22)

Make sure that we block encap changes before creating flow steering tables.
This is applicable only for packet offload in tunnel mode, while packet
offload in transport mode and crypto offload, don't have such limitation
as they don't perform encapsulation.

Reviewed-by: Raed Salem <raeds@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-17 18:55:25 -07:00
Leon Romanovsky
acc109291a net/mlx5: Allow blocking encap changes in eswitch
Existing eswitch encap option enables header encapsulation. Unfortunately
currently available hardware isn't able to perform double encapsulation,
which can happen once IPsec packet offload tunnel mode is used together
with encap mode set to BASIC.

So as a solution for misconfiguration, provide an option to block encap
changes, which will be used for IPsec packet offload.

Reviewed-by: Emeel Hakim <ehakim@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-17 18:55:25 -07:00
Leon Romanovsky
4c24272b4e net/mlx5e: Listen to ARP events to update IPsec L2 headers in tunnel mode
In IPsec packet offload mode all header manipulations are performed by
hardware, which is responsible to add/remove L2 header with source and
destinations MACs.

CX-7 devices don't support offload of in-kernel routing functionality,
as such HW needs external help to fill other side MAC as it isn't
available for HW.

As a solution, let's listen to neigh ARP updates and reconfigure IPsec
rules on the fly once new MAC data information arrives.

Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-17 18:55:25 -07:00
Leon Romanovsky
efbd31c4d8 net/mlx5e: Support IPsec TX packet offload in tunnel mode
Extend mlx5 driver with logic to support IPsec TX packet offload
in tunnel mode.

Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-17 18:55:25 -07:00
Leon Romanovsky
37a417ca91 net/mlx5e: Support IPsec RX packet offload in tunnel mode
Extend mlx5 driver with logic to support IPsec RX packet offload
in tunnel mode.

Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-17 18:55:25 -07:00
Leon Romanovsky
6480a3b6c9 net/mlx5e: Prepare IPsec packet reformat code for tunnel mode
Refactor setup_pkt_reformat() function to accommodate future extension
to support tunnel mode.

Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-17 18:55:25 -07:00
Leon Romanovsky
006adbc6de net/mlx5e: Configure IPsec SA tables to support tunnel mode
Create SA flow steering tables both for RX and TX with tunnel reformat
property. This allows to add and delete extra headers needed for tunnel
mode.

Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-17 18:55:25 -07:00
Leon Romanovsky
1c80e94929 net/mlx5e: Check IPsec packet offload tunnel capabilities
Validate tunnel mode support for IPsec packet offload.

Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-17 18:55:25 -07:00
Leon Romanovsky
1210af3b99 net/mlx5e: Add IPsec packet offload tunnel bits
Extend packet reformat types and flow table capabilities with
IPsec packet offload tunnel bits.

Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-17 18:55:25 -07:00
Horatiu Vultur
99676a5766 net: lan966x: Fix lan966x_ifh_get
From time to time, it was observed that the nanosecond part of the
received timestamp, which is extracted from the IFH, it was actually
bigger than 1 second. So then when actually calculating the full
received timestamp, based on the nanosecond part from IFH and the second
part which is read from HW, it was actually wrong.

The issue seems to be inside the function lan966x_ifh_get, which
extracts information from an IFH(which is an byte array) and returns the
value in a u64. When extracting the timestamp value from the IFH, which
starts at bit 192 and have the size of 32 bits, then if the most
significant bit was set in the timestamp, then this bit was extended
then the return value became 0xffffffff... . And the reason of this is
because constants without any postfix are treated as signed longs and
that is the reason why '1 << 31' becomes 0xffffffff80000000.
This is fixed by adding the postfix 'ULL' to 1.

Fixes: fd7627833d ("net: lan966x: Stop using packing library")
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-04-17 09:56:49 +01:00
David S. Miller
0af03871b6 Merge branch 'sctp-info-dump'
Xin Long says:

====================
sctp: add some missing peer_capables in sctp info dump

The 1st patch removes the unused and obsolete hostname_address from
sctp_association peer and also the bit from sctp_info peer_capables,
and then reuses its bit for reconf_capable and use the higher
available bit for intl_capable in the 2nd patch.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2023-04-17 08:28:21 +01:00
Xin Long
ab4f1e28c9 sctp: add intl_capable and reconf_capable in ss peer_capable
There are two new peer capables have been added since sctp_diag was
introduced into SCTP. When dumping the peer capables, these two new
peer capables should also be included. To not break the old capables,
reconf_capable takes the old hostname_address bit, and intl_capable
uses the higher available bit in sctpi_peer_capable.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-04-17 08:28:21 +01:00
Xin Long
bd4b281894 sctp: delete the obsolete code for the host name address param
In the latest RFC9260, the Host Name Address param has been deprecated.
For INIT chunk:

  Note 3: An INIT chunk MUST NOT contain the Host Name Address
  parameter.  The receiver of an INIT chunk containing a Host Name
  Address parameter MUST send an ABORT chunk and MAY include an
  "Unresolvable Address" error cause.

For Supported Address Types:

  The value indicating the Host Name Address parameter MUST NOT be
  used when sending this parameter and MUST be ignored when receiving
  this parameter.

Currently Linux SCTP doesn't really support Host Name Address param,
but only saves some flag and print debug info, which actually won't
even be triggered due to the verification in sctp_verify_param().
This patch is to delete those dead code.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-04-17 08:28:20 +01:00
David S. Miller
9bf55bd442 Merge branch 'mptcp-cleanups'
Matthieu Baerts says:

====================
mptcp: various small cleanups

Patch 1 makes a function static because it is only used in one file.

Patch 2 adds info about the git trees we use to help occasional devs.

Patch 3 removes an unused variable.

Patch 4 removes duplicated entries from the help menu of a tool used in
MPTCP selftests.

Patch 5 removes some ShellCheck warnings in mptcp_join.sh selftest.

Only very minor improvements then.
====================

Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-04-17 08:25:34 +01:00
Matthieu Baerts
0fcd72df88 selftests: mptcp: join: fix ShellCheck warnings
Most of the code had an issue according to ShellCheck.

That's mainly due to the fact it incorrectly believes most of the code
was unreachable because it's invoked by variable name, see how the
"tests" array is used.

Once SC2317 has been ignored, three small warnings were still visible:

 - SC2155: Declare and assign separately to avoid masking return values.

 - SC2046: Quote this to prevent word splitting: can be ignored because
   "ip netns pids" can display more than one pid.

 - SC2166: Prefer [ p ] || [ q ] as [ p -o q ] is not well defined.

This probably didn't fix any actual issues but it might help spotting
new interesting warnings reported by ShellCheck as just before,
ShellCheck was reporting issues for most lines making it a bit useless.

Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-04-17 08:25:33 +01:00
Matthieu Baerts
0a85264e48 selftests: mptcp: remove duplicated entries in usage
mptcp_connect tool was printing some duplicated entries when showing how
to use it: -j -l -r

While at it, I also:

 - moved the very few entries that were not sorted,

 - added -R that was missing since
   commit 8a4b910d00 ("mptcp: selftests: add rcvbuf set option"),

 - removed the -u parameter that has been removed in
   commit f730b65c9d ("selftests: mptcp: try to set mptcp ulp mode in different sk states").

No need to backport this, it is just an internal tool used by our
selftests. The help menu is mainly useful for MPTCP kernel devs.

Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-04-17 08:25:33 +01:00
Matthieu Baerts
ce395d0e3a mptcp: remove unused 'remaining' variable
In some functions, 'remaining' variable was given in argument and/or set
but never read.

  net/mptcp/options.c:779:3: warning: Value stored to 'remaining' is never
  read [clang-analyzer-deadcode.DeadStores].

  net/mptcp/options.c:547:3: warning: Value stored to 'remaining' is never
  read [clang-analyzer-deadcode.DeadStores].

The issue has been reported internally by Alibaba CI.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Suggested-by: Mat Martineau <martineau@kernel.org>
Co-developed-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-04-17 08:25:33 +01:00
Matthieu Baerts
c3d713409b MAINTAINERS: add git trees for MPTCP
This will help occasional developers to find our git repo without having
to look at our wiki.

Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-04-17 08:25:33 +01:00
Geliang Tang
aa5887dca2 mptcp: make userspace_pm_append_new_local_addr static
mptcp_userspace_pm_append_new_local_addr() has always exclusively been
used in pm_userspace.c since its introduction in
commit 4638de5aef ("mptcp: handle local addrs announced by userspace PMs").

So make it static.

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-04-17 08:25:33 +01:00
David S. Miller
28f610d086 Merge branch 'mptcp-subflow-init'
Matthieu Baerts says:

====================
mptcp: refactor first subflow init

This series refactors the initialisation of the first subflow of a
listen socket. The first subflow allocation is no longer done at the
initialisation of the socket but later, when the connection request is
received or when requested by the userspace.

This is needed not just because Paolo likes to refactor things but
because this simplifies the code and makes the behaviour more consistent
with the rest. Also, this is a prerequisite for future patches adding
proper support of SELinux/LSM labels with MPTCP and accept(2).

In [1], Ondrej Mosnacek explained they discovered the (userspace-facing)
sockets returned by accept(2) when using MPTCP always end up with the
label representing the kernel (typically system_u:system_r:kernel_t:s0),
while it would make more sense to inherit the context from the parent
socket (the one that is passed to accept(2)).

Before being able to properly support that on SELinux/LSM side, patches
2-3/5 prepare the code to simplify the patch 4/5 moving the allocation.

Patch 1/5 is a small clean-up seen while working on the series and patch
5/5 is a small improvement when closing unaccepted sockets.

[1] https://lore.kernel.org/netdev/CAFqZXNs2LF-OoQBUiiSEyranJUXkPLcCfBkMkwFeM6qEwMKCTw@mail.gmail.com/
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2023-04-17 08:18:34 +01:00
Paolo Abeni
8d547809a5 mptcp: fastclose msk when cleaning unaccepted sockets
When cleaning up unaccepted mptcp socket still laying inside
the listener queue at listener close time, such sockets will
go through a regular close, waiting for a timeout before
shutting down the subflows.

There is no need to keep the kernel resources in use for
such a possibly long time: short-circuit to fast-close.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-04-17 08:18:34 +01:00
Paolo Abeni
ddb1a072f8 mptcp: move first subflow allocation at mpc access time
In the long run this will simplify the mptcp code and will
allow for more consistent behavior. Move the first subflow
allocation out of the sock->init ops into the __mptcp_nmpc_socket()
helper.

Since the first subflow creation can now happen after the first
setsockopt() we additionally need to invoke mptcp_sockopt_sync()
on it.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-04-17 08:18:34 +01:00