Commit Graph

1353706 Commits

Author SHA1 Message Date
Lee Trager
cc083264ad eth: fbnic: Add support for multiple concurrent completion messages
Extend fbnic mailbox to support multiple concurrent completion messages at
once. This enables fbnic to support running multiple operations at once
which depend on a response from firmware via the mailbox.

Signed-off-by: Lee Trager <lee@trager.us>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20250512190109.2475614-4-lee@trager.us
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-15 12:59:18 +02:00
Lee Trager
bb7e124e30 eth: fbnic: Accept minimum anti-rollback version from firmware
fbnic supports applying firmware which may not be rolled back. This is
implemented in firmware however it is useful for the driver to know the
minimum supported firmware version. This will enable the driver validate
new firmware before it is sent to the NIC. If it is too old the driver can
provide a clear message that the version is too old.

Signed-off-by: Lee Trager <lee@trager.us>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250512190109.2475614-3-lee@trager.us
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-15 12:59:18 +02:00
Lee Trager
e505e14073 pldmfw: Don't require send_package_data or send_component_table to be defined
Not all drivers require send_package_data or send_component_table when
updating firmware. Instead of forcing drivers to implement a stub allow
these functions to go undefined.

Signed-off-by: Lee Trager <lee@trager.us>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Acked-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20250512190109.2475614-2-lee@trager.us
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-15 12:59:18 +02:00
Dimitri Fedrau
10465365f3 net: phy: marvell-88q2xxx: Enable temperature measurement in probe again
Enabling of the temperature sensor was moved from mv88q2xxx_hwmon_probe to
mv88q222x_config_init with the consequence that the sensor is only
usable when the PHY is configured. Enable the sensor in
mv88q2xxx_hwmon_probe as well to fix this.

Signed-off-by: Dimitri Fedrau <dima.fedrau@gmail.com>
Link: https://patch.msgid.link/20250512-marvell-88q2xxx-hwmon-enable-at-probe-v4-1-9256a5c8f603@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-15 12:32:09 +02:00
Vladimir Oltean
4cde0e4224 net: cpsw: isolate cpsw_ndo_ioctl() to just the old driver
cpsw->slaves[slave_no].phy should be equal to netdev->phydev, because it
is assigned from phy_attach_direct(). The latter is indirectly called
from the two identically named cpsw_slave_open() functions, one in
cpsw.c and another in cpsw_new.c.

Thus, the driver should not need custom logic to find the PHY, the core
can find it, and phy_do_ioctl_running() achieves exactly that.

However, that is only the case for cpsw_new and for the cpsw driver in
dual EMAC mode. This is explained in more detail in the previous commit.
Thus, allow the simpler core logic to execute for cpsw_new, and move
cpsw_ndo_ioctl() to cpsw.c.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20250512114422.4176010-2-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-14 19:41:46 -07:00
Vladimir Oltean
36d9b54258 net: cpsw: convert to ndo_hwtstamp_get() and ndo_hwtstamp_set()
New timestamping API was introduced in commit 66f7223039 ("net: add
NDOs for configuring hardware timestamping") from kernel v6.6. It is
time to convert the two cpsw drivers to the new API, so that the
ndo_eth_ioctl() path can be removed completely.

The cpsw_hwtstamp_get() and cpsw_hwtstamp_set() methods (and their shim
definitions, for the case where CONFIG_TI_CPTS is not enabled) must have
their prototypes adjusted.

These methods are used by two drivers (cpsw and cpsw_new), with vastly
different configurations:
- cpsw has two operating modes:
  - "dual EMAC" - enabled through the "dual_emac" device tree property -
    creates one net_device per EMAC / slave interface (but there is no
    bridging offload)
  - "switch mode" - default - there is a single net_device, with two
    EMACs/slaves behind it (and switching between them happens
    unbeknownst to the network stack).
- cpsw_new always registers one net_device for each EMAC which doesn't
  have status = "disabled". In terms of switching, it has two modes:
  - "dual EMAC": default, no switching between ports, no switchdev
    offload.
  - "switch mode": enabled through the "switch_mode" devlink parameter,
    offloads the Linux bridge through switchdev

Essentially, in 3 out of 4 operating modes, there is a bijective
relation between the net_device and the slave. Timestamping can thus be
configured on individual slaves. But in the "switch mode" of the cpsw
driver, ndo_eth_ioctl() targets a single slave, designated using the
"active_slave" device tree property.

To deal with these different cases, the common portion of the drivers,
cpsw_priv.c, has the cpsw_slave_index() function pointer, set to
separate, identically named cpsw_slave_index_priv() by the 2 drivers.

This is all relevant because cpsw_ndo_ioctl() has the old-style
phy_has_hwtstamp() logic which lets the PHY handle the timestamping
ioctls. Normally, that logic should be obsoleted by the more complex
logic in the core, which permits dynamically selecting the timestamp
provider - see dev_set_hwtstamp_phylib().

But I have doubts as to how this works for the "switch mode" of the dual
EMAC driver, because the core logic only engages if the PHY is visible
through ndev->phydev (this is set by phy_attach_direct()).

In cpsw.c, we have:
cpsw_ndo_open()
-> for_each_slave(priv, cpsw_slave_open, priv); // continues on errors
   -> of_phy_connect()
      -> phy_connect_direct()
         -> phy_attach_direct()
   OR
   -> phy_connect()
      -> phy_connect_direct()
         -> phy_attach_direct()

The problem for "switch mode" is that the behavior of phy_attach_direct()
called twice in a row for the same net_device (once for each slave) is
probably undefined.

For sure it will overwrite dev->phydev. I don't see any explicit error
checks for this case, and even if there were, the for_each_slave() call
makes them non-fatal to cpsw_ndo_open() anyway.

I have no idea what is the extent to which this provides a usable
result, but the point is: only the last attached PHY will be visible
in dev->phydev, and this may well be a different PHY than
cpsw->slaves[slave_no].phy for the "active_slave".

In dual EMAC mode, as well as in cpsw_new, this should not be a problem.
I don't know whether PHY timestamping is a use case for the cpsw "switch
mode" as well, and I hope that there isn't, because for the sake of
simplicity, I've decided to deliberately break that functionality, by
refusing all PHY timestamping. Keeping it would mean blocking the old
API from ever being removed. In the new dev_set_hwtstamp_phylib() API,
it is not possible to operate on a phylib PHY other than dev->phydev,
and I would very much prefer not adding that much complexity for bizarre
driver decisions.

Final point about the cpsw_hwtstamp_get() conversion: we don't need to
propagate the unnecessary "config.flags = 0;", because dev_get_hwtstamp()
provides a zero-initialized struct kernel_hwtstamp_config.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20250512114422.4176010-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-14 19:41:46 -07:00
Jakub Kicinski
265e1d5c63 Merge branch 'misc-drivers-sw-timestamp-changes'
Jason Xing says:

====================
misc drivers' sw timestamp changes

This series modified three outstanding drivers among more than 100 drivers
because the software timestamp generation is too early. The idea of this
series is derived from the brief talk[1] with Willem. In conclusion, this
series makes the generation of software timestamp near/before kicking the
doorbell for drivers.

[1]: https://lore.kernel.org/all/681b9d2210879_1f6aad294bc@willemb.c.googlers.com.notmuch/

v2: https://lore.kernel.org/20250508033328.12507-1-kerneljasonxing@gmail.com
====================

Link: https://patch.msgid.link/20250510134812.48199-1-kerneljasonxing@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-14 19:32:56 -07:00
Jason Xing
33d4cc81fc net: stmmac: generate software timestamp just before the doorbell
Make sure the call of skb_tx_timestamp is as close as possbile to the
doorbell.

The patch also adjusts the order of setting SKBTX_IN_PROGRESS and
generate software timestamp so that without SOF_TIMESTAMPING_OPT_TX_SWHW
being set the software and hardware timestamps will not appear in the
error queue of socket nearly at the same time (Please see __skb_tstamp_tx()).

Signed-off-by: Jason Xing <kernelxing@tencent.com>
Link: https://patch.msgid.link/20250510134812.48199-4-kerneljasonxing@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-14 19:32:54 -07:00
Jason Xing
aaed2789b3 net: cxgb4: generate software timestamp just before the doorbell
Make sure the call of skb_tx_timestamp is as close as possible to the
doorbell.

Signed-off-by: Jason Xing <kernelxing@tencent.com>
Link: https://patch.msgid.link/20250510134812.48199-3-kerneljasonxing@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-14 19:32:54 -07:00
Jason Xing
285ad74775 net: atlantic: generate software timestamp just before the doorbell
Make sure the call of skb_tx_timestamp is as close as possible to the
doorbell.

Signed-off-by: Jason Xing <kernelxing@tencent.com>
Link: https://patch.msgid.link/20250510134812.48199-2-kerneljasonxing@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-14 19:32:53 -07:00
Eric Biggers
a1dc1deeac net: apple: bmac: use crc32() instead of hand-rolled equivalent
The calculation done by bmac_crc(addr) followed by taking the low 6 bits
and reversing them is equivalent to taking the high 6 bits from
crc32(~0, addr, ETH_ALEN).  Just do that instead.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Link: https://patch.msgid.link/20250513050142.635391-1-ebiggers@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-14 19:24:43 -07:00
Eelco Chaudron
88906f5595 openvswitch: Stricter validation for the userspace action
This change enhances the robustness of validate_userspace() by ensuring
that all Netlink attributes are fully contained within the parent
attribute. The previous use of nla_parse_nested_deprecated() could
silently skip trailing or malformed attributes, as it stops parsing at
the first invalid entry.

By switching to nla_parse_deprecated_strict(), we make sure only fully
validated attributes are copied for later use.

Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Acked-by: Ilya Maximets <i.maximets@ovn.org>
Link: https://patch.msgid.link/67eb414e2d250e8408bb8afeb982deca2ff2b10b.1747037304.git.echaudro@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-14 19:13:34 -07:00
Heiner Kallweit
73d952840d net: phy: remove Kconfig symbol MDIO_DEVRES
MDIO_DEVRES is only set where PHYLIB/PHYLINK are set which
select MDIO_DEVRES. So we can remove this symbol.

Note: Due to circular module dependencies we can't simply
      make mdio_devres.c part of phylib.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://patch.msgid.link/27cba535-f507-4b32-84a3-0744c783a465@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-14 19:12:19 -07:00
Eric Biggers
0aa4024b43 net/tg3: use crc32() instead of hand-rolled equivalent
The calculation done by calc_crc() is equivalent to
~crc32(~0, buf, len), so just use that instead.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Link: https://patch.msgid.link/20250513041402.541527-1-ebiggers@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-14 18:52:46 -07:00
Geert Uytterhoeven
685e7b1522 dt-bindings: net: snps,dwmac: Align mdio node in example with bindings
According to the bindings, the MDIO subnode should be called "mdio".
Update the example to match this.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Simon Horman <horms@kernel.org>
Acked-by: Rob Herring (Arm) <robh@kernel.org>
Link: https://patch.msgid.link/308d72c2fe8e575e6e137b99743329c2d53eceea.1747121550.git.geert+renesas@glider.be
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-14 18:51:58 -07:00
Alper Ak
4abc1f14e2 documentation: networking: devlink: Fix a typo in devlink-trap.rst
Fix a typo in the documentation: "errorrs" -> "errors".

Signed-off-by: Alper Ak <alperyasinak1@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250513092451.22387-1-alperyasinak1@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-14 18:51:58 -07:00
Wei Fang
664bf117a3 net: enetc: fix implicit declaration of function FIELD_PREP
The kernel test robot reported the following error:

drivers/net/ethernet/freescale/enetc/ntmp.c: In function 'ntmp_fill_request_hdr':
drivers/net/ethernet/freescale/enetc/ntmp.c:203:38: error: implicit
declaration of function 'FIELD_PREP' [-Wimplicit-function-declaration]
203 |         cbd->req_hdr.access_method = FIELD_PREP(NTMP_ACCESS_METHOD,
    |                                      ^~~~~~~~~~

Therefore, add "bitfield.h" to ntmp_private.h to fix this issue.

Fixes: 4701073c3d ("net: enetc: add initial netc-lib driver to support NTMP")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202505101047.NTMcerZE-lkp@intel.com/
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2025-05-14 09:48:49 +01:00
Jiawen Wu
838b2a28c0 net: wangxun: Correct clerical errors in comments
There are wrong "#endif" comments in .h files need to be corrected.

Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2025-05-14 09:47:33 +01:00
Heiner Kallweit
dc75c3ced1 net: phy: remove stub for mdiobus_register_board_info
The functionality of mdiobus_register_board_info() typically isn't
optional for the caller. Therefore remove the stub.

Note: Currently we have only one caller of mdiobus_register_board_info(),
in a DSA/PHYLINK context. Therefore CONFIG_MDIO_DEVICE is selected anyway.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://patch.msgid.link/410a2222-c4e8-45b0-9091-d49674caeb00@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-13 16:39:45 -07:00
Vladimir Oltean
ae605349e1 net: mlxsw: convert to ndo_hwtstamp_get() and ndo_hwtstamp_set()
New timestamping API was introduced in commit 66f7223039 ("net: add
NDOs for configuring hardware timestamping") from kernel v6.6. It is
time to convert the mlxsw driver to the new API, so that the
ndo_eth_ioctl() path can be removed completely.

The UAPI is still ioctl-only, but it's best to remove the "ioctl"
mentions from the driver in case a netlink variant appears.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20250512154411.848614-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-13 16:38:25 -07:00
Konrad Dybcio
0d161eb27d net: ipa: Make the SMEM item ID constant
It can't vary, stop storing the same magic number everywhere.

Signed-off-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Reviewed-by: Alex Elder <elder@kernel.org>
Link: https://patch.msgid.link/20250512-topic-ipa_smem-v1-1-302679514a0d@oss.qualcomm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-13 15:42:50 -07:00
Vladimir Oltean
51672a6587 net: enetc: convert to ndo_hwtstamp_get() and ndo_hwtstamp_set()
New timestamping API was introduced in commit 66f7223039 ("net: add
NDOs for configuring hardware timestamping") from kernel v6.6. It is
time to convert the ENETC driver to the new API, so that the
ndo_eth_ioctl() path can be removed completely.

Move the enetc_hwtstamp_get() and enetc_hwtstamp_set() calls away from
enetc_ioctl() to dedicated net_device_ops for the LS1028A PF and VF
(NETC v4 does not yet implement enetc_ioctl()), adapt the prototypes and
export these symbols (enetc_ioctl() is also exported).

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Wei Fang <wei.fang@nxp.com>
Link: https://patch.msgid.link/20250512112402.4100618-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-13 15:38:53 -07:00
Jiawen Wu
904c6ad822 net: txgbe: Fix pending interrupt
For unknown reasons, sometimes the value of MISC interrupt is 0 in the
IRQ handle function. In this case, wx_intr_enable() is also should be
invoked to clear the interrupt. Otherwise, the next interrupt would
never be reported.

Fixes: a9843689e2 ("net: txgbe: add sriov function support")
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/F4F708403CE7090B+20250512100652.139510-1-jiawenwu@trustnetic.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-13 15:38:26 -07:00
Jakub Kicinski
c20219ee62 Merge branch 'net-mlx5-hws-complex-matchers-and-rehash-mechanism-fixes'
Tariq Toukan says:

====================
net/mlx5: HWS, Complex Matchers and rehash mechanism fixes

Motivation:
----------

A matcher can match a certain set of match parameters. However,
the number and size of match params for a single matcher are
limited — all the parameters must fit within a single definer.

A common example of this limitation is IPv6 address matching, where
matching both source and destination IPs requires more bits than a
single definer can support.

SW Steering addresses this limitation by chaining multiple Steering
Table Entries (STEs) within the same matcher, where each STE matches
on a subset of the parameters.

In HW Steering, such chaining is not possible — the matcher's STEs
are managed in a hash table, and a single definer is used to calculate
the hash index for STEs.

Overview:
--------

To address this limitation in HW Steering, we introduce
*Complex Matchers*, which consist of two chained matchers. This allows
matching on twice as many parameters. Complex Matchers are filled with
*Complex Rules* — rules that are split into two parts and inserted into
their respective matchers.

The first half of the Complex Matcher is a regular matcher and points
to the second half, which is an *Isolated Matcher*. An Isolated Matcher
has its own isolated table and is accessible only by traffic coming
from the first half of the Complex Matcher.

This splitting of matchers/rules into multiple parts is transparent to
users. It is hidden behind the BWC HWS API. It becomes visible only
when dumping steering debug information, where the Complex Matcher
appears as two separate matchers: one in the user-created table and
another in its isolated table.

Implementation Details:
----------------------

All user actions are performed on the second part of the rules only.
The first part handles matching and applies two actions: modify header
(set metadata, see details below) and go-to-table (directing traffic
to the isolated table containing the isolated matcher).

Rule updates (updating rule actions) are applied to the second part
of the rule since user-provided actions are not executed in the first
matcher.

We use REG_C_6 metadata register to set and match on unique per-rule
tag (see details below).

Splitting rules into two parts introduces new challenges:

1. Invalid Combinations

   Consider two rules with different matching values:
   - Rule 1: A+B
   - Rule 2: C+D

   Let's split the rules into two parts as follows:

   |-----Complex Matcher-------|
   |                           |
   | 1st matcher   2nd matcher |
   |    |---|        |---|     |
   |    | A |        | B |     |
   |    |---| -----> |---|     |
   |    | C |        | D |     |
   |    |---|        |---|     |
   |                           |
   |---------------------------|

   Splitting these rules results in invalid combinations: A+D and C+B:
   any packet that matched on A will be forwarded to the 2nd matcher,
   where it will try to match on B (which is legal, and it is what the
   user asked for), but it will also try to match on D (which is not
   what the user asked for). To resolve this, we assign unique tags
   to each rule on the first matcher and match on these tags on the
   second matcher:

   |----------|     |---------|
   |     A    |     | B, TagA |
   | action:  |     |         |
   | set TagA |     |         |
   |----------| --> |---------|
   |     C    |     | D, TagB |
   | action:  |     |         |
   | set TagB |     |         |
   |----------|     |---------|

2. Duplicated Entries:

   Consider two rules with overlapping values:
   - Rule 1: A+B
   - Rule 2: A+D

   Let's split the rules into two parts as follows:

    |---|     |---|
    | A |     | B |
    |---| --> |---|
    |   |     | D |
    |---|     |---|

   This leads to the duplicated entries on the first matcher, which HWS
   doesn't allow: subsequent delete of either of the rules will delete
   the only entry in the first matcher, leaving the remaining rule
   broken. To address this, we use a reference count for entries in the
   first matcher and delete STEs only when their refcount reaches zero.

Both challenges are resolved by having a per-matcher data structure
(implemented with rhashtable) that manages refcounts for the first part
of the rules and holds unique tags (managed via IDA) for these rules to
set and to match on the second matcher.

Limitations:
-----------

We utilize metadata register REG_C_6 in this implementation, so its
usage anywhere along the flow that might include the need for Complex
Matcher is prohibited.

The number and size of match parameters remain limited — now
constrained by what can be represented by two definers instead of one.
This architectural limitation arises from the structure of Complex
Matchers. If future requirements demand more parameters, Complex
Matchers can be extended beyond two matchers.

Additionally, there is an implementation limit of 32 match parameters
per matcher (disregarding parameter size). This limit can be lifted
if needed.

Patches:
-------

 - Patches 1-3: small additions/refactoring in preparation for
   Complex Matcher: exposed mlx5hws_table_ft_set_next_ft() in header,
   added definer function to convert field name enum to string,
   expose the polling function mlx5hws_bwc_queue_poll() in a header.
 - Patch 4: in preparation for Complex Matcher, this patch adds
   support for Isolated Matcher.
 - Patch 5: the main patch - Complex Matchers implementation.

[2]

Patch 6: fixing the usecase where rule insertion was failing,
but rehash couldn't be initiated if the number of rules in
the table is below the rehash threshold.

Patch 7: fixing the usecase where many rules in parallel
would require rehash, due to the way the counting of rules
was done.

Patch 8: fixing the case where rules were requiring action
template extension in parallel, leading to unneeded extensions
with the same templates.

Patch 9: refactor and simplify the rehash loop.

Patch 10: dump error completion details, which helps a lot
in trying to understand what went wrong, especially during
rehash.
====================

Link: https://patch.msgid.link/1746992290-568936-1-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-13 15:30:25 -07:00
Yevgeny Kliteynik
578b856b5e net/mlx5: HWS, dump bad completion details
Failing to insert/delete a rule should not happen. If it does happen,
it would be good to know at which stage it happened and what was the
failure. This patch adds printing of bad CQE details.

Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Vlad Dogaru <vdogaru@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1746992290-568936-11-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-13 15:30:25 -07:00
Yevgeny Kliteynik
ef94799a87 net/mlx5: HWS, rework rehash loop
Reworking the rehash loop - simplifying the code and making it less
error prone:
 - Instead of doing round-robin on all the queues with batch of rules in
   each cycle, just go over all the queues and move all the rules that
   belong to this queue.
 - If at some stage of moving the rule we get a failure (which should
   not happen), this can't be rolled back. So instead of aborting
   rehash and leaving the matcher in a broken state, allow the loop
   to continue: attempt to move the rest of the rules and delete the
   old matcher. A rule that failed to move to a new matcher will loose
   its match STE once the rehash is completed and the old matcher is
   deleted, so the rule won't match any traffic any more. This rule's
   packets will fall back to the steering pipeline w/o HW offload.
   Rehash procedure will return an error, which will cause the rule
   insertion to fail for the rule that started this whole rehash.

Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Vlad Dogaru <vdogaru@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1746992290-568936-10-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-13 15:30:25 -07:00
Yevgeny Kliteynik
041861b40f net/mlx5: HWS, fix redundant extension of action templates
When a rule is inserted into a matcher, we search for the suitable
action template. If such template is not found, action template array
is extended with the new template. However, when several threads are
performing this in parallel, there is a race - we can end up with
extending the action templates array with the same template.

This patch is doing the following:
 - refactor the code to find action template index in rule create and
   update, have the common code in an auxiliary function
 - after locking all the queues, check again if the action template
   array still needs to be extended

Signed-off-by: Vlad Dogaru <vdogaru@nvidia.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1746992290-568936-9-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-13 15:30:25 -07:00
Yevgeny Kliteynik
4c56b5cbc3 net/mlx5: HWS, fix counting of rules in the matcher
Currently the counter that counts number of rules in a matcher is
increased only when rule insertion is completed. In a multi-threaded
usecase this can lead to a scenario that many rules can be in process
of insertion in the same matcher, while none of them has completed
the insertion and the rule counter is not updated. This results in
a rule insertion failure for many of them at first attempt, which
leads to all of them requiring rehash and requiring locking of all
the queue locks.

This patch fixes the case by increasing the rule counter in the
beginning of insertion process and decreasing in case of any failure.

Signed-off-by: Vlad Dogaru <vdogaru@nvidia.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1746992290-568936-8-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-13 15:30:25 -07:00
Yevgeny Kliteynik
9d4024edce net/mlx5: HWS, force rehash when rule insertion failed
Rules are inserted into hash table in accordance with their hash index.
When a certain number of rules is reached, the table is rehashed:
a bigger new table is allocated and all the rules are moved there.
But sometimes a new rule can't be inserted into the hash table
because its index is full, even though the number of rules in the
table is well below the threshold. The hash function is not perfect,
so such cases are not rare. When that happens, we want to do the same
rehash, in order to increase the table size and lower the probability
for such cases.

This patch fixes the usecase where rule insertion was failing, but
rehash couldn't be initiated due to low number of rules: it adds flag
that denotes that rehash is required, even if the number of rules in
the table is below the rehash threshold.

Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Vlad Dogaru <vdogaru@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1746992290-568936-7-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-13 15:30:25 -07:00
Yevgeny Kliteynik
17e0accac5 net/mlx5: HWS, support complex matchers
This patch adds support for Complex Matchers/Rules

Overview:
--------

A matcher can match on a certain set of match parameters. However, the
number and size of match params for a single matcher are limited: all
the parameters must fit within a single definer.

A common example of this limitation is IPv6 address matching, where
matching both source and destination IPs requires more bits than a
single definer can support.

SW Steering addresses this limitation by chaining multiple Steering
Table Entries (STEs) within the same matcher, where each STE matches
on a subset of the parameters.

In HW Steering, such chaining is not possible — the matcher's STEs
are managed in a hash table, and a single definer is used to calculate
the hash index for STEs.

To address this limitation in HW Steering, we introduce Complex
Matchers, which consist of two chained matchers. This allows matching
on twice as many parameters. Complex Matchers are filled with Complex
Rules — rules that are split into two parts and inserted into their
respective matchers.

The first half of the Complex Matcher is a regular matcher and points
to the second half, which is an Isolated Matcher. An Isolated Matcher
has its own isolated table and is accessible only by traffic coming
from the first half of the Complex Matcher.

This splitting of matchers/rules into multiple parts is transparent to
users. It is hidden under the BWC HWS API. It becomes visible only when
dumping steering debug information, where the Complex Matcher appears
as two separate matchers: one in the user-created table and another
in its isolated table.

Some implementation details:
---------------------------

All user actions are performed on the second part of the rules only.
The first part handles matching and applies two actions: modify header
(set metadata, see details below) and go-to-table (directing traffic to
the isolated table containing the isolated matcher).

Rule updates (updating rule actions) are applied to the second part of
the rule since user-provided actions are not executed in the first
matcher.

We use REG_C_6 metadata register to set and match on unique per-rule
tag (see details below).

Splitting rules into two parts introduces new challenges:

1. Invalid Combinations

   Consider two rules with different matching values:
   - Rule 1: A+B
   - Rule 2: C+D

   Let's split the rules into two parts as follows:

   |---|     |---|
   | A |     | B |
   |---| --> |---|
   | C |     | D |
   |---|     |---|

   Splitting these rules results in invalid combinations like A+D
   and C+B.

   To resolve this, we assign unique tags to each rule on the first
   matcher and match these tags on the second matcher (the tag is
   implemented through modify_hdr action that sets value to metadata
   register REG_C_6):

   |----------|     |---------|
   |     A    |     | B, TagA |
   | action:  |     |         |
   | set TagA |     |         |
   |----------| --> |---------|
   |     C    |     | D, TagB |
   | action:  |     |         |
   | set TagB |     |         |
   |----------|     |---------|

2. Duplicated Entries:

   Consider two rules with overlapping values:
   - Rule 1: A+B
   - Rule 2: A+D

   Let's split the rules into two parts as follows:

    |---|     |---|
    | A |     | B |
    |---| --> |---|
    |   |     | D |
    |---|     |---|

   This leads to the duplicated entries on the first matcher, which HWS
   doesn't allow: subsequent delete of either of the rules will delete
   the only entry in the first matcher, leaving the remaining rule
   broken.

   To address this, we use a reference count for entries in the first
   matcher and delete STEs only when their refcount reaches zero.

Both challenges are resolved by having a per-matcher data structure
(implemented with rhashtable) that manages refcounts for the first part
of the rules and holds unique tags (managed via IDA) for these rules to
set and to match on the second matcher.

Limitations:
-----------

We utilize metadata register REG_C_6 in this implementation, so its
usage anywhere along the steering of the flow that might include the
need for Complex Matcher is prohibited.

The number and size of match parameters remain limited — now it is
constrained by what can be represented by two definers instead of one.
This architectural limitation arises from the structure of Complex
Matchers. If future requirements demand more parameters,
Complex Matchers can be extended beyond two matchers.

Additionally, there is an implementation limit of 32 match parameters
per rule (disregarding parameter size). This limit can be lifted if
needed.

Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Vlad Dogaru <vdogaru@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1746992290-568936-6-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-13 15:30:25 -07:00
Yevgeny Kliteynik
b816743a18 net/mlx5: HWS, introduce isolated matchers
In preparation for complex matcher support, introduce the isolated
matcher.

Isolated matcher is a matcher that has its own isolated table.
It is used as the second half of the complex matcher: when the rule
is split into two parts (complex rule), then matching on the first
part will send the packet to the isolated matcher that will try to
match on the second part. In case of miss, the packet goes back to
the matcher's end flow table.

Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Vlad Dogaru <vdogaru@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1746992290-568936-5-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-13 15:30:25 -07:00
Yevgeny Kliteynik
3c739d1624 net/mlx5: HWS, expose polling function in header file
In preparation for complex matcher, expose the function that is
polling queue for completion (mlx5hws_bwc_queue_poll) in header
file, so that it will be used by complex matcher code.

Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Vlad Dogaru <vdogaru@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1746992290-568936-4-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-13 15:30:25 -07:00
Yevgeny Kliteynik
fed5f48312 net/mlx5: HWS, add definer function to get field name str
In preparation for complex matcher support, add function for
converting definer fname to str, which will be used in following
patches.

Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Vlad Dogaru <vdogaru@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1746992290-568936-3-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-13 15:30:25 -07:00
Yevgeny Kliteynik
d2338a27fc net/mlx5: HWS, expose function mlx5hws_table_ft_set_next_ft in header
In preparation for complex matcher support, make function
mlx5hws_table_ft_set_next_ft() non-static and expose it in header.

Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Vlad Dogaru <vdogaru@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1746992290-568936-2-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-13 15:30:25 -07:00
Paolo Abeni
9f607dc39b Merge branch 'amd-xgbe-add-support-for-amd-renoir'
Raju Rangoju says:

====================
amd-xgbe: add support for AMD Renoir

Add support for a new AMD Ethernet device called "Renoir". It has a new
PCI ID, add this to the current list of supported devices in the
amd-xgbe devices. Also, the BAR1 addresses cannot be used to access the
PCS registers on Renoir platform, use the indirect addressing via SMN
instead.
====================

Link: https://patch.msgid.link/20250509155325.720499-1-Raju.Rangoju@amd.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-13 13:29:43 +02:00
Raju Rangoju
795f86ff05 amd-xgbe: add support for new pci device id 0x1641
Add support for new pci device id 0x1641 to register
Renoir device with PCIe.

Signed-off-by: Raju Rangoju <Raju.Rangoju@amd.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250509155325.720499-6-Raju.Rangoju@amd.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-13 13:29:41 +02:00
Raju Rangoju
ab95bc9aa7 amd-xgbe: Add XGBE_XPCS_ACCESS_V3 support to xgbe_pci_probe()
A new version of XPCS access routines have been introduced, add the
support to xgbe_pci_probe() to use these routines.

Signed-off-by: Raju Rangoju <Raju.Rangoju@amd.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250509155325.720499-5-Raju.Rangoju@amd.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-13 13:29:41 +02:00
Raju Rangoju
e49479f30e amd-xgbe: add support for new XPCS routines
Add the necessary support to enable Renoir ethernet device. Since the
BAR1 address cannot be used to access the XPCS registers on Renoir, use
the smn functions.

Some of the ethernet add-in-cards have dual PHY but share a single MDIO
line (between the ports). In such cases, link inconsistencies are
noticed during the heavy traffic and during reboot stress tests. Using
smn calls helps avoid such race conditions.

Suggested-by: Sudheesh Mavila <sudheesh.mavila@amd.com>
Signed-off-by: Raju Rangoju <Raju.Rangoju@amd.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250509155325.720499-4-Raju.Rangoju@amd.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-13 13:29:41 +02:00
Raju Rangoju
bbbd7303ea amd-xgbe: reorganize the xgbe_pci_probe() code path
Reorganize the xgbe_pci_probe() code path to convert if/else statements
to switch case to help add future code. This helps code look cleaner.

Signed-off-by: Raju Rangoju <Raju.Rangoju@amd.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250509155325.720499-3-Raju.Rangoju@amd.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-13 13:29:40 +02:00
Raju Rangoju
2d4407160f amd-xgbe: reorganize the code of XPCS access
The xgbe_{read/write}_mmd_regs_v* functions have common code which can
be moved to helper functions. Add new helper functions to calculate the
mmd_address for v1/v2 of xpcs access.

Signed-off-by: Raju Rangoju <Raju.Rangoju@amd.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250509155325.720499-2-Raju.Rangoju@amd.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-13 13:29:40 +02:00
Paolo Abeni
42bd96cb9e Merge branch 'tools-ynl-gen-support-sub-types-for-binary-attributes'
Jakub Kicinski says:

====================
tools: ynl-gen: support sub-types for binary attributes

Binary attributes have sub-type annotations which either indicate
that the binary object should be interpreted as a raw / C array of
a simple type (e.g. u32), or that it's a struct.

Use this information in the C codegen instead of outputting void *
for all binary attrs. It doesn't make a huge difference in the genl
families, but in classic Netlink there is a lot more structs.

v1: https://lore.kernel.org/20250508022839.1256059-1-kuba@kernel.org
====================

Link: https://patch.msgid.link/20250509154213.1747885-1-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-13 13:22:34 +02:00
Jakub Kicinski
25e37418c8 tools: ynl-gen: support struct for binary attributes
Support using a struct pointer for binary attrs. Len field is maintained
because the structs may grow with newer kernel versions. Or, which matters
more, be shorter if the binary is built against newer uAPI than kernel
against which it's executed. Since we are storing a pointer to a struct
type - always allocate at least the amount of memory needed by the struct
per current uAPI headers (unused mem is zeroed). Technically users should
check the length field but per modern ASAN checks storing a short object
under a pointer seems like a bad idea.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250509154213.1747885-4-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-13 13:22:32 +02:00
Jakub Kicinski
9ba8e351ef tools: ynl-gen: auto-indent else
We auto-indent if statements (increase the indent of the subsequent
line by 1), do the same thing for else branches without a block.
There hasn't been any else branches before but we're about to add one.

Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Link: https://patch.msgid.link/20250509154213.1747885-3-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-13 13:22:32 +02:00
Jakub Kicinski
02a562bb2b tools: ynl-gen: support sub-type for binary attributes
Sub-type annotation on binary attributes may indicate that the attribute
carries an array of simple types (also referred to as "C array" in docs).
Support rendering them as such in the C user code. For example for u32,
instead of:

  struct {
    u32 arr;
  } _len;

  void *arr;

render:

  struct {
    u32 arr;
  } _count;

  __u32 *arr;

Note that count is the number of elements while len was the length in bytes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250509154213.1747885-2-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-13 13:22:32 +02:00
Paolo Abeni
ac4d1baf97 Merge branch 'device-memory-tcp-tx'
Mina Almasry says:

====================
Device memory TCP TX

The TX path had been dropped from the Device Memory TCP patch series
post RFCv1 [1], to make that series slightly easier to review. This
series rebases the implementation of the TX path on top of the
net_iov/netmem framework agreed upon and merged. The motivation for
the feature is thoroughly described in the docs & cover letter of the
original proposal, so I don't repeat the lengthy descriptions here, but
they are available in [1].

Full outline on usage of the TX path is detailed in the documentation
included with this series.

Test example is available via the kselftest included in the series as well.

The series is relatively small, as the TX path for this feature largely
piggybacks on the existing MSG_ZEROCOPY implementation.

Patch Overview:
---------------

1. Documentation & tests to give high level overview of the feature
   being added.

1. Add netmem refcounting needed for the TX path.

2. Devmem TX netlink API.

3. Devmem TX net stack implementation.

4. Make dma-buf unbinding scheduled work to handle TX cases where it gets
   freed from contexts where we can't sleep.

5. Add devmem TX documentation.

6. Add scaffolding enabling driver support for netmem_tx. Add helpers, driver
feature flag, and docs to enable drivers to declare netmem_tx support.

7. Guard netmem_tx against being enabled against drivers that don't
   support it.

8. Add devmem_tx selftests. Add TX path to ncdevmem and add a test to
   devmem.py.

Testing:
--------

Testing is very similar to devmem TCP RX path. The ncdevmem test used
for the RX path is now augemented with client functionality to test TX
path.

* Test Setup:

Kernel: net-next with this RFC and memory provider API cherry-picked
locally.

Hardware: Google Cloud A3 VMs.

NIC: GVE with header split & RSS & flow steering support.

Performance results are not included with this version, unfortunately.
I'm having issues running the dma-buf exporter driver against the
upstream kernel on my test setup. The issues are specific to that
dma-buf exporter and do not affect this patch series. I plan to follow
up this series with perf fixes if the tests point to issues once they're
up and running.

Special thanks to Stan who took a stab at rebasing the TX implementation
on top of the netmem/net_iov framework merged. Parts of his proposal [2]
that are reused as-is are forked off into their own patches to give full
credit.

[1] https://lore.kernel.org/netdev/20240909054318.1809580-1-almasrymina@google.com/
[2] https://lore.kernel.org/netdev/20240913150913.1280238-2-sdf@fomichev.me/T/#m066dd407fbed108828e2c40ae50e3f4376ef57fd

Cc: sdf@fomichev.me
Cc: asml.silence@gmail.com
Cc: dw@davidwei.uk
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Victor Nogueira <victor@mojatatu.com>
Cc: Pedro Tammela <pctammela@mojatatu.com>
Cc: Samiullah Khawaja <skhawaja@google.com>
Cc: Kuniyuki Iwashima <kuniyu@amazon.com>

v14: https://lore.kernel.org/netdev/20250429032645.363766-1-almasrymina@google.com/
v13: https://lore.kernel.org/netdev/20250425204743.617260-1-almasrymina@google.com/
v12: https://lore.kernel.org/netdev/20250423031117.907681-1-almasrymina@google.com/
v11: https://lore.kernel.org/netdev/20250423031117.907681-1-almasrymina@google.com/
v10: https://lore.kernel.org/netdev/20250417231540.2780723-1-almasrymina@google.com/
v9: https://lore.kernel.org/netdev/20250415224756.152002-1-almasrymina@google.com/
v8: https://lore.kernel.org/netdev/20250308214045.1160445-1-almasrymina@google.com/
v7: https://lore.kernel.org/netdev/20250227041209.2031104-1-almasrymina@google.com/
v6: https://lore.kernel.org/netdev/20250222191517.743530-1-almasrymina@google.com/
v5: https://lore.kernel.org/netdev/20250220020914.895431-1-almasrymina@google.com/
v4: https://lore.kernel.org/netdev/20250203223916.1064540-1-almasrymina@google.com/
v3: https://patchwork.kernel.org/project/netdevbpf/list/?series=929401&state=*
RFC v2: https://patchwork.kernel.org/project/netdevbpf/list/?series=920056&state=*
====================

Link: https://patch.msgid.link/20250508004830.4100853-1-almasrymina@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-13 11:14:46 +02:00
Mina Almasry
2f1a805f32 selftests: ncdevmem: Implement devmem TCP TX
Add support for devmem TX in ncdevmem.

This is a combination of the ncdevmem from the devmem TCP series RFCv1
which included the TX path, and work by Stan to include the netlink API
and refactored on top of his generic memory_provider support.

Signed-off-by: Mina Almasry <almasrymina@google.com>
Signed-off-by: Stanislav Fomichev <sdf@fomichev.me>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250508004830.4100853-10-almasrymina@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-13 11:12:49 +02:00
Mina Almasry
ae28cb1147 net: check for driver support in netmem TX
We should not enable netmem TX for drivers that don't declare support.

Check for driver netmem TX support during devmem TX binding and fail if
the driver does not have the functionality.

Check for driver support in validate_xmit_skb as well.

Signed-off-by: Mina Almasry <almasrymina@google.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250508004830.4100853-9-almasrymina@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-13 11:12:49 +02:00
Mina Almasry
c32532670c gve: add netmem TX support to GVE DQO-RDA mode
Use netmem_dma_*() helpers in gve_tx_dqo.c DQO-RDA paths to
enable netmem TX support in that mode.

Declare support for netmem TX in GVE DQO-RDA mode.

Signed-off-by: Mina Almasry <almasrymina@google.com>
Acked-by: Harshitha Ramamurthy <hramamurthy@google.com>
Link: https://patch.msgid.link/20250508004830.4100853-8-almasrymina@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-13 11:12:49 +02:00
Mina Almasry
383faec0fd net: enable driver support for netmem TX
Drivers need to make sure not to pass netmem dma-addrs to the
dma-mapping API in order to support netmem TX.

Add helpers and netmem_dma_*() helpers that enables special handling of
netmem dma-addrs that drivers can use.

Document in netmem.rst what drivers need to do to support netmem TX.

Signed-off-by: Mina Almasry <almasrymina@google.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250508004830.4100853-7-almasrymina@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-13 11:12:49 +02:00
Mina Almasry
17af8cc06a net: add devmem TCP TX documentation
Add documentation outlining the usage and details of the devmem TCP TX
API.

Signed-off-by: Mina Almasry <almasrymina@google.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250508004830.4100853-6-almasrymina@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-13 11:12:48 +02:00