When tcp-data-split is UNKNOWN mode, drivers arbitrarily handle it.
For example, bnxt_en driver automatically enables if at least one of
LRO/GRO/JUMBO is enabled.
If tcp-data-split is UNKNOWN and LRO is enabled, a driver returns
ENABLES of tcp-data-split, not UNKNOWN.
So, `ethtool -g eth0` shows tcp-data-split is enabled.
The problem is in the setting situation.
In the ethnl_set_rings(), it first calls get_ringparam() to get the
current driver's config.
At that moment, if driver's tcp-data-split config is UNKNOWN, it returns
ENABLE if LRO/GRO/JUMBO is enabled.
Then, it sets values from the user and driver's current config to
kernel_ethtool_ringparam.
Last it calls .set_ringparam().
The driver, especially bnxt_en driver receives
ETHTOOL_TCP_DATA_SPLIT_ENABLED.
But it can't distinguish whether it is set by the user or just the
current config.
When user updates ring parameter, the new hds_config value is updated
and current hds_config value is stored to old_hdsconfig.
Driver's .set_ringparam() callback can distinguish a passed
tcp-data-split value is came from user explicitly.
If .set_ringparam() is failed, hds_config is rollbacked immediately.
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Link: https://patch.msgid.link/20250114142852.3364986-2-ap420073@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Running "make kselftest TARGETS=net/forwarding" results in
multiple ccurrences of the same error:
- ./lib.sh: line 787: teamd: command not found
This patch adds the variable $REQUIRE_TEAMD in every test that uses the
command teamd and checks the $REQUIRE_TEAMD variable in the file "lib.sh"
to skip the test if the command is not installed.
Signed-off-by: Alessandro Zanni <alessandro.zanni87@gmail.com>
Link: https://patch.msgid.link/20250114003323.97207-1-alessandro.zanni87@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This patch adds support for hardware monitoring to the fbnic driver,
allowing for temperature and voltage sensor data to be exposed to
userspace via the HWMON interface. The driver registers a HWMON device
and provides callbacks for reading sensor data, enabling system
admins to monitor the health and operating conditions of fbnic.
Signed-off-by: Sanman Pradhan <sanman.p211993@gmail.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Link: https://patch.msgid.link/20250114000705.2081288-4-sanman.p211993@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add infrastructure to support firmware request/response handling with
completions. Add a completion structure to track message state including
message type for matching, completion for waiting for response, and
result for error propagation. Use existing spinlock to protect the writes.
The data from the various response types will be added to the "union u"
by subsequent commits.
Signed-off-by: Sanman Pradhan <sanman.p211993@gmail.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Link: https://patch.msgid.link/20250114000705.2081288-2-sanman.p211993@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Daniel Machon says:
====================
net: lan969x: add FDMA support
== Description:
This series is the last of a multi-part series, that prepares and adds
support for the new lan969x switch driver.
The upstreaming efforts has been split into multiple series:
1) Prepare the Sparx5 driver for lan969x (merged)
2) Add support for lan969x (same basic features as Sparx5
provides excl. FDMA and VCAP, merged).
3) Add lan969x VCAP functionality (merged).
4) Add RGMII support (merged).
--> 5) Add FDMA support.
== FDMA support:
The lan969x switch device uses the same FDMA engine as the Sparx5 switch
device, with the same number of channels etc. This means we can utilize
the newly added FDMA library, that is already in use by the lan966x and
sparx5 drivers.
As previous lan969x series, the FDMA implementation will hook into the
Sparx5 implementation where possible, however both RX and TX handling
will be done differently on lan969x and therefore requires a separate
implementation of the RX and TX path.
Details are in the commit description of the individual patches
== Patch breakdown:
Patch #1: Enable FDMA support on lan969x
Patch #2: Split start()/stop() functions
Patch #3: Activate TX FDMA in start()
Patch #4: Ops out a few functions that differ on the two platforms
Patch #5: Add FDMA implementation for lan969x
v1: https://lore.kernel.org/20250109-sparx5-lan969x-switch-driver-5-v1-0-13d6d8451e63@microchip.com
====================
Link: https://patch.msgid.link/20250113-sparx5-lan969x-switch-driver-5-v2-0-c468f02fd623@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The lan969x switch device supports manual frame injection and extraction
to and from the switch core, using a number of injection and extraction
queues. This technique is currently supported, but delivers poor
performance compared to Frame DMA (FDMA).
This lan969x implementation of FDMA, hooks into the existing FDMA for
Sparx5, but requires its own RX and TX handling, as lan969x does not
support the same native cache coherency that Sparx5 does. Effectively,
this means that we are going to use the DMA mapping API for mapping and
unmapping TX buffers. The RX loop will utilize the page pool API for
efficient RX handling. Other than that, the implementation is largely
the same, and utilizes the FDMA library for DCB and DB handling.
Some numbers:
Manual injection/extraction (before this series):
// iperf3 -c 1.0.1.1
[ ID] Interval Transfer Bitrate
[ 5] 0.00-10.02 sec 345 MBytes 289 Mbits/sec sender
[ 5] 0.00-10.06 sec 345 MBytes 288 Mbits/sec receiver
FDMA (after this series):
// iperf3 -c 1.0.1.1
[ ID] Interval Transfer Bitrate
[ 5] 0.00-10.03 sec 1.10 GBytes 940 Mbits/sec sender
[ 5] 0.00-10.07 sec 1.10 GBytes 936 Mbits/sec receiver
Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com>
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Link: https://patch.msgid.link/20250113-sparx5-lan969x-switch-driver-5-v2-5-c468f02fd623@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The function sparx5_fdma_tx_activate() is responsible for configuring
the TX FDMA instance and activating the channel. TX activation has
previously been done in the xmit() function, when the first frame is
transmitted. Now that we have separate functions for starting and
stopping the FDMA, it seems reasonable to move the TX activation to the
start function. This change has no implications on the functionality.
Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com>
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Link: https://patch.msgid.link/20250113-sparx5-lan969x-switch-driver-5-v2-3-c468f02fd623@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The two functions: sparx5_fdma_{start(),stop()} are responsible for a
number of things, namely: allocation and initialization of FDMA buffers,
activation FDMA channels in hardware and activation of the NAPI
instance.
This patch splits the buffer allocation and initialization into init and
deinit functions, and the channel and NAPI activation into start and
stop functions. This serves two purposes: 1) the start() and stop()
functions can be reused for lan969x and 2) prepares for future MTU
change support, where we must be able to stop and start the FDMA
channels and NAPI instance, without free'ing and reallocating the FDMA
buffers.
Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com>
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Link: https://patch.msgid.link/20250113-sparx5-lan969x-switch-driver-5-v2-2-c468f02fd623@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Russell King says:
====================
net: phylink: fix PCS without autoneg
Eric Woudstra reported that a PCS attached using 2500base-X does not
see link when phylink is using in-band mode, but autoneg is disabled,
despite there being a valid 2500base-X signal being received. We have
these settings:
act_link_an_mode = MLO_AN_INBAND
pcs_neg_mode = PHYLINK_PCS_NEG_INBAND_DISABLED
Eric diagnosed it to phylink_decode_c37_word() setting state->link
false because the full-duplex bit isn't set in the non-existent link
partner advertisement word (which doesn't exist because in-band
autoneg is disabled!)
The test in phylink_mii_c22_pcs_decode_state() is supposed to catch
this state, but since we converted PCS to use neg_mode, testing the
Autoneg in the local advertisement is no longer sufficient - we need
to be looking at the neg_mode, which currently isn't provided.
We need to provide this via the .pcs_get_state() method, and this
will require modifying all PCS implementations to add the extra
argument to this method.
Patch 1 uses the PCS neg_mode in phylink_mac_pcs_get_state() to correct
the now obsolute usage of the Autoneg bit in the advertisement.
Patch 2 passes neg_mode into the .pcs_get_state() method, and updates
all users.
Patch 3 adds neg_mode as an argument to the various clause 22 state
decoder functions in phylink, modifying drivers to pass the neg_mode
through.
Patch 4 makes use of phylink_mii_c22_pcs_decode_state() rather than
using the Autoneg bit in the advertising field.
Patch 5 may be required for Eric's case - it ensures that we report
the correct state for interface types that we support only one set
of modes for when autoneg is disabled.
====================
Link: https://patch.msgid.link/Z4TbR93B-X8A8iHe@shell.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Matthieu Baerts says:
====================
mptcp: selftests: more debug in case of errors
Here are just a bunch of small improvements for the MPTCP selftests:
Patch 1: Unify errors messages in simult_flows: print MIB and 'ss -Me'.
Patch 2: Unify errors messages in sockopt: print MIB.
Patch 3: Move common code to print debug info to mptcp_lib.sh.
Patch 4: Use 'ss' with '-m' in case of errors.
Patch 5: Remove an unused variable.
Patch 6: Print only the size instead of size + filename again.
====================
Link: https://patch.msgid.link/20250114-net-next-mptcp-st-more-debug-err-v1-0-2ffb16a6cf35@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
'du' will print the name of the file, which was already displayed
before, e.g.
Created /tmp/tmp.UOyy0ghfmQ (size 4703740/tmp/tmp.UOyy0ghfmQ) containing data sent by client
Created /tmp/tmp.xq3zvFinGo (size 1391724/tmp/tmp.xq3zvFinGo) containing data sent by server
'stat' can be used instead, to display this instead:
Created /tmp/tmp.UOyy0ghfmQ (size 4703740 B) containing data sent by client
Created /tmp/tmp.xq3zvFinGo (size 1391724 B) containing data sent by server
So easier to spot the file sizes.
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250114-net-next-mptcp-st-more-debug-err-v1-6-2ffb16a6cf35@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
In order to unify what is printed in case of error, similar to what is
done in mptcp_connect.sh and mptcp_join.sh, it is interesting to do the
following modifications in simult_flows.sh:
- Print the rc errors at the end of the line.
- Print the MIB counters.
- Use the same ss options: add -M (MPTCP sockets) and -e (detailed
socket information).
While at it, also print of the 'max' time only in case of success,
because 'mptcp_connect.c' will already print this info in case of error,
e.g.:
transfer slower than expected! runtime 11948 ms, expected 11921 ms
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250114-net-next-mptcp-st-more-debug-err-v1-1-2ffb16a6cf35@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Russell King says:
====================
net: bcm: asp2: fix fallout from phylib EEE changes
This series addresses the fallout from the phylib changes in the
Broadcom ASP2 driver.
The first patch uses phylib's copy of the LPI timer setting, which
means the driver no longer has to track this. It will be set in
hardware each time the adjust_link function is called when the link
is up, and will be read at initialisation time to set the current
value.
The second patch removes the driver's storage of tx_lpi_enabled,
which has become redundant since phylib managed EEE was merged. The
driver does nothing with this flag other than storing it.
The last patch converts the driver to use phylib's enable_tx_lpi
flag rather than trying to maintain its own copy.
====================
Link: https://patch.msgid.link/Z4aV3RmSZJ1WS3oR@shell.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Russell King says:
====================
net: stmmac: further EEE cleanups (and one fix!)
This series continues the EEE cleanup of the stmmac driver, and
includes one fix.
As mentioned in the previous series, I wasn't entirely happy with the
"stmmac_disable_sw_eee_mode" name, so the first patch renames this to
"stmmac_stop_sw_lpi" instead, which I think better describes what this
function is doing - stopping the transmit of the LPI state because we
have a packet ot send.
Patch 2 corrects the priv->eee_sw_timer_en flag when EEE has been
disabled. Currently upon disable, priv->eee_enabled is set false,
but through the weird logic that was present prior to the previous
series, priv->eee_sw_timer_en was set true. This behaviour was kept
as the previous series was cleanup, not fixes. This patch fixes this.
Having fixed priv->eee_sw_timer_en to actually indicate whether
software timed EEE mode is being used, it becomes no longer necessary
to test priv->eee_enabled in addition. Patch 3 removes the redundant
test. Patch 4 also uses priv->eee_sw_timer_en before manipulating the
software EEE state in the suspend method rather than using
priv->eee_enabled, which brings consistency.
Patch 5 provides stmmac_try_to_start_sw_lpi() which complements
stmmac_stop_sw_lpi(), and allows us to move duplicated code into one
location.
Patch 6 splits stmmac_enable_eee_mode() - one part of this function
tests whether there are any queues that have unfinished work (in
other words are busy). Separate out this code into a separate function.
Patch 7 also splits out the mod_timer() for the software EEE timer
intoi a seperate function (the reason will be in patch 9.)
Patch 8 merges the remains of stmmac_enable_eee_mode() into
stmmac_try_to_start_sw_lpi().
Patch 9 fixes the delay between transmit and entering LPI. Currently,
when cleaning the transmit queues, if we discover that we have finished
cleaning up all queues, we immediately instruct the hardware to enter
LPI mode without waiting for the LPI timer. However, we should wait for
the LPI timer to expire. Therefore, the transmit cleanup path needs
to call stmmac_restart_sw_lpi_timer() instead of
stmmac_try_to_start_sw_lpi().
====================
Link: https://patch.msgid.link/Z4T84SbaC4D-fN5y@shell.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Combine stmmac_enable_eee_mode() with stmmac_try_to_start_sw_lpi()
which makes the code easier to read and the flow more logical. We
can now trivially see that if the transmit queues are busy, we
(re-)start the eee_ctrl_timer. Otherwise, if the transmit path is
not already in LPI mode, we ask the hardware to enter LPI mode.
I believe that now we can see better what is going on here, this
shows that there is a bug with the software LPI timer implementation.
The LPI timer is supposed to define how long after the last
transmittion completed before we start signalling LPI. However,
this code structure shows that if all transmit queues are empty,
and stmmac_try_to_start_sw_lpi() is called immediately after cleaning
the transmit queue, we will instruct the hardware to start signalling
LPI immediately.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1tXItb-000MBa-OU@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
As mentioned in "net: stmmac: correct priv->eee_sw_timer_en setting",
we can simplify some fast-path tests.
The transmit cleaning path checks whether EEE is enabled, the transmit
path is not in LPI mode, and that we're using software timed mode.
Since the above mentioned commit, checking whether EEE is enabled is
no longer necessary as priv->eee_sw_timer_en will be false when EEE is
disabled. Simplify this test.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1tXItC-000MB6-54@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
If we are disabling EEE/LPI, then we should not be enabling software
mode. The only time when we should is if EEE is active, and we are
wanting to use software-timed EEE mode.
Therefore, in the disable path of stmmac_eee_init(), ensure that
priv->eee_sw_timer_en is set false as we are going to be calling
del_timer_sync() on the timer.
This will allow us to simplify some fast-path tests in later patches.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1tXIt7-000MB0-0W@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The device ioctl handler no longer calls ndo_do_ioctl, but calls
ndo_eth_ioctl to handle mii ioctls since commit a76053707d
("dev_ioctl: split out ndo_eth_ioctl"). However, sunplus still used
ndo_do_ioctl when it was introduced. So switch to ndo_eth_ioctl.
Bad commit fd3040b939 ("net: ethernet: Add driver for Sunplus SP7021")
was the initial driver commit, meaning that PHY IOCTLs where never
available on this driver. Therefore don't consider this as a fix.
Found by code inspection.
Signed-off-by: 谢致邦 (XIE Zhibang) <Yeking@Red54.com>
Link: https://patch.msgid.link/tencent_8CF8A72C708E96B9C7DC1AF96FEE19AF3D05@qq.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Use syscon_regmap_lookup_by_phandle_args() which is a wrapper over
syscon_regmap_lookup_by_phandle() combined with getting the syscon
argument. Except simpler code this annotates within one line that given
phandle has arguments, so grepping for code would be easier.
There is also no real benefit in printing errors on missing syscon
argument, because this is done just too late: runtime check on
static/build-time data. Dtschema and Devicetree bindings offer the
static/build-time check for this already.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://patch.msgid.link/20250112-syscon-phandle-args-net-v1-5-3423889935f7@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Use syscon_regmap_lookup_by_phandle_args() which is a wrapper over
syscon_regmap_lookup_by_phandle() combined with getting the syscon
argument. Except simpler code this annotates within one line that given
phandle has arguments, so grepping for code would be easier.
There is also no real benefit in printing errors on missing syscon
argument, because this is done just too late: runtime check on
static/build-time data. Dtschema and Devicetree bindings offer the
static/build-time check for this already.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://patch.msgid.link/20250112-syscon-phandle-args-net-v1-4-3423889935f7@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Use syscon_regmap_lookup_by_phandle_args() which is a wrapper over
syscon_regmap_lookup_by_phandle() combined with getting the syscon
argument. Except simpler code this annotates within one line that given
phandle has arguments, so grepping for code would be easier.
There is also no real benefit in printing errors on missing syscon
argument, because this is done just too late: runtime check on
static/build-time data. Dtschema and Devicetree bindings offer the
static/build-time check for this already.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://patch.msgid.link/20250112-syscon-phandle-args-net-v1-3-3423889935f7@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>