Russell King says:
====================
net: stmmac: rk: cleanups v3: mode and speed for most
Third installment in the rk cleanups, this converts the interface mode
and speed configuration for most RK SoCs.
====================
Link: https://patch.msgid.link/aYB2cKRu3DQh6yXK@shell.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
As we can detect whether the SoC provides the parameters necessary for
rk_set_reg_speed(), we don't need to have explicit calls to this.
Instead, we can move the contents of this function to
rk_set_clk_tx_rate().
This remsoves all the .set_speed() implementations that merely go on to
invoke rk_set_reg_speed().
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1vmqnM-00000007VD8-1xWo@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The RMII clock is a single bit, which is set for 100M and clear for
10M. Move this out of struct rk_reg_speed_data (which gets rid of
this structure) into the struct rk_clock_fields as the bitmask for
this bit.
This gets rid of the per-SoC variability in the calls to
rk_set_reg_speed().
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1vmqnH-00000007VCz-1WmP@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The RMII speed configuration is encoded as a single bit, which is set
for 100M and clean for 10M. Provide the bitfield definition in
struct rk_clock_fields, moving it out of struct rk_reg_speed_data's
rmii_10 and rmii_100 initialisers. Update rk_set_reg_speed() to handle
the new definition location of this bit.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1vmqnC-00000007VCt-0oRg@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
As all of the RGMII clock selection bitfields (gmii_clk_sel) use the
same encoding, parameterise this by providing the bitfield mask in
the BSP private data.
This is the last user of GRF_FIELD_CONST(), so remove that definition
as well.
One additional change is for RK3328 - as only gmac2io supports RGMII,
only initialise the mask for this instance.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1vmqn7-00000007VCn-0OZA@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
rk3588 has a quirk compared to the other Rockchip implementations in
that the interface mode configuration register is in the php_grf
regmap rather than the grf regmap. Add a flag to indicate this, and
a separate function to write to the appropriate regmap. This allows
rk3588 to be converted.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1vmqmm-00000007VCP-2XZc@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The majority of Rockchip implementations require three common pieces
of information to configure the PHY interface mode:
- The grf register offset for configuring the GMAC phy_intf_sel field
and the RMII mode bit.
- The bitfield in this register for the GMAC's phy_intf_sel.
- The bit position for RMII mode but clear for RGMII mode.
Introduce members for this information into struct rk_priv_data and
struct rk_gmac_ops, which will be used to pre-initialise the struct
rk_priv_data members. We describe the register contents using
bitfields, even for those that are a single bit for consistency.
As each register comprises of two halves, where the upper half enables
changing the bit state in the lower half, we can describe these
bitfields using a 16-bit data type, and provide rk_encode_wm16() to
generate the actual register values from the field mask and field
value. We are unable to use the FIELD_PREP_WM16() macros for this as
these require the field mask to be a constant.
Add code to rk_gmac_powerup() to get the phy_intf_sel value, validating
that the resulting mode is either RMII or RGMII. No other modes are
supported by any of the Rockchip SoCs supported by this driver.
If either of the bitfield mask values are populated in struct
rk_priv_data, use these to generate the register contents, and write
the resulting value to the specified GRF register.
Convert many Rockchip implementations to use this new infrastructure.
For those where there is a single GMAC instance, it is merely a case of
filling in the new members of struct rk_gmac_ops. For those with
multiple instances, one or more of these members depends on the GMAC
instance, so setup of the members in struct rk_gmac has to be done via
the .init method of struct rk_gmac_ops. The corresponding code is
removed from the set_to_rgmii() and set_to_rmii() implementations.
Since the member name documents the purpose of the field that is being
initialised, providing preprocessor macros to define the bitfields is
deemed to be less than useful given the massive size of this driver.
The existing mechanisms remain behind for those SoCs that can not be
converted to this scheme.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
v2: disable clocks on failure
Link: https://patch.msgid.link/E1vmqmh-00000007VCJ-1xns@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add 2-bit tcpi_ecn_mode feild within tcp_info to indicate which ECN
mode is negotiated: ECN_MODE_DISABLED, ECN_MODE_RFC3168, ECN_MODE_ACCECN,
or ECN_MODE_PENDING. This is done by utilizing available bits from
tcpi_accecn_opt_seen (reduced from 16 bits to 2 bits) and
tcpi_accecn_fail_mode (reduced from 16 bits to 4 bits).
Also, an extra 24-bit tcpi_options2 field is identified to represent
newer options and connection features, as all 8 bits of tcpi_options
field have been used.
Signed-off-by: Chia-Yu Chang <chia-yu.chang@nokia-bell-labs.com>
Co-developed-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260131222515.8485-14-chia-yu.chang@nokia-bell-labs.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Detect spurious retransmission of a previously sent ACK carrying the
AccECN option after the second retransmission. Since this might be caused
by the middlebox dropping ACK with options it does not recognize, disable
the sending of the AccECN option in all subsequent ACKs. This patch
follows Section 3.2.3.2.2 of AccECN spec (RFC9768), and a new field
(accecn_opt_sent_w_dsack) is added to indicate that an AccECN option was
sent with duplicate SACK info.
Also, a new AccECN option sending mode is added to tcp_ecn_option sysctl:
(TCP_ECN_OPTION_PERSIST), which ignores the AccECN fallback policy and
persistently sends AccECN option once it fits into TCP option space.
Signed-off-by: Chia-Yu Chang <chia-yu.chang@nokia-bell-labs.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260131222515.8485-13-chia-yu.chang@nokia-bell-labs.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
According to Section 3.2.2.1 of AccECN spec (RFC9768), if the Server
is in AccECN mode and in SYN-RCVD state, and if it receives a value of
zero on a pure ACK with SYN=0 and no SACK blocks, for the rest of the
connection the Server MUST NOT set ECT on outgoing packets and MUST
NOT respond to AccECN feedback. Nonetheless, as a Data Receiver it
MUST NOT disable AccECN feedback.
Signed-off-by: Chia-Yu Chang <chia-yu.chang@nokia-bell-labs.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260131222515.8485-12-chia-yu.chang@nokia-bell-labs.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
For Accurate ECN, the first SYN/ACK sent by the TCP server shall set
the ACE flag (Table 1 of RFC9768) and the AccECN option to complete the
capability negotiation. However, if the TCP server needs to retransmit
such a SYN/ACK (for example, because it did not receive an ACK
acknowledging its SYN/ACK, or received a second SYN requesting AccECN
support), the TCP server retransmits the SYN/ACK without the AccECN
option. This is because the SYN/ACK may be lost due to congestion, or a
middlebox may block the AccECN option. Furthermore, if this retransmission
also times out, to expedite connection establishment, the TCP server
should retransmit the SYN/ACK with (AE,CWR,ECE) = (0,0,0) and without the
AccECN option, while maintaining AccECN feedback mode.
This complies with Section 3.2.3.2.2 of the AccECN spec RFC9768.
Signed-off-by: Chia-Yu Chang <chia-yu.chang@nokia-bell-labs.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260131222515.8485-10-chia-yu.chang@nokia-bell-labs.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Based on AccECN spec (RFC9768) Section 3.1.4.1, if the sender of an
AccECN SYN (the TCP Client) times out before receiving the SYN/ACK, it
SHOULD attempt to negotiate the use of AccECN at least one more time
by continuing to set all three TCP ECN flags (AE,CWR,ECE) = (1,1,1) on
the first retransmitted SYN (using the usual retransmission time-outs).
If this first retransmission also fails to be acknowledged, in
deployment scenarios where AccECN path traversal might be problematic,
the TCP Client SHOULD send subsequent retransmissions of the SYN with
the three TCP-ECN flags cleared (AE,CWR,ECE) = (0,0,0).
Signed-off-by: Chia-Yu Chang <chia-yu.chang@nokia-bell-labs.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260131222515.8485-8-chia-yu.chang@nokia-bell-labs.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
According to Sections 3.1.2 and 3.1.3 of AccECN spec (RFC9768).
In Section 3.1.2, it says an AccECN implementation has no need to
recognize or support the Server response labelled 'Nonce' or ECN-nonce
feedback more generally, as RFC 3540 has been reclassified as Historic.
AccECN is compatible with alternative ECN feedback integrity approaches
to the nonce. The SYN/ACK labelled 'Nonce' with (AE,CWR,ECE) = (1,0,1)
is reserved for future use. A TCP Client (A) that receives such a SYN/ACK
follows the procedure for forward compatibility given in Section 3.1.3.
Then in Section 3.1.3, it says if a TCP Client has sent a SYN requesting
AccECN feedback with (AE,CWR,ECE) = (1,1,1) then receives a SYN/ACK with
the currently reserved combination (AE,CWR,ECE) = (1,0,1) but it does not
have logic specific to such a combination, the Client MUST enable AccECN
mode as if the SYN/ACK onfirmed that the Server supported AccECN and as
if it fed back that the IP-ECN field on the SYN had arrived unchanged.
Fixes: 3cae34274c ("tcp: accecn: AccECN negotiation").
Signed-off-by: Chia-Yu Chang <chia-yu.chang@nokia-bell-labs.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260131222515.8485-7-chia-yu.chang@nokia-bell-labs.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
When AccECN is not successfully negociated for a TCP flow, it defaults
fallback to classic ECN (RFC3168). However, L4S service will fallback
to non-ECN.
This patch enables congestion control module to control whether it
should not fallback to classic ECN after unsuccessful AccECN negotiation.
A new CA module flag (TCP_CONG_NO_FALLBACK_RFC3168) identifies this
behavior expected by the CA.
Signed-off-by: Chia-Yu Chang <chia-yu.chang@nokia-bell-labs.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260131222515.8485-6-chia-yu.chang@nokia-bell-labs.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Two flags for congestion control (CC) module are added in this patch
related to AccECN negotiation. First, a new flag (TCP_CONG_NEEDS_ACCECN)
defines that the CC expects to negotiate AccECN functionality using the
ECE, CWR and AE flags in the TCP header.
Second, during ECN negotiation, ECT(0) in the IP header is used. This
patch enables CC to control whether ECT(0) or ECT(1) should be used on
a per-segment basis. A new flag (TCP_CONG_ECT_1_NEGOTIATION) defines the
expected ECT value in the IP header by the CA when not-yet initialized
for the connection.
The detailed AccECN negotiaotn can be found in IETF RFC9768.
Co-developed-by: Olivier Tilmans <olivier.tilmans@nokia.com>
Signed-off-by: Olivier Tilmans <olivier.tilmans@nokia.com>
Signed-off-by: Ilpo Järvinen <ij@kernel.org>
Signed-off-by: Chia-Yu Chang <chia-yu.chang@nokia-bell-labs.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260131222515.8485-5-chia-yu.chang@nokia-bell-labs.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Currently, GRO does not flush packets when the CWR bit is set.
A corresponding self-test is being added, in which the CWR flag
is set for two consecutive packets, but the first packet with the
CWR flag set will not be flushed immediately.
+===================+==========+===============+===========+
| Packet id | CWR flag | Payload | Flushing? |
+===================+==========+===============+===========+
| 0 | 0 | PAYLOAD_LEN | 0 |
| ... | 0 | PAYLOAD_LEN | 1 |
+-------------------+----------+---------------+-----------+
| NUM_PACKETS/2 - 1 | 1 | payload_len | 0 |
| NUM_PACKETS/2 | 1 | payload_len | 1 |
+-------------------+----------+---------------+-----------+
| ... | 0 | PAYLOAD_LEN | 0 |
| NUM_PACKETS | 0 | PAYLOAD_LEN | 1 |
+===================+==========+===============+===========+
Signed-off-by: Chia-Yu Chang <chia-yu.chang@nokia-bell-labs.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260131222515.8485-4-chia-yu.chang@nokia-bell-labs.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Add newly acked pkts EWMA. When ACK thinning occurs, select
between safer and unsafe cep delta in AccECN processing based
on it. If the packets ACKed per ACK tends to be large, don't
conservatively assume ACE field overflow.
This patch uses the existing 2-byte holes in the rx group for new
u16 variables withtout creating more holes. Below are the pahole
outcomes before and after this patch:
[BEFORE THIS PATCH]
struct tcp_sock {
[...]
u32 delivered_ecn_bytes[3]; /* 2744 12 */
/* XXX 4 bytes hole, try to pack */
[...]
__cacheline_group_end__tcp_sock_write_rx[0]; /* 2816 0 */
[...]
/* size: 3264, cachelines: 51, members: 177 */
}
[AFTER THIS PATCH]
struct tcp_sock {
[...]
u32 delivered_ecn_bytes[3]; /* 2744 12 */
u16 pkts_acked_ewma; /* 2756 2 */
/* XXX 2 bytes hole, try to pack */
[...]
__cacheline_group_end__tcp_sock_write_rx[0]; /* 2816 0 */
[...]
/* size: 3264, cachelines: 51, members: 178 */
}
Signed-off-by: Ilpo Järvinen <ij@kernel.org>
Co-developed-by: Chia-Yu Chang <chia-yu.chang@nokia-bell-labs.com>
Signed-off-by: Chia-Yu Chang <chia-yu.chang@nokia-bell-labs.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260131222515.8485-2-chia-yu.chang@nokia-bell-labs.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Heiner Kallweit says:
====================
net: phy: remove modalias-based MDIO device bus matching
modalias-based MDIO device bus matching has only one user (dsa-loop),
where we can replace modalias-based matching with a simple custom
match function. This, and first patch of the series, lay the foundation
for removing modalias-based matching.
====================
Link: https://patch.msgid.link/d9543e7d-23e1-4dba-a6b3-35dcd6a35dec@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Primary reason for this change is to remove the misuse of MDIO_NAME_SIZE
here, so that this constant can be removed in a follow-up patch.
Use case here is simply a chip name w/o any relationship to a MDIO
device. Also there's no need to reserve a longer char array, so make
the name a pointer.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://patch.msgid.link/61bc14fa-eed3-43b6-ae40-b98063e81578@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
devlink_rate_nodes_check() was used to verify there are no devlink rate
nodes created when switching the esw mode.
Rate management code is about to become more complex, so refactor this
function:
- remove unused param 'mode'.
- add a new 'rate_filter' param.
- rename to devlink_rates_check().
- expose devlink_rate_is_node() to be used as a rate filter.
This makes it more usable from multiple places, so use it from those
places as well.
Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260128112544.1661250-6-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Commit [1] defined the locking expectations for nested devlink
instances: the nested-in devlink instance lock needs to be acquired
before the nested devlink instance lock. The code handling devlink rels
was architected with that assumption in mind.
There are no actual users of double locking yet but that is about to
change in the upcoming patches in the series.
Code operating on nested devlink instances will require also obtaining
the nested-in instance lock, but such code may already be called from a
variety of places with the nested devlink instance lock. Then, there's
no way to acquire the nested-in lock other than making sure that all
callers acquire it first.
Reversing the nested lock order allows incrementally acquiring the
nested-in instance lock when needed (perhaps even a chain of locks up to
the root) without affecting any caller.
The only affected use of nesting is devlink_nl_nested_fill(), which
iterates over nested devlink instances with the RCU lock, without
locking them, so there's no possibility of deadlock.
So this commit just updates a comment regarding the nested locks.
[1] commit c137743bce ("devlink: introduce object and nested devlink
relationship infra")
Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260128112544.1661250-4-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Russell King says:
====================
net: stmmac: pcs preparation
These three patches prepare for the PCS changes, which, subject
to Qualcomm testing, should be coming in the next cycle.
====================
Link: https://patch.msgid.link/aXyRlFw7ZuhRPiKo@shell.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The dwmac core has no support for SGMII without using its integrated
PCS. Thus, PHY_INTF_SEL_SGMII is only supported when this block is
present, and it makes no sense for stmmac_get_phy_intf_sel() to decode
this.
None of the platform glue users that use stmmac_get_phy_intf_sel()
directly accept PHY_INTF_SEL_SGMII as a valid mode.
Check whether a PCS will be used by the driver for the interface mode,
and if it is the integrated PCS, query the integrated PCS for the
phy_intf_sel_i value to use.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Tested-by: Mohd Ayaan Anwar <mohd.anwar@oss.qualcomm.com>
Link: https://patch.msgid.link/E1vlmOa-00000006zvB-1fIe@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Move most of the PCS register offset definitions to stmmac_pcs.c.
Since stmmac_pcs.c only ever passes zero into the register offset
macros, remove that ability, making them simple constant integer
definitions.
Add appropriate descriptions of the registers, pointing out their
similarity with their IEEE 802.3 counterparts. Make use of the
BMSR definitions for the GMAC_AN_STATUS register and remove the
driver private versions.
Note that BMSR_LSTATUS is non-low-latching, unlike it's 802.3z
counterpart.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Tested-by: Mohd Ayaan Anwar <mohd.anwar@oss.qualcomm.com>
Link: https://patch.msgid.link/E1vlmOV-00000006zv5-1CwO@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Biju Das says:
====================
Add support for Renesas RZ/G3L GBETH
From: Biju Das <biju.das.jz@bp.renesas.com>
The Renesas RZ/G3L GBETH IP uses Synopsys DesignWare MAC version 5.30
compared to other Renesas SoC such as RZ/V2H that use MAC version 5.20.
The RZ/G3L GBETH requires an extra clock compared to RZ/G3E and has pps
interrupts. Document the Renesas RZ/G3L GBETH IP in bindings and add
support for the RZ/G3L GBETH in dwmac-renesas-gbeth glue driver.
====================
Link: https://patch.msgid.link/20260131161250.5047-1-biju.das.jz@bp.renesas.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add device tree binding support for the Gigabit Ethernet (GBETH) IP on
Renesas RZ/G3L SoC. This SoC uses different Synopsys DesignWare MAC
version 5.30 compared to RZ/G3E.
RZ/G3L requires an extra clock compared to RZ/G3E and has pps interrupts.
Add a new compatible string "renesas,r9a08g046-gbeth" for RZ/G3L SoC and
update the schema to handle hardware differences between SoC variants.
Extend the base snps,dwmac.yaml schema to accommodate the PPS interrupts.
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Link: https://patch.msgid.link/20260131161250.5047-2-biju.das.jz@bp.renesas.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>