linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-05-16 12:31:52 -04:00

Author	SHA1	Message	Date
Russell King (Oracle)	0835bc7251	net: stmmac: move initialisation of dma_cfg->atds Move the initialisation of priv->plat->dma_cfg->atds, which indicates that 8 32-bit word descriptors are being used for pre-v4.0 cores, after the call to stmmac_hwif_init(), which will initialise priv->extend_desc and priv->mode (the descriptor mode.) We don't need to re-evaluate this in stmmac_init_dma_engine() - as the state that it depends on only changes in stmmac_hwif_init() which is only called in the probe path. Also, once set, no code clears this flag. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1vvuXt-0000000Avnc-0UYC@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:35:06 -08:00
Russell King (Oracle)	1558705afb	net: stmmac: make dma_cfg mixed/fixed burst boolean struct stmmac_dma_cfg mixed_burst/fixed_burst members are both boolean in nature - of_property_read_bool() are used to read these from DT, and they are only tested for non-zero values. Use bool to avoid unnecessary padding in this structure. Update dwmac-intel to initialise these using true rather than '1', and remove the '0' initialisers as the struct is already zero initialised on allocation. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1vvuXn-0000000AvnX-4A1u@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:35:06 -08:00
Russell King (Oracle)	a2a3832ad7	net: stmmac: make chain_mode a boolean priv->chain_mode is only tested for non-zero, so it can be a boolean. Change its type to boolean, and add a comment describing this member. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1vvuXi-0000000AvnR-3btC@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:35:06 -08:00
Russell King (Oracle)	ecb037f58d	net: stmmac: make extend_desc boolean extend_desc is a boolean, so make it so, and use "true" to assign it. Add a comment to describe what this member does. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1vvuXd-0000000AvnL-36K3@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:35:06 -08:00
Russell King (Oracle)	70bafb53b3	net: stmmac: remove mac->xlgmac mac->xlgmac is only ever written to by the dwxlgmac2_quirk() function. Remove mac->xlgmac, and the quirk function that then becomes redundant. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1vvuXY-0000000AvnF-2ccv@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:35:06 -08:00
Russell King (Oracle)	0e7cb34d0f	net: stmmac: remove dwmac410_(enable\|disable)_dma_irq As a result of the previous cleanup, it is now obvious that there are no differences between the dwmac4 and dwmac410 versions of the DMA interrupt enable/disable functions. Moreover, dwmac410_disable_dma_irq() is completely unused; instead, dwmac4_disable_dma_irq() is used to disable the interrupts for v4.10a cores while dwmac410_enable_dma_irq() was being used to enable these same same interrupts. Remove the unnecessary v4.10a functions. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1vvuXT-0000000Avn9-29US@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:35:05 -08:00
Russell King (Oracle)	d192529123	net: stmmac: remove dwmac4 DMA_CHAN_INTR_DEFAULT_[TR]X* Remove the DMA_CHAN_INTR_DEFAULT_[TR]X* definitions, which are aliases of their respective DMA_CHAN_INTR_ENA_[TR]IE definitions. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1vvuXO-0000000Avn3-1hhD@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:35:05 -08:00
Russell King (Oracle)	19f2d59c3c	net: stmmac: remove .get_tx_len() No code calls stmmac_get_tx_len(). Remove this macro, its associated function pointer, and all implementations. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1vvuXJ-0000000Avmx-1B8Y@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:35:05 -08:00
Russell King (Oracle)	1fe444bdc5	net: stmmac: remove .get_tx_ls() No code calls stmmac_get_tx_ls(). Remove this macro, its associated function pointer, and all implementations. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1vvuXE-0000000Avmr-0eB0@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:35:05 -08:00
Russell King (Oracle)	d48ba98bbc	net: stmmac: remove .get_tx_owner() No code calls stmmac_get_tx_owner(). Remove the macro, its associated function pointer, and all implementations. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1vvuX9-0000000Avml-08Lo@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:35:05 -08:00
Russell King (Oracle)	44a2ec96d3	net: stmmac: remove plat_dat->port_node There are repeated instances of: fwnode = priv->plat->port_node; if (!fwnode) fwnode = dev_fwnode(priv->device); However, the only place that ->port_node is set is stmmac_probe_config_dt(): struct device_node np = pdev->dev.of_node; ... / PHYLINK automatically parses the phy-handle property */ plat->port_node = of_fwnode_handle(np); which is equivalent to dev_fwnode(&pdev->dev) and, as priv->device will be &pdev->dev, is also equivalent to dev_fwnode(priv->device). Thus, plat_dat->port_node doesn't provide any extra benefit over using dev_fwnode(priv->device) directly. There is one case where port_node is used directly, which can be found in stmmac_pcs_setup(). This may cause a change of behaviour as PCI drivers do not populate plat_dat->port_node, but dev_fwnode(priv->device) may be valid. PCI-based stmmac should be tested. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1vvuX3-0000000Avme-3oej@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:35:05 -08:00
Russell King (Oracle)	940ec40dd2	net: stmmac: clean up formatting in stmmac_mac_finish() Wrap the arguments for priv->plat->mac_finish() to avoid an overly long line. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1vvuWy-0000000AvmY-3GWN@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:35:04 -08:00
Jakub Kicinski	df548e627b	Merge branch 'selftests-drv-net-iou-zcrx-improve-stability-and-make-the-large-chunk-test-work' Jakub Kicinski says: ==================== selftests: drv-net: iou-zcrx: improve stability and make the large chunk test work The iou-zcrx test hasn't been passing in NIPA, I assumed it's because we're missing iouring changes, but it's still failing after the merge window. Turns out there was a bug in the implementation which was fixed separately via the iouring tree. With that out of the way the tests are passing but flaky. Patch 1 deals with the flakiness. While looking at this I also noticed that the large chunk test isn't running at all. So fix and enable it (patches 2 and 3). ==================== Link: https://patch.msgid.link/20260227171305.2848240-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:21:20 -08:00
Jakub Kicinski	c7b228418e	selftests: drv-net: iou-zcrx: allocate hugepages for large chunks test The large chunks test needs 2MB hugepages for its mmap allocation, but the test system may not have any pre-allocated. Ensure at least 64 hugepages are available before running the test, and restore the original value on cleanup. While at it strip the stdout, it has a trailing new line. Before: ok 5 iou-zcrx.test_zcrx_large_chunks # SKIP Can't allocate huge pages Link: https://patch.msgid.link/20260227171305.2848240-4-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:21:16 -08:00
Jakub Kicinski	67792dde27	selftests: drv-net: iou-zcrx: rework large chunks test to use common setup Commit `a32bb32d01` ("selftests: iou-zcrx: test large chunk sizes") and commit `de7c600e2d` ("selftests/net: parametrise iou-zcrx.py with ksft_variants") landed at similar time. The large chunks test was actually not included in the list of tests, so it never run. We haven't noticed that it uses the old-style helpers (_get_combined_channels, _get_current_settings, _set_flow_rule) that were removed by the other commit. Rework test_zcrx_large_chunks to reuse the single() setup function and add it to the ksft_run cases list so it actually gets executed. Link: https://patch.msgid.link/20260227171305.2848240-3-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:21:15 -08:00
Jakub Kicinski	27c4ab9438	selftests: drv-net: iou-zcrx: wait for memory provider cleanup io_uring defers zcrx context teardown to the iou_exit workqueue. # ps aux \| grep iou ... 07:58 0:00 [kworker/u19:0-iou_exit] ... 07:58 0:00 [kworker/u18:2-iou_exit] When the test's receiver process exits, bkg() returns but the memory provider may still be attached to the rx queue. The subsequent defer() that restores tcp-data-split then fails: # Exception while handling defer / cleanup (callback 3 of 3)! # Defer Exception\| net.ynl.pyynl.lib.ynl.NlError: Netlink error: can't disable tcp-data-split while device has memory provider enabled: Invalid argument not ok 1 iou-zcrx.test_zcrx.single Add a helper that polls netdev queue-get until no rx queue reports the io-uring memory provider attribute. Register it as a defer() just before tcp-data-split is restored as a "barrier". Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Link: https://patch.msgid.link/20260227171305.2848240-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:21:15 -08:00
Eric Dumazet	8341c989ac	net: remove addr_len argument of recvmsg() handlers Use msg->msg_namelen as a place holder instead of a temporary variable, notably in inet[6]_recvmsg(). This removes stack canaries and allows tail-calls. $ scripts/bloat-o-meter -t vmlinux.old vmlinux add/remove: 0/0 grow/shrink: 2/19 up/down: 26/-532 (-506) Function old new delta rawv6_recvmsg 744 767 +23 vsock_dgram_recvmsg 55 58 +3 vsock_connectible_recvmsg 50 47 -3 unix_stream_recvmsg 161 158 -3 unix_seqpacket_recvmsg 62 59 -3 unix_dgram_recvmsg 42 39 -3 tcp_recvmsg 546 543 -3 mptcp_recvmsg 1568 1565 -3 ping_recvmsg 806 800 -6 tcp_bpf_recvmsg_parser 983 974 -9 ip_recv_error 588 576 -12 ipv6_recv_rxpmtu 442 428 -14 udp_recvmsg 1243 1224 -19 ipv6_recv_error 1046 1024 -22 udpv6_recvmsg 1487 1461 -26 raw_recvmsg 465 437 -28 udp_bpf_recvmsg 1027 984 -43 sock_common_recvmsg 103 27 -76 inet_recvmsg 257 175 -82 inet6_recvmsg 257 175 -82 tcp_bpf_recvmsg 663 568 -95 Total: Before=25143834, After=25143328, chg -0.00% Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20260227151120.1346573-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:17:17 -08:00
Jakub Kicinski	f5ada26d6c	Merge tag 'phy-qcom-sgmii-eth-add-set_mode-and-validate-methods' net: stmmac: qcom-ethqos: further serdes reorganisation [part] First PHY patch of Russell's series. Vladimir will need this to avoid a conflict with his work. Link: https://patch.msgid.link/aaDSJAc-x2-klvHJ@shell.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 15:46:10 -08:00
Russell King (Oracle)	4ff5801f45	phy: qcom-sgmii-eth: add .set_mode() and .validate() methods qcom-sgmii-eth is an Ethernet SerDes supporting only Ethernet mode using SGMII, 1000BASE-X and 2500BASE-X. Add an implementation of the .set_mode() method, which can be used instead of or as well as the .set_speed() method. The Ethernet interface modes mentioned above all have a fixed data rate, so setting the mode is sufficient to fully specify the operating parameters. Add an implementation of the .validate() method, which will be necessary to allow discovery of the SerDes capabilities for platform independent SerDes support in the stmmac network driver. Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com> Acked-by: Vinod Koul <vkoul@kernel.org> Tested-by: Mohd Ayaan Anwar <mohd.anwar@oss.qualcomm.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1vvkU3-0000000AuP2-0hu3@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 15:41:38 -08:00
Jakub Kicinski	01857fc712	Merge branch 'net-sched-refactor-qdisc-drop-reasons-into-dedicated-tracepoint' Jesper Dangaard Brouer says: ==================== net: sched: refactor qdisc drop reasons into dedicated tracepoint This series refactors qdisc drop reason handling by introducing a dedicated enum qdisc_drop_reason and trace_qdisc_drop tracepoint, providing qdisc layer drop diagnostics with direct qdisc context visibility. Background: ----------- Identifying which qdisc dropped a packet via skb_drop_reason is difficult. Normally, the kfree_skb tracepoint caller "location" hints at the dropping code, but qdisc drops happen at a central point (__dev_queue_xmit), making this unusable. As a workaround, commits `5765c7f6e3` ("net_sched: sch_fq: add three drop_reason") and `a42d71e322` ("net_sched: sch_cake: Add drop reasons") encoded qdisc names directly in the drop reason enums. This series provides a cleaner solution by creating a dedicated qdisc tracepoint that naturally includes qdisc context (handle, parent, kind). Solution: --------- Create a new tracepoint trace_qdisc_drop that builds on top of existing trace_qdisc_enqueue infrastructure. It includes qdisc handle, parent, qdisc kind (name), and device information directly. The existing SKB_DROP_REASON_QDISC_DROP is retained for backwards compatibility via kfree_skb_reason(). The qdisc-specific drop reasons (QDISC_DROP_*) provide fine-grained detail via the new tracepoint. The enum uses subsystem encoding (offset by SKB_DROP_REASON_SUBSYS_QDISC) to catch type mismatches during debugging. This implements the alternative approach described in: https://lore.kernel.org/all/6be17a08-f8aa-4f91-9bd0-d9e1f0a92d90@kernel.org/ ==================== Link: https://patch.msgid.link/177211325634.3011628.9343837509740374154.stgit@firesoul Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-28 15:31:38 -08:00
Jesper Dangaard Brouer	67713dff63	net: sched: sch_dualpi2: use qdisc_dequeue_drop() for dequeue drops DualPI2 drops packets during dequeue but was using kfree_skb_reason() directly, bypassing trace_qdisc_drop. Convert to qdisc_dequeue_drop() and add QDISC_DROP_L4S_STEP_NON_ECN to the qdisc drop reason enum. - Set TCQ_F_DEQUEUE_DROPS flag in dualpi2_init() - Use enum qdisc_drop_reason in drop_and_retry() - Replace kfree_skb_reason() with qdisc_dequeue_drop() Signed-off-by: Jesper Dangaard Brouer <hawk@kernel.org> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://patch.msgid.link/177211351978.3011628.11267023360997620069.stgit@firesoul Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-28 15:31:35 -08:00
Jesper Dangaard Brouer	9d3e7f9718	net: sched: rename QDISC_DROP_CAKE_FLOOD to QDISC_DROP_FLOOD_PROTECTION Rename QDISC_DROP_CAKE_FLOOD to QDISC_DROP_FLOOD_PROTECTION to use a generic name without embedding the qdisc name. This follows the principle that drop reasons should describe the drop mechanism rather than being tied to a specific qdisc implementation. The flood protection drop reason is used by qdiscs implementing probabilistic drop algorithms (like BLUE) that detect unresponsive flows indicating potential DoS or flood attacks. CAKE uses this via its Cobalt AQM component. Signed-off-by: Jesper Dangaard Brouer <hawk@kernel.org> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://patch.msgid.link/177211347537.3011628.13759059534638729639.stgit@firesoul Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-28 15:31:35 -08:00
Jesper Dangaard Brouer	f30d9073ec	net: sched: rename QDISC_DROP_FQ_* to generic names Rename FQ-specific drop reasons to generic names: - QDISC_DROP_FQ_BAND_LIMIT -> QDISC_DROP_BAND_LIMIT - QDISC_DROP_FQ_HORIZON_LIMIT -> QDISC_DROP_HORIZON_LIMIT This follows the principle that drop reasons should describe the drop mechanism rather than being tied to a specific qdisc implementation. These concepts (priority band limits, timestamp horizon) could apply to other qdiscs as well. Remove the local macro define FQDR() and instead use the full QDISC_DROP_* name to make it easier to navigate code. Signed-off-by: Jesper Dangaard Brouer <hawk@kernel.org> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://patch.msgid.link/177211346902.3011628.12523261489552097455.stgit@firesoul Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-28 15:31:35 -08:00
Jesper Dangaard Brouer	3e28f8ad47	net: sched: sfq: convert to qdisc drop reasons Convert SFQ to use the new qdisc-specific drop reason infrastructure. This patch demonstrates how to convert a flow-based qdisc to use the new enum qdisc_drop_reason. As part of this conversion: - Add QDISC_DROP_MAXFLOWS for flow table exhaustion - Rename FQ_FLOW_LIMIT to generic FLOW_LIMIT, now shared by FQ and SFQ - Use QDISC_DROP_OVERLIMIT for sfq_drop() when overall limit exceeded - Use QDISC_DROP_FLOW_LIMIT for per-flow depth limit exceeded The FLOW_LIMIT reason is now a common drop reason for per-flow limits, applicable to both FQ and SFQ qdiscs. Signed-off-by: Jesper Dangaard Brouer <hawk@kernel.org> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://patch.msgid.link/177211345946.3011628.12770616071857185664.stgit@firesoul Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-28 15:31:34 -08:00
Jesper Dangaard Brouer	ff2998f29f	net: sched: introduce qdisc-specific drop reason tracing Create new enum qdisc_drop_reason and trace_qdisc_drop tracepoint for qdisc layer drop diagnostics with direct qdisc context visibility. The new tracepoint includes qdisc handle, parent, kind (name), and device information. Existing SKB_DROP_REASON_QDISC_DROP is retained for backwards compatibility via kfree_skb_reason(). Convert qdiscs with drop reasons to use the new infrastructure. Change CAKE's cobalt_should_drop() return type from enum skb_drop_reason to enum qdisc_drop_reason to fix implicit enum conversion warnings. Use QDISC_DROP_UNSPEC as the 'not dropped' sentinel instead of SKB_NOT_DROPPED_YET. Both have the same compiled value (0), so the comparison logic remains semantically equivalent. Signed-off-by: Jesper Dangaard Brouer <hawk@kernel.org> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://patch.msgid.link/177211345275.3011628.1974310302645218067.stgit@firesoul Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-28 15:31:34 -08:00
Jakub Kicinski	52d534aa66	Merge branch 'icmp-fix-icmp-error-source-address-over-xfrm-tunnel' Antony Antony says: ==================== icmp: Fix icmp error source address over xfrm tunnel icmp: Fix icmp error source address over xfrm tunnel This fix, originally sent to XFRM/IPsec, has been recommended by Steffen Klassert to submit to the net tree, since it changes ICMP behavior. The patch addresses a minor issue related to the IPv4 source address of ICMP error messages. The bug only occurs when xfrm policies are configured. It originated from an old 2011 commit: commit `415b3334a2` ("icmp: Fix regression in nexthop resolution during replies.") ==================== Link: https://patch.msgid.link/cover.1772101380.git.antony.antony@secunet.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-28 15:08:20 -08:00
Antony Antony	5b43d35e57	selftests: net: add ICMP error source address test over xfrm tunnel Test that ICMP error messages generated by an IPsec gateway use the correct source address (the gateway's address, not the unreachable destination). Signed-off-by: Antony Antony <antony.antony@secunet.com> Link: https://patch.msgid.link/79d526f96cf2252d71550d38772876bc72c7e3c7.1772101380.git.antony.antony@secunet.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-28 15:08:15 -08:00
Antony Antony	595da751c8	icmp: fix ICMP error source address when xfrm policy matches When an IPsec gateway generates an ICMP error (e.g., Destination Host Unreachable), the source address incorrectly shows the unreachable destination instead of the gateway's address. IPv6 behaves correctly. Before fix: ping 10.1.6.3 From 10.1.6.3 icmp_seq=1 Destination Host Unreachable (wrong - 10.1.6.3 is the unreachable host) After fix: ping 10.1.6.3 From 10.1.5.2 icmp_seq=1 Destination Host Unreachable (correct - 10.1.5.2 is the gateway) The fix removes the memcpy that overwrote fl4 with fl4_dec after xfrm_lookup(). A follow-up commit adds a selftest. Fixes: `415b3334a2` ("icmp: Fix regression in nexthop resolution during replies.") Cc: stable+noautosel@kernel.org # Avoid false positives in tests Signed-off-by: Antony Antony <antony.antony@secunet.com> Acked-by: Tobias Brunner <tobias@strongswan.org> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/19a0156ff6e76baa323a81d710510d399a6ff63a.1772101380.git.antony.antony@secunet.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-28 15:08:15 -08:00
Jakub Kicinski	d578b47293	Merge branch 'npc-hw-block-support-for-cn20k' Ratheesh Kannoth says: ==================== NPC HW block support for cn20k This patchset adds comprehensive support for the CN20K NPC architecture. CN20K introduces significant changes in MCAM layout, parser design, KPM/KPU mapping, index management, virtual index handling, and dynamic rule installation. The patches update the AF, PF/VF, and common layers to correctly support these new capabilities while preserving compatibility with previous silicon variants. MCAM on CN20K differs from older designs: the hardware now contains two vertical banks of depth 8192, and thirty-two horizontal subbanks of depth 256. Each subbank can be configured as x2 or x4, enabling 256-bit or 512-bit key storage. Several allocation models are added to support this layout, including contiguous and non-contiguous allocation with or without reference ranges and priorities. Parser and extraction logic are also enhanced. CN20K introduces a new profile model where up to twenty-four extractors may be configured for each parsing profile. A new KPM profile scheme is added, grouping sixteen KPUs into eight KPM profiles, each formed by two KPUs. Support is added for default index allocation for CN20K-specific MCAM entry structures, virtual index allocation, improved defragmentation, and TC rule installation by allowing the AF driver to determine required x2/x4 rule width during flow install. ==================== Link: https://patch.msgid.link/20260224080009.4147301-1-rkannoth@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-28 10:30:27 -08:00
Ratheesh Kannoth	2e8aeb7ff0	octeontx2-af: npc: Use common structures CN20K and legacy silicon differ in the size of key words used in NPC MCAM. However, SoC-specific structures are not required for low-level functions. Remove the SoC-specific structures and rename the macros to improve readability. Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com> Link: https://patch.msgid.link/20260224080009.4147301-14-rkannoth@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-28 10:29:27 -08:00
Ratheesh Kannoth	528530dff5	octeontx2-af: npc: cn20k: add debugfs support CN20K silicon divides the NPC MCAM into banks and subbanks, with each subbank configurable for x2 or x4 key widths. This patch adds debugfs entries to expose subbank usage details and their configured key type. A debugfs entry is also added to display the default MCAM indexes allocated for each pcifunc. Additionally, debugfs support is introduced to show the mapping between virtual indexes and real MCAM indexes, and vice versa. Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com> Link: https://patch.msgid.link/20260224080009.4147301-13-rkannoth@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-28 10:29:27 -08:00
Subbaraya Sundeep	0d12d26701	octeontx2-pf: cn20k: Add TC rules support Unlike previous silicons, MCAM entries required for TC rules in CN20K are allocated dynamically. The key size can also be dynamic, i.e., X2 or X4. Based on the size of the TC rule match criteria, the AF driver allocates an X2 or X4 rule. This patch implements the required changes for CN20K TC by requesting an MCAM entry from the AF driver on the fly when the user installs a rule. Based on the TC rule priority added or deleted by the user, the PF driver shifts MCAM entries accordingly. If there is a mix of X2 and X4 rules and the user tries to install a rule in the middle of existing rules, the PF driver detects this and rejects the rule since X2 and X4 rules cannot be shifted in hardware. Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com> Link: https://patch.msgid.link/20260224080009.4147301-12-rkannoth@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-28 10:29:27 -08:00
Ratheesh Kannoth	9000cada7a	octeontx2-af: npc: cn20k: Allocate MCAM entry for flow installation In CN20K, the PF/VF driver is unaware of the NPC MCAM entry type (x2/x4) required for a particular TC rule when the user installs rules through the TC command. This forces the PF/VF driver to first query the AF driver for the rule size, then allocate an entry, and finally install the flow. This sequence requires three mailbox request/response exchanges from the PF. To speed up the installation, the `install_flow` mailbox request message is extended with additional fields that allow the AF driver to determine the required NPC MCAM entry type, allocate the MCAM entry, and complete the flow installation in a single step. Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com> Link: https://patch.msgid.link/20260224080009.4147301-11-rkannoth@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-28 10:29:26 -08:00
Ratheesh Kannoth	645c6e3c19	octeontx2-af: npc: cn20k: virtual index support This patch adds support for virtual MCAM index allocation and improves CN20K MCAM defragmentation handling. A new field is introduced in the non-ref, non-contiguous MCAM allocation mailbox request to indicate that virtual indexes should be returned instead of physical ones. Virtual indexes allow the hardware to move mapped MCAM entries internally, enabling defragmentation and preventing scattered allocations across subbanks. The patch also enhances defragmentation by treating non-ref, non-contiguous allocations as ideal candidates for packing sparsely used regions, which can free up subbanks for potential x2 or x4 configuration. All such allocations are tracked and always returned as virtual indexes so they remain stable even when entries are moved during defrag. During defragmentation, MCAM entries may shift between subbanks, but their virtual indexes remain unchanged. Additionally, this update fixes an issue where entry statistics were not being restored correctly after defragmentation. Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com> Link: https://patch.msgid.link/20260224080009.4147301-10-rkannoth@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-28 10:29:26 -08:00
Suman Ghosh	4e527f1e5c	octeontx2-af: npc: cn20k: Add new mailboxes for CN20K silicon To enable enhanced MCAM capabilities for CN20K, the struct mcam_entry has been extended to support expanded keyword requirements. Specifically, the kw and kw_mask arrays have been increased from a size of 7 to 8 to accommodate the additional keyword field introduced for CN20K. To ensure seamless integration while preserving compatibility with existing platforms, dedicated CN20K-specific mailboxes have been introduced that leverage the updated struct mcam_entry. This approach allows CN20K to utilize the extended structure without impacting current implementations. This patch identifies the relevant mailboxes and introduces the following CN20K-specific additions: New mailboxes added: 1. `NPC_CN20K_MCAM_WRITE_ENTRY` 2. `NPC_CN20K_MCAM_ALLOC_AND_WRITE_ENTRY` 3. `NPC_CN20K_MCAM_READ_ENTRY` 4. `NPC_CN20K_MCAM_READ_BASE_RULE` Signed-off-by: Suman Ghosh <sumang@marvell.com> Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com> Link: https://patch.msgid.link/20260224080009.4147301-9-rkannoth@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-28 10:29:26 -08:00
Ratheesh Kannoth	de3f88b465	octeontx2-af: npc: cn20k: Prepare for new SoC Current code pass mcam_entry structure to all low level functions. This is not proper: 1) We need to modify all functions to support a new SoC 2) It does not look good to pass soc specific structure to all common functions. This patch adds a mcam meta data structure, which is populated and passed to low level functions. Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com> Link: https://patch.msgid.link/20260224080009.4147301-8-rkannoth@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-28 10:29:26 -08:00
Suman Ghosh	6d1e70282f	octeontx2-af: npc: cn20k: Use common APIs In cn20k silicon, the register definitions and the algorithms used to read, write, copy, and enable MCAM entries have changed. This patch updates the common APIs to support both cn20k and previous silicon variants. Additionally, cn20k introduces a new algorithm for MCAM index management. The common APIs are updated to invoke the cn20k-specific index management routines for allocating, freeing, and retrieving default MCAM entries. Signed-off-by: Suman Ghosh <sumang@marvell.com> Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com> Link: https://patch.msgid.link/20260224080009.4147301-7-rkannoth@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-28 10:29:26 -08:00
Ratheesh Kannoth	09d3b7a140	octeontx2-af: npc: cn20k: Allocate default MCAM indexes Reserving MCAM entries in the AF driver for installing default MCAM entries is not an efficient allocation method, as it results in significant wastage of entries. This patch allocates MCAM indexes for promiscuous, multicast, broadcast, and unicast traffic in descending order of indexes (from lower to higher priority) when the NIX LF is attached to the PF/VF. Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com> Link: https://patch.msgid.link/20260224080009.4147301-6-rkannoth@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-28 10:29:26 -08:00
Suman Ghosh	ef992a0f12	octeontx2-af: npc: cn20k: MKEX profile support In new silicon variant cn20k, a new parser profile is introduced. Instead of having two layer-data information per key field type, a new key extractor concept is introduced. As part of this change now a maximum of 24 extractor can be configured per packet parsing profile. For example, LA type(ether) can have 24 unique parsing key, LC type(ip), LD type(tcp/udp) also can have unique 24 parsing key associated. Signed-off-by: Suman Ghosh <sumang@marvell.com> Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com> Link: https://patch.msgid.link/20260224080009.4147301-5-rkannoth@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-28 10:29:25 -08:00
Suman Ghosh	a2df2f95ea	octeontx2-af: npc: cn20k: Add default profile Default mkex profile for cn20k silicon. This commit changes attribute of objects to may_be_unused to avoid compiler warning Signed-off-by: Suman Ghosh <sumang@marvell.com> Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com> Link: https://patch.msgid.link/20260224080009.4147301-4-rkannoth@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-28 10:29:25 -08:00
Suman Ghosh	5868682b68	octeontx2-af: npc: cn20k: KPM profile changes KPU (Kangaroo Processing Unit) profiles are primarily used to set the required packet pointers that will be used in later stages for key generation. In the new CN20K silicon variant, a new KPM profile is introduced alongside the existing KPU profiles. In CN20K, a total of 16 KPUs are grouped into 8 KPM profiles. As per the current hardware design, each KPM configuration contains a combination of 2 KPUs: KPM0 = KPU0 + KPU8 KPM1 = KPU1 + KPU9 ... KPM7 = KPU7 + KPU15 This configuration enables more efficient use of KPU resources. This patch adds support for the new KPM profile configuration. Signed-off-by: Suman Ghosh <sumang@marvell.com> Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com> Link: https://patch.msgid.link/20260224080009.4147301-3-rkannoth@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-28 10:29:25 -08:00
Ratheesh Kannoth	1396771b0b	octeontx2-af: npc: cn20k: Index management In CN20K silicon, the MCAM is divided vertically into two banks. Each bank has a depth of 8192. The MCAM is divided horizontally into 32 subbanks, with each subbank having a depth of 256. Each subbank can accommodate either x2 keys or x4 keys. x2 keys are 256 bits in size, and x4 keys are 512 bits in size. Bank1 Bank0 \|-----------------------------\| \| \| \| subbank 31 { depth 256 } \| \| \| \|-----------------------------\| \| \| \| subbank 30 \| \| \| ------------------------------ ............................... \|-----------------------------\| \| \| \| subbank 0 \| \| \| ------------------------------\| This patch implements the following allocation schemes in NPC. The allocation API accepts reference (ref), limit, contig, priority, and count values. For example, specifying ref=100, limit=200, contig=1, priority=LOW, and count=20 will allocate 20 contiguous MCAM entries between entries 100 and 200. 1. Contiguous allocation with ref, limit, and priority. 2. Non-contiguous allocation with ref, limit, and priority. 3. Non-contiguous allocation without ref. 4. Contiguous allocation without ref. Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com> Link: https://patch.msgid.link/20260224080009.4147301-2-rkannoth@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-28 10:29:25 -08:00
Thorsten Blum	ded4a02e7d	ipv6: sit: Replace deprecated strcpy with strscpy strcpy() has been deprecated [1] because it performs no bounds checking on the destination buffer, which can lead to buffer overflows. Replace it with the safer strscpy(). Use the two-argument version of strscpy() to copy 'parms->name' in ipip6_tunnel_locate(). Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strcpy [1] Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Link: https://patch.msgid.link/20260227004541.798966-3-thorsten.blum@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-28 10:06:29 -08:00
Jakub Kicinski	eed562b2a6	Merge branch 'gve-support-larger-ring-sizes-in-dqo-qpl-mode' Max Yuan says: ==================== gve: Support larger ring sizes in DQO-QPL mode This patch series updates the gve driver to improve Queue Page List (QPL) management and enable support for larger ring sizes when using the DQO-QPL queue format. Previously, the driver used hardcoded multipliers to determine the number of pages to register for QPLs (e.g., 2x ring size for RX). This rigid approach made it difficult to support larger ring sizes without potentially exceeding the "max_registered_pages" limit reported by the device. The first patch introduces a unified and flexible logic for calculating QPL page requirements. It balances TX and RX page allocations based on the configured ring sizes and scales the total count down proportionally if it would otherwise exceed the device's global registration limit. The second patch leverages this new flexibility to stop ignoring the maximum ring size supported by the device in DQO-QPL mode. Users can now configure ring sizes up to the device-reported maximum, as the driver will automatically adjust the QPL size to stay within allowed memory bounds. ==================== Link: https://patch.msgid.link/20260225182342.1049816-1-joshwash@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-28 08:58:41 -08:00
Matt Olson	a2f1918401	gve: Enable reading max ring size from the device in DQO-QPL mode The gVNIC device indicates a device option (MODIFY_RING) to the driver, which presents a range of ring sizes from which the user is allowed to select. But in DQO-QPL queue format, the driver ignores the "max" of this range and instead allows the user to configure the ring size in the range [min, default]. This was done because increasing the ring size could result in the number of registered pages being higher than the max allowed by the device. In order to support large ring sizes, stop ignoring the "max" of the range presented in the MODIFY_RING option. Signed-off-by: Matt Olson <maolson@google.com> Signed-off-by: Max Yuan <maxyuan@google.com> Reviewed-by: Jordan Rhee <jordanrhee@google.com> Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com> Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com> Signed-off-by: Joshua Washington <joshwash@google.com> Link: https://patch.msgid.link/20260225182342.1049816-3-joshwash@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-28 08:58:29 -08:00
Matt Olson	07993df560	gve: Update QPL page registration logic For DQO, change QPL page registration logic to be more flexible to honor the "max_registered_pages" parameter from the gVNIC device. Previously the number of RX pages per QPL was hardcoded to twice the ring size, and the number of TX pages per QPL was dictated by the device in the DQO-QPL device option. Now [in DQO-QPL mode], the driver will ignore the "tx_pages_per_qpl" parameter indicated in the DQO-QPL device option and instead allocate up to (tx_queue_length / 2) pages per TX QPL and up to (rx_queue_length * 2) pages per RX QPL while keeping the total number of pages under the "max_registered_pages". Merge DQO and GQI QPL page calculation logic into a unified gve_update_num_qpl_pages function. Add rx_pages_per_qpl to the priv struct for consumption by both DQO and GQI. Signed-off-by: Matt Olson <maolson@google.com> Signed-off-by: Max Yuan <maxyuan@google.com> Reviewed-by: Jordan Rhee <jordanrhee@google.com> Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com> Signed-off-by: Joshua Washington <joshwash@google.com> Link: https://patch.msgid.link/20260225182342.1049816-2-joshwash@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-28 08:58:29 -08:00
Thorsten Blum	a9a13c7379	keys, dns: Use kmalloc_flex to improve dns_resolver_preparse Use kmalloc_flex() when allocating a new 'struct user_key_payload' in dns_resolver_preparse() to replace the open-coded size arithmetic. Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Link: https://patch.msgid.link/20260226214930.785423-3-thorsten.blum@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-28 08:48:21 -08:00
Jiayuan Chen	58e443b773	net: fix sock compilation error under CONFIG_PREEMPT_RT When CONFIG_PREEMPT_RT is enabled, __SPIN_LOCK_UNLOCKED() expands to a brace-enclosed initializer rather than a compound literal, which cannot be used in assignment expressions. This causes a build failure: net/core/sock.c:3787:29: error: expected expression before '{' token 3787 \| tmp.slock = __SPIN_LOCK_UNLOCKED(tmp.slock); Use declaration-with-initializer instead of assignment, consistent with how __SPIN_LOCK_UNLOCKED() is used elsewhere in the kernel (e.g. DEFINE_SPINLOCK). Fixes: `5151ec54f5` ("net: use try_cmpxchg() in lock_sock_nested()") Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jiayuan Chen <jiayuan.chen@shopee.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260228111319.79506-1-jiayuan.chen@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-28 07:42:39 -08:00
Jakub Kicinski	1e08faf996	Merge branch 'net-ethernet-litex-minor-improvment-for-the-codebase' Inochi Amaoto says: ==================== net: ethernet: litex: minor improvment for the codebase Improve the litex code for using the device managed function to register netdev and replace all the "pdev->dev" with dev pointer instead. ==================== Link: https://patch.msgid.link/20260227003351.752934-1-inochiama@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-27 19:25:20 -08:00
Inochi Amaoto	621e3634df	net: ethernet: litex: use device pointer to simplify code. As there is already a device pointer in the probe function, replace all "&pdev->dev" pattern with this predefined device pointer. Signed-off-by: Inochi Amaoto <inochiama@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20260227003351.752934-3-inochiama@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-27 19:25:16 -08:00

1 2 3 4 5 ...

1427154 Commits