linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-05-14 10:02:33 -04:00

Author	SHA1	Message	Date
Lee Trager	c2b93d6bec	eth: fbnic: Create ring buffer for firmware logs When enabled, firmware may send logs messages which are specific to the device and not the host. Create a ring buffer to store these messages which are read by a user through DebugFS. Buffer access is protected by a spinlock. Signed-off-by: Lee Trager <lee@trager.us> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250702192207.697368-4-lee@trager.us Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-08 17:05:46 -07:00
Lee Trager	e48f6620ee	eth: fbnic: Use FIELD_PREP to generate minimum firmware version Create a new macro based on FIELD_PREP to generate easily readable minimum firmware version ints. This macro will prevent the mistake from the previous patch from happening again. Signed-off-by: Lee Trager <lee@trager.us> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250702192207.697368-3-lee@trager.us Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-08 17:05:45 -07:00
Lee Trager	dd62e960a7	eth: fbnic: Fix incorrect minimum firmware version The full minimum version is 0.10.6-0. The six is now correctly defined as patch and shifted appropriately. 0.10.6-0 is a preproduction version of firmware which was released over a year and a half ago. All production devices meet this requirement. Signed-off-by: Lee Trager <lee@trager.us> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250702192207.697368-2-lee@trager.us Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-08 17:05:45 -07:00
Jakub Kicinski	80b0dd1c4e	Merge branch 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Tariq Toukan says: ==================== mlx5-next updates 2025-07-08 The following pull-request contains common mlx5 updates for your net-next tree. v2: https://lore.kernel.org/1751574385-24672-1-git-send-email-tariqt@nvidia.com * 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux: net/mlx5: Check device memory pointer before usage net/mlx5: fs, fix RDMA TRANSPORT init cleanup flow net/mlx5: Add IFC bits for PCIe Congestion Event object net/mlx5: Small refactor for general object capabilities net/mlx5: fs, add multiple prios to RDMA TRANSPORT steering domain ==================== Link: https://patch.msgid.link/1752002102-11316-1-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-08 16:59:57 -07:00
Jakub Kicinski	0a49abff43	Merge branch 'net-migrate-remaining-drivers-to-dedicated-_rxfh_context-ops' Jakub Kicinski says: ==================== net: migrate remaining drivers to dedicated _rxfh_context ops Around a year ago Ed added dedicated ops for managing RSS contexts. This significantly improved the clarity of the driver facing API. Migrate the remaining 3 drivers and remove the old way of muxing the RSS context operations via .set_rxfh(). v2: https://lore.kernel.org/20250702030606.1776293-1-kuba@kernel.org v1: https://lore.kernel.org/20250630160953.1093267-1-kuba@kernel.org ==================== Link: https://patch.msgid.link/20250707184115.2285277-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-08 11:56:42 -07:00
Jakub Kicinski	cd7e8841b6	net: ethtool: reduce indent for _rxfh_context ops Now that we don't have the compat code we can reduce the indent a little. No functional changes. Reviewed-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Edward Cree <ecree.xilinx@gmail.com> Link: https://patch.msgid.link/20250707184115.2285277-6-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-08 11:56:40 -07:00
Jakub Kicinski	4e655028c2	net: ethtool: remove the compat code for _rxfh_context ops All drivers are now converted to dedicated _rxfh_context ops. Remove the use of >set_rxfh() to manage additional contexts. Reviewed-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Edward Cree <ecree.xilinx@gmail.com> Link: https://patch.msgid.link/20250707184115.2285277-5-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-08 11:56:40 -07:00
Jakub Kicinski	afc55a0659	eth: mlx5: migrate to the *_rxfh_context ops Convert mlx5 to dedicated RXFH ops. This is a fairly shallow conversion, TBH, most of the driver code stays as is, but we let the core allocate the context ID for the driver. mlx5e_rx_res_rss_get_rxfh() and friends are made void, since core only calls the driver for context 0. The second call is right after context creation so it must exist (tm). Tested with drivers/net/hw/rss_ctx.py on MCX6. Reviewed-by: Gal Pressman <gal@nvidia.com> Link: https://patch.msgid.link/20250707184115.2285277-4-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-08 11:56:40 -07:00
Jakub Kicinski	be78c83a8b	eth: ice: drop the dead code related to rss_contexts ICE appears to have some odd form of rss_context use plumbed in for .get_rxfh. The .set_rxfh side does not support creating contexts, however, so this must be dead code. For at least a year now (since commit `7964e78846` ("net: ethtool: use the tracking array for get_rxfh on custom RSS contexts")) we have not been calling .get_rxfh with a non-zero rss_context. We just get the info from the RSS XArray under dev->ethtool. Remove what must be dead code in the driver, clear the support flags. Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250707184115.2285277-3-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-08 11:56:40 -07:00
Jakub Kicinski	62e01d8c41	eth: otx2: migrate to the *_rxfh_context ops otx2 only supports additional indirection tables (no separate keys etc.) so the conversion to dedicated callbacks and core-allocated context is mostly removing the code which stores the extra tables in the driver. Core already stores the indirection tables for additional contexts, and doesn't call .get for them. One subtle change here is that we'll now start with the table covering all queues, not directing all traffic to queue 0. This is what core expects if the user doesn't pass the initial indir table explicitly (there's a WARN_ON() in the core trying to make sure driver authors don't forget to populate ctx to defaults). Drivers implementing .create_rxfh_context don't have to set cap_rss_ctx_supported, so remove it. Tested-by: Geetha Sowjanya <gakula@marvell.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250707184115.2285277-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-08 11:56:40 -07:00
Xin Guo	19c066f940	tcp: update the outdated ref draft-ietf-tcpm-rack As RACK-TLP was published as a standards-track RFC8985, so the outdated ref draft-ietf-tcpm-rack need to be updated. Signed-off-by: Xin Guo <guoxin0309@gmail.com> Reviewed-by: Neal Cardwell <ncardwell@google.com> Link: https://patch.msgid.link/20250705163647.301231-1-guoxin0309@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-08 09:01:52 -07:00
Heiner Kallweit	c523058713	net: phy: declare package-related struct members only if CONFIG_PHY_PACKAGE is enabled Now that we have an own config symbol for the PHY package module, we can use it to reduce size of these structs if it isn't enabled. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://patch.msgid.link/f0daefa4-406a-4a06-a4f0-7e31309f82bc@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-08 08:57:25 -07:00
Fengyuan Gong	a41851bea7	net: account for encap headers in qdisc pkt len Refine qdisc_pkt_len_init to include headers up through the inner transport header when computing header size for encapsulations. Also refine net/sched/sch_cake.c borrowed from qdisc_pkt_len_init(). Signed-off-by: Fengyuan Gong <gfengyuan@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://patch.msgid.link/20250702160741.1204919-1-gfengyuan@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-08 08:55:33 -07:00
Jakub Kicinski	7725a35e74	Merge branch 'support-some-features-for-the-hibmcge-driver' Jijie Shao says: ==================== Support some features for the HIBMCGE driver v4: https://lore.kernel.org/20250701125446.720176-1-shaojijie@huawei.com v3: https://lore.kernel.org/20250626020613.637949-1-shaojijie@huawei.com v2: https://lore.kernel.org/20250623034129.838246-1-shaojijie@huawei.com v1: https://lore.kernel.org/20250619144423.2661528-1-shaojijie@huawei.com ==================== Link: https://patch.msgid.link/20250702125716.2875169-1-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-08 08:54:28 -07:00
Jijie Shao	401581f286	net: hibmcge: configure FIFO thresholds according to the MAC controller documentation Configure FIFO thresholds according to the MAC controller documentation Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250702125716.2875169-4-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-08 08:54:26 -07:00
Jijie Shao	1051404bab	net: hibmcge: adjust the burst len configuration of the MAC controller to improve TX performance. Adjust the burst len configuration of the MAC controller to improve TX performance. Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250702125716.2875169-3-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-08 08:54:25 -07:00
Jijie Shao	1d7cd7a9c6	net: hibmcge: support scenario without PHY Currently, the driver uses phylib to operate PHY by default. On some boards, the PHY device is separated from the MAC device. As a result, the hibmcge driver cannot operate the PHY device. In this patch, the driver determines whether a PHY is available based on register configuration. If no PHY is available, the driver will use fixed_phy to register fake phydev. Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com> Link: https://patch.msgid.link/20250702125716.2875169-2-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-08 08:54:25 -07:00
Faisal Bukhari	effdbb29fd	netlink: spelling: fix appened -> appended in a comment Fix spelling mistake in net/netlink/af_netlink.c appened -> appended Signed-off-by: Faisal Bukhari <faisalbukhari523@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250705030841.353424-1-faisalbukhari523@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-08 08:42:39 -07:00
Jakub Kicinski	301af832db	Merge branch 'net-remove-unused-function-parameters-in-skbuff-c' Michal Luczaj says: ==================== net: Remove unused function parameters in skbuff.c Couple of cleanup patches to get rid of unused function parameters around skbuff.c, plus little things spotted along the way. Offshoot of my question in [1], but way more contained. Found by adding "-Wunused-parameter -Wno-error" to KBUILD_CFLAGS and grepping for specific skbuff.c warnings. [1]: https://lore.kernel.org/netdev/972af569-0c90-4585-9e1f-f2266dab6ec6@rbox.co/ v2: https://lore.kernel.org/20250626-splice-drop-unused-v2-0-3268fac1af89@rbox.co v1: https://lore.kernel.org/20250624-splice-drop-unused-v1-0-cf641a676d04@rbox.co ==================== Link: https://patch.msgid.link/20250702-splice-drop-unused-v3-0-55f68b60d2b7@rbox.co Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-08 08:37:22 -07:00
Michal Luczaj	ab34e14258	net: skbuff: Drop unused @skb Since its introduction in commit `6fa01ccd88` ("skbuff: Add pskb_extract() helper function"), pskb_carve_frag_list() never used the argument @skb. Drop it and adapt the only caller. No functional change intended. Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Michal Luczaj <mhal@rbox.co> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-08 08:37:22 -07:00
Michal Luczaj	ad0ac6cd9c	net: skbuff: Drop unused @skb Since its introduction in commit `ce098da149` ("skbuff: Introduce slab_build_skb()"), __slab_build_skb() never used the @skb argument. Remove it and adapt both callers. No functional change intended. Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Michal Luczaj <mhal@rbox.co> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-08 08:37:22 -07:00
Michal Luczaj	25489a4f55	net: splice: Drop unused @gfp Since its introduction in commit `2e910b9532` ("net: Add a function to splice pages into an skbuff for MSG_SPLICE_PAGES"), skb_splice_from_iter() never used the @gfp argument. Remove it and adapt callers. No functional change intended. Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Michal Luczaj <mhal@rbox.co> Link: https://patch.msgid.link/20250702-splice-drop-unused-v3-2-55f68b60d2b7@rbox.co Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-08 08:37:15 -07:00
Michal Luczaj	1024f12071	net: splice: Drop unused @pipe Since commit `41c73a0d44` ("net: speedup skb_splice_bits()"), __splice_segment() and spd_fill_page() do not use the @pipe argument. Drop it. While adapting the callers, move one line to enforce reverse xmas tree order. No functional change intended. Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Michal Luczaj <mhal@rbox.co> Link: https://patch.msgid.link/20250702-splice-drop-unused-v3-1-55f68b60d2b7@rbox.co Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-08 08:37:14 -07:00
Rob Herring (Arm)	e27dba1951	net: Use of_reserved_mem_region_to_resource{_byname}() for "memory-region" Use the newly added of_reserved_mem_region_to_resource{_byname}() functions to handle "memory-region" properties. The error handling is a bit different for mtk_wed_mcu_load_firmware(). A failed match of the "memory-region-names" would skip the entry, but then other errors in the lookup and retrieval of the address would not skip the entry. However, that distinction is not really important. Either the region is available and usable or it is not. So now, errors from of_reserved_mem_region_to_resource() are ignored so the region is simply skipped. Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Link: https://patch.msgid.link/20250703183459.2074381-1-robh@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-08 08:29:46 -07:00
Ahelenia Ziemiańska	f142028e30	gve: global: fix "for a while" typo Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/5zsbhtyox3cvbntuvhigsn42uooescbvdhrat6s3d6rczznzg5@tarta.nabijaczleweli.xyz Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-08 08:27:29 -07:00
Ahelenia Ziemiańska	60687c2c5c	atm: lanai: fix "take a while" typo Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/mn5rh6i773csmcrpfcr6bogvv2auypz2jwjn6dap2rxousxnw5@tarta.nabijaczleweli.xyz Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-08 08:27:19 -07:00
Colin Ian King	0e86f3eb83	net/mlx5: Fix spelling mistake "disabliing" -> "disabling" There is a spelling mistake in a NL_SET_ERR_MSG_MOD message. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Link: https://patch.msgid.link/20250703102219.1248399-1-colin.i.king@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-08 08:05:50 -07:00
Hannes Reinecke	e22da46850	net/handshake: Add new parameter 'HANDSHAKE_A_ACCEPT_KEYRING' Add a new netlink parameter 'HANDSHAKE_A_ACCEPT_KEYRING' to provide the serial number of the keyring to use. Signed-off-by: Hannes Reinecke <hare@kernel.org> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Link: https://patch.msgid.link/20250701144657.104401-1-hare@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-07-08 15:31:44 +02:00
Wang Liang	5d288658ee	net: replace ADDRLABEL with dynamic debug ADDRLABEL only works when it was set in compilation phase. Replace it with net_dbg_ratelimited(). Signed-off-by: Wang Liang <wangliang74@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250702104417.1526138-1-wangliang74@huawei.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-07-08 15:04:05 +02:00
Eric Dumazet	84a7d6797e	net/sched: acp_api: no longer acquire RTNL in tc_action_net_exit() tc_action_net_exit() got an rtnl exclusion in commit `a159d3c4b8` ("net_sched: acquire RTNL in tc_action_net_exit()") Since then, commit `16af606739` ("net: sched: implement reference counted action release") made this RTNL exclusion obsolete for most cases. Only tcf_action_offload_del() might still require it. Move the rtnl locking into tcf_idrinfo_destroy() when an offload action is found. Most netns do not have actions, yet deleting them is adding a lot of pressure on RTNL, which is for many the most contended mutex in the kernel. We are moving to a per-netns 'rtnl', so tc_action_net_exit() will not be able to grab 'rtnl' a single time for a batch of netns. Before the patch: perf probe -a rtnl_lock perf record -e probe:rtnl_lock -a /bin/bash -c 'unshare -n "/bin/true"; sleep 1' [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.305 MB perf.data (25 samples) ] After the patch: perf record -e probe:rtnl_lock -a /bin/bash -c 'unshare -n "/bin/true"; sleep 1' [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.304 MB perf.data (9 samples) ] Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Vlad Buslov <vladbu@nvidia.com> Cc: Jiri Pirko <jiri@resnulli.us> Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Link: https://patch.msgid.link/20250702071230.1892674-1-edumazet@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-07-08 13:00:24 +02:00
Paolo Abeni	d23647fd54	Merge branch 'net-mctp-add-support-for-gateway-routing' Jeremy Kerr says: ==================== net: mctp: Add support for gateway routing This series adds a gateway route type for the MCTP core, allowing non-local EIDs as the match for a route. Example setup using the mctp tools: mctp route add 9 via mctpi2c0 mctp neigh add 9 dev mctpi2c0 lladdr 0x1d mctp route add 10 gw 9 - will route packets to eid 10 through mctpi2c0, using a dest lladdr of 0x1d (ie, that of the directly-attached eid 9). The core change to support this is the introduction of a struct mctp_dst, which represents the result of a route lookup. Since this involves a bit of surgery through the routing code, we add a few tests along the way. We're introducing an ABI change in the new RTM_{NEW,GET,DEL}ROUTE netlink formats, with the support for a RTA_GATEWAY attribute. Because we need a network ID specified to fully-qualify a gateway EID, the RTA_GATEWAY attribute carries the (net, eid) tuple in full: struct mctp_fq_addr { unsigned int net; mctp_eid_t eid; } Of course, any questions, comments etc are most welcome. Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> ==================== Link: https://patch.msgid.link/20250702-dev-forwarding-v5-0-1468191da8a4@codeconstruct.com.au Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-07-08 12:41:45 +02:00
Jeremy Kerr	48e1736e5d	net: mctp: test: Add tests for gateway routes Add a few kunit tests for the gateway routing. Because we have multiple route types now (direct and gateway), rename mctp_test_create_route to mctp_test_create_route_direct, and add a _gateway variant too. Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Link: https://patch.msgid.link/20250702-dev-forwarding-v5-14-1468191da8a4@codeconstruct.com.au Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-07-08 12:39:24 +02:00
Jeremy Kerr	ad39c12fce	net: mctp: add gateway routing support This change allows for gateway routing, where a route table entry may reference a routable endpoint (by network and EID), instead of routing directly to a netdevice. We add support for a RTM_GATEWAY attribute for netlink route updates, with an attribute format of: struct mctp_fq_addr { unsigned int net; mctp_eid_t eid; } - we need the net here to uniquely identify the target EID, as we no longer have the device reference directly (which would provide the net id in the case of direct routes). This makes route lookups recursive, as a route lookup that returns a gateway route must be resolved into a direct route (ie, to a device) eventually. We provide a limit to the route lookups, to prevent infinite loop routing. The route lookup populates a new 'nexthop' field in the dst structure, which now specifies the key for the neighbour table lookup on device output, rather than using the packet destination address directly. Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Link: https://patch.msgid.link/20250702-dev-forwarding-v5-13-1468191da8a4@codeconstruct.com.au Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-07-08 12:39:24 +02:00
Jeremy Kerr	28ddbb2abe	net: mctp: allow NL parsing directly into a struct mctp_route The netlink route parsing functions end up setting a bunch of output variables from the rt attributes. This will get messy when the routes become more complex. So, split the rt parsing into two types: a lookup (returning route target data suitable for a route lookup, like when deleting a route) and a populate (setting fields of a struct mctp_route). In doing this, we need to separate the route allocation from mctp_route_add, so add some comments on the lifetime semantics for the latter. Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Link: https://patch.msgid.link/20250702-dev-forwarding-v5-12-1468191da8a4@codeconstruct.com.au Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-07-08 12:39:24 +02:00
Jeremy Kerr	4a1de053d7	net: mctp: remove routes by netid, not by device In upcoming changes, a route may not have a device associated. Since the route is matched on the (network, eid) tuple, pass the netid itself into mctp_route_remove. Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Link: https://patch.msgid.link/20250702-dev-forwarding-v5-11-1468191da8a4@codeconstruct.com.au Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-07-08 12:39:24 +02:00
Jeremy Kerr	48e6aa60bf	net: mctp: pass net into route creation We may not have a mdev pointer, from which we currently extract the net. Instead, pass the net directly. Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Link: https://patch.msgid.link/20250702-dev-forwarding-v5-10-1468191da8a4@codeconstruct.com.au Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-07-08 12:39:24 +02:00
Jeremy Kerr	9b4a8c38f4	net: mctp: test: Add initial socket tests Recent changes have modified the extaddr path a little, so add a couple of kunit tests to af-mctp.c. These check that we're correctly passing lladdr data between sendmsg/recvmsg and the routing layer. Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Link: https://patch.msgid.link/20250702-dev-forwarding-v5-9-1468191da8a4@codeconstruct.com.au Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-07-08 12:39:24 +02:00
Jeremy Kerr	19396179a0	net: mctp: test: add sock test infrastructure Add a new test object, for use with the af_mctp socket code. This is intially empty, but we'll start populating actual tests in an upcoming change. Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Link: https://patch.msgid.link/20250702-dev-forwarding-v5-8-1468191da8a4@codeconstruct.com.au Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-07-08 12:39:23 +02:00
Jeremy Kerr	80bcf05e54	net: mctp: test: move functions into utils.[ch] A future change will add another mctp test .c file, so move some of the common test setup from route.c into the utils object. Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Link: https://patch.msgid.link/20250702-dev-forwarding-v5-7-1468191da8a4@codeconstruct.com.au Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-07-08 12:39:23 +02:00
Jeremy Kerr	46ee16462f	net: mctp: test: Add extaddr routing output test Test that the routing code preserves the haddr data in a skb through an input route operation. Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Link: https://patch.msgid.link/20250702-dev-forwarding-v5-6-1468191da8a4@codeconstruct.com.au Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-07-08 12:39:23 +02:00
Jeremy Kerr	96b341a8e7	net: mctp: test: Add an addressed device constructor Upcoming tests will check semantics of hardware addressing, which require a dev with ->addr_len != 0. Add a constructor to create a MCTP interface using a physically-addressed bus type. Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Link: https://patch.msgid.link/20250702-dev-forwarding-v5-5-1468191da8a4@codeconstruct.com.au Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-07-08 12:39:23 +02:00
Jeremy Kerr	3007f90ec0	net: mctp: separate cb from direct-addressing routing Now that we have the dst->haddr populated by sendmsg (when extended addressing is in use), we no longer need to stash the link-layer address in the skb->cb. Instead, only use skb->cb for incoming lladdr data. While we're at it: remove cb->src, as was never used. Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Link: https://patch.msgid.link/20250702-dev-forwarding-v5-4-1468191da8a4@codeconstruct.com.au Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-07-08 12:39:23 +02:00
Jeremy Kerr	269936db5e	net: mctp: separate routing database from routing operations This change adds a struct mctp_dst, representing the result of a routing lookup. This decouples the struct mctp_route from the actual implementation of a routing operation. This will allow for future routing changes which may require more involved lookup logic, such as gateway routing - which may require multiple traversals of the routing table. Since we only use the struct mctp_route at lookup time, we no longer hold routes over a routing operation, as we only need it to populate the dst. However, we do hold the dev while the dst is active. This requires some changes to the route test infrastructure, as we no longer have a mock route to handle the route output operation, and transient dsts are created by the routing code, so we can't override them as easily. Instead, we use kunit->priv to stash a packet queue, and a custom dst_output function queues into that packet queue, which we can use for later expectations. Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Link: https://patch.msgid.link/20250702-dev-forwarding-v5-3-1468191da8a4@codeconstruct.com.au Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-07-08 12:39:23 +02:00
Jeremy Kerr	fc2b87d036	net: mctp: test: make cloned_frag buffers more appropriately-sized In our input_cloned_frag test, we currently allocate our test buffers arbitrarily-sized at 100 bytes. We only expect to receive a max of 15 bytes from the socket, so reduce to a more appropriate size. There are some upcoming changes to the routing code which hit a frame-size limit on s390, so reduce the usage before that lands. Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Link: https://patch.msgid.link/20250702-dev-forwarding-v5-2-1468191da8a4@codeconstruct.com.au Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-07-08 12:39:23 +02:00
Jeremy Kerr	e0f3c79cc0	net: mctp: don't use source cb data when forwarding, ensure pkt_type is set In the output path, only check the skb->cb data when we know it's from a local socket; input packets will have source address information there instead. In order to detect when we're forwarding, set skb->pkt_type on input/output. Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Link: https://patch.msgid.link/20250702-dev-forwarding-v5-1-1468191da8a4@codeconstruct.com.au Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-07-08 12:39:23 +02:00
Paolo Abeni	05cc60ef27	Merge branch 'add-broadcast_neighbor-for-no-stacking-networking-arch' Tonghao Zhang says: ==================== add broadcast_neighbor for no-stacking networking arch For no-stacking networking arch, and enable the bond mode 4(lacp) in datacenter, the switch require arp/nd packets as session synchronization. More details please see patch. Cc: Jay Vosburgh <jv@jvosburgh.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Simon Horman <horms@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Andrew Lunn <andrew+netdev@lunn.ch> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Nikolay Aleksandrov <razor@blackwall.org> Cc: Zengbing Tu <tuzengbing@didiglobal.com> ==================== Link: https://patch.msgid.link/cover.1751031306.git.tonghao@bamaicloud.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-07-08 10:59:58 +02:00
Tonghao Zhang	2f9afffc39	net: bonding: send peer notify when failure recovery In LACP mode with broadcast_neighbor enabled, after LACP protocol recovery, the port can transmit packets. However, if the bond port doesn't send gratuitous ARP/ND packets to the switch, the switch won't return packets through the current interface. This causes traffic imbalance. To resolve this issue, when LACP protocol recovers, send ARP/ND packets if broadcast_neighbor is enabled. Cc: Jay Vosburgh <jv@jvosburgh.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Simon Horman <horms@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Andrew Lunn <andrew+netdev@lunn.ch> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: Tonghao Zhang <tonghao@bamaicloud.com> Signed-off-by: Zengbing Tu <tuzengbing@didiglobal.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/3993652dc093fffa9504ce1c2448fb9dea31d2d2.1751031306.git.tonghao@bamaicloud.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-07-08 10:59:42 +02:00
Tonghao Zhang	3d98ee5265	net: bonding: add broadcast_neighbor netlink option User can config or display the bonding broadcast_neighbor option via iproute2/netlink. Cc: Jay Vosburgh <jv@jvosburgh.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Simon Horman <horms@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Andrew Lunn <andrew+netdev@lunn.ch> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: Tonghao Zhang <tonghao@bamaicloud.com> Signed-off-by: Zengbing Tu <tuzengbing@didiglobal.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/76b90700ba5b98027dfb51a2f3c5cfea0440a21b.1751031306.git.tonghao@bamaicloud.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-07-08 10:59:42 +02:00
Tonghao Zhang	ce7a381697	net: bonding: add broadcast_neighbor option for 802.3ad Stacking technology is a type of technology used to expand ports on Ethernet switches. It is widely used as a common access method in large-scale Internet data center architectures. Years of practice have proved that stacking technology has advantages and disadvantages in high-reliability network architecture scenarios. For instance, in stacking networking arch, conventional switch system upgrades require multiple stacked devices to restart at the same time. Therefore, it is inevitable that the business will be interrupted for a while. It is for this reason that "no-stacking" in data centers has become a trend. Additionally, when the stacking link connecting the switches fails or is abnormal, the stack will split. Although it is not common, it still happens in actual operation. The problem is that after the split, it is equivalent to two switches with the same configuration appearing in the network, causing network configuration conflicts and ultimately interrupting the services carried by the stacking system. To improve network stability, "non-stacking" solutions have been increasingly adopted, particularly by public cloud providers and tech companies like Alibaba, Tencent, and Didi. "non-stacking" is a method of mimicing switch stacking that convinces a LACP peer, bonding in this case, connected to a set of "non-stacked" switches that all of its ports are connected to a single switch (i.e., LACP aggregator), as if those switches were stacked. This enables the LACP peer's ports to aggregate together, and requires (a) special switch configuration, described in the linked article, and (b) modifications to the bonding 802.3ad (LACP) mode to send all ARP/ND packets across all ports of the active aggregator. Note that, with multiple aggregators, the current broadcast mode logic will send only packets to the selected aggregator(s). +-----------+ +-----------+ \| switch1 \| \| switch2 \| +-----------+ +-----------+ ^ ^ \| \| +-----------------+ \| bond4 lacp \| +-----------------+ \| \| \| NIC1 \| NIC2 +-----------------+ \| server \| +-----------------+ - https://www.ruijie.com/fr-fr/support/tech-gallery/de-stack-data-center-network-architecture/ Cc: Jay Vosburgh <jv@jvosburgh.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Simon Horman <horms@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Andrew Lunn <andrew+netdev@lunn.ch> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: Tonghao Zhang <tonghao@bamaicloud.com> Signed-off-by: Zengbing Tu <tuzengbing@didiglobal.com> Link: https://patch.msgid.link/84d0a044514157bb856a10b6d03a1028c4883561.1751031306.git.tonghao@bamaicloud.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-07-08 10:59:41 +02:00
Jakub Kicinski	0234362d0a	Merge branch 'net-mlx5-hws-optimize-matchers-icm-usage' Mark Bloch says: ==================== net/mlx5: HWS, Optimize matchers ICM usage This series optimizes ICM usage for unidirectional rules and empty matchers and with the last patch we make hardware steering the default FDB steering provider for NICs that don't support software steering. Hardware steering (HWS) uses a type of rule table container (RTC) that is unidirectional, so matchers consist of two RTCs to accommodate bidirectional rules. This small series enables resizing the two RTCs independently by tracking the number of rules separately. For extreme cases where all rules are unidirectional, this results in saving close to half the memory footprint. Results for inserting 1M unidirectional rules using a simple module: Pages Memory Before this patch: 300k 1.5GiB After this patch: 160k 900MiB The 'Pages' column measures the number of 4KiB pages the device requests for itself (the ICM). The 'Memory' column is the difference between peak usage and baseline usage (before starting the test) as reported by `free -h`. In addition, second to last patch of the series handles a case where all the matcher's rules were deleted: the large RTCs of the matcher are no longer required, and we can save some more ICM by shrinking the matcher to its initial size. Finally the last patch makes hardware steering the default mode when in swichdev for NICs that don't have software steering support. ==================== Link: https://patch.msgid.link/20250703185431.445571-1-mbloch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-07 19:12:19 -07:00

1 2 3 4 5 ...

1369356 Commits