linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-05-22 01:26:30 -04:00

Author	SHA1	Message	Date
Jakub Kicinski	8b5a19b4ff	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Cross-merge networking fixes after downstream PR (net-6.16-rc8). Conflicts: drivers/net/ethernet/microsoft/mana/gdma_main.c `9669ddda18` ("net: mana: Fix warnings for missing export.h header inclusion") `7553911210` ("net: mana: Allocate MSI-X vectors dynamically") https://lore.kernel.org/20250711130752.23023d98@canb.auug.org.au Adjacent changes: drivers/net/ethernet/ti/icssg/icssg_prueth.h `6e86fb73de` ("net: ti: icssg-prueth: Fix buffer allocation for ICSSG") `ffe8a49091` ("net: ti: icssg-prueth: Read firmware-names from device tree") Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-24 11:10:46 -07:00
Paolo Abeni	94619ea2d9	Merge tag 'ipsec-next-2025-07-23' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Steffen Klassert says: ==================== pull request (net-next): ipsec-next 2025-07-23 1) Optimize to hold device only for the asynchronous decryption, where it is really needed. From Jianbo Liu. 2) Align our inbund SA lookup to RFC 4301. Only SPI and protocol should be used for an inbound SA lookup. From Aakash Kumar S. 3) Skip redundant statistics update for xfrm crypto offload. From Jianbo Liu. Please pull or let me know if there are problems. * tag 'ipsec-next-2025-07-23' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next: xfrm: Skip redundant statistics update for crypto offload xfrm: Duplicate SPI Handling xfrm: hold device only for the asynchronous decryption ==================== Link: https://patch.msgid.link/20250723080402.3439619-1-steffen.klassert@secunet.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-07-24 15:13:20 +02:00
Paolo Abeni	291d5dc80e	Merge tag 'ipsec-2025-07-23' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec Steffen Klassert says: ==================== pull request (net): ipsec 2025-07-23 1) Premption fixes for xfrm_state_find. From Sabrina Dubroca. 2) Initialize offload path also for SW IPsec GRO. This fixes a performance regression on SW IPsec offload. From Leon Romanovsky. 3) Fix IPsec UDP GRO for IKE packets. From Tobias Brunner, 4) Fix transport header setting for IPcomp after decompressing. From Fernando Fernandez Mancera. 5) Fix use-after-free when xfrmi_changelink tries to change collect_md for a xfrm interface. From Eyal Birger . 6) Delete the special IPcomp x->tunnel state along with the state x to avoid refcount problems. From Sabrina Dubroca. Please pull or let me know if there are problems. * tag 'ipsec-2025-07-23' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec: Revert "xfrm: destroy xfrm_state synchronously on net exit path" xfrm: delete x->tunnel as we delete x xfrm: interface: fix use-after-free after changing collect_md xfrm interface xfrm: ipcomp: adjust transport header after decompressing xfrm: Set transport header to fix UDP GRO handling xfrm: always initialize offload path xfrm: state: use a consistent pcpu_id in xfrm_state_find xfrm: state: initialize state_ptrs earlier in xfrm_state_find ==================== Link: https://patch.msgid.link/20250723075417.3432644-1-steffen.klassert@secunet.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-07-24 12:30:40 +02:00
Paul Chaignon	e09299225d	bpf: Reject narrower access to pointer ctx fields The following BPF program, simplified from a syzkaller repro, causes a kernel warning: r0 = (u8 )(r1 + 169); exit; With pointer field sk being at offset 168 in __sk_buff. This access is detected as a narrower read in bpf_skb_is_valid_access because it doesn't match offsetof(struct __sk_buff, sk). It is therefore allowed and later proceeds to bpf_convert_ctx_access. Note that for the "is_narrower_load" case in the convert_ctx_accesses(), the insn->off is aligned, so the cnt may not be 0 because it matches the offsetof(struct __sk_buff, sk) in the bpf_convert_ctx_access. However, the target_size stays 0 and the verifier errors with a kernel warning: verifier bug: error during ctx access conversion(1) This patch fixes that to return a proper "invalid bpf_context access off=X size=Y" error on the load instruction. The same issue affects multiple other fields in context structures that allow narrow access. Some other non-affected fields (for sk_msg, sk_lookup, and sockopt) were also changed to use bpf_ctx_range_ptr for consistency. Note this syzkaller crash was reported in the "Closes" link below, which used to be about a different bug, fixed in commit `fce7bd8e38` ("bpf/verifier: Handle BPF_LOAD_ACQ instructions in insn_def_regno()"). Because syzbot somehow confused the two bugs, the new crash and repro didn't get reported to the mailing list. Fixes: `f96da09473` ("bpf: simplify narrower ctx access") Fixes: `0df1a55afa` ("bpf: Warn on internal verifier errors") Reported-by: syzbot+0ef84a7bdf5301d4cbec@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=0ef84a7bdf5301d4cbec Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://patch.msgid.link/3b8dcee67ff4296903351a974ddd9c4dca768b64.1753194596.git.paul.chaignon@gmail.com	2025-07-23 19:33:49 -07:00
Kuniyuki Iwashima	17ce3e5949	bpf: Disable migration in nf_hook_run_bpf(). syzbot reported that the netfilter bpf prog can be called without migration disabled in xmit path. Then the assertion in __bpf_prog_run() fails, triggering the splat below. [0] Let's use bpf_prog_run_pin_on_cpu() in nf_hook_run_bpf(). [0]: BUG: assuming non migratable context at ./include/linux/filter.h:703 in_atomic(): 0, irqs_disabled(): 0, migration_disabled() 0 pid: 5829, name: sshd-session 3 locks held by sshd-session/5829: #0: ffff88807b4e4218 (sk_lock-AF_INET){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1667 [inline] #0: ffff88807b4e4218 (sk_lock-AF_INET){+.+.}-{0:0}, at: tcp_sendmsg+0x20/0x50 net/ipv4/tcp.c:1395 #1: ffffffff8e5c4e00 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire include/linux/rcupdate.h:331 [inline] #1: ffffffff8e5c4e00 (rcu_read_lock){....}-{1:3}, at: rcu_read_lock include/linux/rcupdate.h:841 [inline] #1: ffffffff8e5c4e00 (rcu_read_lock){....}-{1:3}, at: __ip_queue_xmit+0x69/0x26c0 net/ipv4/ip_output.c:470 #2: ffffffff8e5c4e00 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire include/linux/rcupdate.h:331 [inline] #2: ffffffff8e5c4e00 (rcu_read_lock){....}-{1:3}, at: rcu_read_lock include/linux/rcupdate.h:841 [inline] #2: ffffffff8e5c4e00 (rcu_read_lock){....}-{1:3}, at: nf_hook+0xb2/0x680 include/linux/netfilter.h:241 CPU: 0 UID: 0 PID: 5829 Comm: sshd-session Not tainted 6.16.0-rc6-syzkaller-00002-g155a3c003e55 #0 PREEMPT(full) Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/07/2025 Call Trace: <TASK> __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x16c/0x1f0 lib/dump_stack.c:120 __cant_migrate kernel/sched/core.c:8860 [inline] __cant_migrate+0x1c7/0x250 kernel/sched/core.c:8834 __bpf_prog_run include/linux/filter.h:703 [inline] bpf_prog_run include/linux/filter.h:725 [inline] nf_hook_run_bpf+0x83/0x1e0 net/netfilter/nf_bpf_link.c:20 nf_hook_entry_hookfn include/linux/netfilter.h:157 [inline] nf_hook_slow+0xbb/0x200 net/netfilter/core.c:623 nf_hook+0x370/0x680 include/linux/netfilter.h:272 NF_HOOK_COND include/linux/netfilter.h:305 [inline] ip_output+0x1bc/0x2a0 net/ipv4/ip_output.c:433 dst_output include/net/dst.h:459 [inline] ip_local_out net/ipv4/ip_output.c:129 [inline] __ip_queue_xmit+0x1d7d/0x26c0 net/ipv4/ip_output.c:527 __tcp_transmit_skb+0x2686/0x3e90 net/ipv4/tcp_output.c:1479 tcp_transmit_skb net/ipv4/tcp_output.c:1497 [inline] tcp_write_xmit+0x1274/0x84e0 net/ipv4/tcp_output.c:2838 __tcp_push_pending_frames+0xaf/0x390 net/ipv4/tcp_output.c:3021 tcp_push+0x225/0x700 net/ipv4/tcp.c:759 tcp_sendmsg_locked+0x1870/0x42b0 net/ipv4/tcp.c:1359 tcp_sendmsg+0x2e/0x50 net/ipv4/tcp.c:1396 inet_sendmsg+0xb9/0x140 net/ipv4/af_inet.c:851 sock_sendmsg_nosec net/socket.c:712 [inline] __sock_sendmsg net/socket.c:727 [inline] sock_write_iter+0x4aa/0x5b0 net/socket.c:1131 new_sync_write fs/read_write.c:593 [inline] vfs_write+0x6c7/0x1150 fs/read_write.c:686 ksys_write+0x1f8/0x250 fs/read_write.c:738 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xcd/0x4c0 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7fe7d365d407 Code: 48 89 fa 4c 89 df e8 38 aa 00 00 8b 93 08 03 00 00 59 5e 48 83 f8 fc 74 1a 5b c3 0f 1f 84 00 00 00 00 00 48 8b 44 24 10 0f 05 <5b> c3 0f 1f 80 00 00 00 00 83 e2 39 83 fa 08 75 de e8 23 ff ff ff RSP: Fixes: `fd9c663b9a` ("bpf: minimal support for programs hooked into netfilter framework") Reported-by: syzbot+40f772d37250b6d10efc@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/6879466d.a00a0220.3af5df.0022.GAE@google.com/ Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Tested-by: syzbot+40f772d37250b6d10efc@syzkaller.appspotmail.com Acked-by: Florian Westphal <fw@strlen.de> Link: https://patch.msgid.link/20250722224041.112292-1-kuniyu@google.com	2025-07-23 18:58:03 -07:00
Koen De Schepper	8f9516daed	sched: Add enqueue/dequeue of dualpi2 qdisc DualPI2 provides L4S-type low latency & loss to traffic that uses a scalable congestion controller (e.g. TCP-Prague, DCTCP) without degrading the performance of 'classic' traffic (e.g. Reno, Cubic etc.). It is to be the reference implementation of IETF RFC9332 DualQ Coupled AQM (https://datatracker.ietf.org/doc/html/rfc9332). Note that creating two independent queues cannot meet the goal of DualPI2 mentioned in RFC9332: "...to preserve fairness between ECN-capable and non-ECN-capable traffic." Further, it could even lead to starvation of Classic traffic, which is also inconsistent with the requirements in RFC9332: "...although priority MUST be bounded in order not to starve Classic traffic." DualPI2 is designed to maintain approximate per-flow fairness on L-queue and C-queue by forming a single qdisc using the coupling factor and scheduler between two queues. The qdisc provides two queues called low latency and classic. It classifies packets based on the ECN field in the IP headers. By default it directs non-ECN and ECT(0) into the classic queue and ECT(1) and CE into the low latency queue, as per the IETF spec. Each queue runs its own AQM: * The classic AQM is called PI2, which is similar to the PIE AQM but more responsive and simpler. Classic traffic requires a decent target queue (default 15ms for Internet deployment) to fully utilize the link and to avoid high drop rates. * The low latency AQM is, by default, a very shallow ECN marking threshold (1ms) similar to that used for DCTCP. The DualQ isolates the low queuing delay of the Low Latency queue from the larger delay of the 'Classic' queue. However, from a bandwidth perspective, flows in either queue will share out the link capacity as if there was just a single queue. This bandwidth pooling effect is achieved by coupling together the drop and ECN-marking probabilities of the two AQMs. The PI2 AQM has two main parameters in addition to its target delay. The integral gain factor alpha is used to slowly correct any persistent standing queue error from the target delay, while the proportional gain factor beta is used to quickly compensate for queue changes (growth or shrinkage). Either alpha and beta are given as a parameter, or they can be calculated by tc from alternative typical and maximum RTT parameters. Internally, the output of a linear Proportional Integral (PI) controller is used for both queues. This output is squared to calculate the drop or ECN-marking probability of the classic queue. This counterbalances the square-root rate equation of Reno/Cubic, which is the trick that balances flow rates across the queues. For the ECN-marking probability of the low latency queue, the output of the base AQM is multiplied by a coupling factor. This determines the balance between the flow rates in each queue. The default setting makes the flow rates roughly equal, which should be generally applicable. If DUALPI2 AQM has detected overload (due to excessive non-responsive traffic in either queue), it will switch to signaling congestion solely using drop, irrespective of the ECN field. Alternatively, it can be configured to limit the drop probability and let the queue grow and eventually overflow (like tail-drop). GSO splitting in DUALPI2 is configurable from userspace while the default behavior is to split gso. When running DUALPI2 at unshaped 10gigE with 4 download streams test, splitting gso apart results in halving the latency with no loss in throughput: Summary of tcp_4down run 'no_split_gso': avg median # data pts Ping (ms) ICMP : 0.53 0.30 ms 350 TCP download avg : 2326.86 N/A Mbits/s 350 TCP download sum : 9307.42 N/A Mbits/s 350 TCP download::1 : 2672.99 2568.73 Mbits/s 350 TCP download::2 : 2586.96 2570.51 Mbits/s 350 TCP download::3 : 1786.26 1798.82 Mbits/s 350 TCP download::4 : 2261.21 2309.49 Mbits/s 350 Summart of tcp_4down run 'split_gso': avg median # data pts Ping (ms) ICMP : 0.22 0.23 ms 350 TCP download avg : 2335.02 N/A Mbits/s 350 TCP download sum : 9340.09 N/A Mbits/s 350 TCP download::1 : 2335.30 2334.22 Mbits/s 350 TCP download::2 : 2334.72 2334.20 Mbits/s 350 TCP download::3 : 2335.28 2334.58 Mbits/s 350 TCP download::4 : 2334.79 2334.39 Mbits/s 350 A similar result is observed when running DUALPI2 at unshaped 1gigE with 1 download stream test: Summary of tcp_1down run 'no_split_gso': avg median # data pts Ping (ms) ICMP : 1.13 1.25 ms 350 TCP download : 941.41 941.46 Mbits/s 350 Summart of tcp_1down run 'split_gso': avg median # data pts Ping (ms) ICMP : 0.51 0.55 ms 350 TCP download : 941.41 941.45 Mbits/s 350 Additional details can be found in the draft: https://datatracker.ietf.org/doc/html/rfc9332 Signed-off-by: Koen De Schepper <koen.de_schepper@nokia-bell-labs.com> Co-developed-by: Olga Albisser <olga@albisser.org> Signed-off-by: Olga Albisser <olga@albisser.org> Co-developed-by: Olivier Tilmans <olivier.tilmans@nokia.com> Signed-off-by: Olivier Tilmans <olivier.tilmans@nokia.com> Co-developed-by: Henrik Steen <henrist@henrist.net> Signed-off-by: Henrik Steen <henrist@henrist.net> Co-developed-by: Chia-Yu Chang <chia-yu.chang@nokia-bell-labs.com> Signed-off-by: Chia-Yu Chang <chia-yu.chang@nokia-bell-labs.com> Signed-off-by: Bob Briscoe <research@bobbriscoe.net> Signed-off-by: Ilpo Järvinen <ij@kernel.org> Acked-by: Dave Taht <dave.taht@gmail.com> Link: https://patch.msgid.link/20250722095915.24485-4-chia-yu.chang@nokia-bell-labs.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-23 17:52:07 -07:00
Chia-Yu Chang	d4de8bffbe	sched: Dump configuration and statistics of dualpi2 qdisc The configuration and statistics dump of the DualPI2 Qdisc provides information related to both queues, such as packet numbers and queuing delays in the L-queue and C-queue, as well as general information such as probability value, WRR credits, memory usage, packet marking counters, max queue size, etc. The following patch includes enqueue/dequeue for DualPI2. Signed-off-by: Chia-Yu Chang <chia-yu.chang@nokia-bell-labs.com> Link: https://patch.msgid.link/20250722095915.24485-3-chia-yu.chang@nokia-bell-labs.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-23 17:52:07 -07:00
Chia-Yu Chang	320d031ad6	sched: Struct definition and parsing of dualpi2 qdisc DualPI2 is the reference implementation of IETF RFC9332 DualQ Coupled AQM (https://datatracker.ietf.org/doc/html/rfc9332) providing two queues called low latency (L-queue) and classic (C-queue). By default, it enqueues non-ECN and ECT(0) packets into the C-queue and ECT(1) and CE packets into the low latency queue (L-queue), as per IETF RFC9332 spec. This patch defines the dualpi2 Qdisc structure and parsing, and the following two patches include dumping and enqueue/dequeue for the DualPI2. Signed-off-by: Chia-Yu Chang <chia-yu.chang@nokia-bell-labs.com> Link: https://patch.msgid.link/20250722095915.24485-2-chia-yu.chang@nokia-bell-labs.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-23 17:52:07 -07:00
Carolina Jubran	1bbdb81a98	devlink: Fix excessive stack usage in rate TC bandwidth parsing The devlink_nl_rate_tc_bw_parse function uses a large stack array for devlink attributes, which triggers a warning about excessive stack usage: net/devlink/rate.c: In function 'devlink_nl_rate_tc_bw_parse': net/devlink/rate.c:382:1: error: the frame size of 1648 bytes is larger than 1536 bytes [-Werror=frame-larger-than=] Introduce a separate attribute set specifically for rate TC bandwidth parsing that only contains the two attributes actually used: index and bandwidth. This reduces the stack array from DEVLINK_ATTR_MAX entries to just 2 entries, solving the stack usage issue. Update devlink selftest to use the new 'index' and 'bw' attribute names consistent with the YAML spec. Example usage with ynl with the new spec: ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/devlink.yaml \ --do rate-set --json '{ "bus-name": "pci", "dev-name": "0000:08:00.0", "port-index": 1, "rate-tc-bws": [ {"index": 0, "bw": 50}, {"index": 1, "bw": 50}, {"index": 2, "bw": 0}, {"index": 3, "bw": 0}, {"index": 4, "bw": 0}, {"index": 5, "bw": 0}, {"index": 6, "bw": 0}, {"index": 7, "bw": 0} ] }' ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/devlink.yaml \ --do rate-get --json '{ "bus-name": "pci", "dev-name": "0000:08:00.0", "port-index": 1 }' output for rate-get: {'bus-name': 'pci', 'dev-name': '0000:08:00.0', 'port-index': 1, 'rate-tc-bws': [{'bw': 50, 'index': 0}, {'bw': 50, 'index': 1}, {'bw': 0, 'index': 2}, {'bw': 0, 'index': 3}, {'bw': 0, 'index': 4}, {'bw': 0, 'index': 5}, {'bw': 0, 'index': 6}, {'bw': 0, 'index': 7}], 'rate-tx-max': 0, 'rate-tx-priority': 0, 'rate-tx-share': 0, 'rate-tx-weight': 0, 'rate-type': 'leaf'} Fixes: `566e8f108f` ("devlink: Extend devlink rate API with traffic classes bandwidth management") Reported-by: Arnd Bergmann <arnd@arndb.de> Closes: https://lore.kernel.org/netdev/20250708160652.1810573-1-arnd@kernel.org/ Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202507171943.W7DJcs6Y-lkp@intel.com/ Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Tested-by: Carolina Jubran <cjubran@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://patch.msgid.link/1753175609-330621-1-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-23 17:07:35 -07:00
Yang Li	a7bcffc673	Bluetooth: Add PA_LINK to distinguish BIG sync and PA sync connections Currently, BIS_LINK is used for both BIG sync and PA sync connections, which makes it impossible to distinguish them when searching for a PA sync connection. Adding PA_LINK will make the distinction clearer and simplify future extensions for PA-related features. Signed-off-by: Yang Li <yang.li@amlogic.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2025-07-23 10:35:14 -04:00
Chris Down	0cadf8534f	Bluetooth: hci_event: Mask data status from LE ext adv reports The Event_Type field in an LE Extended Advertising Report uses bits 5 and 6 for data status (e.g. truncation or fragmentation), not the PDU type itself. The ext_evt_type_to_legacy() function fails to mask these status bits before evaluation. This causes valid advertisements with status bits set (e.g. a truncated non-connectable advertisement, which ends up showing as PDU type 0x40) to be misclassified as unknown and subsequently dropped. This is okay for most checks which use bitwise AND on the relevant event type bits, but it doesn't work for non-connectable types, which are checked with '== LE_EXT_ADV_NON_CONN_IND' (that is, zero). In terms of behaviour, first the device sends a truncated report: > HCI Event: LE Meta Event (0x3e) plen 26 LE Extended Advertising Report (0x0d) Entry 0 Event type: 0x0040 Data status: Incomplete, data truncated, no more to come Address type: Random (0x01) Address: 1D:12:46:FA:F8:6E (Non-Resolvable) SID: 0x03 RSSI: -98 dBm (0x9e) Data length: 0x00 Then, a few seconds later, it sends the subsequent complete report: > HCI Event: LE Meta Event (0x3e) plen 122 LE Extended Advertising Report (0x0d) Entry 0 Event type: 0x0000 Data status: Complete Address type: Random (0x01) Address: 1D:12:46:FA:F8:6E (Non-Resolvable) SID: 0x03 RSSI: -97 dBm (0x9f) Data length: 0x60 Service Data: Google (0xfef3) Data[92]: ... These devices often send multiple truncated reports per second. This patch introduces a PDU type mask to ensure only the relevant bits are evaluated, allowing for the correct translation of all valid extended advertising packets. Fixes: `b2cc9761f1` ("Bluetooth: Handle extended ADV PDU types") Signed-off-by: Chris Down <chris@chrisdown.name> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2025-07-23 10:34:47 -04:00
Ivan Pravdin	7af4d7b535	Bluetooth: hci_devcd_dump: fix out-of-bounds via dev_coredumpv Currently both dev_coredumpv and skb_put_data in hci_devcd_dump use hdev->dump.head. However, dev_coredumpv can free the buffer. From dev_coredumpm_timeout documentation, which is used by dev_coredumpv: > Creates a new device coredump for the given device. If a previous one hasn't > been read yet, the new coredump is discarded. The data lifetime is determined > by the device coredump framework and when it is no longer needed the @free > function will be called to free the data. If the data has not been read by the userspace yet, dev_coredumpv will discard new buffer, freeing hdev->dump.head. This leads to vmalloc-out-of-bounds error when skb_put_data tries to access hdev->dump.head. A crash report from syzbot illustrates this: ================================================================== BUG: KASAN: vmalloc-out-of-bounds in skb_put_data include/linux/skbuff.h:2752 [inline] BUG: KASAN: vmalloc-out-of-bounds in hci_devcd_dump+0x142/0x240 net/bluetooth/coredump.c:258 Read of size 140 at addr ffffc90004ed5000 by task kworker/u9:2/5844 CPU: 1 UID: 0 PID: 5844 Comm: kworker/u9:2 Not tainted 6.14.0-syzkaller-10892-g4e82c87058f4 #0 PREEMPT(full) Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2025 Workqueue: hci0 hci_devcd_timeout Call Trace: <TASK> __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:120 print_address_description mm/kasan/report.c:408 [inline] print_report+0xc3/0x670 mm/kasan/report.c:521 kasan_report+0xe0/0x110 mm/kasan/report.c:634 check_region_inline mm/kasan/generic.c:183 [inline] kasan_check_range+0xef/0x1a0 mm/kasan/generic.c:189 __asan_memcpy+0x23/0x60 mm/kasan/shadow.c:105 skb_put_data include/linux/skbuff.h:2752 [inline] hci_devcd_dump+0x142/0x240 net/bluetooth/coredump.c:258 hci_devcd_timeout+0xb5/0x2e0 net/bluetooth/coredump.c:413 process_one_work+0x9cc/0x1b70 kernel/workqueue.c:3238 process_scheduled_works kernel/workqueue.c:3319 [inline] worker_thread+0x6c8/0xf10 kernel/workqueue.c:3400 kthread+0x3c2/0x780 kernel/kthread.c:464 ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:153 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245 </TASK> The buggy address ffffc90004ed5000 belongs to a vmalloc virtual mapping Memory state around the buggy address: ffffc90004ed4f00: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 ffffc90004ed4f80: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 >ffffc90004ed5000: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 ^ ffffc90004ed5080: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 ffffc90004ed5100: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 ================================================================== To avoid this issue, reorder dev_coredumpv to be called after skb_put_data that does not free the data. Reported-by: syzbot+ac3c79181f6aecc5120c@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=ac3c79181f6aecc5120c Fixes: `b257e02ecc` ("HCI: coredump: Log devcd dumps into the monitor") Tested-by: syzbot+ac3c79181f6aecc5120c@syzkaller.appspotmail.com Signed-off-by: Ivan Pravdin <ipravdin.official@gmail.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2025-07-23 10:33:57 -04:00
Yang Li	ef568ae04e	Bluetooth: ISO: Support SCM_TIMESTAMPING for ISO TS User-space applications (e.g. PipeWire) depend on ISO-formatted timestamps for precise audio sync. The ISO ts is based on the controller’s clock domain, so hardware timestamping (hwtimestamp) must be used. Ref: Documentation/networking/timestamping.rst, section 3.1 Hardware Timestamping. Signed-off-by: Yang Li <yang.li@amlogic.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2025-07-23 10:31:49 -04:00
Pauli Virtanen	7565bc5659	Bluetooth: ISO: add socket option to report packet seqnum via CMSG User applications need a way to track which ISO interval a given SDU belongs to, to properly detect packet loss. All controllers do not set timestamps, and it's not guaranteed user application receives all packet reports (small socket buffer, or controller doesn't send all reports like Intel AX210 is doing). Add socket option BT_PKT_SEQNUM that enables reporting of received packet ISO sequence number in BT_SCM_PKT_SEQNUM CMSG. Use BT_PKT_SEQNUM == 22 for the socket option, as 21 was used earlier for a removed experimental feature that never got into mainline. Signed-off-by: Pauli Virtanen <pav@iki.fi> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2025-07-23 10:31:19 -04:00
Bastien Nocera	0e492dbacc	Bluetooth: Fix typos in comments Found by codespell. Signed-off-by: Bastien Nocera <hadess@hadess.net> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2025-07-23 10:30:48 -04:00
Bastien Nocera	e6555fffd5	Bluetooth: RFCOMM: Fix typos in comments Found by codespell. Signed-off-by: Bastien Nocera <hadess@hadess.net> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2025-07-23 10:30:32 -04:00
Bastien Nocera	8074811359	Bluetooth: aosp: Fix typo in comment Found by codespell. Signed-off-by: Bastien Nocera <hadess@hadess.net> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2025-07-23 10:30:18 -04:00
Yang Li	be31d11ec9	Bluetooth: Fix spelling mistakes Correct the misspelling of “estabilished” in the code. Signed-off-by: Yang Li <yang.li@amlogic.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2025-07-23 10:27:33 -04:00
Yang Li	b2a5f2e1c1	Bluetooth: hci_event: Add support for handling LE BIG Sync Lost event When the BIS source stops, the controller sends an LE BIG Sync Lost event (subevent 0x1E). Currently, this event is not handled, causing the BIS stream to remain active in BlueZ and preventing recovery. Signed-off-by: Yang Li <yang.li@amlogic.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2025-07-23 10:27:04 -04:00
Zijun Hu	e44328c99b	Bluetooth: hci_event: Correct comment about HCI_EV_EXTENDED_INQUIRY_RESULT HCI_EV_EXTENDED_INQUIRY_RESULT's comment wrongly uses 0x2d as its event code. Use right 0x2f instead. Signed-off-by: Zijun Hu <zijun.hu@oss.qualcomm.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2025-07-23 10:26:18 -04:00
Zijun Hu	88d6ba89d8	Bluetooth: hci_core: Eliminate an unnecessary goto label in hci_find_irk_by_addr() Eliminate an unnecessary goto label by using break instead of goto to exit the loop in hci_find_irk_by_addr(). Signed-off-by: Zijun Hu <zijun.hu@oss.qualcomm.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2025-07-23 10:26:04 -04:00
Zijun Hu	da0186f19a	Bluetooth: hci_sync: Use bt_dev_err() to log error message in hci_update_event_filter_sync() Use bt_dev_err() instead of bt_dev_dbg() to log error message in hci_update_event_filter_sync(). Signed-off-by: Zijun Hu <zijun.hu@oss.qualcomm.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2025-07-23 10:25:49 -04:00
Zijun Hu	4d7936e8a5	Bluetooth: hci_sock: Reset cookie to zero in hci_sock_free_cookie() Reset cookie value to 0 instead of 0xffffffff in hci_sock_free_cookie() since: 0 : means cookie has not been assigned yet 0xffffffff: means cookie assignment failure Also fix generating cookie failure with usage shown below: hci_sock_gen_cookie(sk) // generate cookie hci_sock_free_cookie(sk) // free cookie hci_sock_gen_cookie(sk) // Can't generate cookie any more Signed-off-by: Zijun Hu <zijun.hu@oss.qualcomm.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2025-07-23 10:25:34 -04:00
Johannes Berg	c57e5b9819	wifi: mac80211: fix WARN_ON for monitor mode on some devices On devices without WANT_MONITOR_VIF (and probably without channel context support) we get a WARN_ON for changing the per-link setting of a monitor interface. Since we already skip AP_VLAN interfaces and MONITOR with WANT_MONITOR_VIF and/or NO_VIRTUAL_MONITOR should update the settings, catch this in the link change code instead of the warning. Reported-by: Martin Kaistra <martin.kaistra@linutronix.de> Link: https://lore.kernel.org/r/a9de62a0-28f1-4981-84df-253489da74ed@linutronix.de/ Fixes: `c4382d5ca1` ("wifi: mac80211: update the right link for tx power") Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2025-07-23 12:29:07 +02:00
Paolo Abeni	b115c77588	tcp: do not increment BeyondWindow MIB for old seq The mentioned MIB is currently incremented even when a packet with an old sequence number (i.e. a zero window probe) is received, which is IMHO misleading. Explicitly restrict such MIB increment at the relevant events. Fixes: `6c758062c6` ("tcp: add LINUX_MIB_BEYOND_WINDOW") Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Link: https://patch.msgid.link/20d147292eb4b13b6535e0ad6f56be64d9c330d3.1753118029.git.pabeni@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-22 18:21:15 -07:00
Paolo Abeni	972ca7a3bc	tcp: do not set a zero size receive buffer The nipa CI is reporting frequent failures in the mptcp_connect self-tests. In the failing scenarios (TCP -> MPTCP) the involved sockets are actually plain TCP ones, as fallback for passive socket at 2whs time cause the MPTCP listener to actually create a TCP socket. The transfer is stuck due to the receiver buffer being zero. With the stronger check in place, tcp_clamp_window() can be invoked while the TCP socket has sk_rmem_alloc == 0, and the receive buffer will be zeroed, too. Check for the critical condition in tcp_prune_queue() and just drop the packet without shrinking the receiver buffer. Fixes: `1d2fbaad7c` ("tcp: stronger sk_rcvbuf checks") Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20c18165d3f848e1c5c1b782d88c1a5ab38b3f70.1753118029.git.pabeni@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-22 18:21:15 -07:00
Fan Yu	ad892e912b	tcp: trace retransmit failures in tcp_retransmit_skb Background ========== When TCP retransmits a packet due to missing ACKs, the retransmission may fail for various reasons (e.g., packets stuck in driver queues, receiver zero windows, or routing issues). The original tcp_retransmit_skb tracepoint: 'commit `e086101b15` ("tcp: add a tracepoint for tcp retransmission")' lacks visibility into these failure causes, making production diagnostics difficult. Solution ======== Adds the retval("err") to the tcp_retransmit_skb tracepoint. Enables users to know why some tcp retransmission failed and users can filter retransmission failures by retval. Compatibility description ========================= This patch extends the tcp_retransmit_skb tracepoint by adding a new "err" field at the end of its existing structure (within TP_STRUCT__entry). The compatibility implications are detailed as follows: 1) Structural compatibility for legacy user-space tools Legacy tools/BPF programs accessing existing fields (by offset or name) can still work without modification or recompilation.The new field is appended to the end, preserving original memory layout. 2) Note: semantic changes The original tracepoint primarily only focused on successfully retransmitted packets. With this patch, the tracepoint now can figure out packets that may terminate early due to specific reasons. For accurate statistics, users should filter using "err" to distinguish outcomes. Before patched: field:const void * skbaddr; offset:8; size:8; signed:0; field:const void * skaddr; offset:16; size:8; signed:0; field:int state; offset:24; size:4; signed:1; field:__u16 sport; offset:28; size:2; signed:0; field:__u16 dport; offset:30; size:2; signed:0; field:__u16 family; offset:32; size:2; signed:0; field:__u8 saddr[4]; offset:34; size:4; signed:0; field:__u8 daddr[4]; offset:38; size:4; signed:0; field:__u8 saddr_v6[16]; offset:42; size:16; signed:0; field:__u8 daddr_v6[16]; offset:58; size:16; signed:0; print fmt: "skbaddr=%p skaddr=%p family=%s sport=%hu dport=%hu saddr=%pI4 daddr=%pI4 saddrv6=%pI6c daddrv6=%pI6c state=%s" After patched: field:const void * skbaddr; offset:8; size:8; signed:0; field:const void * skaddr; offset:16; size:8; signed:0; field:int state; offset:24; size:4; signed:1; field:__u16 sport; offset:28; size:2; signed:0; field:__u16 dport; offset:30; size:2; signed:0; field:__u16 family; offset:32; size:2; signed:0; field:__u8 saddr[4]; offset:34; size:4; signed:0; field:__u8 daddr[4]; offset:38; size:4; signed:0; field:__u8 saddr_v6[16]; offset:42; size:16; signed:0; field:__u8 daddr_v6[16]; offset:58; size:16; signed:0; field:int err; offset:76; size:4; signed:1; print fmt: "skbaddr=%p skaddr=%p family=%s sport=%hu dport=%hu saddr=%pI4 daddr=%pI4 saddrv6=%pI6c daddrv6=%pI6c state=%s err=%d" Co-developed-by: xu xin <xu.xin16@zte.com.cn> Signed-off-by: xu xin <xu.xin16@zte.com.cn> Signed-off-by: Fan Yu <fan.yu9@zte.com.cn> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250721111607626_BDnIJB0ywk6FghN63bor@zte.com.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-22 18:19:11 -07:00
Randy Dunlap	b2dd6eb0ac	net: Kconfig: add endif/endmenu comments Add comments on endif & endmenu blocks. This can save time when searching & trying to understand kconfig menu dependencies. The other endif & endmenu statements are already commented like this. This makes it similar to drivers/net/Kconfig, which is already commented like this. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250721020420.3555128-1-rdunlap@infradead.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-22 18:17:23 -07:00
Antonio Quartulli	708243c62e	wifi: mac80211: fix unassigned variable access In ieee80211_latest_active_link_conn_timeout() we loop over all sta->links in order to compute the timeout expiring last across all links. Such timeout is stored in `latest_timeout` which is used in the time_after() comparison before having been initialized. Fix this behaviour by initializing the variable to `jiffies` and adapt surrouding conditions accordingly. Note that the caller assumed latest_timeout to be 0 if no active link was found. This is not appropriate because jiffies=0 is a valid (and recurrent, although not often) point in time. By using `jiffies` as default value for latest_timeout, we can fix the caller as well. Address-Coverity-ID: 1647986 ("Uninitialized variables (UNINIT)") Fixes: `1bc892d76a` ("wifi: mac80211: extend connection monitoring for MLO") Signed-off-by: Antonio Quartulli <antonio@mandelbit.com> Link: https://patch.msgid.link/20250722120634.3501-1-antonio@mandelbit.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2025-07-22 14:13:03 +02:00
Trond Myklebust	f66e6bffc5	SUNRPC: Silence warnings about parameters not being described Warning: net/sunrpc/auth_gss/gss_krb5_crypto.c:902 function parameter 'len' not described in 'krb5_etm_decrypt' Warning: net/sunrpc/auth_gss/gss_krb5_crypto.c:902 function parameter 'buf' not described in 'krb5_etm_decrypt' Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2025-07-22 08:10:41 -04:00
Yue Haibing	db8a5149fa	ip6_gre: Factor out common ip6gre tunnel match into helper Extract common ip6gre tunnel match from ip6gre_tunnel_lookup() into new helper function ip6gre_tunnel_match() to reduce code duplication. No functional change intended. Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250719081551.963670-1-yuehaibing@huawei.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-07-22 12:15:26 +02:00
Xiang Mei	cf074eca00	net/sched: sch_qfq: Avoid triggering might_sleep in atomic context in qfq_delete_class might_sleep could be trigger in the atomic context in qfq_delete_class. qfq_destroy_class was moved into atomic context locked by sch_tree_lock to avoid a race condition bug on qfq_aggregate. However, might_sleep could be triggered by qfq_destroy_class, which introduced sleeping in atomic context (path: qfq_destroy_class->qdisc_put->__qdisc_destroy->lockdep_unregister_key ->might_sleep). Considering the race is on the qfq_aggregate objects, keeping qfq_rm_from_agg in the lock but moving the left part out can solve this issue. Fixes: `5e28d5a3f7` ("net/sched: sch_qfq: Fix race condition on qfq_aggregate") Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Xiang Mei <xmei5@asu.edu> Link: https://patch.msgid.link/4a04e0cc-a64b-44e7-9213-2880ed641d77@sabinyo.mountain Reviewed-by: Cong Wang <xiyou.wangcong@gmail.com> Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org> Link: https://patch.msgid.link/20250717230128.159766-1-xmei5@asu.edu Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-07-22 11:48:34 +02:00
Miri Korenblit	69fdb08435	wifi: mac80211: don't require cipher and keylen in gtk rekey ieee80211_add_gtk_rekey receives a keyconf as an argument, and the cipher and keylen are taken from there to the new allocated key. But in rekey, both the cipher and the keylen should be the same as of the old key, so let ieee80211_add_gtk_rekey find those, so drivers won't have to fill it in. Reviewed-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20250721214922.3c5c023bfae9.Ie6594ae2b4b6d5b3d536e642b349046ebfce7a5d@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2025-07-22 10:43:19 +02:00
Kees Cook	2ed9a9fc99	wifi: nl80211: Set num_sub_specs before looping through sub_specs The processing of the struct cfg80211_sar_specs::sub_specs flexible array requires its counter, num_sub_specs, to be assigned before the loop in nl80211_set_sar_specs(). Leave the final assignment after the loop in place in case fewer ended up in the array. Fixes: `aa4ec06c45` ("wifi: cfg80211: use __counted_by where appropriate") Signed-off-by: Kees Cook <kees@kernel.org> Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Link: https://patch.msgid.link/20250721183125.work.183-kees@kernel.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2025-07-22 10:38:39 +02:00
Kees Cook	a37192c432	wifi: mac80211: Write cnt before copying in ieee80211_copy_rnr_beacon() While I caught the need for setting cnt early in nl80211_parse_rnr_elems() in the original annotation of struct cfg80211_rnr_elems with __counted_by, I missed a similar pattern in ieee80211_copy_rnr_beacon(). Fix this by moving the cnt assignment to before the loop. Fixes: `7b6d708703` ("wifi: cfg80211: Annotate struct cfg80211_rnr_elems with __counted_by") Signed-off-by: Kees Cook <kees@kernel.org> Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Link: https://patch.msgid.link/20250721182521.work.540-kees@kernel.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2025-07-22 10:38:21 +02:00
Jakub Kicinski	fbe09277fa	ethtool: rss: support removing contexts via Netlink Implement removing additional RSS contexts via Netlink. Technically it'd be possible to shoehorn the delete operation into ethnl_request_ops-compatible handler. The code ends up longer than open coded version, and I think we'll need a custom way of sending notifications at some stage (if we allow tying the context lifetime to the netlink socket, in the future). Link: https://patch.msgid.link/20250717234343.2328602-8-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-21 18:21:19 -07:00
Jakub Kicinski	a166ab7816	ethtool: rss: support creating contexts via Netlink Support creating contexts via Netlink. Setting flow hashing fields on the new context is not supported at this stage, it can be added later. An empty indirection table is not supported. This is a carry over from the IOCTL interface where empty indirection table meant delete. We can repurpose empty indirection table in Netlink but for now to avoid confusion reject it using the policy. Support letting user choose the ID for the new context. This was not possible in IOCTL since the context ID field for the create action had to be set to the ETH_RXFH_CONTEXT_ALLOC magic value. Link: https://patch.msgid.link/20250717234343.2328602-7-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-21 18:20:43 -07:00
Jakub Kicinski	55ef461ce1	ethtool: move ethtool_rxfh_ctx_alloc() to common code Move ethtool_rxfh_ctx_alloc() to common code, Netlink will need it. Reviewed-by: Gal Pressman <gal@nvidia.com> Link: https://patch.msgid.link/20250717234343.2328602-6-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-21 18:20:19 -07:00
Jakub Kicinski	5c090d9eae	ethtool: rss: factor out populating response from context Similarly to previous change, factor out populating the response. We will use this after the context was allocated to send a notification so this time factor out from the additional context handling, rather than context 0 handling (for request context didn't exist, for response it does). Reviewed-by: Gal Pressman <gal@nvidia.com> Link: https://patch.msgid.link/20250717234343.2328602-5-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-21 18:20:19 -07:00
Jakub Kicinski	a45f98efa4	ethtool: rss: factor out allocating memory for response To ease the code reuse for RSS_CREATE we'll want to prepare struct rss_reply_data for the new context. Unfortunately we can't depend on the exiting scaffolding because the context doesn't exist (ctx=NULL) when we start preparing. Factor out the portion of the context 0 handling responsible for allocation of request memory, so that we can call it directly. Link: https://patch.msgid.link/20250717234343.2328602-4-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-21 18:20:19 -07:00
Jakub Kicinski	5f5c59b78e	ethtool: rejig the RSS notification machinery for more types In anticipation for CREATE and DELETE notifications - explicitly pass the notification type to ethtool_rss_notify(), when calling from the IOCTL code. Reviewed-by: Gal Pressman <gal@nvidia.com> Link: https://patch.msgid.link/20250717234343.2328602-3-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-21 18:20:19 -07:00
Jakub Kicinski	80e55735d5	ethtool: assert that drivers with sym hash are consistent for RSS contexts Supporting per-RSS context configuration of hashing fields but not the hashing algorithm would complicate the code a lot. We'd need to cross check the config against all RSS contexts. None of the drivers need this today, so explicitly prevent new drivers with such skewed capabilities from registering. If such driver appears it will need to first adjust the checks in the core. Link: https://patch.msgid.link/20250717234343.2328602-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-21 18:20:19 -07:00
moyuanhao	154e56a77d	mptcp: fix typo in a comment This patch fixes the follow spelling mistake in a comment: greter -> greater Signed-off-by: moyuanhao <moyuanhao3676@163.com> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250719-net-next-mptcp-tcp_maxseg-v2-4-8c910fbc5307@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-21 17:48:33 -07:00
Geliang Tang	51c5fd09e1	mptcp: add TCP_MAXSEG sockopt support The TCP_MAXSEG socket option is currently not supported by MPTCP, mainly because it has never been requested before. But there are still valid use-cases, e.g. with HAProxy. This patch adds its support in MPTCP by propagating the value to all subflows. The get part looks at the value on the first subflow, to be as closed as possible to TCP. Only one value can be returned for the cached MSS, so this can come only from one subflow. Similar to mptcp_setsockopt_first_sf_only(), a generic helper mptcp_setsockopt_all_subflows() is added to set sockopt for each subflows of the mptcp socket. Add a new member for struct mptcp_sock to store the TCP_MAXSEG value, and return this value in getsockopt. Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/515 Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250719-net-next-mptcp-tcp_maxseg-v2-3-8c910fbc5307@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-21 17:48:33 -07:00
Geliang Tang	51a62199a8	tcp: add tcp_sock_set_maxseg Add a helper tcp_sock_set_maxseg() to directly set the TCP_MAXSEG sockopt from kernel space. This new helper will be used in the following patch from MPTCP. Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Acked-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250719-net-next-mptcp-tcp_maxseg-v2-2-8c910fbc5307@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-21 17:48:32 -07:00
Geliang Tang	edd669057c	mptcp: sockopt: drop redundant tcp_getsockopt tcp_getsockopt() is called twice in mptcp_getsockopt_first_sf_only() in different conditions, which makes the code a bit redundant. The first call to tcp_getsockopt() when the first subflow exists can be replaced by going to a new label "get" before the second call. Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250719-net-next-mptcp-tcp_maxseg-v2-1-8c910fbc5307@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-21 17:48:32 -07:00
Kito Xu (veritas501)	6c4a92d07b	net: appletalk: Fix use-after-free in AARP proxy probe The AARP proxy‐probe routine (aarp_proxy_probe_network) sends a probe, releases the aarp_lock, sleeps, then re-acquires the lock. During that window an expire timer thread (__aarp_expire_timer) can remove and kfree() the same entry, leading to a use-after-free. race condition: cpu 0 \| cpu 1 atalk_sendmsg() \| atif_proxy_probe_device() aarp_send_ddp() \| aarp_proxy_probe_network() mod_timer() \| lock(aarp_lock) // LOCK!! timeout around 200ms \| alloc(aarp_entry) and then call \| proxies[hash] = aarp_entry aarp_expire_timeout() \| aarp_send_probe() \| unlock(aarp_lock) // UNLOCK!! lock(aarp_lock) // LOCK!! \| msleep(100); __aarp_expire_timer(&proxies[ct]) \| free(aarp_entry) \| unlock(aarp_lock) // UNLOCK!! \| \| lock(aarp_lock) // LOCK!! \| UAF aarp_entry !! ================================================================== BUG: KASAN: slab-use-after-free in aarp_proxy_probe_network+0x560/0x630 net/appletalk/aarp.c:493 Read of size 4 at addr ffff8880123aa360 by task repro/13278 CPU: 3 UID: 0 PID: 13278 Comm: repro Not tainted 6.15.2 #3 PREEMPT(full) Call Trace: <TASK> __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x116/0x1b0 lib/dump_stack.c:120 print_address_description mm/kasan/report.c:408 [inline] print_report+0xc1/0x630 mm/kasan/report.c:521 kasan_report+0xca/0x100 mm/kasan/report.c:634 aarp_proxy_probe_network+0x560/0x630 net/appletalk/aarp.c:493 atif_proxy_probe_device net/appletalk/ddp.c:332 [inline] atif_ioctl+0xb58/0x16c0 net/appletalk/ddp.c:857 atalk_ioctl+0x198/0x2f0 net/appletalk/ddp.c:1818 sock_do_ioctl+0xdc/0x260 net/socket.c:1190 sock_ioctl+0x239/0x6a0 net/socket.c:1311 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:906 [inline] __se_sys_ioctl fs/ioctl.c:892 [inline] __x64_sys_ioctl+0x194/0x200 fs/ioctl.c:892 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xcb/0x250 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f </TASK> Allocated: aarp_alloc net/appletalk/aarp.c:382 [inline] aarp_proxy_probe_network+0xd8/0x630 net/appletalk/aarp.c:468 atif_proxy_probe_device net/appletalk/ddp.c:332 [inline] atif_ioctl+0xb58/0x16c0 net/appletalk/ddp.c:857 atalk_ioctl+0x198/0x2f0 net/appletalk/ddp.c:1818 Freed: kfree+0x148/0x4d0 mm/slub.c:4841 __aarp_expire net/appletalk/aarp.c:90 [inline] __aarp_expire_timer net/appletalk/aarp.c:261 [inline] aarp_expire_timeout+0x480/0x6e0 net/appletalk/aarp.c:317 The buggy address belongs to the object at ffff8880123aa300 which belongs to the cache kmalloc-192 of size 192 The buggy address is located 96 bytes inside of freed 192-byte region [ffff8880123aa300, ffff8880123aa3c0) Memory state around the buggy address: ffff8880123aa200: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff8880123aa280: 00 00 00 00 fc fc fc fc fc fc fc fc fc fc fc fc >ffff8880123aa300: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff8880123aa380: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc ffff8880123aa400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ================================================================== Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Signed-off-by: Kito Xu (veritas501) <hxzene@gmail.com> Link: https://patch.msgid.link/20250717012843.880423-1-hxzene@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-21 16:55:08 -07:00
Michael-CY Lee	84b62b72b4	wifi: cfg80211/mac80211: report link ID for unexpected frames The upper layer may require the link ID to properly handle unexpected frames. For instance, if hostapd, operating as an AP MLD, receives a data frame from a non-associated STA, it must send deauthentication to the link on which the STA is operating. Signed-off-by: Michael-CY Lee <michael-cy.lee@mediatek.com> Reviewed-by: Money Wang <money.wang@mediatek.com> Link: https://patch.msgid.link/20250721065159.1740992-1-michael-cy.lee@mediatek.com [edit commit message] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2025-07-21 19:40:59 +02:00
Michael-CY Lee	4970e393eb	wifi: mac80211: determine missing link_id in ieee80211_rx_for_interface() based on frequency For broadcast frames, every interface might have to process it and therefore the link_id cannot be determined in the driver. In mac80211, when the frame is about to be forwarded to each interface, we can use the member "freq" in struct ieee80211_rx_status to determine the "link_id" for each interface. Signed-off-by: Michael-CY Lee <michael-cy.lee@mediatek.com> Reviewed-by: Money Wang <money.wang@mediatek.com> Link: https://patch.msgid.link/20250721062929.1662700-1-michael-cy.lee@mediatek.com [simplify, remove unnecessary link->conf check] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2025-07-21 19:39:43 +02:00
Johannes Berg	be06a8c731	wifi: cfg80211: reject HTC bit for management frames Management frames sent by userspace should never have the order/HTC bit set, reject that. It could also cause some confusion with the length of the buffer and the header so the validation might end up wrong. Link: https://patch.msgid.link/20250718202307.97a0455f0f35.I1805355c7e331352df16611839bc8198c855a33f@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2025-07-21 19:35:45 +02:00

... 15 16 17 18 19 ...

82231 Commits