linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-05-05 20:33:49 -04:00

Author	SHA1	Message	Date
Matt Johnston	4fa9c5181c	net: mctp-serial: Add kunit test for next_chunk_len() Test various edge cases of inputs that contain characters that need escaping. This adds a new kunit suite for mctp-serial. Signed-off-by: Matt Johnston <matt@codeconstruct.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-09-01 18:14:01 +01:00
Cong Wang	fe1910f933	tcp_bpf: fix return value of tcp_bpf_sendmsg() When we cork messages in psock->cork, the last message triggers the flushing will result in sending a sk_msg larger than the current message size. In this case, in tcp_bpf_send_verdict(), 'copied' becomes negative at least in the following case: 468 case __SK_DROP: 469 default: 470 sk_msg_free_partial(sk, msg, tosend); 471 sk_msg_apply_bytes(psock, tosend); 472 *copied -= (tosend + delta); // <==== HERE 473 return -EACCES; Therefore, it could lead to the following BUG with a proper value of 'copied' (thanks to syzbot). We should not use negative 'copied' as a return value here. ------------[ cut here ]------------ kernel BUG at net/socket.c:733! Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP Modules linked in: CPU: 0 UID: 0 PID: 3265 Comm: syz-executor510 Not tainted 6.11.0-rc3-syzkaller-00060-gd07b43284ab3 #0 Hardware name: linux,dummy-virt (DT) pstate: 61400009 (nZCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--) pc : sock_sendmsg_nosec net/socket.c:733 [inline] pc : sock_sendmsg_nosec net/socket.c:728 [inline] pc : __sock_sendmsg+0x5c/0x60 net/socket.c:745 lr : sock_sendmsg_nosec net/socket.c:730 [inline] lr : __sock_sendmsg+0x54/0x60 net/socket.c:745 sp : ffff800088ea3b30 x29: ffff800088ea3b30 x28: fbf00000062bc900 x27: 0000000000000000 x26: ffff800088ea3bc0 x25: ffff800088ea3bc0 x24: 0000000000000000 x23: f9f00000048dc000 x22: 0000000000000000 x21: ffff800088ea3d90 x20: f9f00000048dc000 x19: ffff800088ea3d90 x18: 0000000000000001 x17: 0000000000000000 x16: 0000000000000000 x15: 000000002002ffaf x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000 x11: 0000000000000000 x10: ffff8000815849c0 x9 : ffff8000815b49c0 x8 : 0000000000000000 x7 : 000000000000003f x6 : 0000000000000000 x5 : 00000000000007e0 x4 : fff07ffffd239000 x3 : fbf00000062bc900 x2 : 0000000000000000 x1 : 0000000000000000 x0 : 00000000fffffdef Call trace: sock_sendmsg_nosec net/socket.c:733 [inline] __sock_sendmsg+0x5c/0x60 net/socket.c:745 ____sys_sendmsg+0x274/0x2ac net/socket.c:2597 ___sys_sendmsg+0xac/0x100 net/socket.c:2651 __sys_sendmsg+0x84/0xe0 net/socket.c:2680 __do_sys_sendmsg net/socket.c:2689 [inline] __se_sys_sendmsg net/socket.c:2687 [inline] __arm64_sys_sendmsg+0x24/0x30 net/socket.c:2687 __invoke_syscall arch/arm64/kernel/syscall.c:35 [inline] invoke_syscall+0x48/0x110 arch/arm64/kernel/syscall.c:49 el0_svc_common.constprop.0+0x40/0xe0 arch/arm64/kernel/syscall.c:132 do_el0_svc+0x1c/0x28 arch/arm64/kernel/syscall.c:151 el0_svc+0x34/0xec arch/arm64/kernel/entry-common.c:712 el0t_64_sync_handler+0x100/0x12c arch/arm64/kernel/entry-common.c:730 el0t_64_sync+0x19c/0x1a0 arch/arm64/kernel/entry.S:598 Code: f9404463 d63f0060 3108441f 54fffe81 (d4210000) ---[ end trace 0000000000000000 ]--- Fixes: `4f738adba3` ("bpf: create tcp_bpf_ulp allowing BPF to monitor socket TX/RX data") Reported-by: syzbot+58c03971700330ce14d8@syzkaller.appspotmail.com Cc: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Cong Wang <cong.wang@bytedance.com> Reviewed-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20240821030744.320934-1-xiyou.wangcong@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-08-30 11:09:10 -07:00
Jeongjun Park	98d4435efc	net/smc: prevent NULL pointer dereference in txopt_get Since smc_inet6_prot does not initialize ipv6_pinfo_offset, inet6_create() copies an incorrect address value, sk + 0 (offset), to inet_sk(sk)->pinet6. In addition, since inet_sk(sk)->pinet6 and smc_sk(sk)->clcsock practically point to the same address, when smc_create_clcsk() stores the newly created clcsock in smc_sk(sk)->clcsock, inet_sk(sk)->pinet6 is corrupted into clcsock. This causes NULL pointer dereference and various other memory corruptions. To solve this problem, you need to initialize ipv6_pinfo_offset, add a smc6_sock structure, and then add ipv6_pinfo as the second member of the smc_sock structure. Reported-by: syzkaller <syzkaller@googlegroups.com> Fixes: `d25a92ccae` ("net/smc: Introduce IPPROTO_SMC") Signed-off-by: Jeongjun Park <aha310510@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-08-30 13:26:12 +01:00
Jakub Kicinski	1bb3c548e4	Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2024-08-28 (igb, ice) This series contains updates to igb and ice drivers. Daiwei Li restores writing the TSICR (TimeSync Interrupt Cause) register on 82850 devices to workaround a hardware issue for igb. Dawid detaches netdev device for reset to avoid ethtool accesses during reset causing NULL pointer dereferences on ice. * '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: ice: Add netif_device_attach/detach into PF reset flow igb: Fix not clearing TimeSync interrupts for 82580 ==================== Link: https://patch.msgid.link/20240828225444.645154-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-08-29 19:13:30 -07:00
Jakub Kicinski	b57d643a67	MAINTAINERS: exclude bluetooth and wireless DT bindings from netdev ML We exclude wireless drivers from the netdev@ traffic, to delegate it to linux-wireless@, and avoid overwhelming netdev@. Bluetooth drivers are implicitly excluded because they live under drivers/bluetooth, not drivers/net. In both cases DT bindings sit under Documentation/devicetree/bindings/net/ and aren't excluded. So if a patch series touches DT bindings netdev@ ends up getting CCed, and these are usually fairly boring series. Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20240828175821.2960423-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-08-29 12:51:44 -07:00
Linus Torvalds	0dd5dd63ba	Merge tag 'net-6.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from bluetooth, wireless and netfilter. No known outstanding regressions. Current release - regressions: - wifi: iwlwifi: fix hibernation - eth: ionic: prevent tx_timeout due to frequent doorbell ringing Previous releases - regressions: - sched: fix sch_fq incorrect behavior for small weights - wifi: - iwlwifi: take the mutex before running link selection - wfx: repair open network AP mode - netfilter: restore IP sanity checks for netdev/egress - tcp: fix forever orphan socket caused by tcp_abort - mptcp: close subflow when receiving TCP+FIN - bluetooth: fix random crash seen while removing btnxpuart driver Previous releases - always broken: - mptcp: more fixes for the in-kernel PM - eth: bonding: change ipsec_lock from spin lock to mutex - eth: mana: fix race of mana_hwc_post_rx_wqe and new hwc response Misc: - documentation: drop special comment style for net code" * tag 'net-6.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (57 commits) nfc: pn533: Add poll mod list filling check mailmap: update entry for Sriram Yagnaraman selftests: mptcp: join: check re-re-adding ID 0 signal mptcp: pm: ADD_ADDR 0 is not a new address selftests: mptcp: join: validate event numbers mptcp: avoid duplicated SUB_CLOSED events selftests: mptcp: join: check re-re-adding ID 0 endp mptcp: pm: fix ID 0 endp usage after multiple re-creations mptcp: pm: do not remove already closed subflows selftests: mptcp: join: no extra msg if no counter selftests: mptcp: join: check re-adding init endp with != id mptcp: pm: reset MPC endp ID when re-added mptcp: pm: skip connecting to already established sf mptcp: pm: send ACK on an active subflow selftests: mptcp: join: check removing ID 0 endpoint mptcp: pm: fix RM_ADDR ID for the initial subflow mptcp: pm: reuse ID 0 after delete and re-add net: busy-poll: use ktime_get_ns() instead of local_clock() sctp: fix association labeling in the duplicate COOKIE-ECHO case mptcp: pr_debug: add missing \n at the end ...	2024-08-30 06:14:39 +12:00
Aleksandr Mishin	febccb3925	nfc: pn533: Add poll mod list filling check In case of im_protocols value is 1 and tm_protocols value is 0 this combination successfully passes the check 'if (!im_protocols && !tm_protocols)' in the nfc_start_poll(). But then after pn533_poll_create_mod_list() call in pn533_start_poll() poll mod list will remain empty and dev->poll_mod_count will remain 0 which lead to division by zero. Normally no im protocol has value 1 in the mask, so this combination is not expected by driver. But these protocol values actually come from userspace via Netlink interface (NFC_CMD_START_POLL operation). So a broken or malicious program may pass a message containing a "bad" combination of protocol parameter values so that dev->poll_mod_count is not incremented inside pn533_poll_create_mod_list(), thus leading to division by zero. Call trace looks like: nfc_genl_start_poll() nfc_start_poll() ->start_poll() pn533_start_poll() Add poll mod list filling check. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: `dfccd0f580` ("NFC: pn533: Add some polling entropy") Signed-off-by: Aleksandr Mishin <amishin@t-argos.ru> Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://patch.msgid.link/20240827084822.18785-1-amishin@t-argos.ru Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-08-29 12:08:44 +02:00
Paolo Abeni	0240bceb0d	Merge tag 'nf-24-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for net: Patch #1 sets on NFT_PKTINFO_L4PROTO for UDP packets less than 4 bytes payload from netdev/egress by subtracting skb_network_offset() when validating IPv4 packet length, otherwise 'meta l4proto udp' never matches. Patch #2 subtracts skb_network_offset() when validating IPv6 packet length for netdev/egress. netfilter pull request 24-08-28 * tag 'nf-24-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: nf_tables_ipv6: consider network offset in netdev/egress validation netfilter: nf_tables: restore IP sanity checks for netdev/egress ==================== Link: https://patch.msgid.link/20240828214708.619261-1-pablo@netfilter.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-08-29 11:35:54 +02:00
Sriram Yagnaraman	6213dcc752	mailmap: update entry for Sriram Yagnaraman Link my old est.tech address to my active mail address Signed-off-by: Sriram Yagnaraman <sriram.yagnaraman@ericsson.com> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> Link: https://patch.msgid.link/20240828072417.4111996-1-sriram.yagnaraman@ericsson.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-08-29 10:53:46 +02:00
Paolo Abeni	b666a651d0	Merge branch 'mptcp-more-fixes-for-the-in-kernel-pm' Matthieu Baerts says: ==================== mptcp: more fixes for the in-kernel PM Here is a new batch of fixes for the MPTCP in-kernel path-manager: Patch 1 ensures the address ID is set to 0 when the path-manager sends an ADD_ADDR for the address of the initial subflow. The same fix is applied when a new subflow is created re-using this special address. A fix for v6.0. Patch 2 is similar, but for the case where an endpoint is removed: if this endpoint was used for the initial address, it is important to send a RM_ADDR with this ID set to 0, and look for existing subflows with the ID set to 0. A fix for v6.0 as well. Patch 3 validates the two previous patches. Patch 4 makes the PM selecting an "active" path to send an address notification in an ACK, instead of taking the first path in the list. A fix for v5.11. Patch 5 fixes skipping the establishment of a new subflow if a previous subflow using the same pair of addresses is being closed. A fix for v5.13. Patch 6 resets the ID linked to the initial subflow when the linked endpoint is re-added, possibly with a different ID. A fix for v6.0. Patch 7 validates the three previous patches. Patch 8 is a small fix for the MPTCP Join selftest, when being used with older subflows not supporting all MIB counters. A fix for a commit introduced in v6.4, but backported up to v5.10. Patch 9 avoids the PM to try to close the initial subflow multiple times, and increment counters while nothing happened. A fix for v5.10. Patch 10 stops incrementing local_addr_used and add_addr_accepted counters when dealing with the address ID 0, because these counters are not taking into account the initial subflow, and are then not decremented when the linked addresses are removed. A fix for v6.0. Patch 11 validates the previous patch. Patch 12 avoids the PM to send multiple SUB_CLOSED events for the initial subflow. A fix for v5.12. Patch 13 validates the previous patch. Patch 14 stops treating the ADD_ADDR 0 as a new address, and accepts it in order to re-create the initial subflow if it has been closed, even if the limit for new addresses -- not taking into account the address of the initial subflow -- has been reached. A fix for v5.10. Patch 15 validates the previous patch. Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> --- Matthieu Baerts (NGI0) (15): mptcp: pm: reuse ID 0 after delete and re-add mptcp: pm: fix RM_ADDR ID for the initial subflow selftests: mptcp: join: check removing ID 0 endpoint mptcp: pm: send ACK on an active subflow mptcp: pm: skip connecting to already established sf mptcp: pm: reset MPC endp ID when re-added selftests: mptcp: join: check re-adding init endp with != id selftests: mptcp: join: no extra msg if no counter mptcp: pm: do not remove already closed subflows mptcp: pm: fix ID 0 endp usage after multiple re-creations selftests: mptcp: join: check re-re-adding ID 0 endp mptcp: avoid duplicated SUB_CLOSED events selftests: mptcp: join: validate event numbers mptcp: pm: ADD_ADDR 0 is not a new address selftests: mptcp: join: check re-re-adding ID 0 signal net/mptcp/pm.c \| 4 +- net/mptcp/pm_netlink.c \| 87 ++++++++++---- net/mptcp/protocol.c \| 6 + net/mptcp/protocol.h \| 5 +- tools/testing/selftests/net/mptcp/mptcp_join.sh \| 153 ++++++++++++++++++++---- tools/testing/selftests/net/mptcp/mptcp_lib.sh \| 4 + 6 files changed, 209 insertions(+), 50 deletions(-) --- base-commit: `3a0504d54b` change-id: 20240826-net-mptcp-more-pm-fix-ffa61a36f817 Best regards, ==================== Link: https://patch.msgid.link/20240828-net-mptcp-more-pm-fix-v2-0-7f11b283fff7@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-08-29 10:39:53 +02:00
Matthieu Baerts (NGI0)	f18fa2abf8	selftests: mptcp: join: check re-re-adding ID 0 signal This test extends "delete re-add signal" to validate the previous commit: when the 'signal' endpoint linked to the initial subflow (ID 0) is re-added multiple times, it will re-send the ADD_ADDR with id 0. The client should still be able to re-create this subflow, even if the add_addr_accepted limit has been reached as this special address is not considered as a new address. The 'Fixes' tag here below is the same as the one from the previous commit: this patch here is not fixing anything wrong in the selftests, but it validates the previous fix for an issue introduced by this commit ID. Fixes: `d0876b2284` ("mptcp: add the incoming RM_ADDR support") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-08-29 10:39:50 +02:00
Matthieu Baerts (NGI0)	57f86203b4	mptcp: pm: ADD_ADDR 0 is not a new address The ADD_ADDR 0 with the address from the initial subflow should not be considered as a new address: this is not something new. If the host receives it, it simply means that the address is available again. When receiving an ADD_ADDR for the ID 0, the PM already doesn't consider it as new by not incrementing the 'add_addr_accepted' counter. But the 'accept_addr' might not be set if the limit has already been reached: this can be bypassed in this case. But before, it is important to check that this ADD_ADDR for the ID 0 is for the same address as the initial subflow. If not, it is not something that should happen, and the ADD_ADDR can be ignored. Note that if an ADD_ADDR is received while there is already a subflow opened using the same address, this ADD_ADDR is ignored as well. It means that if multiple ADD_ADDR for ID 0 are received, there will not be any duplicated subflows created by the client. Fixes: `d0876b2284` ("mptcp: add the incoming RM_ADDR support") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-08-29 10:39:50 +02:00
Matthieu Baerts (NGI0)	20ccc7c5f7	selftests: mptcp: join: validate event numbers This test extends "delete and re-add" and "delete re-add signal" to validate the previous commit: the number of MPTCP events are checked to make sure there are no duplicated or unexpected ones. A new helper has been introduced to easily check these events. The missing events have been added to the lib. The 'Fixes' tag here below is the same as the one from the previous commit: this patch here is not fixing anything wrong in the selftests, but it validates the previous fix for an issue introduced by this commit ID. Fixes: `b911c97c7d` ("mptcp: add netlink event support") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-08-29 10:39:50 +02:00
Matthieu Baerts (NGI0)	d82809b6c5	mptcp: avoid duplicated SUB_CLOSED events The initial subflow might have already been closed, but still in the connection list. When the worker is instructed to close the subflows that have been marked as closed, it might then try to close the initial subflow again. A consequence of that is that the SUB_CLOSED event can be seen twice: # ip mptcp endpoint 1.1.1.1 id 1 subflow dev eth0 2.2.2.2 id 2 subflow dev eth1 # ip mptcp monitor & [ CREATED] remid=0 locid=0 saddr4=1.1.1.1 daddr4=9.9.9.9 [ ESTABLISHED] remid=0 locid=0 saddr4=1.1.1.1 daddr4=9.9.9.9 [ SF_ESTABLISHED] remid=0 locid=2 saddr4=2.2.2.2 daddr4=9.9.9.9 # ip mptcp endpoint delete id 1 [ SF_CLOSED] remid=0 locid=0 saddr4=1.1.1.1 daddr4=9.9.9.9 [ SF_CLOSED] remid=0 locid=0 saddr4=1.1.1.1 daddr4=9.9.9.9 The first one is coming from mptcp_pm_nl_rm_subflow_received(), and the second one from __mptcp_close_subflow(). To avoid doing the post-closed processing twice, the subflow is now marked as closed the first time. Note that it is not enough to check if we are dealing with the first subflow and check its sk_state: the subflow might have been reset or closed before calling mptcp_close_ssk(). Fixes: `b911c97c7d` ("mptcp: add netlink event support") Cc: stable@vger.kernel.org Tested-by: Arınç ÜNAL <arinc.unal@arinc9.com> Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-08-29 10:39:50 +02:00
Matthieu Baerts (NGI0)	d397d7246c	selftests: mptcp: join: check re-re-adding ID 0 endp This test extends "delete and re-add" to validate the previous commit: when the endpoint linked to the initial subflow (ID 0) is re-added multiple times, it was no longer being used, because the internal linked counters are not decremented for this special endpoint: it is not an additional endpoint. Here, the "del/add id 0" steps are done 3 times to unsure this case is validated. The 'Fixes' tag here below is the same as the one from the previous commit: this patch here is not fixing anything wrong in the selftests, but it validates the previous fix for an issue introduced by this commit ID. Fixes: `3ad14f54bd` ("mptcp: more accurate MPC endpoint tracking") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-08-29 10:39:50 +02:00
Matthieu Baerts (NGI0)	9366922adc	mptcp: pm: fix ID 0 endp usage after multiple re-creations 'local_addr_used' and 'add_addr_accepted' are decremented for addresses not related to the initial subflow (ID0), because the source and destination addresses of the initial subflows are known from the beginning: they don't count as "additional local address being used" or "ADD_ADDR being accepted". It is then required not to increment them when the entrypoint used by the initial subflow is removed and re-added during a connection. Without this modification, this entrypoint cannot be removed and re-added more than once. Reported-by: Arınç ÜNAL <arinc.unal@arinc9.com> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/512 Fixes: `3ad14f54bd` ("mptcp: more accurate MPC endpoint tracking") Reported-by: syzbot+455d38ecd5f655fc45cf@syzkaller.appspotmail.com Closes: https://lore.kernel.org/00000000000049861306209237f4@google.com Cc: stable@vger.kernel.org Tested-by: Arınç ÜNAL <arinc.unal@arinc9.com> Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-08-29 10:39:50 +02:00
Matthieu Baerts (NGI0)	58e1b66b4e	mptcp: pm: do not remove already closed subflows It is possible to have in the list already closed subflows, e.g. the initial subflow has been already closed, but still in the list. No need to try to close it again, and increments the related counters again. Fixes: `0ee4261a36` ("mptcp: implement mptcp_pm_remove_subflow") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-08-29 10:39:50 +02:00
Matthieu Baerts (NGI0)	76a2d8394c	selftests: mptcp: join: no extra msg if no counter The checksum and fail counters might not be available. Then no need to display an extra message with missing info. While at it, fix the indentation around, which is wrong since the same commit. Fixes: `47867f0a7e` ("selftests: mptcp: join: skip check if MIB counter not supported") Cc: stable@vger.kernel.org Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-08-29 10:39:50 +02:00
Matthieu Baerts (NGI0)	1c2326fcae	selftests: mptcp: join: check re-adding init endp with != id The initial subflow has a special local ID: 0. It is specific per connection. When a global endpoint is deleted and re-added later, it can have a different ID, but the kernel should still use the ID 0 if it corresponds to the initial address. This test validates this behaviour: the endpoint linked to the initial subflow is removed, and re-added with a different ID. Note that removing the initial subflow will not decrement the 'subflows' counters, which corresponds to the additional subflows. On the other hand, when the same endpoint is re-added, it will increment this counter, as it will be seen as an additional subflow this time. The 'Fixes' tag here below is the same as the one from the previous commit: this patch here is not fixing anything wrong in the selftests, but it validates the previous fix for an issue introduced by this commit ID. Fixes: `3ad14f54bd` ("mptcp: more accurate MPC endpoint tracking") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-08-29 10:39:50 +02:00
Matthieu Baerts (NGI0)	dce1c6d1e9	mptcp: pm: reset MPC endp ID when re-added The initial subflow has a special local ID: 0. It is specific per connection. When a global endpoint is deleted and re-added later, it can have a different ID -- most services managing the endpoints automatically don't force the ID to be the same as before. It is then important to track these modifications to be consistent with the ID being used for the address used by the initial subflow, not to confuse the other peer or to send the ID 0 for the wrong address. Now when removing an endpoint, msk->mpc_endpoint_id is reset if it corresponds to this endpoint. When adding a new endpoint, the same variable is updated if the address match the one of the initial subflow. Fixes: `3ad14f54bd` ("mptcp: more accurate MPC endpoint tracking") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-08-29 10:39:50 +02:00
Matthieu Baerts (NGI0)	bc19ff5763	mptcp: pm: skip connecting to already established sf The lookup_subflow_by_daddr() helper checks if there is already a subflow connected to this address. But there could be a subflow that is closing, but taking time due to some reasons: latency, losses, data to process, etc. If an ADD_ADDR is received while the endpoint is being closed, it is better to try connecting to it, instead of rejecting it: the peer which has sent the ADD_ADDR will not be notified that the ADD_ADDR has been rejected for this reason, and the expected subflow will not be created at the end. This helper should then only look for subflows that are established, or going to be, but not the ones being closed. Fixes: `d84ad04941` ("mptcp: skip connecting the connected address") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-08-29 10:39:50 +02:00
Matthieu Baerts (NGI0)	c07cc3ed89	mptcp: pm: send ACK on an active subflow Taking the first one on the list doesn't work in some cases, e.g. if the initial subflow is being removed. Pick another one instead of not sending anything. Fixes: `84dfe3677a` ("mptcp: send out dedicated ADD_ADDR packet") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-08-29 10:39:50 +02:00
Matthieu Baerts (NGI0)	5f94b08c00	selftests: mptcp: join: check removing ID 0 endpoint Removing the endpoint linked to the initial subflow should trigger a RM_ADDR for the right ID, and the removal of the subflow. That's what is now being verified in the "delete and re-add" test. Note that removing the initial subflow will not decrement the 'subflows' counters, which corresponds to the additional subflows. On the other hand, when the same endpoint is re-added, it will increment this counter, as it will be seen as an additional subflow this time. The 'Fixes' tag here below is the same as the one from the previous commit: this patch here is not fixing anything wrong in the selftests, but it validates the previous fix for an issue introduced by this commit ID. Fixes: `3ad14f54bd` ("mptcp: more accurate MPC endpoint tracking") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-08-29 10:39:49 +02:00
Matthieu Baerts (NGI0)	87b5896f3f	mptcp: pm: fix RM_ADDR ID for the initial subflow The initial subflow has a special local ID: 0. When an endpoint is being deleted, it is then important to check if its address is not linked to the initial subflow to send the right ID. If there was an endpoint linked to the initial subflow, msk's mpc_endpoint_id field will be set. We can then use this info when an endpoint is being removed to see if it is linked to the initial subflow. So now, the correct IDs are passed to mptcp_pm_nl_rm_addr_or_subflow(), it is no longer needed to use mptcp_local_id_match(). Fixes: `3ad14f54bd` ("mptcp: more accurate MPC endpoint tracking") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-08-29 10:39:49 +02:00
Matthieu Baerts (NGI0)	8b8ed1b429	mptcp: pm: reuse ID 0 after delete and re-add When the endpoint used by the initial subflow is removed and re-added later, the PM has to force the ID 0, it is a special case imposed by the MPTCP specs. Note that the endpoint should then need to be re-added reusing the same ID. Fixes: `3ad14f54bd` ("mptcp: more accurate MPC endpoint tracking") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-08-29 10:39:49 +02:00
Linus Torvalds	d5d547aa7b	Merge tag 'random-6.11-rc6-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random Pull random number generator fix from Jason Donenfeld: "Reject invalid flags passed to vgetrandom() in the same way that getrandom() does, so that the behavior is the same, from Yann. The flags argument to getrandom() only has a behavioral effect on the function if the RNG isn't initialized yet, so vgetrandom() falls back to the syscall in that case. But if the RNG is initialized, all of the flags behave the same way, so vgetrandom() didn't bother checking them, and just ignored them entirely. But that doesn't account for invalid flags passed in, which need to be rejected so we can use them later" * tag 'random-6.11-rc6-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random: random: vDSO: reject unknown getrandom() flags	2024-08-29 13:59:18 +12:00
Eric Dumazet	0870b0d8b3	net: busy-poll: use ktime_get_ns() instead of local_clock() Typically, busy-polling durations are below 100 usec. When/if the busy-poller thread migrates to another cpu, local_clock() can be off by +/-2msec or more for small values of HZ, depending on the platform. Use ktimer_get_ns() to ensure deterministic behavior, which is the whole point of busy-polling. Fixes: `0602129286` ("net: add low latency socket poll") Fixes: `9a3c71aa80` ("net: convert low latency sockets to sched_clock()") Fixes: `3708983452` ("sched, net: Fixup busy_loop_us_clock()") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Mina Almasry <almasrymina@google.com> Cc: Willem de Bruijn <willemb@google.com> Reviewed-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20240827114916.223377-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-08-28 17:53:13 -07:00
Jakub Kicinski	41901c227e	Merge tag 'wireless-2024-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless Johannes Berg says: ==================== Regressions: * wfx: fix for open network connection * iwlwifi: fix for hibernate (due to fast resume feature) * iwlwifi: fix for a few warnings that were recently added (had previously been messages not warnings) Previously broken: * mwifiex: fix static structures used for per-device data * iwlwifi: some harmless FW related messages were tagged too high priority * iwlwifi: scan buffers weren't checked correctly * mac80211: SKB leak on beacon error path * iwlwifi: fix ACPI table interop with certain BIOSes * iwlwifi: fix locking for link selection * mac80211: fix SSID comparison in beacon validation * tag 'wireless-2024-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless: wifi: iwlwifi: clear trans->state earlier upon error wifi: wfx: repair open network AP mode wifi: mac80211: free skb on error path in ieee80211_beacon_get_ap() wifi: iwlwifi: mvm: don't wait for tx queues if firmware is dead wifi: iwlwifi: mvm: allow 6 GHz channels in MLO scan wifi: iwlwifi: mvm: pause TCM when the firmware is stopped wifi: iwlwifi: fw: fix wgds rev 3 exact size wifi: iwlwifi: mvm: take the mutex before running link selection wifi: iwlwifi: mvm: fix iwl_mvm_max_scan_ie_fw_cmd_room() wifi: iwlwifi: mvm: fix iwl_mvm_scan_fits() calculation wifi: iwlwifi: lower message level for FW buffer destination wifi: iwlwifi: mvm: fix hibernation wifi: mac80211: fix beacon SSID mismatch handling wifi: mwifiex: duplicate static structs used in driver instances ==================== Link: https://patch.msgid.link/20240828100151.23662-3-johannes@sipsolutions.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-08-28 16:54:45 -07:00
Dawid Osuchowski	d11a676342	ice: Add netif_device_attach/detach into PF reset flow Ethtool callbacks can be executed while reset is in progress and try to access deleted resources, e.g. getting coalesce settings can result in a NULL pointer dereference seen below. Reproduction steps: Once the driver is fully initialized, trigger reset: # echo 1 > /sys/class/net/<interface>/device/reset when reset is in progress try to get coalesce settings using ethtool: # ethtool -c <interface> BUG: kernel NULL pointer dereference, address: 0000000000000020 PGD 0 P4D 0 Oops: Oops: 0000 [#1] PREEMPT SMP PTI CPU: 11 PID: 19713 Comm: ethtool Tainted: G S 6.10.0-rc7+ #7 RIP: 0010:ice_get_q_coalesce+0x2e/0xa0 [ice] RSP: 0018:ffffbab1e9bcf6a8 EFLAGS: 00010206 RAX: 000000000000000c RBX: ffff94512305b028 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffff9451c3f2e588 RDI: ffff9451c3f2e588 RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 R10: ffff9451c3f2e580 R11: 000000000000001f R12: ffff945121fa9000 R13: ffffbab1e9bcf760 R14: 0000000000000013 R15: ffffffff9e65dd40 FS: 00007faee5fbe740(0000) GS:ffff94546fd80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000020 CR3: 0000000106c2e005 CR4: 00000000001706f0 Call Trace: <TASK> ice_get_coalesce+0x17/0x30 [ice] coalesce_prepare_data+0x61/0x80 ethnl_default_doit+0xde/0x340 genl_family_rcv_msg_doit+0xf2/0x150 genl_rcv_msg+0x1b3/0x2c0 netlink_rcv_skb+0x5b/0x110 genl_rcv+0x28/0x40 netlink_unicast+0x19c/0x290 netlink_sendmsg+0x222/0x490 __sys_sendto+0x1df/0x1f0 __x64_sys_sendto+0x24/0x30 do_syscall_64+0x82/0x160 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x7faee60d8e27 Calling netif_device_detach() before reset makes the net core not call the driver when ethtool command is issued, the attempt to execute an ethtool command during reset will result in the following message: netlink error: No such device instead of NULL pointer dereference. Once reset is done and ice_rebuild() is executing, the netif_device_attach() is called to allow for ethtool operations to occur again in a safe manner. Fixes: `fcea6f3da5` ("ice: Add stats and ethtool support") Suggested-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Igor Bagnucki <igor.bagnucki@intel.com> Signed-off-by: Dawid Osuchowski <dawid.osuchowski@linux.intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Reviewed-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2024-08-28 15:29:59 -07:00
Daiwei Li	ba8cf80724	igb: Fix not clearing TimeSync interrupts for 82580 82580 NICs have a hardware bug that makes it necessary to write into the TSICR (TimeSync Interrupt Cause) register to clear it: https://lore.kernel.org/all/CDCB8BE0.1EC2C%25matthew.vick@intel.com/ Add a conditional so only for 82580 we write into the TSICR register, so we don't risk losing events for other models. Without this change, when running ptp4l with an Intel 82580 card, I get the following output: > timed out while polling for tx timestamp increasing tx_timestamp_timeout or > increasing kworker priority may correct this issue, but a driver bug likely > causes it This goes away with this change. This (partially) reverts commit `ee14cc9ea1` ("igb: Fix missing time sync events"). Fixes: `ee14cc9ea1` ("igb: Fix missing time sync events") Closes: https://lore.kernel.org/intel-wired-lan/CAN0jFd1kO0MMtOh8N2Ztxn6f7vvDKp2h507sMryobkBKe=xk=w@mail.gmail.com/ Tested-by: Daiwei Li <daiweili@google.com> Suggested-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Signed-off-by: Daiwei Li <daiweili@google.com> Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2024-08-28 15:29:36 -07:00
Linus Torvalds	928f79a188	Merge tag 'loongarch-fixes-6.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson Pull LoongArch fixes from Huacai Chen: "Remove the unused dma-direct.h, and some bug & warning fixes" * tag 'loongarch-fixes-6.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson: LoongArch: KVM: Invalidate guest steal time address on vCPU reset LoongArch: Add ifdefs to fix LSX and LASX related warnings LoongArch: Define ARCH_IRQ_INIT_FLAGS as IRQ_NOPROBE LoongArch: Remove the unused dma-direct.h	2024-08-29 07:15:41 +12:00
Linus Torvalds	f9a59dd097	Merge tag 'platform-drivers-x86-v6.11-5' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull x86 platform drivers fixes from Ilpo Järvinen: - platform/x86/amd/pmc: AMD 1Ah model 60h series support (2nd attempt) - asus-wmi: Prevent spurious rfkill on Asus Zenbook Duo - x86-android-tablets: Relax DMI match to cover another model * tag 'platform-drivers-x86-v6.11-5' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: platform/x86: x86-android-tablets: Make Lenovo Yoga Tab 3 X90F DMI match less strict platform/x86: asus-wmi: Fix spurious rfkill on UX8406MA platform/x86/amd/pmc: Extend support for PMC features on new AMD platform platform/x86/amd/pmc: Fix SMU command submission path on new AMD platform	2024-08-29 07:12:02 +12:00
Linus Torvalds	a18093afa3	Merge tag 'nfsd-6.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fixes from Chuck Lever: - Fix a number of crashers - Update email address for an NFSD reviewer * tag 'nfsd-6.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: fs/nfsd: fix update of inode attrs in CB_GETATTR nfsd: fix potential UAF in nfsd4_cb_getattr_release nfsd: hold reference to delegation when updating it for cb_getattr MAINTAINERS: Update Olga Kornievskaia's email address nfsd: prevent panic for nfsv4.0 closed files in nfs4_show_open nfsd: ensure that nfsd4_fattr_args.context is zeroed out	2024-08-29 06:20:44 +12:00
Linus Torvalds	2840526875	Merge tag 'for-6.11-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: - fix use-after-free when submitting bios for read, after an error and partially submitted bio the original one is freed while it can be still be accessed again - fix fstests case btrfs/301, with enabled quotas wait for delayed iputs when flushing delalloc - fix periodic block group reclaim, an unitialized value can be returned if there are no block groups to reclaim - fix build warning (-Wmaybe-uninitialized) * tag 'for-6.11-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: fix uninitialized return value from btrfs_reclaim_sweep() btrfs: fix a use-after-free when hitting errors inside btrfs_submit_chunk() btrfs: initialize last_extent_end to fix -Wmaybe-uninitialized warning in extent_fiemap() btrfs: run delayed iputs when flushing delalloc	2024-08-29 06:17:46 +12:00
Linus Torvalds	86987d84b9	Merge tag 'v6.11-rc5-client-fixes' of git://git.samba.org/sfrench/cifs-2.6 Pull smb client fixes from Steve French: - two RDMA/smbdirect fixes and a minor cleanup - punch hole fix * tag 'v6.11-rc5-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: cifs: Fix FALLOC_FL_PUNCH_HOLE support smb/client: fix rdma usage in smb2_async_writev() smb/client: remove unused rq_iter_size from struct smb_rqst smb/client: avoid dereferencing rdata=NULL in smb2_new_read_req()	2024-08-28 15:05:02 +12:00
Linus Torvalds	46d22bfdf0	Merge tag 'tpmdd-next-6.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd Pull TPM fix from Jarkko Sakkinen: "A bug fix for tpm_ibmvtpm driver so that it will take the bus encryption into use" * tag 'tpmdd-next-6.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd: tpm: ibmvtpm: Call tpm2_sessions_init() to initialize session support	2024-08-28 14:55:48 +12:00
Ondrej Mosnacek	3a0504d54b	sctp: fix association labeling in the duplicate COOKIE-ECHO case sctp_sf_do_5_2_4_dupcook() currently calls security_sctp_assoc_request() on new_asoc, but as it turns out, this association is always discarded and the LSM labels never get into the final association (asoc). This can be reproduced by having two SCTP endpoints try to initiate an association with each other at approximately the same time and then peel off the association into a new socket, which exposes the unitialized labels and triggers SELinux denials. Fix it by calling security_sctp_assoc_request() on asoc instead of new_asoc. Xin Long also suggested limit calling the hook only to cases A, B, and D, since in cases C and E the COOKIE ECHO chunk is discarded and the association doesn't enter the ESTABLISHED state, so rectify that as well. One related caveat with SELinux and peer labeling: When an SCTP connection is set up simultaneously in this way, we will end up with an association that is initialized with security_sctp_assoc_request() on both sides, so the MLS component of the security context of the association will get swapped between the peers, instead of just one side setting it to the other's MLS component. However, at that point security_sctp_assoc_request() had already been called on both sides in sctp_sf_do_unexpected_init() (on a temporary association) and thus if the exchange didn't fail before due to MLS, it won't fail now either (most likely both endpoints have the same MLS range). Tested by: - reproducer from https://src.fedoraproject.org/tests/selinux/pull-request/530 - selinux-testsuite (https://github.com/SELinuxProject/selinux-testsuite/) - sctp-tests (https://github.com/sctp/sctp-tests) - no tests failed that wouldn't fail also without the patch applied Fixes: `c081d53f97` ("security: pass asoc to sctp_assoc_request and sctp_sk_clone") Suggested-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> Acked-by: Xin Long <lucien.xin@gmail.com> Acked-by: Paul Moore <paul@paul-moore.com> (LSM/SELinux) Link: https://patch.msgid.link/20240826130711.141271-1-omosnace@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-08-27 16:07:12 -07:00
Jakub Kicinski	237c3851dc	Merge branch 'mptcp-close-subflow-when-receiving-tcp-fin-and-misc' Matthieu Baerts says: ==================== mptcp: close subflow when receiving TCP+FIN and misc. Here are different fixes: Patch 1 closes the subflow after having received a FIN, instead of leaving it half-closed until the end of the MPTCP connection. A fix for v5.12. Patch 2 validates the previous patch. Patch 3 is a fix for a recent fix to check both directions for the backup flag. It can follow the 'Fixes' commit and be backported up to v5.7. Patch 4 adds a missing \n at the end of pr_debug(), causing debug messages to be displayed with a delay, which confuses the debugger. A fix for v5.6. ==================== Link: https://patch.msgid.link/20240826-net-mptcp-close-extra-sf-fin-v1-0-905199fe1172@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-08-27 14:45:18 -07:00
Matthieu Baerts (NGI0)	cb41b195e6	mptcp: pr_debug: add missing \n at the end pr_debug() have been added in various places in MPTCP code to help developers to debug some situations. With the dynamic debug feature, it is easy to enable all or some of them, and asks users to reproduce issues with extra debug. Many of these pr_debug() don't end with a new line, while no 'pr_cont()' are used in MPTCP code. So the goal was not to display multiple debug messages on one line: they were then not missing the '\n' on purpose. Not having the new line at the end causes these messages to be printed with a delay, when something else needs to be printed. This issue is not visible when many messages need to be printed, but it is annoying and confusing when only specific messages are expected, e.g. # echo "func mptcp_pm_add_addr_echoed +fmp" \ > /sys/kernel/debug/dynamic_debug/control # ./mptcp_join.sh "signal address"; \ echo "$(awk '{print $1}' /proc/uptime) - end"; \ sleep 5s; \ echo "$(awk '{print $1}' /proc/uptime) - restart"; \ ./mptcp_join.sh "signal address" 013 signal address (...) 10.75 - end 15.76 - restart 013 signal address [ 10.367935] mptcp:mptcp_pm_add_addr_echoed: MPTCP: msk=(...) (...) => a delay of 5 seconds: printed with a 10.36 ts, but after 'restart' which was printed at the 15.76 ts. The 'Fixes' tag here below points to the first pr_debug() used without '\n' in net/mptcp. This patch could be split in many small ones, with different Fixes tag, but it doesn't seem worth it, because it is easy to re-generate this patch with this simple 'sed' command: git grep -l pr_debug -- net/mptcp \| xargs sed -i "s/$pr_debug(\".*[^n]$$\"[,)]$/\1\\\n\2/g" So in case of conflicts, simply drop the modifications, and launch this command. Fixes: `f870fa0b57` ("mptcp: Add MPTCP socket stubs") Cc: stable@vger.kernel.org Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20240826-net-mptcp-close-extra-sf-fin-v1-4-905199fe1172@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-08-27 14:45:16 -07:00
Matthieu Baerts (NGI0)	2a1f596ebb	mptcp: sched: check both backup in retrans The 'mptcp_subflow_context' structure has two items related to the backup flags: - 'backup': the subflow has been marked as backup by the other peer - 'request_bkup': the backup flag has been set by the host Looking only at the 'backup' flag can make sense in some cases, but it is not the behaviour of the default packet scheduler when selecting paths. As explained in the commit `b6a66e521a` ("mptcp: sched: check both directions for backup"), the packet scheduler should look at both flags, because that was the behaviour from the beginning: the 'backup' flag was set by accident instead of the 'request_bkup' one. Now that the latter has been fixed, get_retrans() needs to be adapted as well. Fixes: `b6a66e521a` ("mptcp: sched: check both directions for backup") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20240826-net-mptcp-close-extra-sf-fin-v1-3-905199fe1172@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-08-27 14:45:16 -07:00
Matthieu Baerts (NGI0)	e93681afcb	selftests: mptcp: join: cannot rm sf if closed Thanks to the previous commit, the MPTCP subflows are now closed on both directions even when only the MPTCP path-manager of one peer asks for their closure. In the two tests modified here -- "userspace pm add & remove address" and "userspace pm create destroy subflow" -- one peer is controlled by the userspace PM, and the other one by the in-kernel PM. When the userspace PM sends a RM_ADDR notification, the in-kernel PM will automatically react by closing all subflows using this address. Now, thanks to the previous commit, the subflows are properly closed on both directions, the userspace PM can then no longer closes the same subflows if they are already closed. Before, it was OK to do that, because the subflows were still half-opened, still OK to send a RM_ADDR. In other words, thanks to the previous commit closing the subflows, an error will be returned to the userspace if it tries to close a subflow that has already been closed. So no need to run this command, which mean that the linked counters will then not be incremented. These tests are then no longer sending both a RM_ADDR, then closing the linked subflow just after. The test with the userspace PM on the server side is now removing one subflow linked to one address, then sending a RM_ADDR for another address. The test with the userspace PM on the client side is now only removing the subflow that was previously created. Fixes: `4369c198e5` ("selftests: mptcp: test userspace pm out of transfer") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20240826-net-mptcp-close-extra-sf-fin-v1-2-905199fe1172@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-08-27 14:45:16 -07:00
Matthieu Baerts (NGI0)	f09b0ad55a	mptcp: close subflow when receiving TCP+FIN When a peer decides to close one subflow in the middle of a connection having multiple subflows, the receiver of the first FIN should accept that, and close the subflow on its side as well. If not, the subflow will stay half closed, and would even continue to be used until the end of the MPTCP connection or a reset from the network. The issue has not been seen before, probably because the in-kernel path-manager always sends a RM_ADDR before closing the subflow. Upon the reception of this RM_ADDR, the other peer will initiate the closure on its side as well. On the other hand, if the RM_ADDR is lost, or if the path-manager of the other peer only closes the subflow without sending a RM_ADDR, the subflow would switch to TCP_CLOSE_WAIT, but that's it, leaving the subflow half-closed. So now, when the subflow switches to the TCP_CLOSE_WAIT state, and if the MPTCP connection has not been closed before with a DATA_FIN, the kernel owning the subflow schedules its worker to initiate the closure on its side as well. This issue can be easily reproduced with packetdrill, as visible in [1], by creating an additional subflow, injecting a FIN+ACK before sending the DATA_FIN, and expecting a FIN+ACK in return. Fixes: `40947e1399` ("mptcp: schedule worker when subflow is closed") Cc: stable@vger.kernel.org Link: https://github.com/multipath-tcp/packetdrill/pull/154 [1] Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20240826-net-mptcp-close-extra-sf-fin-v1-1-905199fe1172@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-08-27 14:45:16 -07:00
Xueming Feng	bac76cf898	tcp: fix forever orphan socket caused by tcp_abort We have some problem closing zero-window fin-wait-1 tcp sockets in our environment. This patch come from the investigation. Previously tcp_abort only sends out reset and calls tcp_done when the socket is not SOCK_DEAD, aka orphan. For orphan socket, it will only purging the write queue, but not close the socket and left it to the timer. While purging the write queue, tp->packets_out and sk->sk_write_queue is cleared along the way. However tcp_retransmit_timer have early return based on !tp->packets_out and tcp_probe_timer have early return based on !sk->sk_write_queue. This caused ICSK_TIME_RETRANS and ICSK_TIME_PROBE0 not being resched and socket not being killed by the timers, converting a zero-windowed orphan into a forever orphan. This patch removes the SOCK_DEAD check in tcp_abort, making it send reset to peer and close the socket accordingly. Preventing the timer-less orphan from happening. According to Lorenzo's email in the v1 thread, the check was there to prevent force-closing the same socket twice. That situation is handled by testing for TCP_CLOSE inside lock, and returning -ENOENT if it is already closed. The -ENOENT code comes from the associate patch Lorenzo made for iproute2-ss; link attached below, which also conform to RFC 9293. At the end of the patch, tcp_write_queue_purge(sk) is removed because it was already called in tcp_done_with_error(). p.s. This is the same patch with v2. Resent due to mis-labeled "changes requested" on patchwork.kernel.org. Link: https://patchwork.ozlabs.org/project/netdev/patch/1450773094-7978-3-git-send-email-lorenzo@google.com/ Fixes: `c1e64e298b` ("net: diag: Support destroying TCP sockets.") Signed-off-by: Xueming Feng <kuro@kuroa.me> Tested-by: Lorenzo Colitti <lorenzo@google.com> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20240826102327.1461482-1-kuro@kuroa.me Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-08-27 14:27:49 -07:00
Cong Wang	defd8b3c37	gtp: fix a potential NULL pointer dereference When sockfd_lookup() fails, gtp_encap_enable_socket() returns a NULL pointer, but its callers only check for error pointers thus miss the NULL pointer case. Fix it by returning an error pointer with the error code carried from sockfd_lookup(). (I found this bug during code inspection.) Fixes: `1e3a3abd8b` ("gtp: make GTP sockets in gtp_newlink optional") Cc: Andreas Schultz <aschultz@tpip.net> Cc: Harald Welte <laforge@gnumonks.org> Signed-off-by: Cong Wang <cong.wang@bytedance.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org> Link: https://patch.msgid.link/20240825191638.146748-1-xiyou.wangcong@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-08-27 14:19:40 -07:00
Jakub Kicinski	2fecbf75c4	Merge branch 'fixes-for-ipsec-over-bonding' Jianbo Liu says: ==================== Fixes for IPsec over bonding This patchset provides bug fixes for IPsec over bonding driver. It adds the missing xdo_dev_state_free API, and fixes "scheduling while atomic" by using mutex lock instead. Series generated against: commit `c07ff8592d` ("netem: fix return value if duplicate enqueue fails") ==================== Link: https://patch.msgid.link/20240823031056.110999-1-jianbol@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-08-27 13:11:39 -07:00
Jianbo Liu	2aeeef906d	bonding: change ipsec_lock from spin lock to mutex In the cited commit, bond->ipsec_lock is added to protect ipsec_list, hence xdo_dev_state_add and xdo_dev_state_delete are called inside this lock. As ipsec_lock is a spin lock and such xfrmdev ops may sleep, "scheduling while atomic" will be triggered when changing bond's active slave. [ 101.055189] BUG: scheduling while atomic: bash/902/0x00000200 [ 101.055726] Modules linked in: [ 101.058211] CPU: 3 PID: 902 Comm: bash Not tainted 6.9.0-rc4+ #1 [ 101.058760] Hardware name: [ 101.059434] Call Trace: [ 101.059436] <TASK> [ 101.060873] dump_stack_lvl+0x51/0x60 [ 101.061275] __schedule_bug+0x4e/0x60 [ 101.061682] __schedule+0x612/0x7c0 [ 101.062078] ? __mod_timer+0x25c/0x370 [ 101.062486] schedule+0x25/0xd0 [ 101.062845] schedule_timeout+0x77/0xf0 [ 101.063265] ? asm_common_interrupt+0x22/0x40 [ 101.063724] ? __bpf_trace_itimer_state+0x10/0x10 [ 101.064215] __wait_for_common+0x87/0x190 [ 101.064648] ? usleep_range_state+0x90/0x90 [ 101.065091] cmd_exec+0x437/0xb20 [mlx5_core] [ 101.065569] mlx5_cmd_do+0x1e/0x40 [mlx5_core] [ 101.066051] mlx5_cmd_exec+0x18/0x30 [mlx5_core] [ 101.066552] mlx5_crypto_create_dek_key+0xea/0x120 [mlx5_core] [ 101.067163] ? bonding_sysfs_store_option+0x4d/0x80 [bonding] [ 101.067738] ? kmalloc_trace+0x4d/0x350 [ 101.068156] mlx5_ipsec_create_sa_ctx+0x33/0x100 [mlx5_core] [ 101.068747] mlx5e_xfrm_add_state+0x47b/0xaa0 [mlx5_core] [ 101.069312] bond_change_active_slave+0x392/0x900 [bonding] [ 101.069868] bond_option_active_slave_set+0x1c2/0x240 [bonding] [ 101.070454] __bond_opt_set+0xa6/0x430 [bonding] [ 101.070935] __bond_opt_set_notify+0x2f/0x90 [bonding] [ 101.071453] bond_opt_tryset_rtnl+0x72/0xb0 [bonding] [ 101.071965] bonding_sysfs_store_option+0x4d/0x80 [bonding] [ 101.072567] kernfs_fop_write_iter+0x10c/0x1a0 [ 101.073033] vfs_write+0x2d8/0x400 [ 101.073416] ? alloc_fd+0x48/0x180 [ 101.073798] ksys_write+0x5f/0xe0 [ 101.074175] do_syscall_64+0x52/0x110 [ 101.074576] entry_SYSCALL_64_after_hwframe+0x4b/0x53 As bond_ipsec_add_sa_all and bond_ipsec_del_sa_all are only called from bond_change_active_slave, which requires holding the RTNL lock. And bond_ipsec_add_sa and bond_ipsec_del_sa are xfrm state xdo_dev_state_add and xdo_dev_state_delete APIs, which are in user context. So ipsec_lock doesn't have to be spin lock, change it to mutex, and thus the above issue can be resolved. Fixes: `9a5605505d` ("bonding: Add struct bond_ipesc to manage SA") Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Jay Vosburgh <jv@jvosburgh.net> Link: https://patch.msgid.link/20240823031056.110999-4-jianbol@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-08-27 13:11:37 -07:00
Jianbo Liu	907ed83a75	bonding: extract the use of real_device into local variable Add a local variable for slave->dev, to prepare for the lock change in the next patch. There is no functionality change. Fixes: `9a5605505d` ("bonding: Add struct bond_ipesc to manage SA") Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Jay Vosburgh <jv@jvosburgh.net> Link: https://patch.msgid.link/20240823031056.110999-3-jianbol@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-08-27 13:11:37 -07:00
Jianbo Liu	ec13009472	bonding: implement xdo_dev_state_free and call it after deletion Add this implementation for bonding, so hardware resources can be freed from the active slave after xfrm state is deleted. The netdev used to invoke xdo_dev_state_free callback, is saved in the xfrm state (xs->xso.real_dev), which is also the bond's active slave. To prevent it from being freed, acquire netdev reference before leaving RCU read-side critical section, and release it after callback is done. And call it when deleting all SAs from old active real interface while switching current active slave. Fixes: `9a5605505d` ("bonding: Add struct bond_ipesc to manage SA") Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Jay Vosburgh <jv@jvosburgh.net> Link: https://patch.msgid.link/20240823031056.110999-2-jianbol@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-08-27 13:11:37 -07:00
Petr Machata	65a3cce43d	selftests: forwarding: local_termination: Down ports on cleanup This test neglects to put ports down on cleanup. Fix it. Fixes: `90b9566aa5` ("selftests: forwarding: add a test for local_termination.sh") Signed-off-by: Petr Machata <petrm@nvidia.com> Link: https://patch.msgid.link/bf9b79f45de378f88344d44550f0a5052b386199.1724692132.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-08-27 12:53:29 -07:00
Petr Machata	e8497d6951	selftests: forwarding: no_forwarding: Down ports on cleanup This test neglects to put ports down on cleanup. Fix it. Fixes: `476a4f05d9` ("selftests: forwarding: add a no_forwarding.sh test") Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/0baf91dc24b95ae0cadfdf5db05b74888e6a228a.1724430120.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-08-27 12:53:29 -07:00

1 2 3 4 5 ...

1295958 Commits