Commit Graph

79717 Commits

Author SHA1 Message Date
David Howells
fc9de52de3 rxrpc: Fix missing locking causing hanging calls
If a call gets aborted (e.g. because kafs saw a signal) between it being
queued for connection and the I/O thread picking up the call, the abort
will be prioritised over the connection and it will be removed from
local->new_client_calls by rxrpc_disconnect_client_call() without a lock
being held.  This may cause other calls on the list to disappear if a race
occurs.

Fix this by taking the client_call_lock when removing a call from whatever
list its ->wait_link happens to be on.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: linux-afs@lists.infradead.org
Reported-by: Marc Dionne <marc.dionne@auristor.com>
Fixes: 9d35d880e0 ("rxrpc: Move client call connection to the I/O thread")
Link: https://patch.msgid.link/726660.1730898202@warthog.procyon.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-07 11:30:34 -08:00
Wenjia Zhang
de88df0179 net/smc: Fix lookup of netdev by using ib_device_get_netdev()
The SMC-R variant of the SMC protocol used direct call to function
ib_device_ops.get_netdev() to lookup netdev. As we used mlx5 device
driver to run SMC-R, it failed to find a device, because in mlx5_ib the
internal net device management for retrieving net devices was replaced
by a common interface ib_device_get_netdev() in commit 8d159eb211
("RDMA/mlx5: Use IB set_netdev and get_netdev functions").

Since such direct accesses to the internal net device management is not
recommended at all, update the SMC-R code to use proper API
ib_device_get_netdev().

Fixes: 54903572c2 ("net/smc: allow pnetid-less configuration")
Reported-by: Aswin K <aswin@linux.ibm.com>
Reviewed-by: Gerd Bayer <gbayer@linux.ibm.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Dust Li <dust.li@linux.alibaba.com>
Reviewed-by: Wen Gu <guwen@linux.alibaba.com>
Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Reviewed-by: D. Wythe <alibuda@linux.alibaba.com>
Signed-off-by: Wenjia Zhang <wenjia@linux.ibm.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://patch.msgid.link/20241106082612.57803-1-wenjia@linux.ibm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-07 11:24:19 -08:00
Jakub Kicinski
013d2c5c6b Merge tag 'nf-24-11-07' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Pablo Neira Ayuso says:

====================
Netfilter fix for net

The following series contains a Netfilter fix:

1) Wait for rcu grace period after netdevice removal is reported via event.

* tag 'nf-24-11-07' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
  netfilter: nf_tables: wait for rcu grace period on net_device removal
====================

Link: https://patch.msgid.link/20241107113212.116634-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-07 08:16:42 -08:00
Gilad Naaman
702c290a1c sctp: Avoid enqueuing addr events redundantly
Avoid modifying or enqueuing new events if it's possible to tell that no
one will consume them.

Since enqueueing requires searching the current queue for opposite
events for the same address, adding addresses en-masse turns this
inetaddr_event into a bottle-neck, as it will get slower and slower
with each address added.

Signed-off-by: Gilad Naaman <gnaaman@drivenets.com>
Acked-by: Xin Long <lucien.xin@gmail.com>
Link: https://patch.msgid.link/20241104083545.114-1-gnaaman@drivenets.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-11-07 15:45:19 +01:00
Christophe JAILLET
bb9df91cfe wifi: cfg80211: Fix an error handling path in nl80211_start_ap()
All error handling paths go to "out", except this one. Before the
commit in Fixes, error in the previous code would also end to "out",
freeing the memory.

Move the code up to avoid the leak.

Fixes: 62262dd00c ("wifi: cfg80211: disallow SMPS in AP mode")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://patch.msgid.link/eae54ce066d541914f272b10cab7b263c08eced3.1729956868.git.christophe.jaillet@wanadoo.fr
[move code, update commit message]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2024-11-07 14:40:44 +01:00
Aleksei Vetrov
9c46a3a5b3 wifi: nl80211: fix bounds checker error in nl80211_parse_sched_scan
The channels array in the cfg80211_scan_request has a __counted_by
attribute attached to it, which points to the n_channels variable. This
attribute is used in bounds checking, and if it is not set before the
array is filled, then the bounds sanitizer will issue a warning or a
kernel panic if CONFIG_UBSAN_TRAP is set.

This patch sets the size of allocated memory as the initial value for
n_channels. It is updated with the actual number of added elements after
the array is filled.

Fixes: aa4ec06c45 ("wifi: cfg80211: use __counted_by where appropriate")
Cc: stable@vger.kernel.org
Signed-off-by: Aleksei Vetrov <vvvvvv@google.com>
Reviewed-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Link: https://patch.msgid.link/20241029-nl80211_parse_sched_scan-bounds-checker-fix-v2-1-c804b787341f@google.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2024-11-07 14:38:33 +01:00
Lingbo Kong
b4ebb58cb9 wifi: cfg80211: Remove the Medium Synchronization Delay validity check
Currently, when the driver attempts to connect to an AP MLD with multiple
APs, the cfg80211_mlme_check_mlo_compat() function requires the Medium
Synchronization Delay values from different APs of the same AP MLD to be
equal, which may result in connection failures.

This is because when the driver receives a multi-link probe response from
an AP MLD with multiple APs, cfg80211 updates the Elements for each AP
based on the multi-link probe response. If the Medium Synchronization Delay
is set in the multi-link probe response, the Elements for each AP belonging
to the same AP MLD will have the Medium Synchronization Delay set
simultaneously. If non-multi-link probe responses are received from
different APs of the same MLD AP, cfg80211 will still update the Elements
based on the non-multi-link probe response. Since the non-multi-link probe
response does not set the Medium Synchronization Delay
(IEEE 802.11be-2024-35.3.4.4), if the Elements from a non-multi-link probe
response overwrite those from a multi-link probe response that has set the
Medium Synchronization Delay, the Medium Synchronization Delay values for
APs belonging to the same AP MLD will not be equal. This discrepancy causes
the cfg80211_mlme_check_mlo_compat() function to fail, leading to
connection failures. Commit ccb964b4ab
("wifi: cfg80211: validate MLO connections better") did not take this into
account.

To address this issue, remove this validity check.

Fixes: ccb964b4ab ("wifi: cfg80211: validate MLO connections better")
Signed-off-by: Lingbo Kong <quic_lingbok@quicinc.com>
Link: https://patch.msgid.link/20241031134223.970-1-quic_lingbok@quicinc.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2024-11-07 14:38:18 +01:00
Paolo Abeni
17bcfe6637 Merge tag 'nf-next-24-11-07' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next
Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

The following series contains Netfilter updates for net-next:

1) Make legacy xtables configs user selectable, from Breno Leitao.

2) Fix a few sparse warnings related to percpu, from Uros Bizjak.

3) Use strscpy_pad, from Justin Stitt.

4) Use nft_trans_elem_alloc() in catchall flush, from Florian Westphal.

5) A series of 7 patches to fix false positive with CONFIG_RCU_LIST=y.
   Florian also sees possible issue with 10 while module load/removal
   when requesting an expression that is available via module. As for
   patch 11, object is being updated so reference on the module already
   exists so I don't see any real issue.

   Florian says:

   "Unfortunately there are many more errors, and not all are false positives.

   First patches pass lockdep_commit_lock_is_held() to the rcu list traversal
   macro so that those splats are avoided.

   The last two patches are real code change as opposed to
   'pass the transaction mutex to relax rcu check':

   Those two lists are not protected by transaction mutex so could be altered
   in parallel.

   This targets nf-next because these are long-standing issues."

netfilter pull request 24-11-07

* tag 'nf-next-24-11-07' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next:
  netfilter: nf_tables: must hold rcu read lock while iterating object type list
  netfilter: nf_tables: must hold rcu read lock while iterating expression type list
  netfilter: nf_tables: avoid false-positive lockdep splats with basechain hook
  netfilter: nf_tables: avoid false-positive lockdep splats in set walker
  netfilter: nf_tables: avoid false-positive lockdep splats with flowtables
  netfilter: nf_tables: avoid false-positive lockdep splats with sets
  netfilter: nf_tables: avoid false-positive lockdep splat on rule deletion
  netfilter: nf_tables: prefer nft_trans_elem_alloc helper
  netfilter: nf_tables: replace deprecated strncpy with strscpy_pad
  netfilter: nf_tables: Fix percpu address space issues in nf_tables_api.c
  netfilter: Make legacy configs user selectable
====================

Link: https://patch.msgid.link/20241106234625.168468-1-pablo@netfilter.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-11-07 12:46:04 +01:00
Pablo Neira Ayuso
c03d278fdf netfilter: nf_tables: wait for rcu grace period on net_device removal
8c873e2199 ("netfilter: core: free hooks with call_rcu") removed
synchronize_net() call when unregistering basechain hook, however,
net_device removal event handler for the NFPROTO_NETDEV was not updated
to wait for RCU grace period.

Note that 835b803377 ("netfilter: nf_tables_netdev: unregister hooks
on net_device removal") does not remove basechain rules on device
removal, I was hinted to remove rules on net_device removal later, see
5ebe0b0eec ("netfilter: nf_tables: destroy basechain and rules on
netdevice removal").

Although NETDEV_UNREGISTER event is guaranteed to be handled after
synchronize_net() call, this path needs to wait for rcu grace period via
rcu callback to release basechain hooks if netns is alive because an
ongoing netlink dump could be in progress (sockets hold a reference on
the netns).

Note that nf_tables_pre_exit_net() unregisters and releases basechain
hooks but it is possible to see NETDEV_UNREGISTER at a later stage in
the netns exit path, eg. veth peer device in another netns:

 cleanup_net()
  default_device_exit_batch()
   unregister_netdevice_many_notify()
    notifier_call_chain()
     nf_tables_netdev_event()
      __nft_release_basechain()

In this particular case, same rule of thumb applies: if netns is alive,
then wait for rcu grace period because netlink dump in the other netns
could be in progress. Otherwise, if the other netns is going away then
no netlink dump can be in progress and basechain hooks can be released
inmediately.

While at it, turn WARN_ON() into WARN_ON_ONCE() for the basechain
validation, which should not ever happen.

Fixes: 835b803377 ("netfilter: nf_tables_netdev: unregister hooks on net_device removal")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-11-07 12:28:47 +01:00
Juraj Šarinay
9907cda95f net: nfc: Propagate ISO14443 type A target ATS to userspace via netlink
Add a 20-byte field ats to struct nfc_target and expose it as
NFC_ATTR_TARGET_ATS via the netlink interface. The payload contains
'historical bytes' that help to distinguish cards from one another.
The information is commonly used to assemble an emulated ATR similar
to that reported by smart cards with contacts.

Add a 20-byte field target_ats to struct nci_dev to hold the payload
obtained in nci_rf_intf_activated_ntf_packet() and copy it to over to
nfc_target.ats in nci_activate_target(). The approach is similar
to the handling of 'general bytes' within ATR_RES.

Replace the hard-coded size of rats_res within struct
activation_params_nfca_poll_iso_dep by the equal constant NFC_ATS_MAXSIZE
now defined in nfc.h

Within NCI, the information corresponds to the 'RATS Response' activation
parameter that omits the initial length byte TL. This loses no
information and is consistent with our handling of SENSB_RES that
also drops the first (constant) byte.

Tested with nxp_nci_i2c on a few type A targets including an
ICAO 9303 compliant passport.

I refrain from the corresponding change to digital_in_recv_ats()
to have the few drivers based on digital.h fill nfc_target.ats,
as I have no way to test it. That class of drivers appear not to set
NFC_ATTR_TARGET_SENSB_RES either. Consider a separate patch to propagate
(all) the parameters.

Signed-off-by: Juraj Šarinay <juraj@sarinay.com>
Link: https://patch.msgid.link/20241103124525.8392-1-juraj@sarinay.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-11-07 10:21:58 +01:00
Nam Cao
eb688451dc net: pktgen: Switch to use hrtimer_setup_sleeper_on_stack()
hrtimer_setup_sleeper_on_stack() replaces hrtimer_init_sleeper_on_stack()
to keep the naming convention consistent.

Convert the usage site over to it. The conversion was done with Coccinelle.

Signed-off-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/c4b40b8fef250b6a325e1b8bd6057005fb3cb660.1730386209.git.namcao@linutronix.de
2024-11-07 02:47:06 +01:00
Paolo Abeni
eb02688c5c ipv6: release nexthop on device removal
The CI is hitting some aperiodic hangup at device removal time in the
pmtu.sh self-test:

unregister_netdevice: waiting for veth_A-R1 to become free. Usage count = 6
ref_tracker: veth_A-R1@ffff888013df15d8 has 1/5 users at
	dst_init+0x84/0x4a0
	dst_alloc+0x97/0x150
	ip6_dst_alloc+0x23/0x90
	ip6_rt_pcpu_alloc+0x1e6/0x520
	ip6_pol_route+0x56f/0x840
	fib6_rule_lookup+0x334/0x630
	ip6_route_output_flags+0x259/0x480
	ip6_dst_lookup_tail.constprop.0+0x5c2/0x940
	ip6_dst_lookup_flow+0x88/0x190
	udp_tunnel6_dst_lookup+0x2a7/0x4c0
	vxlan_xmit_one+0xbde/0x4a50 [vxlan]
	vxlan_xmit+0x9ad/0xf20 [vxlan]
	dev_hard_start_xmit+0x10e/0x360
	__dev_queue_xmit+0xf95/0x18c0
	arp_solicit+0x4a2/0xe00
	neigh_probe+0xaa/0xf0

While the first suspect is the dst_cache, explicitly tracking the dst
owing the last device reference via probes proved such dst is held by
the nexthop in the originating fib6_info.

Similar to commit f5b51fe804 ("ipv6: route: purge exception on
removal"), we need to explicitly release the originating fib info when
disconnecting a to-be-removed device from a live ipv6 dst: move the
fib6_info cleanup into ip6_dst_ifdown().

Tested running:

./pmtu.sh cleanup_ipv6_exception

in a tight loop for more than 400 iterations with no spat, running an
unpatched kernel  I observed a splat every ~10 iterations.

Fixes: f88d8ea67f ("ipv6: Plumb support for nexthop object in a fib6_info")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/604c45c188c609b732286b47ac2a451a40f6cf6d.1730828007.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-06 17:27:35 -08:00
Thomas Gleixner
2634303f87 alarmtimers: Remove return value from alarm functions
Now that the SIG_IGN problem is solved in the core code, the alarmtimer
callbacks do not require a return value anymore.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Frederic Weisbecker <frederic@kernel.org>
Link: https://lore.kernel.org/all/20241105064214.318837272@linutronix.de
2024-11-07 02:14:46 +01:00
Zijian Zhang
955afd57dc bpf, sockmap: Fix sk_msg_reset_curr
Found in the test_txmsg_pull in test_sockmap,
```
txmsg_cork = 512; // corking is importrant here
opt->iov_length = 3;
opt->iov_count = 1;
opt->rate = 512; // sendmsg will be invoked 512 times
```
The first sendmsg will send an sk_msg with size 3, and bpf_msg_pull_data
will be invoked the first time. sk_msg_reset_curr will reset the copybreak
from 3 to 0. In the second sendmsg, since we are in the stage of corking,
psock->cork will be reused in func sk_msg_alloc. msg->sg.copybreak is 0
now, the second msg will overwrite the first msg. As a result, we could
not pass the data integrity test.

The same problem happens in push and pop test. Thus, fix sk_msg_reset_curr
to restore the correct copybreak.

Fixes: bb9aefde5b ("bpf: sockmap, updating the sg structure should also update curr")
Signed-off-by: Zijian Zhang <zijianzhang@bytedance.com>
Link: https://lore.kernel.org/r/20241106222520.527076-9-zijianzhang@bytedance.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-11-06 16:01:54 -08:00
Zijian Zhang
5d609ba262 bpf, sockmap: Several fixes to bpf_msg_pop_data
Several fixes to bpf_msg_pop_data,
1. In sk_msg_shift_left, we should put_page
2. if (len == 0), return early is better
3. pop the entire sk_msg (last == msg->sg.size) should be supported
4. Fix for the value of variable "a"
5. In sk_msg_shift_left, after shifting, i has already pointed to the next
element. Addtional sk_msg_iter_var_next may result in BUG.

Fixes: 7246d8ed4d ("bpf: helper to pop data from messages")
Signed-off-by: Zijian Zhang <zijianzhang@bytedance.com>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/r/20241106222520.527076-8-zijianzhang@bytedance.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-11-06 16:01:53 -08:00
Zijian Zhang
15ab0548e3 bpf, sockmap: Several fixes to bpf_msg_push_data
Several fixes to bpf_msg_push_data,
1. test_sockmap has tests where bpf_msg_push_data is invoked to push some
data at the end of a message, but -EINVAL is returned. In this case, in
bpf_msg_push_data, after the first loop, i will be set to msg->sg.end, add
the logic to handle it.
2. In the code block of "if (start - offset)", it's possible that "i"
points to the last of sk_msg_elem. In this case, "sk_msg_iter_next(msg,
end)" might still be called twice, another invoking is in "if (!copy)"
code block, but actually only one is needed. Add the logic to handle it,
and reconstruct the code to make the logic more clear.

Fixes: 6fff607e2f ("bpf: sk_msg program helper bpf_msg_push_data")
Signed-off-by: Zijian Zhang <zijianzhang@bytedance.com>
Link: https://lore.kernel.org/r/20241106222520.527076-7-zijianzhang@bytedance.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-11-06 16:01:53 -08:00
Guillaume Nault
e57dfaa4b0 xfrm: Convert struct xfrm_dst_lookup_params -> tos to dscp_t.
Add type annotation to the "tos" field of struct xfrm_dst_lookup_params,
to ensure that the ECN bits aren't mistakenly taken into account when
doing route lookups. Rename that field (tos -> dscp) to make that
change explicit.

Signed-off-by: Guillaume Nault <gnault@redhat.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2024-11-06 12:42:51 +01:00
Guillaume Nault
3021a2a340 xfrm: Convert xfrm_dst_lookup() to dscp_t.
Pass a dscp_t variable to xfrm_dst_lookup(), instead of an int, to
prevent accidental setting of ECN bits in ->flowi4_tos.

Only xfrm_bundle_create() actually calls xfrm_dst_lookup(). Since it
already has a dscp_t variable to pass as parameter, we only need to
remove the inet_dscp_to_dsfield() conversion.

Signed-off-by: Guillaume Nault <gnault@redhat.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2024-11-06 12:42:42 +01:00
Guillaume Nault
01f61cbfc8 xfrm: Convert xfrm_bundle_create() to dscp_t.
Use a dscp_t variable to store the result of xfrm_get_dscp().
This prepares for the future conversion of xfrm_dst_lookup().

Signed-off-by: Guillaume Nault <gnault@redhat.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2024-11-06 12:42:34 +01:00
Guillaume Nault
766f532089 xfrm: Convert xfrm_get_tos() to dscp_t.
Return a dscp_t variable to prepare for the future conversion of
xfrm_bundle_create() to dscp_t.

While there, rename the function "xfrm_get_dscp", to align its name
with the new return type.

Signed-off-by: Guillaume Nault <gnault@redhat.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2024-11-06 12:42:16 +01:00
Hyunwoo Kim
6ca575374d vsock/virtio: Initialization of the dangling pointer occurring in vsk->trans
During loopback communication, a dangling pointer can be created in
vsk->trans, potentially leading to a Use-After-Free condition.  This
issue is resolved by initializing vsk->trans to NULL.

Cc: stable <stable@kernel.org>
Fixes: 06a8fc7836 ("VSOCK: Introduce virtio_vsock_common.ko")
Signed-off-by: Hyunwoo Kim <v4bel@theori.io>
Signed-off-by: Wongi Lee <qwerty@theori.io>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Message-Id: <2024102245-strive-crib-c8d3@gregkh>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-11-06 03:30:20 -05:00
Matthieu Baerts (NGI0)
f2c71c49da mptcp: remove unneeded lock when listing scheds
mptcp_get_available_schedulers() needs to iterate over the schedulers'
list only to read the names: it doesn't modify anything there.

In this case, it is enough to hold the RCU read lock, no need to combine
this with the associated spin lock as it was done since its introduction
in commit 73c900aa36 ("mptcp: add net.mptcp.available_schedulers").

Suggested-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Geliang Tang <geliang@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20241104-net-next-mptcp-sched-unneeded-lock-v2-1-2ccc1e0c750c@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-05 17:54:39 -08:00
Geliang Tang
99635c91fb mptcp: use sock_kfree_s instead of kfree
The local address entries on userspace_pm_local_addr_list are allocated
by sock_kmalloc().

It's then required to use sock_kfree_s() instead of kfree() to free
these entries in order to adjust the allocated size on the sk side.

Fixes: 24430f8bf5 ("mptcp: add address into userspace pm list")
Cc: stable@vger.kernel.org
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20241104-net-mptcp-misc-6-12-v1-2-c13f2ff1656f@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-05 17:51:09 -08:00
Matthieu Baerts (NGI0)
cfbbd48598 mptcp: no admin perm to list endpoints
During the switch to YNL, the command to list all endpoints has been
accidentally restricted to users with admin permissions.

It looks like there are no reasons to have this restriction which makes
it harder for a user to quickly check if the endpoint list has been
correctly populated by an automated tool. Best to go back to the
previous behaviour then.

mptcp_pm_gen.c has been modified using ynl-gen-c.py:

   $ ./tools/net/ynl/ynl-gen-c.py --mode kernel \
     --spec Documentation/netlink/specs/mptcp_pm.yaml --source \
     -o net/mptcp/mptcp_pm_gen.c

The header file doesn't need to be regenerated.

Fixes: 1d0507f468 ("net: mptcp: convert netlink from small_ops to ops")
Cc: stable@vger.kernel.org
Reviewed-by: Davide Caratti <dcaratti@redhat.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20241104-net-mptcp-misc-6-12-v1-1-c13f2ff1656f@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-05 17:51:08 -08:00
Aaron Conole
7d1c2d517f openvswitch: Pass on secpath details for internal port rx.
Clearing the secpath for internal ports will cause packet drops when
ipsec offload or early SW ipsec decrypt are used.  Systems that rely
on these will not be able to actually pass traffic via openvswitch.

There is still an open issue for a flow miss packet - this is because
we drop the extensions during upcall and there is no facility to
restore such data (and it is non-trivial to add such functionality
to the upcall interface).  That means that when a flow miss occurs,
there will still be packet drops.  With this patch, when a flow is
found then traffic which has an associated xfrm extension will
properly flow.

Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Link: https://patch.msgid.link/20241101204732.183840-1-aconole@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-05 17:38:25 -08:00
Florian Westphal
cddc04275f netfilter: nf_tables: must hold rcu read lock while iterating object type list
Update of stateful object triggers:
WARNING: suspicious RCU usage
net/netfilter/nf_tables_api.c:7759 RCU-list traversed in non-reader section!!

other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 1
1 lock held by nft/3060:
 #0: ffff88810f0578c8 (&nft_net->commit_mutex){+.+.}-{4:4}, [..]

... but this list is not protected by the transaction mutex but the
nfnl nftables subsystem mutex.

Switch to nft_obj_type_get which will acquire rcu read lock,
bump refcount, and returns the result.

v3: Dan Carpenter points out nft_obj_type_get returns error pointer, not
NULL, on error.

Fixes: dad3bdeef4 ("netfilter: nf_tables: fix memory leak during stateful obj update").
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-11-05 22:07:12 +01:00
Florian Westphal
ee666a541e netfilter: nf_tables: must hold rcu read lock while iterating expression type list
nft shell tests trigger:
 WARNING: suspicious RCU usage
 net/netfilter/nf_tables_api.c:3125 RCU-list traversed in non-reader section!!
 1 lock held by nft/2068:
  #0: ffff888106c6f8c8 (&nft_net->commit_mutex){+.+.}-{4:4}, at: nf_tables_valid_genid+0x3c/0xf0

But the transaction mutex doesn't protect this list, the nfnl subsystem
mutex would, but we can't acquire it here without risk of ABBA
deadlocks.

Acquire the rcu read lock to avoid this issue.

v3: add a comment that explains the ->inner_ops check implies
expression is builtin and lack of a module owner reference is ok.

Fixes: 3a07327d10 ("netfilter: nft_inner: support for inner tunnel header matching")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-11-05 22:06:32 +01:00
Florian Westphal
3567146b94 netfilter: nf_tables: avoid false-positive lockdep splats with basechain hook
Like previous patches: iteration is ok if the list cannot be altered in
parallel.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-11-05 22:06:23 +01:00
Florian Westphal
28b7a6b84c netfilter: nf_tables: avoid false-positive lockdep splats in set walker
Its not possible to add or delete elements from hash and bitmap sets,
as long as caller is holding the transaction mutex, so its ok to iterate
the list outside of rcu read side critical section.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-11-05 22:06:22 +01:00
Florian Westphal
b3e8f29d6b netfilter: nf_tables: avoid false-positive lockdep splats with flowtables
The transaction mutex prevents concurrent add/delete, its ok to iterate
those lists outside of rcu read side critical sections.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-11-05 22:06:18 +01:00
Florian Westphal
8f5f3786db netfilter: nf_tables: avoid false-positive lockdep splats with sets
Same as previous patch.  All set handling functions here can be called
with transaction mutex held (but not the rcu read lock).

The transaction mutex prevents concurrent add/delete, so this is fine.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-11-05 22:06:14 +01:00
Florian Westphal
9adbb4198b netfilter: nf_tables: avoid false-positive lockdep splat on rule deletion
On rule delete we get:
 WARNING: suspicious RCU usage
 net/netfilter/nf_tables_api.c:3420 RCU-list traversed in non-reader section!!
 1 lock held by iptables/134:
   #0: ffff888008c4fcc8 (&nft_net->commit_mutex){+.+.}-{3:3}, at: nf_tables_valid_genid (include/linux/jiffies.h:101) nf_tables

Code is fine, no other CPU can change the list because we're holding
transaction mutex.

Pass the needed lockdep annotation to the iterator and fix
two comments for functions that are no longer restricted to rcu-only
context.

This is enough to resolve rule delete, but there are several other
missing annotations, added in followup-patches.

Fixes: 28875945ba ("rcu: Add support for consolidated-RCU reader checking")
Reported-by: Matthieu Baerts <matttbe@kernel.org>
Tested-by: Matthieu Baerts <matttbe@kernel.org>
Closes: https://lore.kernel.org/netfilter-devel/da27f17f-3145-47af-ad0f-7fd2a823623e@kernel.org/
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-11-05 22:06:10 +01:00
NeilBrown
10f0740234 sunrpc: handle -ENOTCONN in xs_tcp_setup_socket()
xs_tcp_finish_connecting() can return -ENOTCONN but the switch statement
in xs_tcp_setup_socket() treats that as an unhandled error.

If we treat it as a known error it would propagate back to
call_connect_status() which does handle that error code.  This appears
to be the intention of the commit (given below) which added -ENOTCONN as
a return status for xs_tcp_finish_connecting().

So add -ENOTCONN to the switch statement as an error to pass through to
the caller.

Link: https://bugzilla.suse.com/show_bug.cgi?id=1231050
Link: https://access.redhat.com/discussions/3434091
Fixes: 01d37c428a ("SUNRPC: xprt_connect() don't abort the task if the transport isn't bound")
Signed-off-by: NeilBrown <neilb@suse.de>
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2024-11-04 10:24:18 -05:00
Jakub Kicinski
cbf49bed6a Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2024-10-31

We've added 13 non-merge commits during the last 16 day(s) which contain
a total of 16 files changed, 710 insertions(+), 668 deletions(-).

The main changes are:

1) Optimize and homogenize bpf_csum_diff helper for all archs and also
   add a batch of new BPF selftests for it, from Puranjay Mohan.

2) Rewrite and migrate the test_tcp_check_syncookie.sh BPF selftest
   into test_progs so that it can be run in BPF CI, from Alexis Lothoré.

3) Two BPF sockmap selftest fixes, from Zijian Zhang.

4) Small XDP synproxy BPF selftest cleanup to remove IP_DF check,
   from Vincent Li.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next:
  selftests/bpf: Add a selftest for bpf_csum_diff()
  selftests/bpf: Don't mask result of bpf_csum_diff() in test_verifier
  bpf: bpf_csum_diff: Optimize and homogenize for all archs
  net: checksum: Move from32to16() to generic header
  selftests/bpf: remove xdp_synproxy IP_DF check
  selftests/bpf: remove test_tcp_check_syncookie
  selftests/bpf: test MSS value returned with bpf_tcp_gen_syncookie
  selftests/bpf: add ipv4 and dual ipv4/ipv6 support in btf_skc_cls_ingress
  selftests/bpf: get rid of global vars in btf_skc_cls_ingress
  selftests/bpf: add missing ns cleanups in btf_skc_cls_ingress
  selftests/bpf: factorize conn and syncookies tests in a single runner
  selftests/bpf: Fix txmsg_redir of test_txmsg_pull in test_sockmap
  selftests/bpf: Fix msg_verify_data in test_sockmap

====================

Link: https://patch.msgid.link/20241031221543.108853-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-03 14:44:51 -08:00
Dmitry Safonov
6b2d11e2d8 net/tcp: Add missing lockdep annotations for TCP-AO hlist traversals
Under CONFIG_PROVE_RCU_LIST + CONFIG_RCU_EXPERT
hlist_for_each_entry_rcu() provides very helpful splats, which help
to find possible issues. I missed CONFIG_RCU_EXPERT=y in my testing
config the same as described in
a3e4bf7f96 ("configs/debug: make sure PROVE_RCU_LIST=y takes effect").

The fix itself is trivial: add the very same lockdep annotations
as were used to dereference ao_info from the socket.

Reported-by: Jakub Kicinski <kuba@kernel.org>
Closes: https://lore.kernel.org/netdev/20241028152645.35a8be66@kernel.org/
Signed-off-by: Dmitry Safonov <0x7f454c46@gmail.com>
Link: https://patch.msgid.link/20241030-tcp-ao-hlist-lockdep-annotate-v1-1-bf641a64d7c6@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-03 12:10:11 -08:00
Gustavo A. R. Silva
3bd9b9abdf net: ethtool: Avoid thousands of -Wflex-array-member-not-at-end warnings
-Wflex-array-member-not-at-end was introduced in GCC-14, and we are
getting ready to enable it, globally.

Change the type of the middle struct member currently causing trouble from
`struct ethtool_link_settings` to `struct ethtool_link_settings_hdr`.

Additionally, update the type of some variables in various functions that
don't access the flexible-array member, changing them to the newly created
`struct ethtool_link_settings_hdr`. These changes are needed because the
type of the conflicting middle members changed. So, those instances that
expect the type to be `struct ethtool_link_settings` should be adjusted to
the newly created type `struct ethtool_link_settings_hdr`.

Also, adjust variable declarations to follow the reverse xmas tree
convention.

Fix 3338 of the following -Wflex-array-member-not-at-end warnings:

include/linux/ethtool.h:214:38: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://patch.msgid.link/0bc2809fe2a6c11dd4c8a9a10d9bd65cccdb559b.1730238285.git.gustavoars@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-03 11:06:58 -08:00
Xin Long
0ead60804b sctp: properly validate chunk size in sctp_sf_ootb()
A size validation fix similar to that in Commit 50619dbf8d ("sctp: add
size validation when walking chunks") is also required in sctp_sf_ootb()
to address a crash reported by syzbot:

  BUG: KMSAN: uninit-value in sctp_sf_ootb+0x7f5/0xce0 net/sctp/sm_statefuns.c:3712
  sctp_sf_ootb+0x7f5/0xce0 net/sctp/sm_statefuns.c:3712
  sctp_do_sm+0x181/0x93d0 net/sctp/sm_sideeffect.c:1166
  sctp_endpoint_bh_rcv+0xc38/0xf90 net/sctp/endpointola.c:407
  sctp_inq_push+0x2ef/0x380 net/sctp/inqueue.c:88
  sctp_rcv+0x3831/0x3b20 net/sctp/input.c:243
  sctp4_rcv+0x42/0x50 net/sctp/protocol.c:1159
  ip_protocol_deliver_rcu+0xb51/0x13d0 net/ipv4/ip_input.c:205
  ip_local_deliver_finish+0x336/0x500 net/ipv4/ip_input.c:233

Reported-by: syzbot+f0cbb34d39392f2746ca@syzkaller.appspotmail.com
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Link: https://patch.msgid.link/a29ebb6d8b9f8affd0f9abb296faafafe10c17d8.1730223981.git.lucien.xin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-03 11:03:23 -08:00
Rosen Penev
f12b363887 net: dsa: use ethtool string helpers
These are the preferred way to copy ethtool strings.

Avoids incrementing pointers all over the place.

Signed-off-by: Rosen Penev <rosenp@gmail.com>
(for hellcreek driver)
Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de>
Link: https://patch.msgid.link/20241028044828.1639668-1-rosenp@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-03 10:36:34 -08:00
Yafang Shao
dbd5e2e79e net: tcp: Add noinline_for_tracing annotation for tcp_drop_reason()
We previously hooked the tcp_drop_reason() function using BPF to monitor
TCP drop reasons. However, after upgrading our compiler from GCC 9 to GCC
11, tcp_drop_reason() is now inlined, preventing us from hooking into it.
To address this, it would be beneficial to make noinline explicitly for
tracing.

Link: https://lore.kernel.org/netdev/CANn89iJuShCmidCi_ZkYABtmscwbVjhuDta1MS5LxV_4H9tKOA@mail.gmail.com/
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Cc: Menglong Dong <menglong8.dong@gmail.com>
Link: https://patch.msgid.link/20241024093742.87681-3-laoar.shao@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-03 09:02:32 -08:00
Al Viro
6348be02ee fdget(), trivial conversions
fdget() is the first thing done in scope, all matching fdput() are
immediately followed by leaving the scope.

Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2024-11-03 01:28:06 -05:00
Al Viro
f302edb9d8 switch netlink_getsockbyfilp() to taking descriptor
the only call site (in do_mq_notify()) obtains the argument
from an immediately preceding fdget() and it is immediately
followed by fdput(); might as well just replace it with
a variant that would take a descriptor instead of struct file *
and have file lookups handled inside that function.

Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2024-11-03 01:28:06 -05:00
Al Viro
53c0a58beb net/socket.c: switch to CLASS(fd)
The important part in sockfd_lookup_light() is avoiding needless
file refcount operations, not the marginal reduction of the register
pressure from not keeping a struct file pointer in the caller.

	Switch to use fdget()/fdpu(); with sane use of CLASS(fd) we can
get a better code generation...

	Would be nice if somebody tested it on networking test suites
(including benchmarks)...

	sockfd_lookup_light() does fdget(), uses sock_from_file() to
get the associated socket and returns the struct socket reference to
the caller, along with "do we need to fput()" flag.  No matching fdput(),
the caller does its equivalent manually, using the fact that sock->file
points to the struct file the socket has come from.

	Get rid of that - have the callers do fdget()/fdput() and
use sock_from_file() directly.  That kills sockfd_lookup_light()
and fput_light() (no users left).

	What's more, we can get rid of explicit fdget()/fdput() by
switching to CLASS(fd, ...) - code generation does not suffer, since
now fdput() inserted on "descriptor is not opened" failure exit
is recognized to be a no-op by compiler.

[folded a fix for braino in do_recvmmsg() caught by Simon Horman]

Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2024-11-03 01:27:11 -05:00
Linus Torvalds
3e5e6c9900 Merge tag 'nfsd-6.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux
Pull nfsd fixes from Chuck Lever:

 - Fix two async COPY bugs found during NFS bake-a-thon

 - Fix an svcrdma memory leak

* tag 'nfsd-6.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
  rpcrdma: Always release the rpcrdma_device's xa_array
  NFSD: Never decrement pending_async_copies on error
  NFSD: Initialize struct nfsd4_copy earlier
2024-11-02 09:27:11 -10:00
Jinjie Ruan
bc74d329ce netlink: Remove the dead code in netlink_proto_init()
In the error path of netlink_proto_init(), frees the already allocated
bucket table for new hash tables in a loop, but it is going to panic,
so it is not necessary to clean up the resources, just remove the
dead code.

Suggested-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Link: https://patch.msgid.link/20241030012147.357400-1-ruanjinjie@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-31 19:36:25 -07:00
Pengcheng Yang
cf44bd08cd tcp: only release congestion control if it has been initialized
Currently, when cleaning up congestion control, we always call the
release regardless of whether it has been initialized. There is no
need to release when closing TCP_LISTEN and TCP_CLOSE (close
immediately after socket()).

In this case, tcp_cdg calls kfree(NULL) in release without causing
an exception, but for some customized ca, this could lead to
unexpected exceptions. We need to ensure that init and release are
called in pairs.

Signed-off-by: Pengcheng Yang <yangpc@wangsu.com>
Link: https://patch.msgid.link/1729845944-6003-1-git-send-email-yangpc@wangsu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-31 18:22:48 -07:00
Jakub Kicinski
5b1c965956 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-6.12-rc6).

Conflicts:

drivers/net/wireless/intel/iwlwifi/mvm/mld-mac80211.c
  cbe84e9ad5 ("wifi: iwlwifi: mvm: really send iwl_txpower_constraints_cmd")
  188a1bf894 ("wifi: mac80211: re-order assigning channel in activate links")
https://lore.kernel.org/all/20241028123621.7bbb131b@canb.auug.org.au/

net/mac80211/cfg.c
  c4382d5ca1 ("wifi: mac80211: update the right link for tx power")
  8dd0498983 ("wifi: mac80211: Fix setting txpower with emulate_chanctx")

drivers/net/ethernet/intel/ice/ice_ptp_hw.h
  6e58c33106 ("ice: fix crash on probe for DPLL enabled E810 LOM")
  e4291b64e1 ("ice: Align E810T GPIO to other products")
  ebb2693f8f ("ice: Read SDP section from NVM for pin definitions")
  ac532f4f42 ("ice: Cleanup unused declarations")
https://lore.kernel.org/all/20241030120524.1ee1af18@canb.auug.org.au/

No adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-31 18:10:07 -07:00
Linus Torvalds
5635f18942 Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Pull bpf fixes from Daniel Borkmann:

 - Fix BPF verifier to force a checkpoint when the program's jump
   history becomes too long (Eduard Zingerman)

 - Add several fixes to the BPF bits iterator addressing issues like
   memory leaks and overflow problems (Hou Tao)

 - Fix an out-of-bounds write in trie_get_next_key (Byeonguk Jeong)

 - Fix BPF test infra's LIVE_FRAME frame update after a page has been
   recycled (Toke Høiland-Jørgensen)

 - Fix BPF verifier and undo the 40-bytes extra stack space for
   bpf_fastcall patterns due to various bugs (Eduard Zingerman)

 - Fix a BPF sockmap race condition which could trigger a NULL pointer
   dereference in sock_map_link_update_prog (Cong Wang)

 - Fix tcp_bpf_recvmsg_parser to retrieve seq_copied from tcp_sk under
   the socket lock (Jiayuan Chen)

* tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  bpf, test_run: Fix LIVE_FRAME frame update after a page has been recycled
  selftests/bpf: Add three test cases for bits_iter
  bpf: Use __u64 to save the bits in bits iterator
  bpf: Check the validity of nr_words in bpf_iter_bits_new()
  bpf: Add bpf_mem_alloc_check_size() helper
  bpf: Free dynamically allocated bits in bpf_iter_bits_destroy()
  bpf: disallow 40-bytes extra stack for bpf_fastcall patterns
  selftests/bpf: Add test for trie_get_next_key()
  bpf: Fix out-of-bounds write in trie_get_next_key()
  selftests/bpf: Test with a very short loop
  bpf: Force checkpoint when jmp history is too long
  bpf: fix filed access without lock
  sock_map: fix a NULL pointer dereference in sock_map_link_update_prog()
2024-10-31 14:56:19 -10:00
Linus Torvalds
90602c251c Merge tag 'net-6.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
 "Including fixes from WiFi, bluetooth and netfilter.

  No known new regressions outstanding.

  Current release - regressions:

   - wifi: mt76: do not increase mcu skb refcount if retry is not
     supported

  Current release - new code bugs:

   - wifi:
      - rtw88: fix the RX aggregation in USB 3 mode
      - mac80211: fix memory corruption bug in struct ieee80211_chanctx

  Previous releases - regressions:

   - sched:
      - stop qdisc_tree_reduce_backlog on TC_H_ROOT
      - sch_api: fix xa_insert() error path in tcf_block_get_ext()

   - wifi:
      - revert "wifi: iwlwifi: remove retry loops in start"
      - cfg80211: clear wdev->cqm_config pointer on free

   - netfilter: fix potential crash in nf_send_reset6()

   - ip_tunnel: fix suspicious RCU usage warning in ip_tunnel_find()

   - bluetooth: fix null-ptr-deref in hci_read_supported_codecs

   - eth: mlxsw: add missing verification before pushing Tx header

   - eth: hns3: fixed hclge_fetch_pf_reg accesses bar space out of
     bounds issue

  Previous releases - always broken:

   - wifi: mac80211: do not pass a stopped vif to the driver in
     .get_txpower

   - netfilter: sanitize offset and length before calling skb_checksum()

   - core:
      - fix crash when config small gso_max_size/gso_ipv4_max_size
      - skip offload for NETIF_F_IPV6_CSUM if ipv6 header contains extension

   - mptcp: protect sched with rcu_read_lock

   - eth: ice: fix crash on probe for DPLL enabled E810 LOM

   - eth: macsec: fix use-after-free while sending the offloading packet

   - eth: stmmac: fix unbalanced DMA map/unmap for non-paged SKB data

   - eth: hns3: fix kernel crash when 1588 is sent on HIP08 devices

   - eth: mtk_wed: fix path of MT7988 WO firmware"

* tag 'net-6.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (70 commits)
  net: hns3: fix kernel crash when 1588 is sent on HIP08 devices
  net: hns3: fixed hclge_fetch_pf_reg accesses bar space out of bounds issue
  net: hns3: initialize reset_timer before hclgevf_misc_irq_init()
  net: hns3: don't auto enable misc vector
  net: hns3: Resolved the issue that the debugfs query result is inconsistent.
  net: hns3: fix missing features due to dev->features configuration too early
  net: hns3: fixed reset failure issues caused by the incorrect reset type
  net: hns3: add sync command to sync io-pgtable
  net: hns3: default enable tx bounce buffer when smmu enabled
  netfilter: nft_payload: sanitize offset and length before calling skb_checksum()
  net: ethernet: mtk_wed: fix path of MT7988 WO firmware
  selftests: forwarding: Add IPv6 GRE remote change tests
  mlxsw: spectrum_ipip: Fix memory leak when changing remote IPv6 address
  mlxsw: pci: Sync Rx buffers for device
  mlxsw: pci: Sync Rx buffers for CPU
  mlxsw: spectrum_ptp: Add missing verification before pushing Tx header
  net: skip offload for NETIF_F_IPV6_CSUM if ipv6 header contains extension
  Bluetooth: hci: fix null-ptr-deref in hci_read_supported_codecs
  netfilter: nf_reject_ipv6: fix potential crash in nf_send_reset6()
  netfilter: Fix use-after-free in get_info()
  ...
2024-10-31 12:39:58 -10:00
Toke Høiland-Jørgensen
c40dd8c473 bpf, test_run: Fix LIVE_FRAME frame update after a page has been recycled
The test_run code detects whether a page has been modified and
re-initialises the xdp_frame structure if it has, using
xdp_update_frame_from_buff(). However, xdp_update_frame_from_buff()
doesn't touch frame->mem, so that wasn't correctly re-initialised, which
led to the pages from page_pool not being returned correctly. Syzbot
noticed this as a memory leak.

Fix this by also copying the frame->mem structure when re-initialising
the frame, like we do on initialisation of a new page from page_pool.

Fixes: e5995bc7e2 ("bpf, test_run: fix crashes due to XDP frame overwriting/corruption")
Fixes: b530e9e106 ("bpf: Add "live packet" mode for XDP in BPF_PROG_RUN")
Reported-by: syzbot+d121e098da06af416d23@syzkaller.appspotmail.com
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: syzbot+d121e098da06af416d23@syzkaller.appspotmail.com
Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://lore.kernel.org/bpf/20241030-test-run-mem-fix-v1-1-41e88e8cae43@redhat.com
2024-10-31 16:15:21 +01:00
Paolo Abeni
50ae879de1 Merge tag 'nf-24-10-31' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Pablo Neira Ayuso says:

====================
The following patchset contains Netfilter fixes for net:

1) Remove unused parameters in conntrack_dump_flush.c used by
   selftests, from Liu Jing.

2) Fix possible UaF when removing xtables module via getsockopt()
   interface, from Dong Chenchen.

3) Fix potential crash in nf_send_reset6() reported by syzkaller.
   From Eric Dumazet

4) Validate offset and length before calling skb_checksum()
   in nft_payload, otherwise hitting BUG() is possible.

netfilter pull request 24-10-31

* tag 'nf-24-10-31' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
  netfilter: nft_payload: sanitize offset and length before calling skb_checksum()
  netfilter: nf_reject_ipv6: fix potential crash in nf_send_reset6()
  netfilter: Fix use-after-free in get_info()
  selftests: netfilter: remove unused parameter
====================

Link: https://patch.msgid.link/
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-31 12:13:08 +01:00