This commit appends a common "sysdata" suffix to functions responsible
for appending data to sysdata.
This change enhances code clarity and prevents naming conflicts with
other "append" functions, particularly in anticipation of the upcoming
inclusion of the `release` field in the next patch.
Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250314-netcons_release-v1-3-07979c4b86af@debian.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
The #if check causes a build failure when CONFIG_DEBUG_FS is turned
off:
In file included from drivers/net/ethernet/airoha/airoha_eth.c:17:
drivers/net/ethernet/airoha/airoha_eth.h:543:5: error: "CONFIG_DEBUG_FS" is not defined, evaluates to 0 [-Werror=undef]
543 | #if CONFIG_DEBUG_FS
| ^~~~~~~~~~~~~~~
Replace it with the correct #ifdef.
Fixes: 3fe15c640f ("net: airoha: Introduce PPE debugfs support")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://patch.msgid.link/20250314155009.4114308-1-arnd@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Martin KaFai Lau says:
====================
pull-request: bpf-next 2025-03-13
The following pull-request contains BPF updates for your *net-next* tree.
We've added 4 non-merge commits during the last 3 day(s) which contain
a total of 2 files changed, 35 insertions(+), 12 deletions(-).
The main changes are:
1) bpf_getsockopt support for TCP_BPF_RTO_MIN and TCP_BPF_DELACK_MAX,
from Jason Xing
bpf-next-for-netdev
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next:
selftests/bpf: Add bpf_getsockopt() for TCP_BPF_DELACK_MAX and TCP_BPF_RTO_MIN
tcp: bpf: Support bpf_getsockopt for TCP_BPF_DELACK_MAX
tcp: bpf: Support bpf_getsockopt for TCP_BPF_RTO_MIN
tcp: bpf: Introduce bpf_sol_tcp_getsockopt to support TCP_BPF flags
====================
Link: https://patch.msgid.link/20250313221620.2512684-1-martin.lau@linux.dev
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Cross-merge networking fixes after downstream PR (net-6.14-rc8).
Conflict:
tools/testing/selftests/net/Makefile
03544faad7 ("selftest: net: add proc_net_pktgen")
3ed61b8938 ("selftests: net: test for lwtunnel dst ref loops")
tools/testing/selftests/net/config:
85cb3711ac ("selftests: net: Add test cases for link and peer netns")
3ed61b8938 ("selftests: net: test for lwtunnel dst ref loops")
Adjacent commits:
tools/testing/selftests/net/Makefile
c935af429e ("selftests: net: add support for testing SO_RCVMARK and SO_RCVPRIORITY")
355d940f4d ("Revert "selftests: Add IPv6 link-local address generation tests for GRE devices."")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Pull networking fixes from Paolo Abeni:
"Including fixes from can, bluetooth and ipsec.
This contains a last minute revert of a recent GRE patch, mostly to
allow me stating there are no known regressions outstanding.
Current release - regressions:
- revert "gre: Fix IPv6 link-local address generation."
- eth: ti: am65-cpsw: fix NAPI registration sequence
Previous releases - regressions:
- ipv6: fix memleak of nhc_pcpu_rth_output in fib_check_nh_v6_gw().
- mptcp: fix data stream corruption in the address announcement
- bluetooth: fix connection regression between LE and non-LE adapters
- can:
- flexcan: only change CAN state when link up in system PM
- ucan: fix out of bound read in strscpy() source
Previous releases - always broken:
- lwtunnel: fix reentry loops
- ipv6: fix TCP GSO segmentation with NAT
- xfrm: force software GSO only in tunnel mode
- eth: ti: icssg-prueth: add lock to stats
Misc:
- add Andrea Mayer as a maintainer of SRv6"
* tag 'net-6.14-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (33 commits)
MAINTAINERS: Add Andrea Mayer as a maintainer of SRv6
Revert "gre: Fix IPv6 link-local address generation."
Revert "selftests: Add IPv6 link-local address generation tests for GRE devices."
net/neighbor: add missing policy for NDTPA_QUEUE_LENBYTES
tools headers: Sync uapi/asm-generic/socket.h with the kernel sources
mptcp: Fix data stream corruption in the address announcement
selftests: net: test for lwtunnel dst ref loops
net: ipv6: ioam6: fix lwtunnel_output() loop
net: lwtunnel: fix recursion loops
net: ti: icssg-prueth: Add lock to stats
net: atm: fix use after free in lec_send()
xsk: fix an integer overflow in xp_create_and_assign_umem()
net: stmmac: dwc-qos-eth: use devm_kzalloc() for AXI data
selftests: drv-net: use defer in the ping test
phy: fix xa_alloc_cyclic() error handling
dpll: fix xa_alloc_cyclic() error handling
devlink: fix xa_alloc_cyclic() error handling
ipv6: Set errno after ip_fib_metrics_init() in ip6_route_info_create().
ipv6: Fix memleak of nhc_pcpu_rth_output in fib_check_nh_v6_gw().
net: ipv6: fix TCP GSO segmentation with NAT
...
Pull rdma fixes from Jason Gunthorpe:
"Collected driver fixes from the last few weeks, I was surprised how
significant many of them seemed to be.
- Fix rdma-core test failures due to wrong startup ordering in rxe
- Don't crash in bnxt_re if the FW supports more than 64k QPs
- Fix wrong QP table indexing math in bnxt_re
- Calculate the max SRQs for userspace properly in bnxt_re
- Don't try to do math on errno for mlx5's rate calculation
- Properly allow userspace to control the VLAN in the QP state during
INIT->RTR for bnxt_re
- 6 bug fixes for HNS:
- Soft lockup when processing huge MRs, add a cond_resched()
- Fix missed error unwind for doorbell allocation
- Prevent bad send queue parameters from userspace
- Wrong error unwind in qp creation
- Missed xa_destroy during driver shutdown
- Fix reporting to userspace of max_sge_rd, hns doesn't have a
read/write difference"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
RDMA/hns: Fix wrong value of max_sge_rd
RDMA/hns: Fix missing xa_destroy()
RDMA/hns: Fix a missing rollback in error path of hns_roce_create_qp_common()
RDMA/hns: Fix invalid sq params not being blocked
RDMA/hns: Fix unmatched condition in error path of alloc_user_qp_db()
RDMA/hns: Fix soft lockup during bt pages loop
RDMA/bnxt_re: Avoid clearing VLAN_ID mask in modify qp path
RDMA/mlx5: Handle errors returned from mlx5r_ib_rate()
RDMA/bnxt_re: Fix reporting maximum SRQs on P7 chips
RDMA/bnxt_re: Add missing paranthesis in map_qp_id_to_tbl_indx
RDMA/bnxt_re: Fix allocation of QP table
RDMA/rxe: Fix the failure of ibv_query_device() and ibv_query_device_ex() tests
Pull EFI fixes from Ard Biesheuvel:
"Here's a final batch of EFI fixes for v6.14.
The efivarfs ones are fixes for changes that were made this cycle.
James's fix is somewhat of a band-aid, but it was blessed by the VFS
folks, who are working with James to come up with something better for
the next cycle.
- Avoid physical address 0x0 for random page allocations
- Add correct lockdep annotation when traversing efivarfs on resume
- Avoid NULL mount in kernel_file_open() when traversing efivarfs on
resume"
* tag 'efi-fixes-for-v6.14-3' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
efivarfs: fix NULL dereference on resume
efivarfs: use I_MUTEX_CHILD nested lock to traverse variables on resume
efi/libstub: Avoid physical address 0x0 when doing random allocation
Guillaume Nault says:
====================
gre: Revert IPv6 link-local address fix.
Following Paolo's suggestion, let's revert the IPv6 link-local address
generation fix for GRE devices. The patch introduced regressions in the
upstream CI, which are still under investigation.
Start by reverting the kselftest that depend on that fix (patch 1), then
revert the kernel code itself (patch 2).
====================
Link: https://patch.msgid.link/cover.1742418408.git.gnault@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Steffen Klassert says:
====================
pull request (net): ipsec 2025-03-19
1) Fix tunnel mode TX datapath in packet offload mode
by directly putting it to the xmit path.
From Alexandre Cassen.
2) Force software GSO only in tunnel mode in favor
of potential HW GSO. From Cosmin Ratiu.
ipsec-2025-03-19
* tag 'ipsec-2025-03-19' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec:
xfrm_output: Force software GSO only in tunnel mode
xfrm: fix tunnel mode TX datapath in packet offload mode
====================
Link: https://patch.msgid.link/20250319065513.987135-1-steffen.klassert@secunet.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Simon Wunderlich says:
====================
Here is batman-adv bugfix:
- Ignore own maximum aggregation size during RX, Sven Eckelmann
* tag 'batadv-net-pullrequest-20250318' of git://git.open-mesh.org/linux-merge:
batman-adv: Ignore own maximum aggregation size during RX
====================
Link: https://patch.msgid.link/20250318150035.35356-1-sw@simonwunderlich.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Previous commit 8b5c171bb3 ("neigh: new unresolved queue limits")
introduces new netlink attribute NDTPA_QUEUE_LENBYTES to represent
approximative value for deprecated QUEUE_LEN. However, it forgot to add
the associated nla_policy in nl_ntbl_parm_policy array. Fix it with one
simple NLA_U32 type policy.
Fixes: 8b5c171bb3 ("neigh: new unresolved queue limits")
Signed-off-by: Lin Ma <linma@zju.edu.cn>
Link: https://patch.msgid.link/20250315165113.37600-1-linma@zju.edu.cn
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Because of the size restriction in the TCP options space, the MPTCP
ADD_ADDR option is exclusive and cannot be sent with other MPTCP ones.
For this reason, in the linked mptcp_out_options structure, group of
fields linked to different options are part of the same union.
There is a case where the mptcp_pm_add_addr_signal() function can modify
opts->addr, but not ended up sending an ADD_ADDR. Later on, back in
mptcp_established_options, other options will be sent, but with
unexpected data written in other fields due to the union, e.g. in
opts->ext_copy. This could lead to a data stream corruption in the next
packet.
Using an intermediate variable, prevents from corrupting previously
established DSS option. The assignment of the ADD_ADDR option
parameters is now done once we are sure this ADD_ADDR option can be set
in the packet, e.g. after having dropped other suboptions.
Fixes: 1bff1e43a3 ("mptcp: optimize out option generation")
Cc: stable@vger.kernel.org
Suggested-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Arthur Mongodin <amongodin@randorisec.fr>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
[ Matt: the commit message has been updated: long lines splits and some
clarifications. ]
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250314-net-mptcp-fix-data-stream-corr-sockopt-v1-1-122dbb249db3@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Justin Iurman says:
====================
net: fix lwtunnel reentry loops
When the destination is the same after the transformation, we enter a
lwtunnel loop. This is true for most of lwt users: ioam6, rpl, seg6,
seg6_local, ila_lwt, and lwt_bpf. It can happen in their input() and
output() handlers respectively, where either dst_input() or dst_output()
is called at the end. It can also happen in xmit() handlers.
Here is an example for rpl_input():
dump_stack_lvl+0x60/0x80
rpl_input+0x9d/0x320
lwtunnel_input+0x64/0xa0
lwtunnel_input+0x64/0xa0
lwtunnel_input+0x64/0xa0
lwtunnel_input+0x64/0xa0
lwtunnel_input+0x64/0xa0
[...]
lwtunnel_input+0x64/0xa0
lwtunnel_input+0x64/0xa0
lwtunnel_input+0x64/0xa0
lwtunnel_input+0x64/0xa0
lwtunnel_input+0x64/0xa0
ip6_sublist_rcv_finish+0x85/0x90
ip6_sublist_rcv+0x236/0x2f0
... until rpl_do_srh() fails, which means skb_cow_head() failed.
This series provides a fix at the core level of lwtunnel to catch such
loops when they're not caught by the respective lwtunnel users, and
handle the loop case in ioam6 which is one of the users. This series
also comes with a new selftest to detect some dst cache reference loops
in lwtunnel users.
====================
Link: https://patch.msgid.link/20250314120048.12569-1-justin.iurman@uliege.be
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
As recently specified by commit 0ea09cbf83 ("docs: netdev: add a note
on selftest posting") in net-next, the selftest is therefore shipped in
this series. However, this selftest does not really test this series. It
needs this series to avoid crashing the kernel. What it really tests,
thanks to kmemleak, is what was fixed by the following commits:
- commit c71a192976 ("net: ipv6: fix dst refleaks in rpl, seg6 and
ioam6 lwtunnels")
- commit 92191dd107 ("net: ipv6: fix dst ref loops in rpl, seg6 and
ioam6 lwtunnels")
- commit c64a0727f9 ("net: ipv6: fix dst ref loop on input in seg6
lwt")
- commit 13e55fbaec ("net: ipv6: fix dst ref loop on input in rpl
lwt")
- commit 0e7633d7b9 ("net: ipv6: fix dst ref loop in ila lwtunnel")
- commit 5da15a9c11 ("net: ipv6: fix missing dst ref drop in ila
lwtunnel")
Signed-off-by: Justin Iurman <justin.iurman@uliege.be>
Link: https://patch.msgid.link/20250314120048.12569-4-justin.iurman@uliege.be
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Fix the lwtunnel_output() reentry loop in ioam6_iptunnel when the
destination is the same after transformation. Note that a check on the
destination address was already performed, but it was not enough. This
is the example of a lwtunnel user taking care of loops without relying
only on the last resort detection offered by lwtunnel.
Fixes: 8cb3bf8bff ("ipv6: ioam: Add support for the ip6ip6 encapsulation")
Signed-off-by: Justin Iurman <justin.iurman@uliege.be>
Link: https://patch.msgid.link/20250314120048.12569-3-justin.iurman@uliege.be
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
This patch acts as a parachute, catch all solution, by detecting
recursion loops in lwtunnel users and taking care of them (e.g., a loop
between routes, a loop within the same route, etc). In general, such
loops are the consequence of pathological configurations. Each lwtunnel
user is still free to catch such loops early and do whatever they want
with them. It will be the case in a separate patch for, e.g., seg6 and
seg6_local, in order to provide drop reasons and update statistics.
Another example of a lwtunnel user taking care of loops is ioam6, which
has valid use cases that include loops (e.g., inline mode), and which is
addressed by the next patch in this series. Overall, this patch acts as
a last resort to catch loops and drop packets, since we don't want to
leak something unintentionally because of a pathological configuration
in lwtunnels.
The solution in this patch reuses dev_xmit_recursion(),
dev_xmit_recursion_inc(), and dev_xmit_recursion_dec(), which seems fine
considering the context.
Closes: https://lore.kernel.org/netdev/2bc9e2079e864a9290561894d2a602d6@akamai.com/
Closes: https://lore.kernel.org/netdev/Z7NKYMY7fJT5cYWu@shredder/
Fixes: ffce41962e ("lwtunnel: support dst output redirect function")
Fixes: 2536862311 ("lwt: Add support to redirect dst.input")
Fixes: 14972cbd34 ("net: lwtunnel: Handle fragmentation")
Signed-off-by: Justin Iurman <justin.iurman@uliege.be>
Link: https://patch.msgid.link/20250314120048.12569-2-justin.iurman@uliege.be
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Currently the API emac_update_hardware_stats() reads different ICSSG
stats without any lock protection.
This API gets called by .ndo_get_stats64() which is only under RCU
protection and nothing else. Add lock to this API so that the reading of
statistics happens during lock.
Fixes: c1e10d5dc7 ("net: ti: icssg-prueth: Add ICSSG Stats")
Signed-off-by: MD Danish Anwar <danishanwar@ti.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250314102721.1394366-1-danishanwar@ti.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Matthieu Baerts says:
====================
mptcp: pm: prep work for new ops and sysctl knobs
Here are a few cleanups, preparation work for the new PM ops, and sysctl
knobs.
- Patch 1: reorg: move generic NL code used by all PMs to pm_netlink.c.
- Patch 2: use kmemdup() instead of kmalloc + copy.
- Patch 3: small cleanup to use pm var instead of msk->pm.
- Patch 4: reorg: id_avail_bitmap is only used by the in-kernel PM.
- Patch 5: use struct_group to easily reset a subset of PM data vars.
- Patch 6: introduce the minimal skeleton for the new PM ops.
- Patch 7: register in-kernel and userspace PM ops.
- Patch 8: new net.mptcp.path_manager sysctl knob, deprecating pm_type.
- Patch 9: map the new path_manager sysctl knob with pm_type.
- Patch 10: map the old pm_type sysctl knob with path_manager.
- Patch 11: new net.mptcp.available_path_managers sysctl knob.
- Patch 12: new test to validate path_manager and pm_type mapping.
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
====================
Link: https://patch.msgid.link/20250313-net-next-mptcp-pm-ops-intro-v1-0-f4e4a88efc50@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
This patch defines the original in-kernel netlink path manager as a
new struct mptcp_pm_ops named "mptcp_pm_kernel", and register it in
mptcp_pm_kernel_register(). And define the userspace path manager as
a new struct mptcp_pm_ops named "mptcp_pm_userspace", and register it
in mptcp_pm_init().
To ensure that there's always a valid path manager available, the default
path manager "mptcp_pm_kernel" will be skipped in mptcp_pm_unregister().
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250313-net-next-mptcp-pm-ops-intro-v1-7-f4e4a88efc50@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
In order to allow users to develop their own BPF-based path manager,
this patch defines a struct ops "mptcp_pm_ops" for an MPTCP path
manager, which contains a set of interfaces. Currently only init()
and release() interfaces are included, subsequent patches will add
others step by step.
Add a set of functions to register, unregister, find and validate a
given path manager struct ops.
"list" is used to add this path manager to mptcp_pm_list list when
it is registered. "name" is used to identify this path manager.
mptcp_pm_find() uses "name" to find a path manager on the list.
mptcp_pm_unregister is not used in this set, but will be invoked in
.unreg of struct bpf_struct_ops. mptcp_pm_validate() will be invoked
in .validate of struct bpf_struct_ops. That's why they are exported.
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250313-net-next-mptcp-pm-ops-intro-v1-6-f4e4a88efc50@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
With the device instance lock, there is now a possibility of a deadlock:
[ 1.211455] ============================================
[ 1.211571] WARNING: possible recursive locking detected
[ 1.211687] 6.14.0-rc5-01215-g032756b4ca7a-dirty #5 Not tainted
[ 1.211823] --------------------------------------------
[ 1.211936] ip/184 is trying to acquire lock:
[ 1.212032] ffff8881024a4c30 (&dev->lock){+.+.}-{4:4}, at: dev_set_allmulti+0x4e/0xb0
[ 1.212207]
[ 1.212207] but task is already holding lock:
[ 1.212332] ffff8881024a4c30 (&dev->lock){+.+.}-{4:4}, at: dev_open+0x50/0xb0
[ 1.212487]
[ 1.212487] other info that might help us debug this:
[ 1.212626] Possible unsafe locking scenario:
[ 1.212626]
[ 1.212751] CPU0
[ 1.212815] ----
[ 1.212871] lock(&dev->lock);
[ 1.212944] lock(&dev->lock);
[ 1.213016]
[ 1.213016] *** DEADLOCK ***
[ 1.213016]
[ 1.213143] May be due to missing lock nesting notation
[ 1.213143]
[ 1.213294] 3 locks held by ip/184:
[ 1.213371] #0: ffffffff838b53e0 (rtnl_mutex){+.+.}-{4:4}, at: rtnl_nets_lock+0x1b/0xa0
[ 1.213543] #1: ffffffff84e5fc70 (&net->rtnl_mutex){+.+.}-{4:4}, at: rtnl_nets_lock+0x37/0xa0
[ 1.213727] #2: ffff8881024a4c30 (&dev->lock){+.+.}-{4:4}, at: dev_open+0x50/0xb0
[ 1.213895]
[ 1.213895] stack backtrace:
[ 1.213991] CPU: 0 UID: 0 PID: 184 Comm: ip Not tainted 6.14.0-rc5-01215-g032756b4ca7a-dirty #5
[ 1.213993] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014
[ 1.213994] Call Trace:
[ 1.213995] <TASK>
[ 1.213996] dump_stack_lvl+0x8e/0xd0
[ 1.214000] print_deadlock_bug+0x28b/0x2a0
[ 1.214020] lock_acquire+0xea/0x2a0
[ 1.214027] __mutex_lock+0xbf/0xd40
[ 1.214038] dev_set_allmulti+0x4e/0xb0 # real_dev->flags & IFF_ALLMULTI
[ 1.214040] vlan_dev_open+0xa5/0x170 # ndo_open on vlandev
[ 1.214042] __dev_open+0x145/0x270
[ 1.214046] __dev_change_flags+0xb0/0x1e0
[ 1.214051] netif_change_flags+0x22/0x60 # IFF_UP vlandev
[ 1.214053] dev_change_flags+0x61/0xb0 # for each device in group from dev->vlan_info
[ 1.214055] vlan_device_event+0x766/0x7c0 # on netdevsim0
[ 1.214058] notifier_call_chain+0x78/0x120
[ 1.214062] netif_open+0x6d/0x90
[ 1.214064] dev_open+0x5b/0xb0 # locks netdevsim0
[ 1.214066] bond_enslave+0x64c/0x1230
[ 1.214075] do_set_master+0x175/0x1e0 # on netdevsim0
[ 1.214077] do_setlink+0x516/0x13b0
[ 1.214094] rtnl_newlink+0xaba/0xb80
[ 1.214132] rtnetlink_rcv_msg+0x440/0x490
[ 1.214144] netlink_rcv_skb+0xeb/0x120
[ 1.214150] netlink_unicast+0x1f9/0x320
[ 1.214153] netlink_sendmsg+0x346/0x3f0
[ 1.214157] __sock_sendmsg+0x86/0xb0
[ 1.214160] ____sys_sendmsg+0x1c8/0x220
[ 1.214164] ___sys_sendmsg+0x28f/0x2d0
[ 1.214179] __x64_sys_sendmsg+0xef/0x140
[ 1.214184] do_syscall_64+0xec/0x1d0
[ 1.214190] entry_SYSCALL_64_after_hwframe+0x77/0x7f
[ 1.214191] RIP: 0033:0x7f2d1b4a7e56
Device setup:
netdevsim0 (down)
^ ^
bond netdevsim1.100@netdevsim1 allmulticast=on (down)
When we enslave the lower device (netdevsim0) which has a vlan, we
propagate vlan's allmuti/promisc flags during ndo_open. This causes
(re)locking on of the real_dev.
Propagate allmulti/promisc on flags change, not on the open. There
is a slight semantics change that vlans that are down now propagate
the flags, but this seems unlikely to result in the real issues.
Reproducer:
echo 0 1 > /sys/bus/netdevsim/new_device
dev_path=$(ls -d /sys/bus/netdevsim/devices/netdevsim0/net/*)
dev=$(echo $dev_path | rev | cut -d/ -f1 | rev)
ip link set dev $dev name netdevsim0
ip link set dev netdevsim0 up
ip link add link netdevsim0 name netdevsim0.100 type vlan id 100
ip link set dev netdevsim0.100 allmulticast on down
ip link add name bond1 type bond mode 802.3ad
ip link set dev netdevsim0 down
ip link set dev netdevsim0 master bond1
ip link set dev bond1 up
ip link show
Reported-by: syzbot+b0c03d76056ef6cd12a6@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/Z9CfXjLMKn6VLG5d@mini-arch/T/#m15ba130f53227c883e79fb969687d69d670337a0
Signed-off-by: Stanislav Fomichev <sdf@fomichev.me>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250313100657.2287455-1-sdf@fomichev.me
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Jacob Keller says:
====================
net: ptp: fix egregious supported flag checks
In preparation for adding .supported_extts_flags and
.supported_perout_flags to the ptp_clock_info structure, fix a couple of
places where drivers get existing flag gets grossly incorrect.
The igb driver claims 82580 supports strictly validating PTP_RISING_EDGE
and PTP_FALLING_EDGE, but doesn't actually check the flags. Fix the driver
to require that the request match both edges, as this is implied by the
datasheet description.
The renesas driver also claims to support strict flag checking, but does
not actually check the flags either. I do not have the data sheet for this
device, so I do not know what edge it timestamps. For simplicity, just
reject all requests with PTP_STRICT_FLAGS. This essentially prevents the
PTP_EXTTS_REQUEST2 ioctl from working. Updating to correctly validate the
flags will require someone who has the hardware to confirm the behavior.
The lan743x driver supports (and strictly validates) that the request is
either PTP_RISING_EDGE or PTP_FALLING_EDGE but not both. However, it does
not check the flags are one of the known valid flags. Thus, requests for
PTP_EXT_OFF (and any future flag) will be accepted and misinterpreted. Add
the appropriate check to reject unsupported PTP_EXT_OFF requests and future
proof against new flags.
The broadcom PHY driver checks that PTP_PEROUT_PHASE is not set. This
appears to be an attempt at rejecting unsupported flags. It is not robust
against flag additions such as the PTP_PEROUT_ONE_SHOT, or anything added
in the future. Fix this by instead checking against the negation of the
supported PTP_PEROUT_DUTY_CYCLE instead.
The ptp_ocp driver supports PTP_PEROUT_PHASE and PTP_PEROUT_DUTY_CYCLE, but
does not check unsupported flags. Add the appropriate check to ensure
PTP_PEROUT_ONE_SHOT and any future flags are rejected as unsupported.
These are changes compile-tested, but I do not have hardware to validate the
behavior.
There are a number of other drivers which enable periodic output or
external timestamp requests, but which do not check flags at all. We could
go through each of these drivers one-by-one and meticulously add a flag
check. Instead, these drivers will be covered only by the upcoming
.supported_extts_flags and .supported_perout_flags checks in a net-next
series.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
====================
Link: https://patch.msgid.link/20250312-jk-net-fixes-supported-extts-flags-v2-0-ea930ba82459@intel.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>