Pull networking fixes from Jakub Kicinski:
"Happy Wear a Dress Day.
Fairly standard-sized batch of fixes, accounting for the lack of
sub-tree submissions this week. The mlx5 IRQ fixes are notable, people
were complaining about that. No fires burning.
Current release - regressions:
- eth: mlx5e:
- multiple fixes for dynamic IRQ allocation
- prevent encap offload when neigh update is running
- eth: mana: fix perf regression: remove rx_cqes, tx_cqes counters
Current release - new code bugs:
- eth: mlx5e: DR, add missing mutex init/destroy in pattern manager
Previous releases - always broken:
- tcp: deny tcp_disconnect() when threads are waiting
- sched: prevent ingress Qdiscs from getting installed in random
locations in the hierarchy and moving around
- sched: flower: fix possible OOB write in fl_set_geneve_opt()
- netlink: fix NETLINK_LIST_MEMBERSHIPS length report
- udp6: fix race condition in udp6_sendmsg & connect
- tcp: fix mishandling when the sack compression is deferred
- rtnetlink: validate link attributes set at creation time
- mptcp: fix connect timeout handling
- eth: stmmac: fix call trace when stmmac_xdp_xmit() is invoked
- eth: amd-xgbe: fix the false linkup in xgbe_phy_status
- eth: mlx5e:
- fix corner cases in internal buffer configuration
- drain health before unregistering devlink
- usb: qmi_wwan: set DTR quirk for BroadMobi BM818
Misc:
- tcp: return user_mss for TCP_MAXSEG in CLOSE/LISTEN state if
user_mss set"
* tag 'net-6.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (71 commits)
mptcp: fix active subflow finalization
mptcp: add annotations around sk->sk_shutdown accesses
mptcp: fix data race around msk->first access
mptcp: consolidate passive msk socket initialization
mptcp: add annotations around msk->subflow accesses
mptcp: fix connect timeout handling
rtnetlink: add the missing IFLA_GRO_ tb check in validate_linkmsg
rtnetlink: move IFLA_GSO_ tb check to validate_linkmsg
rtnetlink: call validate_linkmsg in rtnl_create_link
ice: recycle/free all of the fragments from multi-buffer frame
net: phy: mxl-gpy: extend interrupt fix to all impacted variants
net: renesas: rswitch: Fix return value in error path of xmit
net: dsa: mv88e6xxx: Increase wait after reset deactivation
net: ipa: Use correct value for IPA_STATUS_SIZE
tcp: fix mishandling when the sack compression is deferred.
net/sched: flower: fix possible OOB write in fl_set_geneve_opt()
sfc: fix error unwinds in TC offload
net/mlx5: Read embedded cpu after init bit cleared
net/mlx5e: Fix error handling in mlx5e_refresh_tirs
net/mlx5: Ensure af_desc.mask is properly initialized
...
When switching from kthreads to vhost_tasks two bugs were added:
1. The vhost worker tasks's now show up as processes so scripts doing
ps or ps a would not incorrectly detect the vhost task as another
process. 2. kthreads disabled freeze by setting PF_NOFREEZE, but
vhost tasks's didn't disable or add support for them.
To fix both bugs, this switches the vhost task to be thread in the
process that does the VHOST_SET_OWNER ioctl, and has vhost_worker call
get_signal to support SIGKILL/SIGSTOP and freeze signals. Note that
SIGKILL/STOP support is required because CLONE_THREAD requires
CLONE_SIGHAND which requires those 2 signals to be supported.
This is a modified version of the patch written by Mike Christie
<michael.christie@oracle.com> which was a modified version of patch
originally written by Linus.
Much of what depended upon PF_IO_WORKER now depends on PF_USER_WORKER.
Including ignoring signals, setting up the register state, and having
get_signal return instead of calling do_group_exit.
Tidied up the vhost_task abstraction so that the definition of
vhost_task only needs to be visible inside of vhost_task.c. Making
it easier to review the code and tell what needs to be done where.
As part of this the main loop has been moved from vhost_worker into
vhost_task_fn. vhost_worker now returns true if work was done.
The main loop has been updated to call get_signal which handles
SIGSTOP, freezing, and collects the message that tells the thread to
exit as part of process exit. This collection clears
__fatal_signal_pending. This collection is not guaranteed to
clear signal_pending() so clear that explicitly so the schedule()
sleeps.
For now the vhost thread continues to exist and run work until the
last file descriptor is closed and the release function is called as
part of freeing struct file. To avoid hangs in the coredump
rendezvous and when killing threads in a multi-threaded exec. The
coredump code and de_thread have been modified to ignore vhost threads.
Remvoing the special case for exec appears to require teaching
vhost_dev_flush how to directly complete transactions in case
the vhost thread is no longer running.
Removing the special case for coredump rendezvous requires either the
above fix needed for exec or moving the coredump rendezvous into
get_signal.
Fixes: 6e890c5d50 ("vhost: use vhost_tasks for worker threads")
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Co-developed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Saeed Mahameed says:
====================
mlx5 fixes 2023-05-31
This series provides bug fixes to mlx5 driver.
* tag 'mlx5-fixes-2023-05-31' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
net/mlx5: Read embedded cpu after init bit cleared
net/mlx5e: Fix error handling in mlx5e_refresh_tirs
net/mlx5: Ensure af_desc.mask is properly initialized
net/mlx5: Fix setting of irq->map.index for static IRQ case
net/mlx5: Remove rmap also in case dynamic MSIX not supported
====================
Link: https://lore.kernel.org/r/20230601031051.131529-1-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Mat Martineau says:
====================
mptcp: Fixes for connect timeout, access annotations, and subflow init
Patch 1 allows the SO_SNDTIMEO sockopt to correctly change the connect
timeout on MPTCP sockets.
Patches 2-5 add READ_ONCE()/WRITE_ONCE() annotations to fix KCSAN issues.
Patch 6 correctly initializes some subflow fields on outgoing connections.
====================
Link: https://lore.kernel.org/r/20230531-send-net-20230531-v1-0-47750c420571@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Active subflow are inserted into the connection list at creation time.
When the MPJ handshake completes successfully, a new subflow creation
netlink event is generated correctly, but the current code wrongly
avoid initializing a couple of subflow data.
The above will cause misbehavior on a few exceptional events: unneeded
mptcp-level retransmission on msk-level sequence wrap-around and infinite
mapping fallback even when a MPJ socket is present.
Address the issue factoring out the needed initialization in a new helper
and invoking the latter from __mptcp_finish_join() time for passive
subflow and from mptcp_finish_join() for active ones.
Fixes: 0530020a7c ("mptcp: track and update contiguous data status")
Cc: stable@vger.kernel.org
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The first subflow socket is accessed outside the msk socket lock
by mptcp_subflow_fail(), we need to annotate each write access
with WRITE_ONCE, but a few spots still lacks it.
Fixes: 76a13b3157 ("mptcp: invoke MP_FAIL response when needed")
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When the msk socket is cloned at MPC handshake time, a few
fields are initialized in a racy way outside mptcp_sk_clone()
and the msk socket lock.
The above is due historical reasons: before commit a88d0092b2
("mptcp: simplify subflow_syn_recv_sock()") as the first subflow socket
carrying all the needed date was not available yet at msk creation
time
We can now refactor the code moving the missing initialization bit
under the socket lock, removing the init race and avoiding some
code duplication.
This will also simplify the next patch, as all msk->first write
access are now under the msk socket lock.
Fixes: 0397c6d85f ("mptcp: keep unaccepted MPC subflow into join list")
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The MPTCP can access the first subflow socket in a few spots
outside the socket lock scope. That is actually safe, as MPTCP
will delete the socket itself only after the msk sock close().
Still the such accesses causes a few KCSAN splats, as reported
by Christoph. Silence the harmless warning adding a few annotation
around the relevant accesses.
Fixes: 71ba088ce0 ("mptcp: cleanup accept and poll")
Reported-by: Christoph Paasch <cpaasch@apple.com>
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/402
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ondrej reported a functional issue WRT timeout handling on connect
with a nice reproducer.
The problem is that the current mptcp connect waits for both the
MPTCP socket level timeout, and the first subflow socket timeout.
The latter is not influenced/touched by the exposed setsockopt().
Overall the above makes the SO_SNDTIMEO a no-op on connect.
Since mptcp_connect is invoked via inet_stream_connect and the
latter properly handle the MPTCP level timeout, we can address the
issue making the nested subflow level connect always unblocking.
This also allow simplifying a bit the code, dropping an ugly hack
to handle the fastopen and custom proto_ops connect.
The issues predates the blamed commit below, but the current resolution
requires the infrastructure introduced there.
Fixes: 54f1944ed6 ("mptcp: factor out mptcp_connect()")
Reported-by: Ondrej Mosnacek <omosnace@redhat.com>
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/399
Cc: stable@vger.kernel.org
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Xin Long says:
====================
rtnetlink: a couple of fixes in linkmsg validation
validate_linkmsg() was introduced to do linkmsg validation for existing
links. However, the new created links also need this linkmsg validation.
Add validate_linkmsg() check for link creating in Patch 1, and add more
tb checks into validate_linkmsg() in Patch 2 and 3.
====================
Link: https://lore.kernel.org/r/cover.1685548598.git.lucien.xin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This fixes the issue that dev gro_max_size and gso_ipv4_max_size
can be set to a huge value:
# ip link add dummy1 type dummy
# ip link set dummy1 gro_max_size 4294967295
# ip -d link show dummy1
dummy addrgenmode eui64 ... gro_max_size 4294967295
Fixes: 0fe79f28bf ("net: allow gro_max_size to exceed 65536")
Fixes: 9eefedd58a ("net: add gso_ipv4_max_size and gro_ipv4_max_size per device")
Reported-by: Xiumei Mu <xmu@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
These IFLA_GSO_* tb check should also be done for the new created link,
otherwise, they can be set to a huge value when creating links:
# ip link add dummy1 gso_max_size 4294967295 type dummy
# ip -d link show dummy1
dummy addrgenmode eui64 ... gso_max_size 4294967295
Fixes: 46e6b992c2 ("rtnetlink: allow GSO maximums to be set on device creation")
Fixes: 9eefedd58a ("net: add gso_ipv4_max_size and gro_ipv4_max_size per device")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
validate_linkmsg() was introduced by commit 1840bb13c2 ("[RTNL]:
Validate hardware and broadcast address attribute for RTM_NEWLINK")
to validate tb[IFLA_ADDRESS/BROADCAST] for existing links. The same
check should also be done for newly created links.
This patch adds validate_linkmsg() call in rtnl_create_link(), to
avoid the invalid address set when creating some devices like:
# ip link add dummy0 type dummy
# ip link add link dummy0 name mac0 address 01:02 type macsec
Fixes: 0e06877c6f ("[RTNETLINK]: rtnl_link: allow specifying initial device address")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Pull firewire fix from Takashi Sakamoto:
"A single patch to use a flexible array rather than a zero-length one"
* tag 'firewire-fixes-6.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394:
firewire: Replace zero-length array with flexible-array member
Pull mailbox fix from Jassi Brar:
"Fix missing mutex unlock in mailbox-test"
* tag 'mailbox-fixes-6.4-rc5' of git://git.linaro.org/landing-teams/working/fujitsu/integration:
mailbox: mailbox-test: fix a locking issue in mbox_test_message_write()
A switch held in reset by default needs to wait longer until we can
reliably detect it.
An issue was observed when testing on the Marvell 88E6393X (Link Street).
The driver failed to detect the switch on some upstarts. Increasing the
wait time after reset deactivation solves this issue.
The updated wait time is now also the same as the wait time in the
mv88e6xxx_hardware_reset function.
Fixes: 7b75e49de4 ("net: dsa: mv88e6xxx: wait after reset deactivation")
Signed-off-by: Andreas Svensson <andreas.svensson@axis.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20230530145223.1223993-1-andreas.svensson@axis.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Zero-length and one-element arrays are deprecated, and we are moving
towards adopting C99 flexible-array members, instead.
Address the following warnings found with GCC-13 and
-fstrict-flex-arrays=3 enabled:
sound/firewire/amdtp-stream.c: In function ‘build_it_pkt_header’:
sound/firewire/amdtp-stream.c:694:17: warning: ‘generate_cip_header’ accessing 8 bytes in a region of size 0 [-Wstringop-overflow=]
694 | generate_cip_header(s, cip_header, data_block_counter, syt);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
sound/firewire/amdtp-stream.c:694:17: note: referencing argument 2 of type ‘__be32[2]’ {aka ‘unsigned int[2]’}
sound/firewire/amdtp-stream.c:667:13: note: in a call to function ‘generate_cip_header’
667 | static void generate_cip_header(struct amdtp_stream *s, __be32 cip_header[2],
| ^~~~~~~~~~~~~~~~~~~
This helps with the ongoing efforts to tighten the FORTIFY_SOURCE
routines on memcpy() and help us make progress towards globally
enabling -fstrict-flex-arrays=3 [1].
Link: https://github.com/KSPP/linux/issues/21
Link: https://github.com/KSPP/linux/issues/303
Link: https://gcc.gnu.org/pipermail/gcc-patches/2022-October/602902.html [1]
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/ZHT0V3SpvHyxCv5W@work
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Pull HID fixes from Jiri Kosina:
- Regression fix for overlong long timeouts during initialization on
some Logitech Unifying devices (Bastien Nocera)
- error handling and overflow fixes for Wacom driver (Denis Arefev,
Jason Gerecke, Nikita Zhandarovich)
* tag 'for-linus-2023060101' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
HID: logitech-hidpp: Handle timeout differently from busy
HID: wacom: Add error check to wacom_parse_and_register()
HID: google: add jewel USB id
HID: wacom: avoid integer overflow in wacom_intuos_inout()
HID: wacom: Check for string overflow from strscpy calls
Pull ata fix from Damien Le Moal:
- Fix ata_find_dev() use of the device number to find a struct
ata_device for a port. This addresses issues with some passthrough
commands with libsas managed devices.
* tag 'ata-6.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata:
ata: libata-scsi: Use correct device no in ata_find_dev()
Pull smb server fixes from Steve French:
"Eight server fixes (most also for stable):
- Two fixes for uninitialized pointer reads (rename and link)
- Fix potential UAF in oplock break
- Two fixes for potential out of bound reads in negotiate
- Fix crediting bug
- Two fixes for xfstests (allocation size fix for test 694 and lookup
issue shown by test 464)"
* tag '6.4-rc4-smb3-server-fixes' of git://git.samba.org/ksmbd:
ksmbd: call putname after using the last component
ksmbd: fix incorrect AllocationSize set in smb2_get_info
ksmbd: fix UAF issue from opinfo->conn
ksmbd: fix multiple out-of-bounds read during context decoding
ksmbd: fix slab-out-of-bounds read in smb2_handle_negotiate
ksmbd: fix credit count leakage
ksmbd: fix uninitialized pointer read in smb2_create_link()
ksmbd: fix uninitialized pointer read in ksmbd_vfs_rename()
If we send two TCA_FLOWER_KEY_ENC_OPTS_GENEVE packets and their total
size is 252 bytes(key->enc_opts.len = 252) then
key->enc_opts.len = opt->length = data_len / 4 = 0 when the third
TCA_FLOWER_KEY_ENC_OPTS_GENEVE packet enters fl_set_geneve_opt. This
bypasses the next bounds check and results in an out-of-bounds.
Fixes: 0a6e77784f ("net/sched: allow flower to match tunnel options")
Signed-off-by: Hangyu Hua <hbh25y@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com>
Link: https://lore.kernel.org/r/20230531102805.27090-1-hbh25y@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
During driver load it reads embedded_cpu bit from initialization
segment, but the initialization segment is readable only after
initialization bit is cleared.
Move the call to mlx5_read_embedded_cpu() right after initialization bit
cleared.
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Fixes: 591905ba96 ("net/mlx5: Introduce Mellanox SmartNIC and modify page management logic")
Reviewed-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Allocation failure is outside the critical lock section and should
return immediately rather than jumping to the unlock section.
Also unlock as soon as required and remove the now redundant jump label.
Fixes: 80a2a9026b ("net/mlx5e: Add a lock on tir list")
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
When dynamic IRQ allocation is not supported all IRQs are allocated up
front in mlx5_irq_table_create() instead of dynamically as part of
mlx5_irq_alloc(). In the latter dynamic case irq->map.index is set
via the mapping returned by pci_msix_alloc_irq_at(). In the static case
and prior to commit 1da438c0ae ("net/mlx5: Fix indexing of mlx5_irq")
irq->map.index was set in mlx5_irq_alloc() twice once initially to 0 and
then to the requested index before storing in the xarray. After this
commit it is only set to 0 which breaks all other IRQ mappings.
Fix this by setting irq->map.index to the requested index together with
irq->map.virq and improve the related comment to make it clearer which
cases it deals with.
Cc: Chuck Lever III <chuck.lever@oracle.com>
Tested-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Eli Cohen <elic@nvidia.com>
Fixes: 1da438c0ae ("net/mlx5: Fix indexing of mlx5_irq")
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Tested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Pull smb client fixes from Steve French:
"Four small smb3 client fixes:
- two small fixes suggested by kernel test robot
- small cleanup fix
- update Paulo's email address in the maintainer file"
* tag '6.4-rc4-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
cifs: address unused variable warning
smb: delete an unnecessary statement
smb3: missing null check in SMB2_change_notify
smb3: update a reviewer email in MAINTAINERS file
There was a bug where this code forgot to unlock the tdev->mutex if the
kzalloc() failed. Fix this issue, by moving the allocation outside the
lock.
Fixes: 2d1e952a2b ("mailbox: mailbox-test: Fix potential double-free in mbox_test_message_write()")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Lee Jones <lee@kernel.org>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
Pull rdma fixes from Jason Gunthorpe:
- Fix 64K ARM page size support in bnxt_re and efa
- bnxt_re fixes for a memory leak, incorrect error handling and a
remove a bogus FW failure when running on a VF
- Update MAINTAINERS for hns and efa
- Fix two rxe regressions added this merge window in error unwind and
incorrect spinlock primitives
- hns gets a better algorithm for allocating page tables to avoid
running out of resources, and a timeout adjustment
- Fix a text case failure in hns
- Use after free in irdma and fix incorrect construction of a WQE
causing mis-execution
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
RDMA/irdma: Fix Local Invalidate fencing
RDMA/irdma: Prevent QP use after free
MAINTAINERS: Update maintainer of Amazon EFA driver
RDMA/bnxt_re: Do not enable congestion control on VFs
RDMA/bnxt_re: Fix return value of bnxt_re_process_raw_qp_pkt_rx
RDMA/bnxt_re: Fix a possible memory leak
RDMA/hns: Modify the value of long message loopback slice
RDMA/hns: Fix base address table allocation
RDMA/hns: Fix timeout attr in query qp for HIP08
RDMA/efa: Fix unsupported page sizes in device
RDMA/rxe: Convert spin_{lock_bh,unlock_bh} to spin_{lock_irqsave,unlock_irqrestore}
RDMA/rxe: Fix double unlock in rxe_qp.c
MAINTAINERS: Update maintainers of HiSilicon RoCE
RDMA/bnxt_re: Fix the page_size used during the MR creation
Pull ext4 fixes from Ted Ts'o:
"Fix two regressions in ext4 and a number of issues reported by syzbot"
* tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: enable the lazy init thread when remounting read/write
ext4: fix fsync for non-directories
ext4: add lockdep annotations for i_data_sem for ea_inode's
ext4: disallow ea_inodes with extended attributes
ext4: set lockdep subclass for the ea_inode in ext4_xattr_inode_cache_find()
ext4: add EA_INODE checking to ext4_iget()
If an attempt at contacting a receiver or a device fails because the
receiver or device never responds, don't restart the communication, only
restart it if the receiver or device answers that it's busy, as originally
intended.
This was the behaviour on communication timeout before commit 586e8fede7
("HID: logitech-hidpp: Retry commands when device is busy").
This fixes some overly long waits in a critical path on boot, when
checking whether the device is connected by getting its HID++ version.
Signed-off-by: Bastien Nocera <hadess@hadess.net>
Suggested-by: Mark Lord <mlord@pobox.com>
Fixes: 586e8fede7 ("HID: logitech-hidpp: Retry commands when device is busy")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=217412
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Syzkaller got the following report:
BUG: KASAN: use-after-free in sk_setup_caps+0x621/0x690 net/core/sock.c:2018
Read of size 8 at addr ffff888027f82780 by task syz-executor276/3255
The function sk_setup_caps (called by ip6_sk_dst_store_flow->
ip6_dst_store) referenced already freed memory as this memory was
freed by parallel task in udpv6_sendmsg->ip6_sk_dst_lookup_flow->
sk_dst_check.
task1 (connect) task2 (udp6_sendmsg)
sk_setup_caps->sk_dst_set |
| sk_dst_check->
| sk_dst_set
| dst_release
sk_setup_caps references |
to already freed dst_entry|
The reason for this race condition is: sk_setup_caps() keeps using
the dst after transferring the ownership to the dst cache.
Found by Linux Verification Center (linuxtesting.org) with syzkaller.
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Vladislav Efanov <VEfanov@ispras.ru>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When use the following command to test:
1)ip link add bond0 type bond
2)ip link set bond0 up
3)tc qdisc add dev bond0 root handle ffff: mq
4)tc qdisc replace dev bond0 parent ffff:fff1 handle ffff: mq
The kernel reports NULL pointer dereference issue. The stack information
is as follows:
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
Internal error: Oops: 0000000096000006 [#1] SMP
Modules linked in:
pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : mq_attach+0x44/0xa0
lr : qdisc_graft+0x20c/0x5cc
sp : ffff80000e2236a0
x29: ffff80000e2236a0 x28: ffff0000c0e59d80 x27: ffff0000c0be19c0
x26: ffff0000cae3e800 x25: 0000000000000010 x24: 00000000fffffff1
x23: 0000000000000000 x22: ffff0000cae3e800 x21: ffff0000c9df4000
x20: ffff0000c9df4000 x19: 0000000000000000 x18: ffff80000a934000
x17: ffff8000f5b56000 x16: ffff80000bb08000 x15: 0000000000000000
x14: 0000000000000000 x13: 6b6b6b6b6b6b6b6b x12: 6b6b6b6b00000001
x11: 0000000000000000 x10: 0000000000000000 x9 : 0000000000000000
x8 : ffff0000c0be0730 x7 : bbbbbbbbbbbbbbbb x6 : 0000000000000008
x5 : ffff0000cae3e864 x4 : 0000000000000000 x3 : 0000000000000001
x2 : 0000000000000001 x1 : ffff8000090bc23c x0 : 0000000000000000
Call trace:
mq_attach+0x44/0xa0
qdisc_graft+0x20c/0x5cc
tc_modify_qdisc+0x1c4/0x664
rtnetlink_rcv_msg+0x354/0x440
netlink_rcv_skb+0x64/0x144
rtnetlink_rcv+0x28/0x34
netlink_unicast+0x1e8/0x2a4
netlink_sendmsg+0x308/0x4a0
sock_sendmsg+0x64/0xac
____sys_sendmsg+0x29c/0x358
___sys_sendmsg+0x90/0xd0
__sys_sendmsg+0x7c/0xd0
__arm64_sys_sendmsg+0x2c/0x38
invoke_syscall+0x54/0x114
el0_svc_common.constprop.1+0x90/0x174
do_el0_svc+0x3c/0xb0
el0_svc+0x24/0xec
el0t_64_sync_handler+0x90/0xb4
el0t_64_sync+0x174/0x178
This is because when mq is added for the first time, qdiscs in mq is set
to NULL in mq_attach(). Therefore, when replacing mq after adding mq, we
need to initialize qdiscs in the mq before continuing to graft. Otherwise,
it will couse NULL pointer dereference issue in mq_attach(). And the same
issue will occur in the attach functions of mqprio, taprio and htb.
ffff:fff1 means that the repalce qdisc is ingress. Ingress does not allow
any qdisc to be attached. Therefore, ffff:fff1 is incorrectly used, and
the command should be dropped.
Fixes: 6ec1c69a8f ("net_sched: add classful multiqueue dummy scheduler")
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Tested-by: Peilin Ye <peilin.ye@bytedance.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://lore.kernel.org/r/20230527093747.3583502-1-shaozhengchao@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Currently, after creating an ingress (or clsact) Qdisc and grafting it
under TC_H_INGRESS (TC_H_CLSACT), it is possible to graft it again under
e.g. a TBF Qdisc:
$ ip link add ifb0 type ifb
$ tc qdisc add dev ifb0 handle 1: root tbf rate 20kbit buffer 1600 limit 3000
$ tc qdisc add dev ifb0 clsact
$ tc qdisc link dev ifb0 handle ffff: parent 1:1
$ tc qdisc show dev ifb0
qdisc tbf 1: root refcnt 2 rate 20Kbit burst 1600b lat 560.0ms
qdisc clsact ffff: parent ffff:fff1 refcnt 2
^^^^^^^^
clsact's refcount has increased: it is now grafted under both
TC_H_CLSACT and 1:1.
ingress and clsact Qdiscs should only be used under TC_H_INGRESS
(TC_H_CLSACT). Prohibit regrafting them.
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Fixes: 1f211a1b92 ("net, sched: add clsact qdisc")
Tested-by: Pedro Tammela <pctammela@mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Vlad Buslov <vladbu@nvidia.com>
Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Currently it is possible to add e.g. an HTB Qdisc under ffff:fff1
(TC_H_INGRESS, TC_H_CLSACT):
$ ip link add name ifb0 type ifb
$ tc qdisc add dev ifb0 parent ffff:fff1 htb
$ tc qdisc add dev ifb0 clsact
Error: Exclusivity flag on, cannot modify.
$ drgn
...
>>> ifb0 = netdev_get_by_name(prog, "ifb0")
>>> qdisc = ifb0.ingress_queue.qdisc_sleeping
>>> print(qdisc.ops.id.string_().decode())
htb
>>> qdisc.flags.value_() # TCQ_F_INGRESS
2
Only allow ingress and clsact Qdiscs under ffff:fff1. Return -EINVAL
for everything else. Make TCQ_F_INGRESS a static flag of ingress and
clsact Qdiscs.
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Fixes: 1f211a1b92 ("net, sched: add clsact qdisc")
Tested-by: Pedro Tammela <pctammela@mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Vlad Buslov <vladbu@nvidia.com>
Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
clsact Qdiscs are only supposed to be created under TC_H_CLSACT (which
equals TC_H_INGRESS). Return -EOPNOTSUPP if 'parent' is not
TC_H_CLSACT.
Fixes: 1f211a1b92 ("net, sched: add clsact qdisc")
Tested-by: Pedro Tammela <pctammela@mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Vlad Buslov <vladbu@nvidia.com>
Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Pull btrfs fixes from David Sterba:
"One bug fix and two build warning fixes:
- call proper end bio callback for metadata RAID0 in a rare case of
an unaligned block
- fix uninitialized variable (reported by gcc 10.2)
- fix warning about potential access beyond array bounds on mips64
with 64k pages (runtime check would not allow that)"
* tag 'for-6.4-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: fix csum_tree_block page iteration to avoid tripping on -Werror=array-bounds
btrfs: fix an uninitialized variable warning in btrfs_log_inode
btrfs: call btrfs_orig_bbio_end_io in btrfs_end_bio_work
Pull perf tools fixes from Arnaldo Carvalho de Melo:
- Fix BPF CO-RE naming convention for checking the availability of
fields on 'union perf_mem_data_src' on the running kernel
- Remove the use of llvm-strip on BPF skel object files, not needed,
fixes a build breakage when the llvm package, that contains it in
most distros, isn't installed
- Fix tools that use both evsel->{bpf_counter_list,bpf_filters},
removing them from a union
- Remove extra "--" from the 'perf ftrace latency' --use-nsec option,
previously it was working only when using the '-n' alternative
- Don't stop building when both binutils-devel and a C++ compiler isn't
available to compile the alternative C++ demangle support code,
disable that feature instead
- Sync the linux/in.h and coresight-pmu.h header copies with the kernel
sources
- Fix relative include path to cs-etm.h
* tag 'perf-tools-fixes-for-v6.4-2-2023-05-30' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
perf evsel: Separate bpf_counter_list and bpf_filters, can be used at the same time
tools headers UAPI: Sync the linux/in.h with the kernel sources
perf cs-etm: Copy kernel coresight-pmu.h header
perf bpf: Do not use llvm-strip on BPF binary
perf build: Don't compile demangle-cxx.cpp if not necessary
perf arm: Fix include path to cs-etm.h
perf bpf filter: Fix a broken perf sample data naming for BPF CO-RE
perf ftrace latency: Remove unnecessary "--" from --use-nsec option
Pull regmap fixes from Mark Brown:
"The most important fix here is for missing dropping of the RCU read
lock when syncing maple tree register caches, the physical devices I
have that use the code don't do any syncing so I'd only ever tested
this with virtual devices and missed the fact that we need to drop the
lock in order to write to buses that need to sleep.
Otherwise there's a fix for an edge case when splitting up large batch
writes which has been lurking for a long time, a check to make sure
nobody writes new drivers with a bug that was found in several
SoundWire drivers and a tweak to the way the new kunit tests are
enabled to ensure they don't cause regmap to be enabled when it
wouldn't otherwise be"
* tag 'regmap-fix-v6.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap: maple: Drop the RCU read lock while syncing registers
regmap: sdw: check for invalid multi-register writes config
regmap: Account for register length when chunking
regmap: REGMAP_KUNIT should not select REGMAP
Pull modules fix from Luis Chamberlain:
"A fix is provided for ia64. Even though ia64 is on life support it
helps to fix issues if we can. Thanks to Linus for doing tons of the
ia64 debugging"
* tag 'modules-6.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux:
module: fix module load for ia64