Reported an IO hang and unrecoverable error in our testing environment.
After careful research, we found that bch_allocator_thread is stuck,
the call stack is as follows:
[<0>] __switch_to+0xbc/0x108
[<0>] __closure_sync+0x7c/0xbc [bcache]
[<0>] bch_prio_write+0x430/0x448 [bcache]
[<0>] bch_allocator_thread+0xb44/0xb70 [bcache]
[<0>] kthread+0x124/0x130
[<0>] ret_from_fork+0x10/0x18
Moreover, the RESERVE_BTREE type bucket slot are empty and journal_full
occurs at the same time.
When the cache disk is first used, the sb.nJournal_buckets defaults to 0.
So, only 8 RESERVE_BTREE type buckets are reserved. If RESERVE_BTREE type
buckets used up or btree_check_reserve() failed when request handle btree
split, the request will be repeatedly retried and wait for alloc thread to
fill in.
After the alloc thread fills the buckets, it will call bch_prio_write().
If journal_full occurs simultaneously at this time, journal_reclaim() and
btree_flush_write() will be called sequentially, journal_write cannot be
completed.
This is a low probability event, we believe that reserve more RESERVE_BTREE
buckets can avoid the worst situation.
Fixes: 682811b3ce ("bcache: fix for allocator and register thread race")
Signed-off-by: Mingzhe Zou <mingzhe.zou@easystack.cn>
Signed-off-by: Coly Li <colyli@kernel.org>
Link: https://lore.kernel.org/r/20250527051601.74407-4-colyli@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
If the first cpu_node = of_cpu_device_node_get() fails then the cleanup.h
code will try to free "state_node" but it hasn't been initialized yet.
Declare the device_nodes where they are initialized to fix this.
Fixes: 5836ebeb4a ("cpuidle: psci: Avoid initializing faux device if no DT idle states are present")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Link: https://patch.msgid.link/aDVRcfU8O8sez1x7@stanley.mountain
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The error code is not set correctly on if kasprintf() fails. On the
first iteration it would return -EINVAL and subsequent iterations
would return success. Set it to -ENOMEM.
In real life, this allocation will not fail and if it did the system
will not boot so this change is mostly to silence static checker warnings
more than anything else.
Fixes: 04f53540f7 ("ACPI: MRRM: Add /sys files to describe memory ranges")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Link: https://patch.msgid.link/aDVTfEm-Jch7FuHG@stanley.mountain
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* for-next/sme-fixes: (35 commits)
arm64/fpsimd: Allow CONFIG_ARM64_SME to be selected
arm64/fpsimd: ptrace: Gracefully handle errors
arm64/fpsimd: ptrace: Mandate SVE payload for streaming-mode state
arm64/fpsimd: ptrace: Do not present register data for inactive mode
arm64/fpsimd: ptrace: Save task state before generating SVE header
arm64/fpsimd: ptrace/prctl: Ensure VL changes leave task in a valid state
arm64/fpsimd: ptrace/prctl: Ensure VL changes do not resurrect stale data
arm64/fpsimd: Make clone() compatible with ZA lazy saving
arm64/fpsimd: Clear PSTATE.SM during clone()
arm64/fpsimd: Consistently preserve FPSIMD state during clone()
arm64/fpsimd: Remove redundant task->mm check
arm64/fpsimd: signal: Use SMSTOP behaviour in setup_return()
arm64/fpsimd: Add task_smstop_sm()
arm64/fpsimd: Factor out {sve,sme}_state_size() helpers
arm64/fpsimd: Clarify sve_sync_*() functions
arm64/fpsimd: ptrace: Consistently handle partial writes to NT_ARM_(S)SVE
arm64/fpsimd: signal: Consistently read FPSIMD context
arm64/fpsimd: signal: Mandate SVE payload for streaming-mode state
arm64/fpsimd: signal: Clear PSTATE.SM when restoring FPSIMD frame only
arm64/fpsimd: Do not discard modified SVE state
...
* for-next/selftests:
kselftest/arm64: Set default OUTPUT path when undefined
kselftest/arm64: fp-ptrace: Adjust to new inactive mode behaviour
kselftest/arm64: fp-ptrace: Adjust to new VL change behaviour
kselftest/arm64: tpidr2: Adjust to new clone() behaviour
kselftest/arm64: fp-ptrace: Fix expected FPMR value when PSTATE.SM is changed
* for-next/mm:
arm64/boot: Disallow BSS exports to startup code
arm64/boot: Move global CPU override variables out of BSS
arm64/boot: Move init_pgdir[] and init_idmap_pgdir[] into __pi_ namespace
arm64: mm: Drop redundant check in pmd_trans_huge()
arm64/mm: Permit lazy_mmu_mode to be nested
arm64/mm: Disable barrier batching in interrupt contexts
arm64/mm: Batch barriers when updating kernel mappings
mm/vmalloc: Enter lazy mmu mode while manipulating vmalloc ptes
arm64/mm: Support huge pte-mapped pages in vmap
mm/vmalloc: Gracefully unmap huge ptes
mm/vmalloc: Warn on improper use of vunmap_range()
arm64/mm: Hoist barriers out of set_ptes_anysz() loop
arm64: hugetlb: Use __set_ptes_anysz() and __ptep_get_and_clear_anysz()
arm64/mm: Refactor __set_ptes() and __ptep_get_and_clear()
mm/page_table_check: Batch-check pmds/puds just like ptes
arm64: hugetlb: Refine tlb maintenance scope
arm64: hugetlb: Cleanup huge_pte size discovery mechanisms
arm64: pageattr: Explicitly bail out when changing permissions for vmalloc_huge mappings
arm64: Support ARM64_VA_BITS=52 when setting ARCH_MMAP_RND_BITS_MAX
arm64/mm: Remove randomization of the linear map
* for-next/misc:
arm64/cpuinfo: only show one cpu's info in c_show()
arm64: Extend pr_crit message on invalid FDT
arm64: Kconfig: remove unnecessary selection of CRC32
arm64: Add missing includes for mem_encrypt
Merge in for-next/fixes, as subsequent improvements to our early PI
code that disallow BSS exports depend on the 'arm64_use_ng_mappings'
fix here.
* for-next/fixes:
arm64: cpufeature: Move arm64_use_ng_mappings to the .data section to prevent wrong idmap generation
arm64: errata: Add missing sentinels to Spectre-BHB MIDR arrays
Fixes a probe failure that occurs when dual SPI controllers are
enabled and INTx interrupts are used. Reduces the minimum required
number of interrupt vectors to 1 and registers a shared ISR when
the allocated vectors are fewer than the number of controllers.
This change ensures that the probe succeeds even with limited
vectors, restoring INTx functionality when multiple SPI
controllers are present.
Signed-off-by: Thangaraj Samynathan <thangaraj.s@microchip.com>
Link: https://patch.msgid.link/20250527103244.26861-1-thangaraj.s@microchip.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Driver gets and enables all regulator supplies in probe path
(wcd9335_parse_dt() and wcd9335_power_on_reset()), but does not cleanup
in final error paths and in unbind (missing remove() callback). This
leads to leaked memory and unbalanced regulator enable count during
probe errors or unbind.
Fix this by converting entire code into devm_regulator_bulk_get_enable()
which also greatly simplifies the code.
Fixes: 20aedafdf4 ("ASoC: wcd9335: add support to wcd9335 codec")
Cc: stable@vger.kernel.org
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://patch.msgid.link/20250526-b4-b4-asoc-wcd9395-vdd-px-fixes-v1-1-0b8a2993b7d3@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Subbaraya Sundeep says:
====================
octeontx2-pf: Do not detect MACSEC block based on silicon
Out of various silicon variants of CN10K series some have hardware
MACSEC block for offloading MACSEC operations and some do not.
AF driver already has the information of whether MACSEC is present
or not on running silicon. Hence fetch that information from
AF via mailbox message.
====================
Link: https://patch.msgid.link/1747894516-4565-1-git-send-email-sbhatta@marvell.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Syzkaller reports the following issue:
BUG: sleeping function called from invalid context at kernel/locking/mutex.c:578
__mutex_lock+0x106/0xe80 kernel/locking/mutex.c:746
team_change_rx_flags+0x38/0x220 drivers/net/team/team_core.c:1781
dev_change_rx_flags net/core/dev.c:9145 [inline]
__dev_set_promiscuity+0x3f8/0x590 net/core/dev.c:9189
netif_set_promiscuity+0x50/0xe0 net/core/dev.c:9201
dev_set_promiscuity+0x126/0x260 net/core/dev_api.c:286 packet_dev_mc net/packet/af_packet.c:3698 [inline]
packet_dev_mclist_delete net/packet/af_packet.c:3722 [inline]
packet_notifier+0x292/0xa60 net/packet/af_packet.c:4247
notifier_call_chain+0x1b3/0x3e0 kernel/notifier.c:85
call_netdevice_notifiers_extack net/core/dev.c:2214 [inline]
call_netdevice_notifiers net/core/dev.c:2228 [inline]
unregister_netdevice_many_notify+0x15d8/0x2330 net/core/dev.c:11972
rtnl_delete_link net/core/rtnetlink.c:3522 [inline]
rtnl_dellink+0x488/0x710 net/core/rtnetlink.c:3564
rtnetlink_rcv_msg+0x7cf/0xb70 net/core/rtnetlink.c:6955
netlink_rcv_skb+0x219/0x490 net/netlink/af_netlink.c:2534
Calling `PACKET_ADD_MEMBERSHIP` on an ops-locked device can trigger
the `NETDEV_UNREGISTER` notifier, which may require disabling promiscuous
and/or allmulti mode. Both of these operations require acquiring
the netdev instance lock.
Move the call to `packet_dev_mc` outside of the RCU critical section.
The `mclist` modifications (add, del, flush, unregister) are protected by
the RTNL, not the RCU. The RCU only protects the `sklist` and its
associated `sks`. The delayed operation on the `mclist` entry remains
within the RTNL.
Reported-by: syzbot+b191b5ccad8d7a986286@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=b191b5ccad8d7a986286
Fixes: ad7c7b2172 ("net: hold netdev instance lock during sysfs operations")
Signed-off-by: Stanislav Fomichev <stfomichev@gmail.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20250522031129.3247266-1-stfomichev@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
This is the new API for allocating DRM bridges.
This driver embeds an array of channels in the main struct, and each
channel embeds a drm_bridge. This prevents dynamic, refcount-based
deallocation of the bridges.
To make the new, dynamic bridge allocation possible:
* change the array of channels into an array of channel pointers
* allocate each channel using devm_drm_bridge_alloc()
* adapt the code wherever using the channels
* remove the is_available flag, now "ch != NULL" is equivalent
Reviewed-by: Liu Ying <victor.liu@nxp.com>
Link: https://lore.kernel.org/r/20250509-drm-bridge-convert-to-alloc-api-v3-18-b8bc1f16d7aa@bootlin.com
Signed-off-by: Luca Ceresoli <luca.ceresoli@bootlin.com>
There was an issue with SO_LINGER: instead of blocking until all queued
messages for the socket have been successfully sent (or the linger timeout
has been reached), close() would block until packets were handled by the
peer.
Add a test to alert on close() lingering when it should not.
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://patch.msgid.link/20250522-vsock-linger-v6-5-2ad00b0e447e@rbox.co
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Lingering should be transport-independent in the long run. In preparation
for supporting other transports, as well as the linger on shutdown(), move
code to core.
Generalize by querying vsock_transport::unsent_bytes(), guard against the
callback being unimplemented. Do not pass sk_lingertime explicitly. Pull
SOCK_LINGER check into vsock_linger().
Flatten the function. Remove the nested block by inverting the condition:
return early on !timeout.
Suggested-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Link: https://patch.msgid.link/20250522-vsock-linger-v6-2-2ad00b0e447e@rbox.co
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Currently vsock's lingering effectively boils down to waiting (or timing
out) until packets are consumed or dropped by the peer; be it by receiving
the data, closing or shutting down the connection.
To align with the semantics described in the SO_LINGER section of man
socket(7) and to mimic AF_INET's behaviour more closely, change the logic
of a lingering close(): instead of waiting for all data to be handled,
block until data is considered sent from the vsock's transport point of
view. That is until worker picks the packets for processing and decrements
virtio_vsock_sock::bytes_unsent down to 0.
Note that (some interpretation of) lingering was always limited to
transports that called virtio_transport_wait_close() on transport release.
This does not change, i.e. under Hyper-V and VMCI no lingering would be
observed.
The implementation does not adhere strictly to man page's interpretation of
SO_LINGER: shutdown() will not trigger the lingering. This follows AF_INET.
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Link: https://patch.msgid.link/20250522-vsock-linger-v6-1-2ad00b0e447e@rbox.co
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
When moving the Sitronix DRM drivers and renaming their Kconfig symbols,
the old symbols were kept, aiming to provide a seamless migration path
when running "make olddefconfig" or "make oldconfig".
However, the old compatibility symbols are not visible. Hence unless
they are selected by another symbol (which they are not), they can never
be enabled, and no backwards compatibility is provided.
Drop the broken mechanism and the old symbols.
Fixes: 9b8f32002c ("drm/sitronix: move tiny Sitronix drivers to their own subdir")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Javier Martinez Canillas <javierm@redhat.com>
Link: https://lore.kernel.org/r/20395b14effe5e2e05a4f0856fdcda51c410329d.1747751592.git.geert+renesas@glider.be
Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
Add support for the MaxLinear MxL86110 Gigabit Ethernet PHY, a low-power,
cost-optimized transceiver supporting 10/100/1000 Mbps over twisted-pair
copper, compliant with IEEE 802.3.
The driver implements basic features such as:
- Device initialization
- RGMII interface timing configuration
- Wake-on-LAN support
- LED initialization and control via /sys/class/leds
This driver has been tested on multiple Variscite boards, including:
- VAR-SOM-MX93 (i.MX93)
- VAR-SOM-MX8M-PLUS (i.MX8MP)
Example boot log showing driver probe:
[ 7.692101] imx-dwmac 428a0000.ethernet eth0:
PHY [stmmac-0:00] driver [MXL86110 Gigabit Ethernet] (irq=POLL)
Signed-off-by: Stefano Radaelli <stefano.radaelli21@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20250521212821.593057-1-stefano.radaelli21@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Jason A. Donenfeld says:
====================
wireguard updates for 6.16
This small series contains mostly cleanups and one new feature:
1) Kees' __nonstring annotation comes to wireguard.
2) Two selftest fixes, one to help with compilation on gcc 15, and one
removing stale config options.
3) Adoption of NLA_POLICY_MASK.
4) Jordan has added the ability to run:
# wg set ... peer ... allowed-ips -192.168.1.0/24
Which will remove the allowed IP for that peer. Previously you had to
replace all the IPs non-atomically, or move it to a dummy peer
atomically, which wasn't very clean.
====================
Link: https://patch.msgid.link/20250521212707.1767879-1-Jason@zx2c4.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
The current netlink API for WireGuard does not directly support removal
of allowed ips from a peer. A user can remove an allowed ip from a peer
in one of two ways:
1. By using the WGPEER_F_REPLACE_ALLOWEDIPS flag and providing a new
list of allowed ips which omits the allowed ip that is to be removed.
2. By reassigning an allowed ip to a "dummy" peer then removing that
peer with WGPEER_F_REMOVE_ME.
With the first approach, the driver completely rebuilds the allowed ip
list for a peer. If my current configuration is such that a peer has
allowed ips 192.168.0.2 and 192.168.0.3 and I want to remove 192.168.0.2
the actual transition looks like this.
[192.168.0.2, 192.168.0.3] <-- Initial state
[] <-- Step 1: Allowed ips removed for peer
[192.168.0.3] <-- Step 2: Allowed ips added back for peer
This is true even if the allowed ip list is small and the update does
not need to be batched into multiple WG_CMD_SET_DEVICE requests, as the
removal and subsequent addition of ips is non-atomic within a single
request. Consequently, wg_allowedips_lookup_dst and
wg_allowedips_lookup_src may return NULL while reconfiguring a peer even
for packets bound for ips a user did not intend to remove leading to
unintended interruptions in connectivity. This presents in userspace as
failed calls to sendto and sendmsg for UDP sockets. In my case, I ran
netperf while repeatedly reconfiguring the allowed ips for a peer with
wg.
/usr/local/bin/netperf -H 10.102.73.72 -l 10m -t UDP_STREAM -- -R 1 -m 1024
send_data: data send error: No route to host (errno 113)
netperf: send_omni: send_data failed: No route to host
While this may not be of particular concern for environments where peers
and allowed ips are mostly static, systems like Cilium manage peers and
allowed ips in a dynamic environment where peers (i.e. Kubernetes nodes)
and allowed ips (i.e. pods running on those nodes) can frequently
change making WGPEER_F_REPLACE_ALLOWEDIPS problematic.
The second approach avoids any possible connectivity interruptions
but is hacky and less direct, requiring the creation of a temporary
peer just to dispose of an allowed ip.
Introduce a new flag called WGALLOWEDIP_F_REMOVE_ME which in the same
way that WGPEER_F_REMOVE_ME allows a user to remove a single peer from
a WireGuard device's configuration allows a user to remove an ip from a
peer's set of allowed ips. This enables incremental updates to a
device's configuration without any connectivity blips or messy
workarounds.
A corresponding patch for wg extends the existing `wg set` interface to
leverage this feature.
$ wg set wg0 peer <PUBKEY> allowed-ips +192.168.88.0/24,-192.168.0.1/32
When '+' or '-' is prepended to any ip in the list, wg clears
WGPEER_F_REPLACE_ALLOWEDIPS and sets the WGALLOWEDIP_F_REMOVE_ME flag on
any ip prefixed with '-'.
Signed-off-by: Jordan Rife <jordan@jrife.io>
[Jason: minor style nits, fixes to selftest, bump of wireguard-tools version]
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Link: https://patch.msgid.link/20250521212707.1767879-5-Jason@zx2c4.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>