Kumar Kartikeya Dwivedi says:
====================
Fix bpf_throw() vs global subprogs interaction
There is a bug where bpf_throw()'s reachability across global subprogs
is missed by the verifier, leading to successful verification when any
kernel resource or lock is held across global subprog call boundary.
Fix this by effect summarization like other related side effects and
propagate exception reachability into callees.
Changelog:
----------
v1 -> v2
v1: https://lore.kernel.org/bpf/20260516022426.2109698-1-memxor@gmail.com
* Reorder might_throw bit to avoid bpf-next conflicts. (Alexei)
====================
Link: https://patch.msgid.link/20260517075530.3461166-1-memxor@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Add a verifier failure case where the caller holds a reference across a
global subprog call that may throw. The program must be rejected because
the exceptional path would skip the caller's reference release.
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20260517075530.3461166-3-memxor@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Global subprogs are verified independently and are not descended into
when their callers are symbolically executed. This means a caller can
hold references or locks across a global subprog call that may throw,
while the verifier only checks the non-exceptional return path at the
call site.
Record whether a subprog might throw in the CFG summary pass, alongside
the existing might_sleep and packet-data-changing summaries, and
propagate that effect through reachable callees.
When a global subprog is marked as possibly throwing, push the normal
continuation and validate the exceptional path immediately at the call
site, avoiding a synthetic exception state and associated special case
in the pruning checks.
Fixes: f18b03faba ("bpf: Implement BPF exceptions")
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20260517075530.3461166-2-memxor@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Building without CONFIG_BPF_EVENTS produces a build-time
warning:
WARN: resolve_btfids: unresolved symbol bpf_session_is_return
The function is actually defined in kernel/trace/bpf_trace.o,
which is built conditionally based on configuration.
Make the reference to this function conditional as well,
as is already done in the bpf verifier for other functions.
Fixes: 8fe4dc4f64 ("bpf: change prototype of bpf_session_{cookie,is_return}")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20260515113242.2706303-1-arnd@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
percpu_array_map_ops.map_meta_equal points to the generic
bpf_map_meta_equal(), which does not compare max_entries. When a
percpu array serves as an inner map, replacing it with one that has
fewer max_entries bypasses the check. Since percpu_array_map_gen_lookup()
inlines the original template's index_mask as a JIT immediate, a lookup
on the replacement map can access pptrs[] out of bounds.
Point percpu_array_map_ops.map_meta_equal to array_map_meta_equal(),
which already enforces the max_entries equality check.
Add a selftest to verify that replacing a percpu array inner map with
a differently-sized one is rejected.
Fixes: db69718b8e ("bpf: inline bpf_map_lookup_elem() for PERCPU_ARRAY maps")
Signed-off-by: Guannan Wang <wgnbuaa@gmail.com>
Acked-by: Mykyta Yatsenko <yatsenko@meta.com>
Link: https://lore.kernel.org/r/20260514074454.77491-1-wgnbuaa@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Yazhou Tang says:
====================
bpf: Fix call offset truncation and OOB read in bpf_patch_call_args()
From: Yazhou Tang <tangyazhou518@outlook.com>
This patchset addresses a silent truncation bug in the BPF verifier that
occurs when a bpf-to-bpf call involves a massive relative jump offset.
Additionally, it fixes a pre-existing out-of-bounds (OOB) read issue
in the interpreter fallback path.
Because the BPF instruction set utilizes a 32-bit imm field for bpf-to-bpf
calls, implicitly downcasting it to the 16-bit insn->off in bpf_patch_call_args()
causes incorrect call targets or subprog ID resolution for large BPF programs.
While fixing this by swapping the imm and off fields, it was discovered that
the original code also had a load-time OOB read vulnerability when the stack
depth exceeds MAX_BPF_STACK during JIT fallback.
Patch 1/3 fixes the pre-existing OOB read in bpf_patch_call_args(). It
changes the function to return an int and explicitly rejects the JIT
fallback if the stack depth exceeds MAX_BPF_STACK, preventing a potential
stack buffer overflow.
Patch 2/3 fixes the s16 truncation bug.
1. Keep the original imm field unchanged and use the off field to store
the interpreter function index.
2. Adjust the JMP_CALL_ARGS case in ___bpf_prog_run() accordingly.
3. Restore the legacy xlated dump layout in bpf_insn_prepare_dump().
Patch 3/3 introduces a selftest for this fix.
---
Change log:
v10:
1. Make the error log in patch 1/3 more clear. (Kuohai)
2. Drop bpftool and disasm_helpers.c changes, and instead restore the
legacy xlated dump layout in bpf_insn_prepare_dump(). This avoids
requiring bpftool compatibility handling. (Quentin and Alexei)
v9: https://lore.kernel.org/bpf/20260429171904.107244-1-tangyazhou@zju.edu.cn/
1. Modify the selftest in patch 3/3: use __clobber_all in inline asm.
(Sashiko AI reviewer)
v8: https://lore.kernel.org/bpf/20260429105608.92741-1-tangyazhou@zju.edu.cn/
1. Update cfg_partition_funcs() in bpftool to use insn->imm for call target
calculation. (Sashiko AI reviewer)
2. Modify the selftest in patch 3/3: add a large padding before the call
instruction, preventing the kernel panic on kernel without the fix.
(Sashiko AI reviewer)
3. Modify the selftest in patch 3/3: make it more clear.
v7: https://lore.kernel.org/bpf/20260421144504.823756-1-tangyazhou@zju.edu.cn/
1. Rebase the patchset to the bpf-next tree to resolve the apply conflict.
(Alexei)
2. Add Patch 1/3 to properly fix a pre-existing OOB read in bpf_patch_call_args().
(Sashiko AI reviewer)
v6: https://lore.kernel.org/bpf/20260412170334.716778-1-tangyazhou@zju.edu.cn/
1. Use a different but clearer approach to resolve this issue: keeping
the original imm field unchanged and using the off field to store the
interpreter function index. (Kuohai)
2. Update the related dumper code and remove a previous workaround in the
selftests disasm helpers, which is no longer needed after this fix.
v5: https://lore.kernel.org/bpf/20260326090133.221957-1-tangyazhou@zju.edu.cn/
1. Some minor changes in commit messages. (AI Reviewer)
v4: https://lore.kernel.org/bpf/20260326063329.10031-1-tangyazhou@zju.edu.cn/
1. Remove some redundant commit messages of patch 2/3. (Emil)
2. Change the number of instructions in padding_subprog() from 200,000
to 32,765, which is the minimum number of instructions required to
trigger the verifier failure. (Emil)
v3: https://lore.kernel.org/bpf/20260323122254.98540-1-tangyazhou@zju.edu.cn/
1. Resend to fix a typo in v2 and add "Fixes" tag. The rest of the changes
are identical to v2.
v2 (incorrect): https://lore.kernel.org/bpf/20260323081748.106603-1-tangyazhou@zju.edu.cn/
1. Move the s16 boundary check from fixup_call_args() to bpf_patch_call_args(),
and change the return type of bpf_patch_call_args() to int. (Emil)
2. Add Patch 3/3 to fix the incorrect subprog ID in dumped bpf_pseudo_call
instructions, which is caused by the same truncation issue. (Puranjay)
3. Refine the new selftest for clarity and add detailed comments explaining
the test design. (Emil)
v1: https://lore.kernel.org/bpf/20260316190220.113417-1-tangyazhou@zju.edu.cn/
====================
Link: https://patch.msgid.link/20260506094714.419842-1-tangyazhou@zju.edu.cn
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Add a selftest to verify the verifier and JIT behavior when handling
bpf-to-bpf calls with relative jump offsets exceeding the s16 boundary.
The test utilizes an inline assembly block with ".rept 32765" to generate
a massive dummy subprogram. By placing this padding between the main
program and the target subprogram, it forces the verifier to process a
bpf-to-bpf call where the imm field exceeds the s16 range.
- When JIT is enabled, it asserts that the program is successfully loaded
and executes correctly to return the expected value. Since the fix
does not change the JIT behavior, the test passes whether the fix is
applied or not.
- When JIT is disabled, it also asserts that the program is successfully
loaded and executes correctly to return the expected value 3.
- Before the fix, the verifier rewrites the call instruction with a
truncated offset (here 32768 -> -32768) and lets it pass. When the
program is executed, the call instruction will go to a wrong target
(the landing pad) instead of the intended subprogram, then return -1
and fail.
- After the fix, the verifier correctly handles the large offset and
allows it to pass. The program then executes correctly to return the
expected value 3.
Co-developed-by: Tianci Cao <ziye@zju.edu.cn>
Signed-off-by: Tianci Cao <ziye@zju.edu.cn>
Co-developed-by: Shenghao Yuan <shenghaoyuan0928@163.com>
Signed-off-by: Shenghao Yuan <shenghaoyuan0928@163.com>
Signed-off-by: Yazhou Tang <tangyazhou518@outlook.com>
Acked-by: Xu Kuohai <xukuohai@huawei.com>
Link: https://lore.kernel.org/r/20260506094714.419842-4-tangyazhou@zju.edu.cn
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Currently, the BPF instruction set allows bpf-to-bpf calls (or internal
calls, pseudo calls) to use a 32-bit imm field to represent the relative
jump offset.
However, when JIT is disabled or falls back to the interpreter, the
verifier invokes bpf_patch_call_args() to rewrite the call instruction.
In this function, the 32-bit imm is downcast to s16 and stored in the off
field.
void bpf_patch_call_args(struct bpf_insn *insn, u32 stack_depth)
{
stack_depth = max_t(u32, stack_depth, 1);
insn->off = (s16) insn->imm;
insn->imm = interpreters_args[(round_up(stack_depth, 32) / 32) - 1] -
__bpf_call_base_args;
insn->code = BPF_JMP | BPF_CALL_ARGS;
}
If the original imm exceeds the s16 range (i.e., a jump offset greater
than 32767 instructions), this downcast silently truncates the offset,
resulting in an incorrect call target.
Fix this by:
1. In bpf_patch_call_args(), keeping the imm field unchanged and using the
off field to store the index of the interpreter function.
2. In ___bpf_prog_run() for the JMP_CALL_ARGS case, retrieving the
interpreter function pointer from the interpreters_args array using the
off field as the index, and passing the original imm to calculate the
last argument of the interpreter function.
After these changes, the truncation issue is resolved, and __bpf_call_base_args
is also no longer needed and can be removed, which makes the code cleaner.
Performance: In ___bpf_prog_run() for the JMP_CALL_ARGS case, changing the
retrieval of the interpreter function pointer from pointer addition to
direct array indexing improves performance. The possible reason is that the
latter has better instruction-level parallelism. See the v5 discussion [1]
for more details.
[1] https://lore.kernel.org/bpf/f120c3c4-6999-414a-b514-518bb64b4758@zju.edu.cn/
To avoid requiring bpftool changes, keep the new imm/off encoding internal
and restore the legacy xlated dump layout in bpf_insn_prepare_dump().
For bpf-to-bpf call offsets that do not fit in s16, export off as 0 instead
of a truncated and misleading value.
Fixes: 1ea47e01ad ("bpf: add support for bpf_call to interpreter")
Fixes: 7105e828c0 ("bpf: allow for correlation of maps and helpers in dump")
Suggested-by: Xu Kuohai <xukuohai@huaweicloud.com>
Suggested-by: Puranjay Mohan <puranjay@kernel.org>
Co-developed-by: Tianci Cao <ziye@zju.edu.cn>
Signed-off-by: Tianci Cao <ziye@zju.edu.cn>
Co-developed-by: Shenghao Yuan <shenghaoyuan0928@163.com>
Signed-off-by: Shenghao Yuan <shenghaoyuan0928@163.com>
Signed-off-by: Yazhou Tang <tangyazhou518@outlook.com>
Link: https://lore.kernel.org/r/20260506094714.419842-3-tangyazhou@zju.edu.cn
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
The interpreters_args array only accommodates stack depths up to
MAX_BPF_STACK (512 bytes). However, do_misc_fixups() may allow a larger
stack depth if JIT is requested.
If JIT compilation later fails and falls back to the interpreter, the
verifier invokes bpf_patch_call_args() with this oversized stack depth.
This causes a load-time out-of-bounds (OOB) read when calculating the
interpreter function pointer index.
Fix this by changing bpf_patch_call_args() to return an int and explicitly
rejecting the JIT fallback (returning -EINVAL) if the stack depth exceeds
MAX_BPF_STACK.
Fixes: 1ea47e01ad ("bpf: add support for bpf_call to interpreter")
Co-developed-by: Tianci Cao <ziye@zju.edu.cn>
Signed-off-by: Tianci Cao <ziye@zju.edu.cn>
Co-developed-by: Shenghao Yuan <shenghaoyuan0928@163.com>
Signed-off-by: Shenghao Yuan <shenghaoyuan0928@163.com>
Signed-off-by: Yazhou Tang <tangyazhou518@outlook.com>
Acked-by: Xu Kuohai <xukuohai@huawei.com>
Link: https://lore.kernel.org/r/20260506094714.419842-2-tangyazhou@zju.edu.cn
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Pull EDAC fix from Borislav Petkov:
- Fix a string leak in the versalnet driver
* tag 'edac_urgent_for_v7.1_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
EDAC/versalnet: Fix device name memory leak
The DATA-packet handler in rxrpc_input_call_event() and the RESPONSE
handler in rxrpc_verify_response() copy the skb to a linear one before
calling into the security ops only when skb_cloned() is true. An skb
that is not cloned but still carries externally-owned paged fragments
(e.g. SKBFL_SHARED_FRAG set by splice() into a UDP socket via
__ip_append_data, or a chained skb_has_frag_list()) falls through to
the in-place decryption path, which binds the frag pages directly into
the AEAD/skcipher SGL via skb_to_sgvec().
Extend the gate to also unshare when skb_has_frag_list() or
skb_has_shared_frag() is true. This catches the splice-loopback vector
and other externally-shared frag sources while preserving the
zero-copy fast path for skbs whose frags are kernel-private (e.g. NIC
page_pool RX, GRO). The OOM/trace handling already in place is reused.
Fixes: d0d5c0cd1e ("rxrpc: Use skb_unshare() rather than skb_cow_data()")
Cc: stable@vger.kernel.org
Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com>
Reviewed-by: Jiayuan Chen <jiayuan.chen@linux.dev>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull clk driver fixes from Stephen Boyd:
- Mark the DDR bus clk critical in the SpaceMiT driver so that
boot doesn't fail
- Fix boot on Mobile EyeQ by creating the auxiliary device for
the ethernet PHY
- Plug an OF node leak in Rockchip rk808 clk driver
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
clk: rk808: fix OF node reference imbalance
MAINTAINERS: add myself as a reviewer for the clk subsystem
reset: eyeq: drop device_set_of_node_from_dev() done by parent
clk: eyeq: add EyeQ5 children auxiliary device for generic PHYs
clk: eyeq: use the auxiliary device creation helper
clk: spacemit: k3: mark top_dclk as CLK_IS_CRITICAL
BPF_MAP_TYPE_ARENA accepts BPF_PSEUDO_MAP_VALUE offsets at exactly
the end of the arena mapping (off == arena_size). The boundary check
in arena_map_direct_value_addr() uses `>` instead of `>=`, which
incorrectly allows a one-past-end pointer to be accepted.
Change the condition to `>=` to correctly reject offsets that fall
outside the valid arena user_vm range.
Fixes: 317460317a ("bpf: Introduce bpf_arena.")
Signed-off-by: Junyoung Jang <graypanda.inzag@gmail.com>
Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>
Link: https://lore.kernel.org/r/20260426172505.1947915-1-graypanda.inzag@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
XSKMAP entries are used as redirect targets for incoming XDP frames.
A TX-only AF_XDP socket lacks an Rx ring and cannot handle redirected
traffic, but xsk_map_update_elem() currently allows such sockets to
be inserted into the map.
Redirecting packets to such a socket on the veth generic-XDP path
causes a kernel crash in xsk_generic_rcv().
This became possible after xsk_is_setup_for_bpf_map() was removed from
the XSKMAP update path, which allowed bound TX-only sockets to be
inserted into the map.
Reject TX-only sockets during XSKMAP updates to avoid the crash.
They remain fully operational for pure Tx purposes outside XSKMAP.
Fixes: 968be23cea ("xsk: Fix possible segfault at xskmap entry insertion")
Reported-by: Juefei Pu <tomapufckgml@gmail.com>
Reported-by: Yuan Tan <yuantan098@gmail.com>
Reported-by: Xin Liu <bird@lzu.edu.cn>
Signed-off-by: Yifan Wu <yifanwucs@gmail.com>
Signed-off-by: Linpu Yu <linpu5433@gmail.com>
Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>
Link: https://lore.kernel.org/r/20260508144344.694-1-linpu5433@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Because subprog 0, the main subprog, is considered a global function,
we end up running the arg-tracking dataflow analysis twice on it. That
results in slightly longer verification but mostly in more verbose
verifier logs. This patch fixes it by keeping only the iteration over
global subprogs.
When running over all of Cilium's programs with BPF_LOG_LEVEL2, this
reduces verbosity by ~20% on average.
Fixes: bf0c571f7f ("bpf: introduce forward arg-tracking dataflow analysis")
Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/e4d7b53d4963ef520541a782f5fc8108a168877c.1778176504.git.paul.chaignon@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Pull fsverity fix from Eric Biggers:
"Fix a regression in overlayfs caused by an fsverity API change"
* tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fsverity/linux:
ovl: fix verity lazy-load guard broken by fsverity_active() semantic change
Pull Rust fixes from Miguel Ojeda:
"Toolchain and infrastructure:
- Add 'bindgen' target to make UML 32-bit builds work with GCC
- Disable two Clippy warnings ('collapsible_{if,match}')
'pin-init' crate:
- Fix unsoundness issue that created &'static references"
* tag 'rust-fixes-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux:
rust: allow `clippy::collapsible_if` globally
rust: allow `clippy::collapsible_match` globally
rust: pin-init: fix incorrect accessor reference lifetime
rust: pin-init: internal: move alignment check to `make_field_check`
rust: arch: um: Fix building 32-bit UML with GCC
Pull staging driver fixes from Greg KH:
"Here are two small staging driver fixes for 7.1-rc3. They are:
- vme_user root device leak fix
- NULL dereference bugfix in the rtl8723bs driver
Both of these have been in linux-next all this week with no reported
issues"
* tag 'staging-7.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
staging: rtl8723bs: os_dep: avoid NULL pointer dereference in rtw_cbuf_alloc
staging: vme_user: fix root device leak on init failure
Pull USB driver fixes from Greg KH:
"Here are some small USB driver fixes for 7.1-rc3 to resolve some
reported issues, and a new device id. These are:
- usblp driver heap leak fixes
- ulpi driver memory leak fix
- typec driver fixes
- dwc3 driver fix
- omap dma driver fix
- new option driver device id addition
All of these have been in linux-next for over a week with no reported
issues"
* tag 'usb-7.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
USB: serial: option: add Telit Cinterion LE910Cx compositions
usb: usblp: fix uninitialized heap leak via LPGETSTATUS ioctl
usb: usblp: fix heap leak in IEEE 1284 device ID via short response
usb: dwc3: Move GUID programming after PHY initialization
usb: typec: tcpm: fix debug accessory mode detection for sink ports
usb: typec: tcpm: reset internal port states on soft reset AMS
usb: ulpi: fix memory leak on ulpi_register() error paths
USB: omap_udc: DMA: Don't enable burst 4 mode
Pull i2c fixes from Wolfram Sang:
- sanitize more input parameters in the core (found by syzkaller)
- usual set of driver fixes (proper completion handling, applying
quirks, correct workqueue selection...)
- ID additions to simplify dependency handling
- new email address for Peter Rosin
* tag 'i2c-for-7.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: smbus: reject oversized block transfers in the common path
MAINTAINERS: Update mail for Peter Rosin
i2c: stub: Reject I2C block transfers with invalid length
i2c: Compare the return value of gpiod_get_direction against GPIO_LINE_DIRECTION_OUT
i2c: dev: prevent integer overflow in I2C_TIMEOUT ioctl
i2c: acpi: Add ELAN0678 to i2c_acpi_force_100khz_device_ids
dt-bindings: i2c: apple,i2c: Add t8122 compatible
i2c: stm32f7: reinit_completion() per transfer not per msg
dt-bindings: i2c: amlogic: Add compatible for T7 SOC
i2c: testunit: Replace system_long_wq with system_dfl_long_wq
Pull powerpc fixes from Madhavan Srinivasan:
- Fix KASAN sanitization flag for core_$(BITS).o
- Fixes for handling offset values in pseries htmdump
- Fix interrupt mask in cpm1_gpiochip_add16()
- ps3/pasemi fixes to drop redundant result assignment
- Fixes in papr-hvpipe code path
- powerpc/perf: Update check for PERF_SAMPLE_DATA_SRC marked events
Thanks to Aboorva Devarajan, Athira Rajeev, Christophe Leroy (CS GROUP),
Geert Uytterhoeven, Haren Myneni, Krzysztof Kozlowski, Mukesh Kumar
Chaurasiya (IBM), Nathan Chancellor, Ritesh Harjani (IBM), Shivani
Nittor, Sourabh Jain, Thomas Zimmermann, and Venkat Rao Bagalkote.
* tag 'powerpc-7.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (21 commits)
powerpc/pasemi: Drop redundant res assignment
powerpc/ps3: Drop redundant result assignment
powerpc/vdso: Drop -DCC_USING_PATCHABLE_FUNCTION_ENTRY from 32-bit flags with clang
arch/powerpc: Drop CONFIG_FIRMWARE_EDID from defconfig files
powerpc/perf: Update check for PERF_SAMPLE_DATA_SRC marked events
powerpc/8xx: Fix interrupt mask in cpm1_gpiochip_add16()
powerpc/vmx: avoid KASAN instrumentation in enter_vmx_ops() for kexec
powerpc/kdump: fix KASAN sanitization flag for core_$(BITS).o
pseries/papr-hvpipe: Fix style and checkpatch issues in enable_hvpipe_IRQ()
pseries/papr-hvpipe: Refactor and simplify hvpipe_rtas_recv_msg()
pseries/papr-hvpipe: Kill task_struct pointer from struct hvpipe_source_info
pseries/papr-hvpipe: Simplify spin unlock usage in papr_hvpipe_handle_release()
pseries/papr-hvpipe: Fix the usage of copy_to_user()
pseries/papr-hvpipe: Fix & simplify error handling in papr_hvpipe_init()
pseries/papr-hvpipe: Fix null ptr deref in papr_hvpipe_dev_create_handle()
pseries/papr-hvpipe: Prevent kernel stack memory leak to userspace
pseries/papr-hvpipe: Fix race with interrupt handler
powerpc/pseries/htmdump: Add memory configuration dump support to htmdump module
powerpc/pseries/htmdump: Fix the offset value used in htm status dump
powerpc/pseries/htmdump: Fix the offset value used in processor configuration dump
...
Pull x86 fixes from Ingo Molnar:
- Fix memory map enumeration bug in the Xen e820 parsing code (Juergen
Gross)
- Re-enable e820 BIOS fallback if e820 table is empty (David Gow)
* tag 'x86-urgent-2026-05-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/boot/e820: Re-enable BIOS fallback if e820 table is empty
x86/xen: Fix a potential problem in xen_e820_resolve_conflicts()
Pull timer fix from Ingo Molnar:
"Fix CPU hotplug activation race in the timer migration code, by
Frederic Weisbecker"
* tag 'timers-urgent-2026-05-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
timers/migration: Fix another hotplug activation race
Pull scheduler fixes from Ingo Molnar:
- Fix spurious failures in rseq self-tests (Mark Brown)
- Fix rseq rseq::cpu_id_start ABI regression due to TCMalloc's creative
use of the supposedly read-only field
The fix is to introduce a new ABI variant based on a new (larger)
rseq area registration size, to keep the TCMalloc use of rseq
backwards compatible on new kernels (Thomas Gleixner)
- Fix wakeup_preempt_fair() for not waking up task (Vincent Guittot)
- Fix s64 mult overflow in vruntime_eligible() (Zhan Xusheng)
* tag 'sched-urgent-2026-05-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/fair: Fix wakeup_preempt_fair() for not waking up task
sched/fair: Fix overflow in vruntime_eligible()
selftests/rseq: Expand for optimized RSEQ ABI v2
rseq: Reenable performance optimizations conditionally
rseq: Implement read only ABI enforcement for optimized RSEQ V2 mode
selftests/rseq: Validate legacy behavior
selftests/rseq: Make registration flexible for legacy and optimized mode
selftests/rseq: Skip tests if time slice extensions are not available
rseq: Revert to historical performance killing behaviour
rseq: Don't advertise time slice extensions if disabled
rseq: Protect rseq_reset() against interrupts
rseq: Set rseq::cpu_id_start to 0 on unregistration
selftests/rseq: Don't run tests with runner scripts outside of the scripts
Pull perf events fixes from Ingo Molnar:
- Fix deadlock in the perf_mmap() failure path (Peter Zijlstra)
- Intel ACR (Auto Counter Reload) fixes (Dapeng Mi):
- Fix validation and configuration of ACR masks
- Fix ACR rescheduling bug causing stale masks
- Disable the PMI on ACR-enabled hardware
- Enable ACR on Panther Cover uarch too
* tag 'perf-urgent-2026-05-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/intel: Enable auto counter reload for DMR
perf/x86/intel: Disable PMI for self-reloaded ACR events
perf/x86/intel: Always reprogram ACR events to prevent stale masks
perf/x86/intel: Improve validation and configuration of ACR masks
perf/core: Fix deadlock in perf_mmap() failure path
Pull arm64 fix from Catalin Marinas:
- ptrace(PTRACE_SETREGSET) fix to zero the target's fpsimd_state rather
than the tracer's
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64/fpsimd: ptrace: zero target's fpsimd_state, not the tracer's
Pull PCI fixes from Bjorn Helgaas:
- Don't fallback to bus reset after failed slot reset; a bus reset
isn't safe if the .reset_slot() callback is implemented (Keith Busch)
- Update saved_config_space upon resource assignment to fix passthrough
regressions when x86 pcibios_assign_resources() updates BARs (Lukas
Wunner)
- Initialize a temporary pci_dev->dev in sysfs 'new_id' attribute to
fix a lockdep regression after driver_override was moved from PCI to
device core (Samiullah Khawaja)
- Update MAINTAINERS email addresses (Marek Vasut, Hans Zhang)
- Add MAINTAINERS reviewer for PCIe Cadence IP (Aksh Garg)
* tag 'pci-v7.1-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci:
MAINTAINERS: Add Aksh Garg as PCIe CADENCE reviewer
MAINTAINERS: Update Hans Zhang email for PCIe CIX Sky1
MAINTAINERS: Update Marek Vasut email for PCIe R-Car
PCI: Initialize temporary device in new_id_store()
PCI: Update saved_config_space upon resource assignment
PCI: Don't fallback to bus reset after failed slot reset
When setting new_id of a PCI device driver using sysfs a lockdep splat
occurs. This is because new_id_store() builds a temporary pci_dev for
pci_match_device(), which calls device_match_driver_override(). That
depends on the driver_override.lock added by cb3d1049f4 ("driver core:
generalize driver_override in struct device").
The new driver_override.lock was not initialized in the temporary pci_dev,
resulting in this lockdep splat.
Initialize the temporary pci_dev to fix this.
Repro:
Build with CONFIG_LOCKDEP=y, boot with QEMU, and add a new ID:
# echo "8086 10f5" > /sys/bus/pci/drivers/e1000e/new_id
INFO: trying to register non-static key.
The code is fine but needs lockdep annotation, or maybe
you didn't initialize this object before use?
turning off the locking correctness validator.
CPU: 2 UID: 0 PID: 177 Comm: liveupdate-iomm Not tainted 7.0.0+ #9 PREEMPT(full)
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
Call Trace:
<TASK>
dump_stack_lvl+0x5d/0x80
register_lock_class+0x77e/0x790
lock_acquire+0xbf/0x2e0
pci_match_device+0x24/0x180
new_id_store+0x189/0x1d0
kernfs_fop_write_iter+0x14f/0x210
vfs_write+0x263/0x5e0
ksys_write+0x79/0xf0
do_syscall_64+0x117/0xf80
Fixes: 10a4206a24 ("PCI: use generic driver_override infrastructure")
Fixes: 8895d3bcb8 ("PCI: Fail new_id for vendor/device values already built into driver")
Signed-off-by: Samiullah Khawaja <skhawaja@google.com>
[bhelgaas: add commit log details and repro, trim backtrace]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patch.msgid.link/20260505234327.716630-1-skhawaja@google.com
Bernd reports passthrough failure of a Digital Devices Cine S2 V6 DVB
adapter plugged into an ASRock X570S PG Riptide board with BIOS version
P5.41 (09/07/2023):
ddbridge 0000:05:00.0: detected Digital Devices Cine S2 V6 DVB adapter
ddbridge 0000:05:00.0: cannot read registers
ddbridge 0000:05:00.0: fail
BIOS assigns an incorrect BAR to the DVB adapter which doesn't fit into the
upstream bridge window. The kernel corrects the BAR assignment:
pci 0000:07:00.0: BAR 0 [mem 0xfffffffffc500000-0xfffffffffc50ffff 64bit]: can't claim; no compatible bridge window
pci 0000:07:00.0: BAR 0 [mem 0xfc500000-0xfc50ffff 64bit]: assigned
Correction of the BAR assignment happens in an x86-specific fs_initcall,
pcibios_assign_resources(), after device enumeration in a subsys_initcall.
This order was introduced at the behest of Linus in 2004:
https://git.kernel.org/tglx/history/c/a06a30144bbc
No other architecture performs such a late BAR correction.
Bernd bisected the issue to commit a2f1e22390 ("PCI/ERR: Ensure error
recoverability at all times"), but it only occurs in the absence of commit
4d4c10f763 ("PCI: Explicitly put devices into D0 when initializing").
This combination exists in stable kernel v6.12.70, but not in mainline,
hence Bernd cannot reproduce the issue with mainline.
Since a2f1e22390, config space is saved on enumeration, prior to BAR
correction. Upon passthrough, the corrected BAR is overwritten with the
incorrect saved value by:
vfio_pci_core_register_device()
vfio_pci_set_power_state()
pci_restore_state()
But only if the device's current_state is PCI_UNKNOWN, as it was prior to
commit 4d4c10f763. Since the commit, it is PCI_D0, which changes the
behavior of vfio_pci_set_power_state() to no longer restore the state
without saving it first.
Alexandre is reporting the same issue as Bernd, but in his case, mainline
is affected as well. The difference is that on Alexandre's system, the
host kernel binds a driver to the device which is unbound prior to
passthrough, whereas on Bernd's system no driver gets bound by the host
kernel.
Unbinding sets current_state to PCI_UNKNOWN in pci_device_remove(), so when
vfio-pci is subsequently bound to the device, pci_restore_state() is once
again called without invoking pci_save_state() first.
To robustly fix the issue, always update saved_config_space upon resource
assignment.
Reported-by: Bernd Schumacher <bernd@bschu.de>
Closes: https://lore.kernel.org/r/acfZrlP0Ua_5D3U4@eldamar.lan/
Reported-by: Alexandre N. <an.tech@mailo.com>
Closes: https://lore.kernel.org/r/dd3c3358-de0f-4a56-9c81-04aceaab4058@mailo.com/
Fixes: a2f1e22390 ("PCI/ERR: Ensure error recoverability at all times")
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Bernd Schumacher <bernd@bschu.de>
Tested-by: Alexandre N. <an.tech@mailo.com>
Cc: stable@vger.kernel.org # v6.12+
Link: https://patch.msgid.link/febc3f354e0c1f5a9f5b3ee9ffddaa44caccf651.1776268054.git.lukas@wunner.de
Pull block fixes from Jens Axboe:
- Fix for ublk not doing an actual issue from the task_work fallback
path. Any request hitting that should be canceled automatically
- Fix for uring_cmd prep side handling, for the block side uring_cmd
discard handling
- Fix for missing validation of the io and physical block size shifts
- Fix for a use-after-free in ublk's cancel command handling
* tag 'block-7.1-20260508' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
ublk: fix use-after-free in ublk_cancel_cmd()
ublk: validate physical_bs_shift, io_min_shift and io_opt_shift
block: only read from sqe on initial invocation of blkdev_uring_cmd()
ublk: don't issue uring_cmd from fallback task work
Pull io_uring fixes from Jens Axboe:
- Ensure that the absolute timeouts for both the command side and the
waiting side honor the callers time namespace
- Ensure tracked NAPI entries are cleared at unregistration time, as
the NAPI polling loop checks the list state rather than the general
NAPI state. This can lead to NAPI polling even after unregistration
has been done. If unregistered, all NAPI polling should be disabled
- Fix for eventfd recursive invocation handling
* tag 'io_uring-7.1-20260508' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
io_uring/wait: honour caller's time namespace for IORING_ENTER_ABS_TIMER
io_uring/timeout: honour caller's time namespace for IORING_TIMEOUT_ABS
io_uring/eventfd: reset deferred signal state
io_uring/napi: clear tracked NAPI entries on unregister
sol_tcp_sockopt() only checks if sk->sk_protocol is IPPROTO_TCP,
but RAW socket can bypass it:
socket(AF_INET, SOCK_RAW, IPPROTO_TCP)
Let's use sk_is_tcp().
Note that initially sol_tcp_sockopt() checked sk->sk_prot->setsockopt.
Fixes: 2ab42c7b87 ("bpf: Check the protocol of a sock to agree the calls to bpf_setsockopt().")
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://patch.msgid.link/20260504210610.180150-7-kuniyu@google.com
bpf_mptcp_sock_from_subflow() only checks if sk->sk_protocol is
IPPROTO_TCP, but RAW socket can bypass it:
socket(AF_INET, SOCK_RAW, IPPROTO_TCP)
In this case, it would NOT be valid to call sk_is_mptcp() which will
assume sk is a pointer to a struct tcp_sock, and wrongly checks for:
tcp_sk(sk)->is_mptcp.
Fixes: 3bc253c2e6 ("bpf: Add bpf_skc_to_mptcp_sock_proto")
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20260504210610.180150-4-kuniyu@google.com
Pull smb client fixes from Steve French:
- Fix for two ACL issues (security fix to validate dacloffset better
and chmod fix)
- Fix out of bounds reads (in check_wsl_eas and smb2_check_msg for
symlinks)
- Two Kerberos fixes including an important one when AES-256 encryption
chosen
- Fix open_cached_dir problem when directory leases disabled
* tag 'v7.1-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
smb: client: validate dacloffset before building DACL pointers
smb/client: fix out-of-bounds read in smb2_compound_op()
smb/client: fix out-of-bounds read in symlink_data()
smb: client: Zero-pad short GSS session keys per MS-SMB2
smb: client: Use FullSessionKey for AES-256 encryption key derivation
smb: client: use kzalloc to zero-initialize security descriptor buffer
cifs: abort open_cached_dir if we don't request leases
Pull spi fixes from Mark Brown:
"There's two main series here, fixing issues that came up in the
Microchip QSPI and Freescale i.MX drivers. Both of those could result
in some quite noticable issues if they were encountered in production.
We also have one minor documentation fix in the ch341 driver"
* tag 'spi-fix-v7.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: ch341: correct company name in MODULE_DESCRIPTION
spi: microchip-core-qspi: remove some inline markings
spi: microchip-core-qspi: don't attempt to transmit during emulated read-only dual/quad operations
spi: microchip-core-qspi: control built-in cs manually
spi: imx: Propagate prepare_transfer() error from spi_imx_setupxfer()
spi: imx: Fix UAF on package-1 prepare failure in spi_imx_dma_data_prepare()
spi: imx: Fix precedence bug in spi_imx_dma_max_wml_find()
Pull regulator fix from Mark Brown:
"A straightforward fix for an incorrect description of one of the
regulators on the Qualcomm PMH0101"
* tag 'regulator-fix-v7.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
regulator: qcom-rpmh: Fix index for pmh0101 ldo16
Pull drm fixes from Dave Airlie:
"Weekly fixes, lots of them but all pretty small, amdgpu and xe are the
usual but then a large amount of fixes all over.
core:
- fix race condition in handle change ioctl
fb-helper:
- fix clipping
rust:
- fix unsound initialization
- fix GEM state cleanup
- fix wrong ARef import
ttm:
- update GPU MM stats on pool shrinking
i915:
- Re-enable ccs modifiers on dg2
nova:
- fix mailing list
xe:
- Add NULL check for media_gt in intel_hdcp_gsc_check_status
- Fix EAGAIN sign in pf_migration_consume
- Fix MMIO access using PF view instead of VF view during migration
- Exclude indirect ring state page from ADS engine state size
amdgpu:
- GFX9 fixes
- Hawaii SMU fixes
- SDMA4 fix
- GART fix
- Userq fixes
amdkfd:
- GPUVM TLB flush fix
- Hotplug fix
radeon:
- Hawaii SMU fixes
bochs:
- fix managed cleanup
bridge:
- tda998x: fix sparse warnings on type correctness
etnaviv:
- schedule armed jobs
exynos:
- managed bridge cleanup
ivpu:
- disallow reexport of GEM buffer objects
noveau:
- revert support for GA100
panel:
- boe-tv101wum-nl16: use correct MIPI_DSI mode
- feyjang-fy07024di26a30d: fix error reporting
- himax-hx83102: use correct MIPI_DSI mode
- himax-hx83121a: fix error checks
- himax-hx83121a: select DRM_DISPLAY_DSC_HELPER
qaic:
- fix RAS message handling
qxl:
- clean up polling
sti:
- managed bridge cleanup
* tag 'drm-fixes-2026-05-08-1' of https://gitlab.freedesktop.org/drm/kernel: (37 commits)
drm: Set old handle to NULL before prime swap in change_handle
drm/bochs: Drop manual put on probe error path
drm/xe/guc: Exclude indirect ring state page from ADS engine state size
drm/xe/pf: Fix MMIO access using PF view instead of VF view during migration
drm/xe/pf: Fix EAGAIN sign in pf_migration_consume()
drm/xe/hdcp: Add NULL check for media_gt in intel_hdcp_gsc_check_status()
drm/exynos: remove bridge when component_add fails
drm/amdgpu: nuke amdgpu_userq_fence_slab v2
drm/amdgpu/userq: fix access to stale wptr mapping
drm/amdkfd: Check if there are kfd porcesses using adev by kfd_processes_count
drm/amdgpu: zero-initialize GART table on allocation
drm/amdgpu/sdma4: replace BUG_ON with WARN_ON in fence emission
drm/radeon: add missing revision check for CI
drm/amdgpu/pm: align Hawaii mclk workaround with radeon
drm/amdgpu/pm: add missing revision check for CI
drm/amdgpu/gfx9: drop unnecessary 64-bit fence flag check in KIQ
drm/amdkfd: Make all TLB-flushes heavy-weight
drm/panel: himax-hx83102: restore MODE_LPM after sending disable cmds
drm/panel: boe-tv101wum-nl6: restore MODE_LPM after sending disable cmds
drm/panel: feiyang-fy07024di26a30d: return display-on error
...
Johan writes:
USB serial device ids for 7.1-rc3
Here are some new modem device ids.
This one has been in linux-next with no reported issues.
* tag 'usb-serial-7.1-rc3' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial:
USB: serial: option: add Telit Cinterion LE910Cx compositions