Commit Graph

1412804 Commits

Author SHA1 Message Date
Nathan Chancellor
b584bfbd7e ACPI: APEI: GHES: Disable KASAN instrumentation when compile testing with clang < 18
After a recent innocuous change to drivers/acpi/apei/ghes.c, building
ARCH=arm64 allmodconfig with clang-17 or older (which has both
CONFIG_KASAN=y and CONFIG_WERROR=y) fails with:

  drivers/acpi/apei/ghes.c:902:13: error: stack frame size (2768) exceeds limit (2048) in 'ghes_do_proc' [-Werror,-Wframe-larger-than]
    902 | static void ghes_do_proc(struct ghes *ghes,
        |             ^

A KASAN pass that removes unneeded stack instrumentation, enabled by
default in clang-18 [1], drastically improves stack usage in this case.

To avoid the warning in the common allmodconfig case when it can break
the build, disable KASAN for ghes.o when compile testing with clang-17
and older. Disabling KASAN outright may hide legitimate runtime issues,
so live with the warning in that case; the user can either increase the
frame warning limit or disable -Werror, which they should probably do
when debugging with KASAN anyways.

Closes: https://github.com/ClangBuiltLinux/linux/issues/2148
Link: 51fbab1345 [1]
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Cc: All applicable <stable@vger.kernel.org>
Link: https://patch.msgid.link/20260114-ghes-avoid-wflt-clang-older-than-18-v1-1-9c8248bfe4f4@kernel.org
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-01-28 21:35:30 +01:00
Fabio M. De Francesco
95350effc3 ACPI: extlog: Trace CPER CXL Protocol Error Section
When Firmware First is enabled, BIOS handles errors first and then it
makes them available to the kernel via the Common Platform Error Record
(CPER) sections (UEFI 2.11 Appendix N.2.13). Linux parses the CPER
sections via one of two similar paths, either ELOG or GHES. The errors
managed by ELOG are signaled to the BIOS by the I/O Machine Check
Architecture (I/O MCA).

Currently, ELOG and GHES show some inconsistencies in how they report to
userspace via trace events.

Therefore, make the two mentioned paths act similarly by tracing the CPER
CXL Protocol Error Section.

Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
Link: https://patch.msgid.link/20260114101543.85926-6-fabio.m.de.francesco@linux.intel.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-01-14 17:09:34 +01:00
Fabio M. De Francesco
ba8af8e1f1 ACPI: APEI: GHES: Add helper to copy CPER CXL protocol error info to work struct
Make a helper out of cxl_cper_post_prot_err() that checks the CXL agent
type and copy the CPER CXL protocol errors information to a work data
structure.

Export the new symbol for reuse by ELOG.

Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
[ rjw: Subject tweak ]
Link: https://patch.msgid.link/20260114101543.85926-5-fabio.m.de.francesco@linux.intel.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-01-14 17:09:34 +01:00
Fabio M. De Francesco
7020586968 ACPI: APEI: GHES: Add helper for CPER CXL protocol errors checks
Move the CPER CXL protocol errors validity check out of
cxl_cper_post_prot_err() to new cxl_cper_sec_prot_err_valid() and limit
the serial number check only to CXL agents that are CXL devices (UEFI
v2.10, Appendix N.2.13).

Export the new symbol for reuse by ELOG.

Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
[ rjw: Subject tweak ]
Link: https://patch.msgid.link/20260114101543.85926-4-fabio.m.de.francesco@linux.intel.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-01-14 17:09:34 +01:00
Fabio M. De Francesco
e778ffefa3 ACPI: extlog: Trace CPER PCI Express Error Section
I/O Machine Check Architecture events may signal failing PCIe components
or links. The AER event contains details on what was happening on the wire
when the error was signaled.

Trace the CPER PCIe Error section (UEFI v2.11, Appendix N.2.7) reported
by the I/O MCA.

Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
Link: https://patch.msgid.link/20260114101543.85926-3-fabio.m.de.francesco@linux.intel.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-01-14 17:09:34 +01:00
Fabio M. De Francesco
a2995f7dab ACPI: extlog: Trace CPER Non-standard Section Body
ghes_do_proc() has a catch-all for unknown or unhandled CPER formats
(UEFI v2.11 Appendix N 2.3), extlog_print() does not. This gap was
noticed by a RAS test that injected CXL protocol errors which were
notified to extlog_print() via the IOMCA (I/O Machine Check
Architecture) mechanism. Bring parity to the extlog_print() path by
including a similar log_non_standard_event().

Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Reviewed-by: Shuai Xue <xueshuai@linux.alibaba.com>
Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
Link: https://patch.msgid.link/20260114101543.85926-2-fabio.m.de.francesco@linux.intel.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-01-14 17:09:34 +01:00
Shuai Xue
b73cf7eaa6 ACPI: APEI: GHES: Improve ghes_notify_sea() status check
Performance testing on ARMv8 systems shows significant overhead in error
status handling in SEA error handling.

- ghes_peek_estatus(): 8,138.3 ns (21,160 cycles).
- ghes_clear_estatus(): 2,038.3 ns (5,300 cycles).

Apply the same optimization used in ghes_notify_nmi() to
ghes_notify_sea() by checking for active errors before processing,

Tested-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
Link: https://patch.msgid.link/20260112032239.30023-4-xueshuai@linux.alibaba.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-01-14 17:05:05 +01:00
Shuai Xue
feb2d38013 ACPI: APEI: GHES: Extract helper functions for error status handling
Refactors the GHES driver by extracting common functionality into
reusable helper functions:

1. ghes_has_active_errors() - Checks if any error sources in a given list
   have active errors
2. ghes_map_error_status() - Maps error status address to virtual address
3. ghes_unmap_error_status() - Unmaps error status virtual address
4. Use `guard(rcu)()` instead of explicit `rcu_read_lock()`/`rcu_read_unlock()`.

These helpers eliminate code duplication in the NMI path and prepare for
similar usage in the SEA path in a subsequent patch.

No functional change intended.

Tested-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
Reviewed-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
Link: https://patch.msgid.link/20260112032239.30023-3-xueshuai@linux.alibaba.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-01-14 17:05:05 +01:00
Tony Luck
f2edc1fb9c ACPI: APEI: GHES: Improve ghes_notify_nmi() status check
ghes_notify_nmi() is called for every NMI and must check whether the NMI was
generated because an error was signalled by platform firmware.

This check is very expensive as for each registered GHES NMI source it reads
from the acpi generic address attached to this error source to get the physical
address of the acpi_hest_generic_status block.  It then checks the "block_status"
to see if an error was logged.

The ACPI/APEI code must create virtual mappings for each of those physical
addresses, and tear them down afterwards. On an Icelake system this takes around
15,000 TSC cycles. Enough to disturb efforts to profile system performance.

If that were not bad enough, there are some atomic accesses in the code path
that will cause cache line bounces between CPUs. A problem that gets worse as
the core count increases.

But BIOS changes neither the acpi generic address nor the physical address of
the acpi_hest_generic_status block. So this walk can be done once when the NMI is
registered to save the virtual address (unmapping if the NMI is ever unregistered).
The "block_status" can be checked directly in the NMI handler. This can be done
without any atomic accesses.

Resulting time to check that there is not an error record is around 900 cycles.

Reported-by: Andi Kleen <andi.kleen@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Tested-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
Link: https://patch.msgid.link/20260112032239.30023-2-xueshuai@linux.alibaba.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-01-14 17:05:05 +01:00
Mauro Carvalho Chehab
55cc6fe571 EFI/CPER: don't dump the entire memory region
The current logic at cper_print_fw_err() doesn't check if the
error record length is big enough to handle offset. On a bad firmware,
if the ofset is above the actual record, length -= offset will
underflow, making it dump the entire memory.

The end result can be:

 - the logic taking a lot of time dumping large regions of memory;
 - data disclosure due to the memory dumps;
 - an OOPS, if it tries to dump an unmapped memory region.

Fix it by checking if the section length is too small before doing
a hex dump.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
[ rjw: Subject tweaks ]
Link: https://patch.msgid.link/1752b5ba63a3e2f148ddee813b36c996cc617e86.1767871950.git.mchehab+huawei@kernel.org
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-01-14 17:04:42 +01:00
Mauro Carvalho Chehab
fa2408a24f APEI/GHES: ensure that won't go past CPER allocated record
The logic at ghes_new() prevents allocating too large records, by
checking if they're bigger than GHES_ESTATUS_MAX_SIZE (currently, 64KB).
Yet, the allocation is done with the actual number of pages from the
CPER bios table location, which can be smaller.

Yet, a bad firmware could send data with a different size, which might
be bigger than the allocated memory, causing an OOPS:

    Unable to handle kernel paging request at virtual address fff00000f9b40000
    Mem abort info:
      ESR = 0x0000000096000007
      EC = 0x25: DABT (current EL), IL = 32 bits
      SET = 0, FnV = 0
      EA = 0, S1PTW = 0
      FSC = 0x07: level 3 translation fault
    Data abort info:
      ISV = 0, ISS = 0x00000007, ISS2 = 0x00000000
      CM = 0, WnR = 0, TnD = 0, TagAccess = 0
      GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
    swapper pgtable: 4k pages, 52-bit VAs, pgdp=000000008ba16000
    [fff00000f9b40000] pgd=180000013ffff403, p4d=180000013fffe403, pud=180000013f85b403, pmd=180000013f68d403, pte=0000000000000000
    Internal error: Oops: 0000000096000007 [#1]  SMP
    Modules linked in:
    CPU: 0 UID: 0 PID: 303 Comm: kworker/0:1 Not tainted 6.19.0-rc1-00002-gda407d200220 #34 PREEMPT
    Hardware name: QEMU QEMU Virtual Machine, BIOS unknown 02/02/2022
    Workqueue: kacpi_notify acpi_os_execute_deferred
    pstate: 214020c5 (nzCv daIF +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
    pc : hex_dump_to_buffer+0x30c/0x4a0
    lr : hex_dump_to_buffer+0x328/0x4a0
    sp : ffff800080e13880
    x29: ffff800080e13880 x28: ffffac9aba86f6a8 x27: 0000000000000083
    x26: fff00000f9b3fffc x25: 0000000000000004 x24: 0000000000000004
    x23: ffff800080e13905 x22: 0000000000000010 x21: 0000000000000083
    x20: 0000000000000001 x19: 0000000000000008 x18: 0000000000000010
    x17: 0000000000000001 x16: 00000007c7f20fec x15: 0000000000000020
    x14: 0000000000000008 x13: 0000000000081020 x12: 0000000000000008
    x11: ffff800080e13905 x10: ffff800080e13988 x9 : 0000000000000000
    x8 : 0000000000000000 x7 : 0000000000000001 x6 : 0000000000000020
    x5 : 0000000000000030 x4 : 00000000fffffffe x3 : 0000000000000000
    x2 : ffffac9aba78c1c8 x1 : ffffac9aba76d0a8 x0 : 0000000000000008
    Call trace:
     hex_dump_to_buffer+0x30c/0x4a0 (P)
     print_hex_dump+0xac/0x170
     cper_estatus_print_section+0x90c/0x968
     cper_estatus_print+0xf0/0x158
     __ghes_print_estatus+0xa0/0x148
     ghes_proc+0x1bc/0x220
     ghes_notify_hed+0x5c/0xb8
     notifier_call_chain+0x78/0x148
     blocking_notifier_call_chain+0x4c/0x80
     acpi_hed_notify+0x28/0x40
     acpi_ev_notify_dispatch+0x50/0x80
     acpi_os_execute_deferred+0x24/0x48
     process_one_work+0x15c/0x3b0
     worker_thread+0x2d0/0x400
     kthread+0x148/0x228
     ret_from_fork+0x10/0x20
    Code: 6b14033f 540001ad a94707e2 f100029f (b8747b44)
    ---[ end trace 0000000000000000 ]---

Prevent that by taking the actual allocated are into account when
checking for CPER length.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
[ rjw: Subject tweaks ]
Link: https://patch.msgid.link/4e70310a816577fabf37d94ed36cde4ad62b1e0a.1767871950.git.mchehab+huawei@kernel.org
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-01-14 17:04:33 +01:00
Mauro Carvalho Chehab
eae21beecb EFI/CPER: don't go past the ARM processor CPER record buffer
There's a logic inside GHES/CPER to detect if the section_length
is too small, but it doesn't detect if it is too big.

Currently, if the firmware receives an ARM processor CPER record
stating that a section length is big, kernel will blindly trust
section_length, producing a very long dump. For instance, a 67
bytes record with ERR_INFO_NUM set 46198 and section length
set to 854918320 would dump a lot of data going a way past the
firmware memory-mapped area.

Fix it by adding a logic to prevent it to go past the buffer
if ERR_INFO_NUM is too big, making it report instead:

	[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 1
	[Hardware Error]: event severity: recoverable
	[Hardware Error]:  Error 0, type: recoverable
	[Hardware Error]:   section_type: ARM processor error
	[Hardware Error]:   MIDR: 0xff304b2f8476870a
	[Hardware Error]:   section length: 854918320, CPER size: 67
	[Hardware Error]:   section length is too big
	[Hardware Error]:   firmware-generated error record is incorrect
	[Hardware Error]:   ERR_INFO_NUM is 46198

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
[ rjw: Subject and changelog tweaks ]
Link: https://patch.msgid.link/41cd9f6b3ace3cdff7a5e864890849e4b1c58b63.1767871950.git.mchehab+huawei@kernel.org
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-01-14 17:04:21 +01:00
Mauro Carvalho Chehab
87880af2d2 APEI/GHES: ARM processor Error: don't go past allocated memory
If the BIOS generates a very small ARM Processor Error, or
an incomplete one, the current logic will fail to deferrence

	err->section_length
and
	ctx_info->size

Add checks to avoid that. With such changes, such GHESv2
records won't cause OOPSes like this:

[    1.492129] Internal error: Oops: 0000000096000005 [#1]  SMP
[    1.495449] Modules linked in:
[    1.495820] CPU: 0 UID: 0 PID: 9 Comm: kworker/0:0 Not tainted 6.18.0-rc1-00017-gabadcc3553dd-dirty #18 PREEMPT
[    1.496125] Hardware name: QEMU QEMU Virtual Machine, BIOS unknown 02/02/2022
[    1.496433] Workqueue: kacpi_notify acpi_os_execute_deferred
[    1.496967] pstate: 814000c5 (Nzcv daIF +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
[    1.497199] pc : log_arm_hw_error+0x5c/0x200
[    1.497380] lr : ghes_handle_arm_hw_error+0x94/0x220

0xffff8000811c5324 is in log_arm_hw_error (../drivers/ras/ras.c:75).
70		err_info = (struct cper_arm_err_info *)(err + 1);
71		ctx_info = (struct cper_arm_ctx_info *)(err_info + err->err_info_num);
72		ctx_err = (u8 *)ctx_info;
73
74		for (n = 0; n < err->context_info_num; n++) {
75			sz = sizeof(struct cper_arm_ctx_info) + ctx_info->size;
76			ctx_info = (struct cper_arm_ctx_info *)((long)ctx_info + sz);
77			ctx_len += sz;
78		}
79

and similar ones while trying to access section_length on an
error dump with too small size.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
[ rjw: Subject tweaks ]
Link: https://patch.msgid.link/7fd9f38413be05ee2d7cfdb0dc31ea2274cf1a54.1767871950.git.mchehab+huawei@kernel.org
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-01-14 17:03:29 +01:00
Colin Ian King
cae444e0e2 ACPI: APEI: EINJ: make read-only array non_mmio_desc static const
Don't populate the read-only array non_mmio_desc on the stack at run
time, instead make it static const.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Link: https://patch.msgid.link/20251219215900.494211-1-colin.i.king@gmail.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-01-07 20:47:34 +01:00
Linus Torvalds
9ace4753a5 Linux 6.19-rc4 v6.19-rc4 2026-01-04 14:41:55 -08:00
Linus Torvalds
54e82e93ca Merge tag 'core_urgent_for_v6.19_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull core entry fix from Borislav Petkov:

 - Make sure clang inlines trivial local_irq_* helpers

* tag 'core_urgent_for_v6.19_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  entry: Always inline local_irq_{enable,disable}_exit_to_user()
2026-01-04 07:21:18 -08:00
Linus Torvalds
aacb0a6d60 Merge tag 'pmdomain-v6.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm
Pull pmdomain fixes from Ulf Hansson:

 - mediatek: Fix spinlock recursion fix during probe

 - imx: Fix reference count leak during probe

* tag 'pmdomain-v6.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm:
  pmdomain: imx: Fix reference count leak in imx_gpc_probe()
  pmdomain: mtk-pm-domains: Fix spinlock recursion fix in probe
2026-01-03 09:18:36 -08:00
Linus Torvalds
805f9a0613 Merge tag 'perf-tools-fixes-for-v6.19-2026-01-02' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
Pull perf tool fixes and from Namhyung Kim:

 - skip building BPF skeletons if libopenssl is missing

 - a couple of test updates

 - handle error cases of filename__read_build_id()

 - support NVIDIA Olympus for ARM SPE profiling

 - update tool headers to sync with the kernel

* tag 'perf-tools-fixes-for-v6.19-2026-01-02' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools:
  tools build: Fix the common set of features test wrt libopenssl
  tools headers: Sync syscall table with kernel sources
  tools headers: Sync linux/socket.h with kernel sources
  tools headers: Sync linux/gfp_types.h with kernel sources
  tools headers: Sync arm64 headers with kernel sources
  tools headers: Sync x86 headers with kernel sources
  tools headers: Sync UAPI sound/asound.h with kernel sources
  tools headers: Sync UAPI linux/mount.h with kernel sources
  tools headers: Sync UAPI linux/fs.h with kernel sources
  tools headers: Sync UAPI linux/fcntl.h with kernel sources
  tools headers: Sync UAPI KVM headers with kernel sources
  tools headers: Sync UAPI drm/drm.h with kernel sources
  perf arm-spe: Add NVIDIA Olympus to neoverse list
  tools headers arm64: Add NVIDIA Olympus part
  perf tests top: Make the test exclusive
  perf tests kvm: Avoid leaving perf.data.guest file around
  perf symbol: Fix ENOENT case for filename__read_build_id
  perf tools: Disable BPF skeleton if no libopenssl found
  tools/build: Add a feature test for libopenssl
2026-01-02 14:24:09 -08:00
Linus Torvalds
bbbc721033 Merge tag 'pm-6.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fix from Rafael Wysocki:
 "Fix a recent regression that affects system suspend testing
  at the 'core' level (Rafael Wysocki)"

* tag 'pm-6.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  PM: sleep: Fix suspend_test() at the TEST_CORE level
2026-01-02 12:35:29 -08:00
Linus Torvalds
dec1ecf2c7 Merge tag 'libcrypto-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull crypto library fix from Eric Biggers:
 "Fix the kunit_run_irq_test() function (which I recently added for the
  CRC and crypto tests) to be less timing-dependent.

  This fixes flakiness in the polyval kunit test suite"

* tag 'libcrypto-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux:
  kunit: Enforce task execution in {soft,hard}irq contexts
2026-01-02 12:28:24 -08:00
Linus Torvalds
6ce4d44fb0 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull rdma fixes from Jason Gunthorpe:

 - Fix several syzkaller found bugs:
    - Poor parsing of the RDMA_NL_LS_OP_IP_RESOLVE netlink
    - GID entry refcount leaking when CM destruction races with
      multicast establishment
    - Missing refcount put in ib_del_sub_device_and_put()

 - Fixup recently introduced uABI padding for 32 bit consistency

 - Avoid user triggered math overflow in MANA and AFA

 - Reading invalid netdev data during an event

 - kdoc fixes

 - Fix never-working gid copying in ib_get_gids_from_rdma_hdr

 - Typo in bnxt when validating the BAR

 - bnxt mis-parsed IB_SEND_IP_CSUM so it didn't work always

 - bnxt out of bounds access in bnxt related to the counters on new
   devices

 - Allocate the bnxt PDE table with the right sizing

 - Use dma_free_coherent() correctly in bnxt

 - Allow rxe to be unloadable when CONFIG_PROVE_LOCKING by adjusting the
   tracking of the global sockets it uses

 - Missing unlocking on error path in rxe

 - Compute the right number of pages in a MR in rtrs

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  RDMA/bnxt_re: fix dma_free_coherent() pointer
  RDMA/rtrs: Fix clt_path::max_pages_per_mr calculation
  IB/rxe: Fix missing umem_odp->umem_mutex unlock on error path
  RDMA/bnxt_re: Fix to use correct page size for PDE table
  RDMA/bnxt_re: Fix OOB write in bnxt_re_copy_err_stats()
  RDMA/bnxt_re: Fix IB_SEND_IP_CSUM handling in post_send
  RDMA/core: always drop device refcount in ib_del_sub_device_and_put()
  RDMA/rxe: let rxe_reclassify_recv_socket() call sk_owner_put()
  RDMA/bnxt_re: Fix incorrect BAR check in bnxt_qplib_map_creq_db()
  RDMA/core: Fix logic error in ib_get_gids_from_rdma_hdr()
  RDMA/efa: Remove possible negative shift
  RTRS/rtrs: clean up rtrs headers kernel-doc
  RDMA/irdma: avoid invalid read in irdma_net_event
  RDMA/mana_ib: check cqe length for kernel CQs
  RDMA/irdma: Fix irdma_alloc_ucontext_resp padding
  RDMA/ucma: Fix rdma_ucm_query_ib_service_resp struct padding
  RDMA/cm: Fix leaking the multicast GID table reference
  RDMA/core: Check for the presence of LS_NLA_TYPE_DGID correctly
2026-01-02 12:25:47 -08:00
Linus Torvalds
3d35fa1190 Merge tag 'linux_kselftest-fixes-6.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull kselftest fixes from Shuah Khan:

 - Fix for build failures in tests that use an empty FIXTURE() seen in
   Android's build environment, which uses -D_FORTIFY_SOURCE=3, a build
   failure occurs in tests that use an empty FIXTURE()

 - Fix func_traceonoff_triggers.tc sometimes failures on Kunpeng-920
   board resulting from including transient trace file name in checksum
   compare

 - Fix to remove available_events requirement from toplevel-enable for
   instance as it isn't a valid requirement for this test

* tag 'linux_kselftest-fixes-6.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  kselftest/harness: Use helper to avoid zero-size memset warning
  selftests/ftrace: Test toplevel-enable for instance
  selftests/ftrace: traceonoff_triggers: strip off names
2026-01-02 12:21:34 -08:00
Linus Torvalds
bea82c80a5 Merge tag 'block-6.19-20260102' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux
Pull block fixes from Jens Axboe:

 - Scan partition tables asynchronously for ublk, similarly to how nvme
   does it. This avoids potential deadlocks, which is why nvme does it
   that way too. Includes a set of selftests as well.

 - MD pull request via Yu:
     - Fix null-pointer dereference in raid5 sysfs group_thread_cnt
       store (Tuo Li)
     - Fix possible mempool corruption during raid1 raid_disks update
       via sysfs (FengWei Shih)
     - Fix logical_block_size configuration being overwritten during
       super_1_validate() (Li Nan)
     - Fix forward incompatibility with configurable logical block size:
       arrays assembled on new kernels could not be assembled on older
       kernels (v6.18 and before) due to non-zero reserved pad rejection
       (Li Nan)
     - Fix static checker warning about iterator not incremented (Li Nan)

 - Skip CPU offlining notifications on unmapped hardware queues

 - bfq-iosched block stats fix

 - Fix outdated comment in bfq-iosched

* tag 'block-6.19-20260102' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
  block, bfq: update outdated comment
  blk-mq: skip CPU offline notify on unmapped hctx
  selftests/ublk: fix Makefile to rebuild on header changes
  selftests/ublk: add test for async partition scan
  ublk: scan partition in async way
  block,bfq: fix aux stat accumulation destination
  md: Fix forward incompatibility from configurable logical block size
  md: Fix logical_block_size configuration being overwritten
  md: suspend array while updating raid_disks via sysfs
  md/raid5: fix possible null-pointer dereferences in raid5_store_group_thread_cnt()
  md: Fix static checker warning in analyze_sbs
2026-01-02 12:15:59 -08:00
Linus Torvalds
509b5b1152 Merge tag 'io_uring-6.19-20260102' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux
Pull io_uring fixes from Jens Axboe:

 - Removed dead argument length for io_uring_validate_mmap_request()

 - Use GFP_NOWAIT for overflow CQEs on legacy ring setups rather than
   GFP_ATOMIC, which makes it play nicer with memcg limits

 - Fix a potential circular locking issue with tctx node removal and
   exec based cancelations

* tag 'io_uring-6.19-20260102' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
  io_uring/memmap: drop unused sz param in io_uring_validate_mmap_request()
  io_uring/tctx: add separate lock for list of tctx's in ctx
  io_uring: use GFP_NOWAIT for overflow CQEs on legacy rings
2026-01-02 12:07:55 -08:00
Linus Torvalds
71b62ed6ce Merge tag 'x86-urgent-2026-01-02' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fix from Ingo Molnar:
 "Fix the AMD microcode Entrysign signature checking code to include
  more models"

* tag 'x86-urgent-2026-01-02' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/microcode/AMD: Fix Entrysign revision check for Zen5/Strix Halo
2026-01-02 12:04:51 -08:00
Linus Torvalds
b993744a97 Merge tag 'loongarch-fixes-6.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson
Pull LoongArch fixes from Huacai Chen:
 "Complete CPUCFG registers definition, set correct protection_map[] for
  VM_NONE/VM_SHARED, fix some bugs in the orc stack unwinder, ftrace and
  BPF JIT"

* tag 'loongarch-fixes-6.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
  samples/ftrace: Adjust LoongArch register restore order in direct calls
  LoongArch: BPF: Enhance the bpf_arch_text_poke() function
  LoongArch: BPF: Enable trampoline-based tracing for module functions
  LoongArch: BPF: Adjust the jump offset of tail calls
  LoongArch: BPF: Save return address register ra to t0 before trampoline
  LoongArch: BPF: Zero-extend bpf_tail_call() index
  LoongArch: BPF: Sign extend kfunc call arguments
  LoongArch: Refactor register restoration in ftrace_common_return
  LoongArch: Enable exception fixup for specific ADE subcode
  LoongArch: Remove unnecessary checks for ORC unwinder
  LoongArch: Remove is_entry_func() and kernel_entry_end
  LoongArch: Use UNWIND_HINT_END_OF_STACK for entry points
  LoongArch: Set correct protection_map[] for VM_NONE/VM_SHARED
  LoongArch: Complete CPUCFG registers definition
2026-01-02 11:33:33 -08:00
Linus Torvalds
9b04368044 Merge tag 'drm-fixes-2026-01-02' of https://gitlab.freedesktop.org/drm/kernel
Pull drm fixes from Dave Airlie:
 "Happy New Year, jetlagged fixes from me, still pretty quiet, xe is
  most of this, with i915/nouveau/imagination fixes and some shmem
  cleanups.

  shmem:
   - docs and MODULE_LICENSE fix

  xe:
   - Ensure svm device memory is idle before migration completes
   - Fix a SVM debug printout
   - Use READ_ONCE() / WRITE_ONCE() for g2h_fence

  i915:
   - Fix eb_lookup_vmas() failure path

  nouveau:
   - fix prepare_fb warnings

  imagination:
   - prevent export of protected objects"

* tag 'drm-fixes-2026-01-02' of https://gitlab.freedesktop.org/drm/kernel:
  drm/i915/gem: Zero-initialize the eb.vma array in i915_gem_do_execbuffer
  drm/xe/guc: READ/WRITE_ONCE g2h_fence->done
  drm/pagemap, drm/xe: Ensure that the devmem allocation is idle before use
  drm/xe/svm: Fix a debug printout
  drm/gem-shmem: Fix the MODULE_LICENSE() string
  drm/gem-shmem: Fix typos in documentation
  drm/nouveau/dispnv50: Don't call drm_atomic_get_crtc_state() in prepare_fb
  drm/imagination: Disallow exporting of PM/FW protected objects
2026-01-02 09:53:45 -08:00
Linus Torvalds
e3a97ab1bb Merge tag 'v6.19-rc3-smb3-server-fixes' of git://git.samba.org/ksmbd
Pull smb server fixes from Steve French:

 - Fix memory leak

 - Fix two refcount leaks

 - Fix error path in create_smb2_pipe

* tag 'v6.19-rc3-smb3-server-fixes' of git://git.samba.org/ksmbd:
  smb/server: fix refcount leak in smb2_open()
  smb/server: fix refcount leak in parse_durable_handle_context()
  smb/server: call ksmbd_session_rpc_close() on error path in create_smb2_pipe()
  ksmbd: Fix memory leak in get_file_all_info()
2026-01-02 09:24:43 -08:00
Linus Torvalds
047b4e783c Merge tag 'v6.19-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull smb client fixes from Steve French:

 - Fix array out of bounds error in copy_file_range

 - Add tracepoint to help debug ioctl failures

* tag 'v6.19-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  smb: client: fix UBSAN array-index-out-of-bounds in smb2_copychunk_range
  smb3 client: add missing tracepoint for unsupported ioctls
2026-01-02 09:14:13 -08:00
Julia Lawall
69153e8b97 block, bfq: update outdated comment
The function bfq_bfqq_may_idle() was renamed as bfq_better_to_idle()
in commit 277a4a9b56 ("block, bfq: give a better name to
bfq_bfqq_may_idle").  Update the comment accordingly.

Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2026-01-01 08:57:37 -07:00
Caleb Sander Mateos
70eafc7430 io_uring/memmap: drop unused sz param in io_uring_validate_mmap_request()
io_uring_validate_mmap_request() doesn't use its size_t sz argument, so
remove it.

Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2026-01-01 08:16:48 -07:00
Jens Axboe
5623eb1ed0 io_uring/tctx: add separate lock for list of tctx's in ctx
ctx->tcxt_list holds the tasks using this ring, and it's currently
protected by the normal ctx->uring_lock. However, this can cause a
circular locking issue, as reported by syzbot, where cancelations off
exec end up needing to remove an entry from this list:

======================================================
WARNING: possible circular locking dependency detected
syzkaller #0 Tainted: G             L
------------------------------------------------------
syz.0.9999/12287 is trying to acquire lock:
ffff88805851c0a8 (&ctx->uring_lock){+.+.}-{4:4}, at: io_uring_del_tctx_node+0xf0/0x2c0 io_uring/tctx.c:179

but task is already holding lock:
ffff88802db5a2e0 (&sig->cred_guard_mutex){+.+.}-{4:4}, at: prepare_bprm_creds fs/exec.c:1360 [inline]
ffff88802db5a2e0 (&sig->cred_guard_mutex){+.+.}-{4:4}, at: bprm_execve+0xb9/0x1400 fs/exec.c:1733

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #2 (&sig->cred_guard_mutex){+.+.}-{4:4}:
       __mutex_lock_common kernel/locking/mutex.c:614 [inline]
       __mutex_lock+0x187/0x1350 kernel/locking/mutex.c:776
       proc_pid_attr_write+0x547/0x630 fs/proc/base.c:2837
       vfs_write+0x27e/0xb30 fs/read_write.c:684
       ksys_write+0x145/0x250 fs/read_write.c:738
       do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
       do_syscall_64+0xec/0xf80 arch/x86/entry/syscall_64.c:94
       entry_SYSCALL_64_after_hwframe+0x77/0x7f

-> #1 (sb_writers#3){.+.+}-{0:0}:
       percpu_down_read_internal include/linux/percpu-rwsem.h:53 [inline]
       percpu_down_read_freezable include/linux/percpu-rwsem.h:83 [inline]
       __sb_start_write include/linux/fs/super.h:19 [inline]
       sb_start_write+0x4d/0x1c0 include/linux/fs/super.h:125
       mnt_want_write+0x41/0x90 fs/namespace.c:499
       open_last_lookups fs/namei.c:4529 [inline]
       path_openat+0xadd/0x3dd0 fs/namei.c:4784
       do_filp_open+0x1fa/0x410 fs/namei.c:4814
       io_openat2+0x3e0/0x5c0 io_uring/openclose.c:143
       __io_issue_sqe+0x181/0x4b0 io_uring/io_uring.c:1792
       io_issue_sqe+0x165/0x1060 io_uring/io_uring.c:1815
       io_queue_sqe io_uring/io_uring.c:2042 [inline]
       io_submit_sqe io_uring/io_uring.c:2320 [inline]
       io_submit_sqes+0xbf4/0x2140 io_uring/io_uring.c:2434
       __do_sys_io_uring_enter io_uring/io_uring.c:3280 [inline]
       __se_sys_io_uring_enter+0x2e0/0x2b60 io_uring/io_uring.c:3219
       do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
       do_syscall_64+0xec/0xf80 arch/x86/entry/syscall_64.c:94
       entry_SYSCALL_64_after_hwframe+0x77/0x7f

-> #0 (&ctx->uring_lock){+.+.}-{4:4}:
       check_prev_add kernel/locking/lockdep.c:3165 [inline]
       check_prevs_add kernel/locking/lockdep.c:3284 [inline]
       validate_chain kernel/locking/lockdep.c:3908 [inline]
       __lock_acquire+0x15a6/0x2cf0 kernel/locking/lockdep.c:5237
       lock_acquire+0x107/0x340 kernel/locking/lockdep.c:5868
       __mutex_lock_common kernel/locking/mutex.c:614 [inline]
       __mutex_lock+0x187/0x1350 kernel/locking/mutex.c:776
       io_uring_del_tctx_node+0xf0/0x2c0 io_uring/tctx.c:179
       io_uring_clean_tctx+0xd4/0x1a0 io_uring/tctx.c:195
       io_uring_cancel_generic+0x6ca/0x7d0 io_uring/cancel.c:646
       io_uring_task_cancel include/linux/io_uring.h:24 [inline]
       begin_new_exec+0x10ed/0x2440 fs/exec.c:1131
       load_elf_binary+0x9f8/0x2d70 fs/binfmt_elf.c:1010
       search_binary_handler fs/exec.c:1669 [inline]
       exec_binprm fs/exec.c:1701 [inline]
       bprm_execve+0x92e/0x1400 fs/exec.c:1753
       do_execveat_common+0x510/0x6a0 fs/exec.c:1859
       do_execve fs/exec.c:1933 [inline]
       __do_sys_execve fs/exec.c:2009 [inline]
       __se_sys_execve fs/exec.c:2004 [inline]
       __x64_sys_execve+0x94/0xb0 fs/exec.c:2004
       do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
       do_syscall_64+0xec/0xf80 arch/x86/entry/syscall_64.c:94
       entry_SYSCALL_64_after_hwframe+0x77/0x7f

other info that might help us debug this:

Chain exists of:
  &ctx->uring_lock --> sb_writers#3 --> &sig->cred_guard_mutex

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&sig->cred_guard_mutex);
                               lock(sb_writers#3);
                               lock(&sig->cred_guard_mutex);
  lock(&ctx->uring_lock);

 *** DEADLOCK ***

1 lock held by syz.0.9999/12287:
 #0: ffff88802db5a2e0 (&sig->cred_guard_mutex){+.+.}-{4:4}, at: prepare_bprm_creds fs/exec.c:1360 [inline]
 #0: ffff88802db5a2e0 (&sig->cred_guard_mutex){+.+.}-{4:4}, at: bprm_execve+0xb9/0x1400 fs/exec.c:1733

stack backtrace:
CPU: 0 UID: 0 PID: 12287 Comm: syz.0.9999 Tainted: G             L      syzkaller #0 PREEMPT(full)
Tainted: [L]=SOFTLOCKUP
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/25/2025
Call Trace:
 <TASK>
 dump_stack_lvl+0xe8/0x150 lib/dump_stack.c:120
 print_circular_bug+0x2e2/0x300 kernel/locking/lockdep.c:2043
 check_noncircular+0x12e/0x150 kernel/locking/lockdep.c:2175
 check_prev_add kernel/locking/lockdep.c:3165 [inline]
 check_prevs_add kernel/locking/lockdep.c:3284 [inline]
 validate_chain kernel/locking/lockdep.c:3908 [inline]
 __lock_acquire+0x15a6/0x2cf0 kernel/locking/lockdep.c:5237
 lock_acquire+0x107/0x340 kernel/locking/lockdep.c:5868
 __mutex_lock_common kernel/locking/mutex.c:614 [inline]
 __mutex_lock+0x187/0x1350 kernel/locking/mutex.c:776
 io_uring_del_tctx_node+0xf0/0x2c0 io_uring/tctx.c:179
 io_uring_clean_tctx+0xd4/0x1a0 io_uring/tctx.c:195
 io_uring_cancel_generic+0x6ca/0x7d0 io_uring/cancel.c:646
 io_uring_task_cancel include/linux/io_uring.h:24 [inline]
 begin_new_exec+0x10ed/0x2440 fs/exec.c:1131
 load_elf_binary+0x9f8/0x2d70 fs/binfmt_elf.c:1010
 search_binary_handler fs/exec.c:1669 [inline]
 exec_binprm fs/exec.c:1701 [inline]
 bprm_execve+0x92e/0x1400 fs/exec.c:1753
 do_execveat_common+0x510/0x6a0 fs/exec.c:1859
 do_execve fs/exec.c:1933 [inline]
 __do_sys_execve fs/exec.c:2009 [inline]
 __se_sys_execve fs/exec.c:2004 [inline]
 __x64_sys_execve+0x94/0xb0 fs/exec.c:2004
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0xec/0xf80 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7ff3a8b8f749
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ff3a9a97038 EFLAGS: 00000246 ORIG_RAX: 000000000000003b
RAX: ffffffffffffffda RBX: 00007ff3a8de5fa0 RCX: 00007ff3a8b8f749
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000200000000400
RBP: 00007ff3a8c13f91 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007ff3a8de6038 R14: 00007ff3a8de5fa0 R15: 00007ff3a8f0fa28
 </TASK>

Add a separate lock just for the tctx_list, tctx_lock. This can nest
under ->uring_lock, where necessary, and be used separately for list
manipulation. For the cancelation off exec side, this removes the
need to grab ->uring_lock, hence fixing the circular locking
dependency.

Reported-by: syzbot+b0e3b77ffaa8a4067ce5@syzkaller.appspotmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2026-01-01 08:16:40 -07:00
Dave Airlie
7be19f9327 Merge tag 'drm-intel-fixes-2025-12-31' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-fixes
drm/i915 fixes for v6.19-rc4:
- Fix eb_lookup_vmas() failure path

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Jani Nikula <jani.nikula@intel.com>
Link: https://patch.msgid.link/4e79f041395bb8bcc9b2a76bb98b5e3df1c1c3eb@intel.com
2026-01-01 16:55:36 +10:00
Dave Airlie
9abfe0b2e0 Merge tag 'drm-misc-fixes-2025-12-29' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes
drm-misc-fixes for v6.19-rc4:
- Documentation fixes and MODULE_LICENSE fix for shmem helper.
- Fix warnings in nouveau prepare_fb().
- Prevent export of protected objects in imagination driver.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patch.msgid.link/5506492b-02ca-47bc-8712-51e67f0e4b8b@linux.intel.com
2026-01-01 16:51:31 +10:00
Dave Airlie
1054f19572 Merge tag 'drm-xe-fixes-2025-12-30' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes
Core Changes:
- Ensure a SVM device memory allocation is idle before migration complete (Thomas)

Driver Changes:
- Fix a SVM debug printout (Thomas)
- Use READ_ONCE() / WRITE_ONCE() for g2h_fence (Jonathan)

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Thomas Hellstrom <thomas.hellstrom@linux.intel.com>
Link: https://patch.msgid.link/aVOTf6-whmkgrUuq@fedora
2026-01-01 16:39:54 +10:00
Shuah Khan
b69053dd3f wifi: mt76: Remove blank line after mt792x firmware version dmesg
An extra blank line gets printed after printing firmware version
because the build date is null terminated. Remove the "\n" from
dev_info() calls to print firmware version and build date to fix
the problem.

Reported-by: Mario Limonciello <superm1@gmail.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-12-31 17:03:35 -08:00
Shuah Khan
af7809f037 Revert "wifi: mt76: Strip whitespace from build ddate"
This reverts commit f804a5895e.

This change introduced the following panic, and mt792x_load_firmware()
fails. wifi is dead on systems with mt792x wireless.

kern  :crit  : kernel BUG at lib/string_helpers.c:1043!
kern  :warn  : Oops: invalid opcode: 0000 [#1] SMP NOPTI
kern  :warn  : CPU: 14 UID: 0 PID: 61 Comm: kworker/14:0 Tainted: G        W
        6.19.0-rc1 #1 PREEMPT(voluntary)
kern  :warn  : Tainted: [W]=WARN
kern  :warn  : Hardware name: Framework Laptop 13 (AMD Ryzen 7040Series)/FRANMDCP07, BIOS 03.16 07/25/2025
kern  :warn  : Workqueue: events mt7921_init_work [mt7921_common]
kern  :warn  : RIP: 0010:__fortify_panic+0xd/0xf
kern  :warn  : Code: 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 40 0f b6 ff e8 c3 55 71 00 <0f> 0b 48 8b 54 24 10 48 8b 74 24 08 4c 89 e9 48 c7 c7 00 a2 d5 a0
kern  :warn  : RSP: 0018:ffffa7a5c03a3d10 EFLAGS: 00010246
kern  :warn  : RAX: ffffffffa0d7aaf2 RBX: 0000000000000000 RCX: ffffffffa0d7aaf2
kern  :warn  : RDX: 0000000000000011 RSI: ffffffffa0d5a170 RDI: ffffffffa128db10
kern  :warn  : RBP: ffff91650ae52060 R08: 0000000000000010 R09: ffffa7a5c31b2000
kern  :warn  : R10: ffffa7a5c03a3bf0 R11: 00000000ffffffff R12: 0000000000000000
kern  :warn  : R13: ffffa7a5c31b2000 R14: 0000000000001000 R15: 0000000000000000
kern  :warn  : FS:  0000000000000000(0000) GS:ffff91743e664000(0000) knlGS:0000000000000000
kern  :warn  : CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
kern  :warn  : CR2: 00007f10786c241c CR3: 00000003eca24000 CR4: 0000000000f50ef0
kern  :warn  : PKRU: 55555554
kern  :warn  : Call Trace:
kern  :warn  :  <TASK>
kern  :warn  :  mt76_connac2_load_patch.cold+0x2b/0xa41 [mt76_connac_lib]
kern  :warn  :  ? srso_alias_return_thunk+0x5/0xfbef5
kern  :warn  :  mt792x_load_firmware+0x36/0x150 [mt792x_lib]
kern  :warn  :  mt7921_run_firmware+0x2c/0x4a0 [mt7921_common]
kern  :warn  :  ? srso_alias_return_thunk+0x5/0xfbef5
kern  :warn  :  ? mt7921_rr+0x12/0x30 [mt7921e]
kern  :warn  :  ? srso_alias_return_thunk+0x5/0xfbef5
kern  :warn  :  ? ____mt76_poll_msec+0x75/0xb0 [mt76]
kern  :warn  :  mt7921e_mcu_init+0x4c/0x7a [mt7921e]
kern  :warn  :  mt7921_init_work+0x51/0x190 [mt7921_common]
kern  :warn  :  process_one_work+0x18b/0x340
kern  :warn  :  worker_thread+0x256/0x3a0
kern  :warn  :  ? __pfx_worker_thread+0x10/0x10
kern  :warn  :  kthread+0xfc/0x240
kern  :warn  :  ? __pfx_kthread+0x10/0x10
kern  :warn  :  ? __pfx_kthread+0x10/0x10
kern  :warn  :  ret_from_fork+0x254/0x290
kern  :warn  :  ? __pfx_kthread+0x10/0x10
kern  :warn  :  ret_from_fork_asm+0x1a/0x30
kern  :warn  :  </TASK>

Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-12-31 17:03:35 -08:00
Wake Liu
19b8a76cd9 kselftest/harness: Use helper to avoid zero-size memset warning
When building kselftests with a toolchain that enables source
fortification (e.g., Android's build environment, which uses
-D_FORTIFY_SOURCE=3), a build failure occurs in tests that use an
empty FIXTURE().

The root cause is that an empty fixture struct results in
`sizeof(self_private)` evaluating to 0. The compiler's fortification
checks then detect the `memset()` call with a compile-time constant size
of 0, issuing a `-Wuser-defined-warnings` which is promoted to an error
by `-Werror`.

An initial attempt to guard the call with `if (sizeof(self_private) > 0)`
was insufficient. The compiler's static analysis is aggressive enough
to flag the `memset(..., 0)` pattern before evaluating the conditional,
thus still triggering the error.

To resolve this robustly, this change introduces a `static inline`
helper function, `__kselftest_memset_safe()`. This function wraps the
size check and the `memset()` call. By replacing the direct `memset()`
in the `__TEST_F_IMPL` macro with a call to this helper, we create an
abstraction boundary. This prevents the compiler's static analyzer from
"seeing" the problematic pattern at the macro expansion site, resolving
the build failure.

Build Context:
Compiler: Android (14488419, +pgo, +bolt, +lto, +mlgo, based on r584948) clang version 22.0.0 (https://android.googlesource.com/toolchain/llvm-project 2d65e4108033380e6fe8e08b1f1826cd2bfb0c99)
Relevant Options: -O2 -Wall -Werror -D_FORTIFY_SOURCE=3 -target i686-linux-android10000

Test: m kselftest_futex_futex_requeue_pi

Removed Gerrit Change-Id
Shuah Khan <skhan@linuxfoundation.org>

Link: https://lore.kernel.org/r/20251224084120.249417-1-wakel@google.com
Signed-off-by: Wake Liu <wakel@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2025-12-31 13:27:36 -07:00
Linus Torvalds
9528d5c091 Merge tag 'platform-drivers-x86-v6.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
Pull x86 platform driver fixes from Ilpo Järvinen:

 - alienware-wmi-wmax: Area-51, x16, and 16X Aurora laptops support

 - asus-armoury:
    - Fix FA507R PPT data
    - Add TDP data for more laptop models

 - asus-nb-wmi: Asus Zenbook 14 display toggle key support

 - dell-lis3lv02d: Dell Latitude 5400 support

 - hp-bioscfg: Fix out-of-bounds array access in ACPI package parsing

 - ibm_rtl: Fix EBDA signature search pointer arithmetic

 - ideapad-laptop: Reassign KEY_CUT to KEY_SELECTIVE_SCREENSHOT

 - intel/pmt:
    - Fix kobject memory leak on init failure
    - Use valid pointers on error handling path

 - intel/vsec: Correct kernel doc comments

 - mellanox: mlxbf-pmc: Fix event names

 - msi-laptop: Add sysfs_remove_group()

 - samsumg-galaxybook: Do not cast pointer to a shorter type

 - think-lmi: WMI certificate thumbprint support for ThinkCenter

 - uniwill: Tuxedo Book BA15 Gen10 support

* tag 'platform-drivers-x86-v6.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: (22 commits)
  platform/x86: asus-armoury: add support for G835LW
  platform/x86: asus-armoury: fix ppt data for FA507R
  platform/x86/intel/pmt/discovery: use valid device pointer in dev_err_probe
  platform/x86: hp-bioscfg: Fix out-of-bounds array access in ACPI package parsing
  platform/x86: asus-armoury: add support for G615LR
  platform/x86: asus-armoury: add support for FA608UM
  platform/x86: asus-armoury: add support for GA403WR
  platform/x86: asus-armoury: add support for GU605CR
  platform/x86: ideapad-laptop: Reassign KEY_CUT to KEY_SELECTIVE_SCREENSHOT
  platform/x86: samsung-galaxybook: Fix problematic pointer cast
  platform/x86/intel/pmt: Fix kobject memory leak on init failure
  platform/x86/intel/vsec: correct kernel-doc comments
  platform/x86: ibm_rtl: fix EBDA signature search pointer arithmetic
  platform/x86: msi-laptop: add missing sysfs_remove_group()
  platform/x86: think-lmi: Add WMI certificate thumbprint support for ThinkCenter
  platform/x86: dell-lis3lv02d: Add Latitude 5400
  platform/mellanox: mlxbf-pmc: Remove trailing whitespaces from event names
  platform/x86: asus-nb-wmi: Add keymap for display toggle
  platform/x86/uniwill: Add TUXEDO Book BA15 Gen10
  platform/x86: alienware-wmi-wmax: Add support for Alienware 16X Aurora
  ...
2025-12-31 12:25:22 -08:00
Zheng Yejian
0eccd4acd6 selftests/ftrace: Test toplevel-enable for instance
'available_events' is actually not required by
'test.d/event/toplevel-enable.tc' and its Existence has been tested in
'test.d/00basic/basic4.tc'.

So the require of 'available_events' can be dropped and then we can add
'instance' flag to test 'test.d/event/toplevel-enable.tc' for instance.

Test result show as below:
 # ./ftracetest test.d/event/toplevel-enable.tc
 === Ftrace unit tests ===
 [1] event tracing - enable/disable with top level files [PASS]
 [2] (instance)  event tracing - enable/disable with top level files [PASS]

 # of passed:  2
 # of failed:  0
 # of unresolved:  0
 # of untested:  0
 # of unsupported:  0
 # of xfailed:  0
 # of undefined(test bug):  0

Link: https://lore.kernel.org/r/20230509203659.1173917-1-zhengyejian1@huawei.com
Signed-off-by: Zheng Yejian <zhengyejian1@huawei.com>
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2025-12-31 12:46:12 -07:00
Yipeng Zou
b889b4fb4c selftests/ftrace: traceonoff_triggers: strip off names
The func_traceonoff_triggers.tc sometimes goes to fail
on my board, Kunpeng-920.

[root@localhost]# ./ftracetest ./test.d/ftrace/func_traceonoff_triggers.tc -l fail.log
=== Ftrace unit tests ===
[1] ftrace - test for function traceon/off triggers     [FAIL]
[2] (instance)  ftrace - test for function traceon/off triggers [UNSUPPORTED]

I look up the log, and it shows that the md5sum is different between csum1 and csum2.

++ cnt=611
++ sleep .1
+++ cnt_trace
+++ grep -v '^#' trace
+++ wc -l
++ cnt2=611
++ '[' 611 -ne 611 ']'
+++ cat tracing_on
++ on=0
++ '[' 0 '!=' 0 ']'
+++ md5sum trace
++ csum1='76896aa74362fff66a6a5f3cf8a8a500  trace'
++ sleep .1
+++ md5sum trace
++ csum2='ee8625a21c058818fc26e45c1ed3f6de  trace'
++ '[' '76896aa74362fff66a6a5f3cf8a8a500  trace' '!=' 'ee8625a21c058818fc26e45c1ed3f6de  trace' ']'
++ fail 'Tracing file is still changing'
++ echo Tracing file is still changing
Tracing file is still changing
++ exit_fail
++ exit 1

So I directly dump the trace file before md5sum, the diff shows that:

[root@localhost]# diff trace_1.log trace_2.log -y --suppress-common-lines
dockerd-12285   [036] d.... 18385.510290: sched_stat | <...>-12285   [036] d.... 18385.510290: sched_stat
dockerd-12285   [036] d.... 18385.510291: sched_swit | <...>-12285   [036] d.... 18385.510291: sched_swit
<...>-740       [044] d.... 18385.602859: sched_stat | kworker/44:1-740 [044] d.... 18385.602859: sched_stat
<...>-740       [044] d.... 18385.602860: sched_swit | kworker/44:1-740 [044] d.... 18385.602860: sched_swit

And we can see that <...> filed be filled with names.

We can strip off the names there to fix that.

After strip off the names:

kworker/u257:0-12 [019] d..2.  2528.758910: sched_stat | -12 [019] d..2.  2528.758910: sched_stat_runtime: comm=k
kworker/u257:0-12 [019] d..2.  2528.758912: sched_swit | -12 [019] d..2.  2528.758912: sched_switch: prev_comm=kw
<idle>-0          [000] d.s5.  2528.762318: sched_waki | -0  [000] d.s5.  2528.762318: sched_waking: comm=sshd pi
<idle>-0          [037] dNh2.  2528.762326: sched_wake | -0  [037] dNh2.  2528.762326: sched_wakeup: comm=sshd pi
<idle>-0          [037] d..2.  2528.762334: sched_swit | -0  [037] d..2.  2528.762334: sched_switch: prev_comm=sw

Link: https://lore.kernel.org/r/20230818013226.2182299-1-zouyipeng@huawei.com
Fixes: d87b29179a ("selftests: ftrace: Use md5sum to take less time of checking logs")
Suggested-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Yipeng Zou <zouyipeng@huawei.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2025-12-31 12:45:25 -07:00
Linus Torvalds
349bd28a86 Merge tag 'vfio-v6.19-rc4' of https://github.com/awilliam/linux-vfio
Pull VFIO fixes from Alex Williamson:

 - Restrict ROM access to dword to resolve a regression introduced with
   qword access seen on some Intel NICs. Update VGA region access to the
   same given lack of precedent for 64-bit users (Kevin Tian)

 - Fix missing .get_region_info_caps callback in the xe-vfio-pci variant
   driver due to integration through the DRM tree (Michal Wajdeczko)

 - Add aligned 64-bit access macros to tools/include/linux/types.h,
   allowing removal of uapi/linux/type.h includes from various vfio
   selftest, resolving redefinition warnings for integration with KVM
   selftests (David Matlack)

 - Fix error path memory leak in pds-vfio-pci variant driver (Zilin Guan)

 - Fix error path use-after-free in xe-vfio-pci variant driver (Alper Ak)

* tag 'vfio-v6.19-rc4' of https://github.com/awilliam/linux-vfio:
  vfio/xe: Fix use-after-free in xe_vfio_pci_alloc_file()
  vfio/pds: Fix memory leak in pds_vfio_dirty_enable()
  vfio: selftests: Drop <uapi/linux/types.h> includes
  tools include: Add definitions for __aligned_{l,b}e64
  vfio/xe: Add default handler for .get_region_info_caps
  vfio/pci: Disable qword access to the VGA region
  vfio/pci: Disable qword access to the PCI ROM bar
2025-12-31 10:38:48 -08:00
Jens Axboe
9e193a06e6 Merge tag 'md-6.19-20251231' of gitolite.kernel.org:pub/scm/linux/kernel/git/mdraid/linux into block-6.19
Pull MD fixes from Yu Kuai:

"- Fix null-pointer dereference in raid5 sysfs group_thread_cnt store
   (Tuo Li)
 - Fix possible mempool corruption during raid1 raid_disks update via
   sysfs (FengWei Shih)
 - Fix logical_block_size configuration being overwritten during
   super_1_validate() (Li Nan)
 - Fix forward incompatibility with configurable logical block size:
   arrays assembled on new kernels could not be assembled on kernels
   <=6.18 due to non-zero reserved pad rejection (Li Nan)
 - Fix static checker warning about iterator not incremented (Li Nan)"

* tag 'md-6.19-20251231' of gitolite.kernel.org:pub/scm/linux/kernel/git/mdraid/linux:
  md: Fix forward incompatibility from configurable logical block size
  md: Fix logical_block_size configuration being overwritten
  md: suspend array while updating raid_disks via sysfs
  md/raid5: fix possible null-pointer dereferences in raid5_store_group_thread_cnt()
  md: Fix static checker warning in analyze_sbs
2025-12-31 06:55:07 -07:00
Krzysztof Niemiec
4fe2bd1954 drm/i915/gem: Zero-initialize the eb.vma array in i915_gem_do_execbuffer
Initialize the eb.vma array with values of 0 when the eb structure is
first set up. In particular, this sets the eb->vma[i].vma pointers to
NULL, simplifying cleanup and getting rid of the bug described below.

During the execution of eb_lookup_vmas(), the eb->vma array is
successively filled up with struct eb_vma objects. This process includes
calling eb_add_vma(), which might fail; however, even in the event of
failure, eb->vma[i].vma is set for the currently processed buffer.

If eb_add_vma() fails, eb_lookup_vmas() returns with an error, which
prompts a call to eb_release_vmas() to clean up the mess. Since
eb_lookup_vmas() might fail during processing any (possibly not first)
buffer, eb_release_vmas() checks whether a buffer's vma is NULL to know
at what point did the lookup function fail.

In eb_lookup_vmas(), eb->vma[i].vma is set to NULL if either the helper
function eb_lookup_vma() or eb_validate_vma() fails. eb->vma[i+1].vma is
set to NULL in case i915_gem_object_userptr_submit_init() fails; the
current one needs to be cleaned up by eb_release_vmas() at this point,
so the next one is set. If eb_add_vma() fails, neither the current nor
the next vma is set to NULL, which is a source of a NULL deref bug
described in the issue linked in the Closes tag.

When entering eb_lookup_vmas(), the vma pointers are set to the slab
poison value, instead of NULL. This doesn't matter for the actual
lookup, since it gets overwritten anyway, however the eb_release_vmas()
function only recognizes NULL as the stopping value, hence the pointers
are being set to NULL as they go in case of intermediate failure. This
patch changes the approach to filling them all with NULL at the start
instead, rather than handling that manually during failure.

Reported-by: Gangmin Kim <km.kim1503@gmail.com>
Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/15062
Fixes: 544460c338 ("drm/i915: Multi-BB execbuf")
Cc: stable@vger.kernel.org # 5.16.x
Signed-off-by: Krzysztof Niemiec <krzysztof.niemiec@intel.com>
Reviewed-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
Link: https://lore.kernel.org/r/20251216180900.54294-2-krzysztof.niemiec@intel.com
(cherry picked from commit 08889b706d)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-12-31 11:19:47 +02:00
Chenghao Duan
bb85d206be samples/ftrace: Adjust LoongArch register restore order in direct calls
Ensure that in the ftrace direct call logic, the CPU register state
(with ra = parent return address) is restored to the correct state after
the execution of the custom trampoline function and before returning to
the traced function. Additionally, guarantee the correctness of the jump
logic for jr t0 (traced function address).

Cc: stable@vger.kernel.org
Fixes: 9cdc3b6a29 ("LoongArch: ftrace: Add direct call support")
Reported-by: Youling Tang <tangyouling@kylinos.cn>
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-12-31 15:19:25 +08:00
Chenghao Duan
73721d8676 LoongArch: BPF: Enhance the bpf_arch_text_poke() function
Enhance the bpf_arch_text_poke() function to enable accurate location
of BPF program entry points.

When modifying the entry point of a BPF program, skip the "move t0, ra"
instruction to ensure the correct logic and copy of the jump address.

Cc: stable@vger.kernel.org
Fixes: 677e6123e3 ("LoongArch: BPF: Disable trampoline for kernel module function trace")
Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-12-31 15:19:21 +08:00
Chenghao Duan
26138762d9 LoongArch: BPF: Enable trampoline-based tracing for module functions
Remove the previous restrictions that blocked the tracing of kernel
module functions. Fix the issue that previously caused kernel lockups
when attempting to trace module functions.

Before entering the trampoline code, the return address register ra
shall store the address of the next assembly instruction after the
'bl trampoline' instruction, which is the traced function address, and
the register t0 shall store the parent function return address. Refine
the trampoline return logic to ensure that register data remains correct
when returning to both the traced function and the parent function.

Before this patch was applied, the module_attach test in selftests/bpf
encountered a deadlock issue. This was caused by an incorrect jump
address after the trampoline execution, which resulted in an infinite
loop within the module function.

Cc: stable@vger.kernel.org
Fixes: 677e6123e3 ("LoongArch: BPF: Disable trampoline for kernel module function trace")
Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-12-31 15:19:21 +08:00
Chenghao Duan
61319d15a5 LoongArch: BPF: Adjust the jump offset of tail calls
Call the next bpf prog and skip the first instruction of TCC
initialization.

A total of 7 instructions are skipped:
'move t0, ra'			1 inst
'move_imm + jirl'		5 inst
'addid REG_TCC, zero, 0'	1 inst

Relevant test cases: the tailcalls test item in selftests/bpf.

Cc: stable@vger.kernel.org
Fixes: 677e6123e3 ("LoongArch: BPF: Disable trampoline for kernel module function trace")
Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-12-31 15:19:21 +08:00
Chenghao Duan
d314e1f482 LoongArch: BPF: Save return address register ra to t0 before trampoline
Modify the build_prologue() function to ensure the return address
register ra is saved to t0 before entering trampoline operations.
This change ensures the accurate return address handling when a BPF
program calls another BPF program, preventing errors in the BPF-to-BPF
call chain.

Cc: stable@vger.kernel.org
Fixes: 677e6123e3 ("LoongArch: BPF: Disable trampoline for kernel module function trace")
Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-12-31 15:19:20 +08:00
Hengqi Chen
eb71f5c433 LoongArch: BPF: Zero-extend bpf_tail_call() index
The bpf_tail_call() index should be treated as a u32 value. Let's
zero-extend it to avoid calling wrong BPF progs. See similar fixes
for x86 [1]) and arm64 ([2]) for more details.

  [1]: 90caccdd8c
  [2]: 16338a9b3a

Cc: stable@vger.kernel.org
Fixes: 5dc615520c ("LoongArch: Add BPF JIT support")
Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-12-31 15:19:20 +08:00