Change scoped_guard() and scoped_cond_guard() macros to make reasoning
about them easier for static analysis tools (smatch, compiler
diagnostics), especially to enable them to tell if the given usage of
scoped_guard() is with a conditional lock class (interruptible-locks,
try-locks) or not (like simple mutex_lock()).
Add compile-time error if scoped_cond_guard() is used for non-conditional
lock class.
Beyond easier tooling and a little shrink reported by bloat-o-meter
this patch enables developer to write code like:
int foo(struct my_drv *adapter)
{
scoped_guard(spinlock, &adapter->some_spinlock)
return adapter->spinlock_protected_var;
}
Current scoped_guard() implementation does not support that,
due to compiler complaining:
error: control reaches end of non-void function [-Werror=return-type]
Technical stuff about the change:
scoped_guard() macro uses common idiom of using "for" statement to declare
a scoped variable. Unfortunately, current logic is too hard for compiler
diagnostics to be sure that there is exactly one loop step; fix that.
To make any loop so trivial that there is no above warning, it must not
depend on any non-const variable to tell if there are more steps. There is
no obvious solution for that in C, but one could use the compound
statement expression with "goto" jumping past the "loop", effectively
leaving only the subscope part of the loop semantics.
More impl details:
one more level of macro indirection is now needed to avoid duplicating
label names;
I didn't spot any other place that is using the
"for (...; goto label) if (0) label: break;" idiom, so it's not packed for
reuse beyond scoped_guard() family, what makes actual macros code cleaner.
There was also a need to introduce const true/false variable per lock
class, it is used to aid compiler diagnostics reasoning about "exactly
1 step" loops (note that converting that to function would undo the whole
benefit).
Big thanks to Andy Shevchenko for help on this patch, both internal and
public, ranging from whitespace/formatting, through commit message
clarifications, general improvements, ending with presenting alternative
approaches - all despite not even liking the idea.
Big thanks to Dmitry Torokhov for the idea of compile-time check for
scoped_cond_guard() (to use it only with conditional locsk), and general
improvements for the patch.
Big thanks to David Lechner for idea to cover also scoped_cond_guard().
Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Link: https://lkml.kernel.org/r/20241018113823.171256-1-przemyslaw.kitszel@intel.com
Replace this pattern in osq_unlock():
atomic_cmpxchg(*ptr, old, new) == old
... with the simpler and faster:
atomic_try_cmpxchg(*ptr, &old, new)
The x86 CMPXCHG instruction returns success in the ZF flag,
so this change saves a compare after the CMPXCHG. The code
in the fast path of osq_unlock() improves from:
11b: 31 c9 xor %ecx,%ecx
11d: 8d 50 01 lea 0x1(%rax),%edx
120: 89 d0 mov %edx,%eax
122: f0 0f b1 0f lock cmpxchg %ecx,(%rdi)
126: 39 c2 cmp %eax,%edx
128: 75 05 jne 12f <...>
to:
12b: 31 d2 xor %edx,%edx
12d: 83 c0 01 add $0x1,%eax
130: f0 0f b1 17 lock cmpxchg %edx,(%rdi)
134: 75 05 jne 13b <...>
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Waiman Long <longman@redhat.com>
Link: https://lore.kernel.org/r/20241001114606.820277-1-ubizjak@gmail.com
Guard functions in local_lock.h are defined using DEFINE_GUARD() and
DEFINE_LOCK_GUARD_1() macros having lock type defined as pointer in
the percpu address space. The functions, defined by these macros
return value in generic address space, causing:
cleanup.h:157:18: error: return from pointer to non-enclosed address space
and
cleanup.h:214:18: error: return from pointer to non-enclosed address space
when strict percpu checks are enabled.
Add explicit casts to remove address space of the returned pointer.
Found by GCC's named address space checks.
Fixes: e4ab322fba ("cleanup: Add conditional guard support")
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20240819074124.143565-1-ubizjak@gmail.com
Every lock, that becomes a sleeping lock on PREEMPT_RT, starts a RCU read
side critical section. There is no sparse annotation for this and sparse
complains about unbalanced locking.
Add __acquires/ __releases for the RCU lock. This covers all but the
trylock functions. A __cond_acquires() annotation didn't work.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20240812104200.2239232-4-bigeasy@linutronix.de
spin_trylock_irqsave() has a __cond_lock() wrapper which points to
__spin_trylock_irqsave(). The function then invokes spin_trylock() which
has another __cond_lock() finally pointing to rt_spin_trylock().
The compiler has no problem to parse this but sparse does not recognise
that users of spin_trylock_irqsave() acquire a conditional lock and
complains.
Remove one layer of __cond_lock() so that sparse recognises conditional
locking.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20240812104200.2239232-3-bigeasy@linutronix.de
The sleeping locks on PREEMPT_RT (rt_spin_lock() and friends) lack
sparse annotation. Therefore a missing spin_unlock() won't be spotted by
sparse in a PREEMPT_RT build while it is noticed on a !PREEMPT_RT build.
Add the __acquires/__releases macros to the lock/ unlock functions. The
trylock functions already use the __cond_lock() wrapper.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20240812104200.2239232-2-bigeasy@linutronix.de
Syzbot reports a problem that a warning will be triggered while
searching a lock class in look_up_lock_class().
The cause of the issue is that a new name is created and used by
lockdep_set_subclass() instead of using the existing one. This results
in a lock instance has a different name pointer than previous registered
one stored in lock class, and WARN_ONCE() is triggered because of that
in look_up_lock_class().
To fix this, change lockdep_set_subclass() to use the existing name
instead of a new one. Hence, no new name will be created by
lockdep_set_subclass(). Hence, the warning is avoided.
[boqun: Reword the commit log to state the correct issue]
Reported-by: <syzbot+7f4a6f7f7051474e40ad@syzkaller.appspotmail.com>
Fixes: de8f5e4f2d ("lockdep: Introduce wait-type checks")
Cc: stable@vger.kernel.org
Signed-off-by: Ahmed Ehab <bottaawesome633@gmail.com>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Link: https://lore.kernel.org/lkml/20240824221031.7751-1-bottaawesome633@gmail.com/
Add a function to check that an offline CPU has left the tracing
infrastructure in a sane state.
Commit 9bb69ba4c1 ("ACPI: processor_idle: use raw_safe_halt() in
acpi_idle_play_dead()") fixed an issue where the acpi_idle_play_dead()
function called safe_halt() instead of raw_safe_halt(), which had the
side-effect of setting the hardirqs_enabled flag for the offline CPU.
On x86 this triggered warnings from lockdep_assert_irqs_disabled() when
the CPU was brought back online again later. These warnings were too
early for the exception to be handled correctly, leading to a
triple-fault.
Add lockdep_cleanup_dead_cpu() to check for this kind of failure mode,
print the events leading up to it, and correct it so that the CPU can
come online again correctly. Re-introducing the original bug now merely
results in this warning instead:
[ 61.556652] smpboot: CPU 1 is now offline
[ 61.556769] CPU 1 left hardirqs enabled!
[ 61.556915] irq event stamp: 128149
[ 61.556965] hardirqs last enabled at (128149): [<ffffffff81720a36>] acpi_idle_play_dead+0x46/0x70
[ 61.557055] hardirqs last disabled at (128148): [<ffffffff81124d50>] do_idle+0x90/0xe0
[ 61.557117] softirqs last enabled at (128078): [<ffffffff81cec74c>] __do_softirq+0x31c/0x423
[ 61.557199] softirqs last disabled at (128065): [<ffffffff810baae1>] __irq_exit_rcu+0x91/0x100
[boqun: Capitalize the title and reword the message a bit]
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Link: https://lore.kernel.org/r/f7bd2b3b999051bb3ef4be34526a9262008285f5.camel@infradead.org
When using mutex_acquire_nest() with a nest_lock, lockdep refcounts the
number of acquired lockdep_maps of mutexes of the same class, and also
keeps a pointer to the first acquired lockdep_map of a class. That pointer
is then used for various comparison-, printing- and checking purposes,
but there is no mechanism to actively ensure that lockdep_map stays in
memory. Instead, a warning is printed if the lockdep_map is freed and
there are still held locks of the same lock class, even if the lockdep_map
itself has been released.
In the context of WW/WD transactions that means that if a user unlocks
and frees a ww_mutex from within an ongoing ww transaction, and that
mutex happens to be the first ww_mutex grabbed in the transaction,
such a warning is printed and there might be a risk of a UAF.
Note that this is only problem when lockdep is enabled and affects only
dereferences of struct lockdep_map.
Adjust to this by adding a fake lockdep_map to the acquired context and
make sure it is the first acquired lockdep map of the associated
ww_mutex class. Then hold it for the duration of the WW/WD transaction.
This has the side effect that trying to lock a ww mutex *without* a
ww_acquire_context but where a such context has been acquire, we'd see
a lockdep splat. The test-ww_mutex.c selftest attempts to do that, so
modify that particular test to not acquire a ww_acquire_context if it
is not going to be used.
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20241009092031.6356-1-thomas.hellstrom@linux.intel.com
If CONFIG_GENERIC_LOCKBREAK=y and CONFIG_DEBUG_LOCK_ALLOC=n
(e.g. sh/sdk7786_defconfig):
kernel/locking/spinlock.c:68:17: warning: no previous prototype for '__raw_spin_lock' [-Wmissing-prototypes]
kernel/locking/spinlock.c:80:26: warning: no previous prototype for '__raw_spin_lock_irqsave' [-Wmissing-prototypes]
kernel/locking/spinlock.c:98:17: warning: no previous prototype for '__raw_spin_lock_irq' [-Wmissing-prototypes]
kernel/locking/spinlock.c:103:17: warning: no previous prototype for '__raw_spin_lock_bh' [-Wmissing-prototypes]
kernel/locking/spinlock.c:68:17: warning: no previous prototype for '__raw_read_lock' [-Wmissing-prototypes]
kernel/locking/spinlock.c:80:26: warning: no previous prototype for '__raw_read_lock_irqsave' [-Wmissing-prototypes]
kernel/locking/spinlock.c:98:17: warning: no previous prototype for '__raw_read_lock_irq' [-Wmissing-prototypes]
kernel/locking/spinlock.c:103:17: warning: no previous prototype for '__raw_read_lock_bh' [-Wmissing-prototypes]
kernel/locking/spinlock.c:68:17: warning: no previous prototype for '__raw_write_lock' [-Wmissing-prototypes]
kernel/locking/spinlock.c:80:26: warning: no previous prototype for '__raw_write_lock_irqsave' [-Wmissing-prototypes]
kernel/locking/spinlock.c:98:17: warning: no previous prototype for '__raw_write_lock_irq' [-Wmissing-prototypes]
kernel/locking/spinlock.c:103:17: warning: no previous prototype for '__raw_write_lock_bh' [-Wmissing-prototypes]
All __raw_* lock ops are internal functions without external callers.
Hence fix this by making them static.
Note that if CONFIG_GENERIC_LOCKBREAK=y, no lock ops are inlined, as all
of CONFIG_INLINE_*_LOCK* depend on !GENERIC_LOCKBREAK.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Waiman Long <longman@redhat.com>
Link: https://lkml.kernel.org/r/7201d7fb408375c6c4df541270d787b1b4a32354.1727879348.git.geert+renesas@glider.be
Pull Kbuild fixes from Masahiro Yamada:
- Move non-boot built-in DTBs to the .rodata section
- Fix Kconfig bugs
- Fix maint scripts in the linux-image Debian package
- Import some list macros to scripts/include/
* tag 'kbuild-fixes-v6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
kbuild: deb-pkg: Remove blank first line from maint scripts
kbuild: fix a typo dt_binding_schema -> dt_binding_schemas
scripts: import more list macros
kconfig: qconf: fix buffer overflow in debug links
kconfig: qconf: move conf_read() before drawing tree pain
kconfig: clear expr::val_is_valid when allocated
kconfig: fix infinite loop in sym_calc_choice()
kbuild: move non-boot built-in DTBs to .rodata section
Pull x86 platform driver fixes from Hans de Goede:
- Intel PMC fix for suspend/resume issues on some Sky and Kaby Lake
laptops
- Intel Diamond Rapids hw-id additions
- Documentation and MAINTAINERS fixes
- Some other small fixes
* tag 'platform-drivers-x86-v6.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
platform/x86: x86-android-tablets: Fix use after free on platform_device_register() errors
platform/x86: wmi: Update WMI driver API documentation
platform/x86: dell-ddv: Fix typo in documentation
platform/x86: dell-sysman: add support for alienware products
platform/x86/intel: power-domains: Add Diamond Rapids support
platform/x86: ISST: Add Diamond Rapids to support list
platform/x86:intel/pmc: Disable ACPI PM Timer disabling on Sky and Kaby Lake
platform/x86: dell-laptop: Do not fail when encountering unsupported batteries
MAINTAINERS: Update Intel In Field Scan(IFS) entry
platform/x86: ISST: Fix the KASAN report slab-out-of-bounds bug
Pull kvm fixes from Paolo Bonzini:
"ARM64:
- Fix pKVM error path on init, making sure we do not change critical
system registers as we're about to fail
- Make sure that the host's vector length is at capped by a value
common to all CPUs
- Fix kvm_has_feat*() handling of "negative" features, as the current
code is pretty broken
- Promote Joey to the status of official reviewer, while James steps
down -- hopefully only temporarly
x86:
- Fix compilation with KVM_INTEL=KVM_AMD=n
- Fix disabling KVM_X86_QUIRK_SLOT_ZAP_ALL when shadow MMU is in use
Selftests:
- Fix compilation on non-x86 architectures"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
x86/reboot: emergency callbacks are now registered by common KVM code
KVM: x86: leave kvm.ko out of the build if no vendor module is requested
KVM: x86/mmu: fix KVM_X86_QUIRK_SLOT_ZAP_ALL for shadow MMU
KVM: arm64: Fix kvm_has_feat*() handling of negative features
KVM: selftests: Fix build on architectures other than x86_64
KVM: arm64: Another reviewer reshuffle
KVM: arm64: Constrain the host to the maximum shared SVE VL with pKVM
KVM: arm64: Fix __pkvm_init_vcpu cptr_el2 error path
Pull powerpc fix from Michael Ellerman:
- Allow r30 to be used in vDSO code generation of getrandom
Thanks to Jason A. Donenfeld
* tag 'powerpc-6.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/vdso: allow r30 in vDSO code generation of getrandom
The blank line causes execve() to fail:
# strace ./postinst
execve("./postinst", ...) = -1 ENOEXEC (Exec format error)
strace: exec: Exec format error
+++ exited with 1 +++
However running the scripts via shell does work (at least with bash)
because the shell attempts to execute the file as a shell script when
execve() fails.
Fixes: b611daae5e ("kbuild: deb-pkg: split image and debug objects staging out into functions")
Signed-off-by: Aaron Thompson <dev@aaront.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
If we follow "make help" to "make dt_binding_schema", we will see
below error:
$ make dt_binding_schema
make[1]: *** No rule to make target 'dt_binding_schema'. Stop.
make: *** [Makefile:224: __sub-make] Error 2
It should be a typo. So this will fix it.
Fixes: 604a57ba97 ("dt-bindings: kbuild: Add separate target/dependency for processed-schema.json")
Signed-off-by: Xu Yang <xu.yang_2@nxp.com>
Reviewed-by: Nicolas Schier <n.schier@avm.de>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
If the battery hook encounters a unsupported battery, it will
return an error. This in turn will cause the battery driver to
automatically unregister the battery hook.
On machines with multiple batteries however, this will prevent
the battery hook from handling the primary battery, since it will
always get unregistered upon encountering one of the unsupported
batteries.
Fix this by simply ignoring unsupported batteries.
Reviewed-by: Pali Rohár <pali@kernel.org>
Fixes: ab58016c68 ("platform/x86:dell-laptop: Add knobs to change battery charge settings")
Signed-off-by: Armin Wolf <W_Armin@gmx.de>
Link: https://lore.kernel.org/r/20241001212835.341788-4-W_Armin@gmx.de
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
KVM/arm64 fixes for 6.12, take #1
- Fix pKVM error path on init, making sure we do not change critical
system registers as we're about to fail
- Make sure that the host's vector length is at capped by a value
common to all CPUs
- Fix kvm_has_feat*() handling of "negative" features, as the current
code is pretty broken
- Promote Joey to the status of official reviewer, while James steps
down -- hopefully only temporarly
Guard them with CONFIG_KVM_X86_COMMON rather than the two vendor modules.
In practice this has no functional change, because CONFIG_KVM_X86_COMMON
is set if and only if at least one vendor-specific module is being built.
However, it is cleaner to specify CONFIG_KVM_X86_COMMON for functions that
are used in kvm.ko.
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Fixes: 590b09b1d8 ("KVM: x86: Register "emergency disable" callbacks when virt is enabled")
Fixes: 6d55a94222 ("x86/reboot: Unconditionally define cpu_emergency_virt_cb typedef")
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
kvm.ko is nothing but library code shared by kvm-intel.ko and kvm-amd.ko.
It provides no functionality on its own and it is unnecessary unless one
of the vendor-specific module is compiled. In particular, /dev/kvm is
not created until one of kvm-intel.ko or kvm-amd.ko is loaded.
Use CONFIG_KVM to decide if it is built-in or a module, but use the
vendor-specific modules for the actual decision on whether to build it.
This also fixes a build failure when CONFIG_KVM_INTEL and CONFIG_KVM_AMD
are both disabled. The cpu_emergency_register_virt_callback() function
is called from kvm.ko, but it is only defined if at least one of
CONFIG_KVM_INTEL and CONFIG_KVM_AMD is provided.
Fixes: 590b09b1d8 ("KVM: x86: Register "emergency disable" callbacks when virt is enabled")
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Pull bcachefs fixes from Kent Overstreet:
"A lot of little fixes, bigger ones include:
- bcachefs's __wait_on_freeing_inode() was broken in rc1 due to vfs
changes, now fixed along with another lost wakeup
- fragmentation LRU fixes; fsck now repairs successfully (this is the
data structure copygc uses); along with some nice simplification.
- Rework logged op error handling, so that if logged op replay errors
(due to another filesystem error) we delete the logged op instead
of going into an infinite loop)
- Various small filesystem connectivitity repair fixes"
* tag 'bcachefs-2024-10-05' of git://evilpiepirate.org/bcachefs:
bcachefs: Rework logged op error handling
bcachefs: Add warn param to subvol_get_snapshot, peek_inode
bcachefs: Kill snapshot arg to fsck_write_inode()
bcachefs: Check for unlinked, non-empty dirs in check_inode()
bcachefs: Check for unlinked inodes with dirents
bcachefs: Check for directories with no backpointers
bcachefs: Kill alloc_v4.fragmentation_lru
bcachefs: minor lru fsck fixes
bcachefs: Mark more errors AUTOFIX
bcachefs: Make sure we print error that causes fsck to bail out
bcachefs: bkey errors are only AUTOFIX during read
bcachefs: Create lost+found in correct snapshot
bcachefs: Fix reattach_inode()
bcachefs: Add missing wakeup to bch2_inode_hash_remove()
bcachefs: Fix trans_commit disk accounting revert
bcachefs: Fix bch2_inode_is_open() check
bcachefs: Fix return type of dirent_points_to_inode_nowarn()
bcachefs: Fix bad shift in bch2_read_flag_list()
Pull xen fix from Juergen Gross:
"Fix Xen config issue introduced in the merge window"
* tag 'for-linus-6.12a-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen: Fix config option reference in XEN_PRIVCMD definition
Pull ext4 fixes from Ted Ts'o:
"Fix some ext4 bugs and regressions relating to oneline resize and fast
commits"
* tag 'ext4_for_linus-5.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: fix off by one issue in alloc_flex_gd()
ext4: mark fc as ineligible using an handle in ext4_xattr_set()
ext4: use handle to mark fc as ineligible in __track_dentry_update()
Pull cxl fix from Ira Weiny:
- Fix calculation for SBDF in error injection
* tag 'cxl-fixes-6.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl:
EINJ, CXL: Fix CXL device SBDF calculation
Pull i2c fix from Wolfram Sang:
- Fix potential deadlock during runtime suspend and resume (stm32f7)
* tag 'i2c-for-6.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: stm32f7: Do not prepare/unprepare clock during runtime suspend/resume
Pull spi fixes from Mark Brown:
"A small set of driver specific fixes that came in since the merge
window, about half of which is fixes for correctness in the use of the
runtime PM APIs done as part of a broader cleanup"
* tag 'spi-fix-v6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: s3c64xx: fix timeout counters in flush_fifo
spi: atmel-quadspi: Fix wrong register value written to MR
spi: spi-cadence: Fix missing spi_controller_is_target() check
spi: spi-cadence: Fix pm_runtime_set_suspended() with runtime pm enabled
spi: spi-imx: Fix pm_runtime_set_suspended() with runtime pm enabled
Pull hardening fixes from Kees Cook:
- gcc plugins: Avoid Kconfig warnings with randstruct (Nathan
Chancellor)
- MAINTAINERS: Add security/Kconfig.hardening to hardening section
(Nathan Chancellor)
- MAINTAINERS: Add unsafe_memcpy() to the FORTIFY review list
* tag 'hardening-v6.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
MAINTAINERS: Add security/Kconfig.hardening to hardening section
hardening: Adjust dependencies in selection of MODVERSIONS
MAINTAINERS: Add unsafe_memcpy() to the FORTIFY review list
Pull lsm revert from Paul Moore:
"Here is the CONFIG_SECURITY_TOMOYO_LKM revert that we've been
discussing this week. With near unanimous agreement that the original
TOMOYO patches were not the right way to solve the distro problem
Tetsuo is trying the solve, reverting is our best option at this time"
* tag 'lsm-pr-20241004' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm:
tomoyo: revert CONFIG_SECURITY_TOMOYO_LKM support
Attaching SST PCI device to VM causes "BUG: KASAN: slab-out-of-bounds".
kasan report:
[ 19.411889] ==================================================================
[ 19.413702] BUG: KASAN: slab-out-of-bounds in _isst_if_get_pci_dev+0x3d5/0x400 [isst_if_common]
[ 19.415634] Read of size 8 at addr ffff888829e65200 by task cpuhp/16/113
[ 19.417368]
[ 19.418627] CPU: 16 PID: 113 Comm: cpuhp/16 Tainted: G E 6.9.0 #10
[ 19.420435] Hardware name: VMware, Inc. VMware20,1/440BX Desktop Reference Platform, BIOS VMW201.00V.20192059.B64.2207280713 07/28/2022
[ 19.422687] Call Trace:
[ 19.424091] <TASK>
[ 19.425448] dump_stack_lvl+0x5d/0x80
[ 19.426963] ? _isst_if_get_pci_dev+0x3d5/0x400 [isst_if_common]
[ 19.428694] print_report+0x19d/0x52e
[ 19.430206] ? __pfx__raw_spin_lock_irqsave+0x10/0x10
[ 19.431837] ? _isst_if_get_pci_dev+0x3d5/0x400 [isst_if_common]
[ 19.433539] kasan_report+0xf0/0x170
[ 19.435019] ? _isst_if_get_pci_dev+0x3d5/0x400 [isst_if_common]
[ 19.436709] _isst_if_get_pci_dev+0x3d5/0x400 [isst_if_common]
[ 19.438379] ? __pfx_sched_clock_cpu+0x10/0x10
[ 19.439910] isst_if_cpu_online+0x406/0x58f [isst_if_common]
[ 19.441573] ? __pfx_isst_if_cpu_online+0x10/0x10 [isst_if_common]
[ 19.443263] ? ttwu_queue_wakelist+0x2c1/0x360
[ 19.444797] cpuhp_invoke_callback+0x221/0xec0
[ 19.446337] cpuhp_thread_fun+0x21b/0x610
[ 19.447814] ? __pfx_cpuhp_thread_fun+0x10/0x10
[ 19.449354] smpboot_thread_fn+0x2e7/0x6e0
[ 19.450859] ? __pfx_smpboot_thread_fn+0x10/0x10
[ 19.452405] kthread+0x29c/0x350
[ 19.453817] ? __pfx_kthread+0x10/0x10
[ 19.455253] ret_from_fork+0x31/0x70
[ 19.456685] ? __pfx_kthread+0x10/0x10
[ 19.458114] ret_from_fork_asm+0x1a/0x30
[ 19.459573] </TASK>
[ 19.460853]
[ 19.462055] Allocated by task 1198:
[ 19.463410] kasan_save_stack+0x30/0x50
[ 19.464788] kasan_save_track+0x14/0x30
[ 19.466139] __kasan_kmalloc+0xaa/0xb0
[ 19.467465] __kmalloc+0x1cd/0x470
[ 19.468748] isst_if_cdev_register+0x1da/0x350 [isst_if_common]
[ 19.470233] isst_if_mbox_init+0x108/0xff0 [isst_if_mbox_msr]
[ 19.471670] do_one_initcall+0xa4/0x380
[ 19.472903] do_init_module+0x238/0x760
[ 19.474105] load_module+0x5239/0x6f00
[ 19.475285] init_module_from_file+0xd1/0x130
[ 19.476506] idempotent_init_module+0x23b/0x650
[ 19.477725] __x64_sys_finit_module+0xbe/0x130
[ 19.476506] idempotent_init_module+0x23b/0x650
[ 19.477725] __x64_sys_finit_module+0xbe/0x130
[ 19.478920] do_syscall_64+0x82/0x160
[ 19.480036] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 19.481292]
[ 19.482205] The buggy address belongs to the object at ffff888829e65000
which belongs to the cache kmalloc-512 of size 512
[ 19.484818] The buggy address is located 0 bytes to the right of
allocated 512-byte region [ffff888829e65000, ffff888829e65200)
[ 19.487447]
[ 19.488328] The buggy address belongs to the physical page:
[ 19.489569] page: refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff888829e60c00 pfn:0x829e60
[ 19.491140] head: order:3 entire_mapcount:0 nr_pages_mapped:0 pincount:0
[ 19.492466] anon flags: 0x57ffffc0000840(slab|head|node=1|zone=2|lastcpupid=0x1fffff)
[ 19.493914] page_type: 0xffffffff()
[ 19.494988] raw: 0057ffffc0000840 ffff88810004cc80 0000000000000000 0000000000000001
[ 19.496451] raw: ffff888829e60c00 0000000080200018 00000001ffffffff 0000000000000000
[ 19.497906] head: 0057ffffc0000840 ffff88810004cc80 0000000000000000 0000000000000001
[ 19.499379] head: ffff888829e60c00 0000000080200018 00000001ffffffff 0000000000000000
[ 19.500844] head: 0057ffffc0000003 ffffea0020a79801 ffffea0020a79848 00000000ffffffff
[ 19.502316] head: 0000000800000000 0000000000000000 00000000ffffffff 0000000000000000
[ 19.503784] page dumped because: kasan: bad access detected
[ 19.505058]
[ 19.505970] Memory state around the buggy address:
[ 19.507172] ffff888829e65100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 19.508599] ffff888829e65180: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 19.510013] >ffff888829e65200: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 19.510014] ^
[ 19.510016] ffff888829e65280: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 19.510018] ffff888829e65300: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 19.515367] ==================================================================
The reason for this error is physical_package_ids assigned by VMware VMM
are not continuous and have gaps. This will cause value returned by
topology_physical_package_id() to be more than topology_max_packages().
Here the allocation uses topology_max_packages(). The call to
topology_max_packages() returns maximum logical package ID not physical
ID. Hence use topology_logical_package_id() instead of
topology_physical_package_id().
Fixes: 9a1aac8a96 ("platform/x86: ISST: PUNIT device mapping with Sub-NUMA clustering")
Cc: stable@vger.kernel.org
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Zach Wade <zachwade.k@gmail.com>
Link: https://lore.kernel.org/r/20240923144508.1764-1-zachwade.k@gmail.com
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>