linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-05-20 17:59:03 -04:00

Author	SHA1	Message	Date
Linus Torvalds	e812928be2	Merge tag 'cxl-for-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl Pull CXL updates from Dave Jiang: - Introduce cxl_memdev_attach and pave way for soft reserved handling, type2 accelerator enabling, and LSA 2.0 enabling. All these series require the endpoint driver to settle before continuing the memdev driver probe. - Address CXL port error protocol handling and reporting. The large patch series was split into three parts. The first two parts are included here with the final part coming later. The first part consists of a series of code refactoring to PCI AER sub-system that addresses CXL and also CXL RAS code to prepare for port error handling. The second part refactors the CXL code to move management of component registers to cxl_port objects to allow all CXL AER errors to be handled through the cxl_port hierarchy. - Provide AMD Zen5 platform address translation for CXL using ACPI PRMT. This includes a conventions document to explain why this is needed and how it's implemented. - Misc CXL patches of fixes, cleanups, and updates. Including CXL address translation for unaligned MOD3 regions. [ TLA service: CXL is "Compute Express Link" ] * tag 'cxl-for-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (59 commits) cxl: Disable HPA/SPA translation handlers for Normalized Addressing cxl/region: Factor out code into cxl_region_setup_poison() cxl/atl: Lock decoders that need address translation cxl: Enable AMD Zen5 address translation using ACPI PRMT cxl/acpi: Prepare use of EFI runtime services cxl: Introduce callback for HPA address ranges translation cxl/region: Use region data to get the root decoder cxl/region: Add @hpa_range argument to function cxl_calc_interleave_pos() cxl/region: Separate region parameter setup and region construction cxl: Simplify cxl_root_ops allocation and handling cxl/region: Store HPA range in struct cxl_region cxl/region: Store root decoder in struct cxl_region cxl/region: Rename misleading variable name @hpa to @hpa_range Documentation/driver-api/cxl: ACPI PRM Address Translation Support and AMD Zen5 enablement cxl, doc: Moving conventions in separate files cxl, doc: Remove isonum.txt inclusion cxl/port: Unify endpoint and switch port lookup cxl/port: Move endpoint component register management to cxl_port cxl/port: Map Port RAS registers cxl/port: Move dport RAS setup to dport add time ...	2026-02-12 16:33:05 -08:00
Linus Torvalds	136114e0ab	Merge tag 'mm-nonmm-stable-2026-02-12-10-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull non-MM updates from Andrew Morton: - "ocfs2: give ocfs2 the ability to reclaim suballocator free bg" saves disk space by teaching ocfs2 to reclaim suballocator block group space (Heming Zhao) - "Add ARRAY_END(), and use it to fix off-by-one bugs" adds the ARRAY_END() macro and uses it in various places (Alejandro Colomar) - "vmcoreinfo: support VMCOREINFO_BYTES larger than PAGE_SIZE" makes the vmcore code future-safe, if VMCOREINFO_BYTES ever exceeds the page size (Pnina Feder) - "kallsyms: Prevent invalid access when showing module buildid" cleans up kallsyms code related to module buildid and fixes an invalid access crash when printing backtraces (Petr Mladek) - "Address page fault in ima_restore_measurement_list()" fixes a kexec-related crash that can occur when booting the second-stage kernel on x86 (Harshit Mogalapalli) - "kho: ABI headers and Documentation updates" updates the kexec handover ABI documentation (Mike Rapoport) - "Align atomic storage" adds the __aligned attribute to atomic_t and atomic64_t definitions to get natural alignment of both types on csky, m68k, microblaze, nios2, openrisc and sh (Finn Thain) - "kho: clean up page initialization logic" simplifies the page initialization logic in kho_restore_page() (Pratyush Yadav) - "Unload linux/kernel.h" moves several things out of kernel.h and into more appropriate places (Yury Norov) - "don't abuse task_struct.group_leader" removes the usage of ->group_leader when it is "obviously unnecessary" (Oleg Nesterov) - "list private v2 & luo flb" adds some infrastructure improvements to the live update orchestrator (Pasha Tatashin) * tag 'mm-nonmm-stable-2026-02-12-10-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (107 commits) watchdog/hardlockup: simplify perf event probe and remove per-cpu dependency procfs: fix missing RCU protection when reading real_parent in do_task_stat() watchdog/softlockup: fix sample ring index wrap in need_counting_irqs() kcsan, compiler_types: avoid duplicate type issues in BPF Type Format kho: fix doc for kho_restore_pages() tests/liveupdate: add in-kernel liveupdate test liveupdate: luo_flb: introduce File-Lifecycle-Bound global state liveupdate: luo_file: Use private list list: add kunit test for private list primitives list: add primitives for private list manipulations delayacct: fix uapi timespec64 definition panic: add panic_force_cpu= parameter to redirect panic to a specific CPU netclassid: use thread_group_leader(p) in update_classid_task() RDMA/umem: don't abuse current->group_leader drm/pan*: don't abuse current->group_leader drm/amd: kill the outdated "Only the pthreads threading model is supported" checks drm/amdgpu: don't abuse current->group_leader android/binder: use same_thread_group(proc->tsk, current) in binder_mmap() android/binder: don't abuse current->group_leader kho: skip memoryless NUMA nodes when reserving scratch areas ...	2026-02-12 12:13:01 -08:00
Linus Torvalds	4cff5c05e0	Merge tag 'mm-stable-2026-02-11-19-22' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - "powerpc/64s: do not re-activate batched TLB flush" makes arch_{enter\|leave}_lazy_mmu_mode() nest properly (Alexander Gordeev) It adds a generic enter/leave layer and switches architectures to use it. Various hacks were removed in the process. - "zram: introduce compressed data writeback" implements data compression for zram writeback (Richard Chang and Sergey Senozhatsky) - "mm: folio_zero_user: clear page ranges" adds clearing of contiguous page ranges for hugepages. Large improvements during demand faulting are demonstrated (David Hildenbrand) - "memcg cleanups" tidies up some memcg code (Chen Ridong) - "mm/damon: introduce {,max_}nr_snapshots and tracepoint for damos stats" improves DAMOS stat's provided information, deterministic control, and readability (SeongJae Park) - "selftests/mm: hugetlb cgroup charging: robustness fixes" fixes a few issues in the hugetlb cgroup charging selftests (Li Wang) - "Fix va_high_addr_switch.sh test failure - again" addresses several issues in the va_high_addr_switch test (Chunyu Hu) - "mm/damon/tests/core-kunit: extend existing test scenarios" improves the KUnit test coverage for DAMON (Shu Anzai) - "mm/khugepaged: fix dirty page handling for MADV_COLLAPSE" fixes a glitch in khugepaged which was causing madvise(MADV_COLLAPSE) to transiently return -EAGAIN (Shivank Garg) - "arch, mm: consolidate hugetlb early reservation" reworks and consolidates a pile of straggly code related to reservation of hugetlb memory from bootmem and creation of CMA areas for hugetlb (Mike Rapoport) - "mm: clean up anon_vma implementation" cleans up the anon_vma implementation in various ways (Lorenzo Stoakes) - "tweaks for __alloc_pages_slowpath()" does a little streamlining of the page allocator's slowpath code (Vlastimil Babka) - "memcg: separate private and public ID namespaces" cleans up the memcg ID code and prevents the internal-only private IDs from being exposed to userspace (Shakeel Butt) - "mm: hugetlb: allocate frozen gigantic folio" cleans up the allocation of frozen folios and avoids some atomic refcount operations (Kefeng Wang) - "mm/damon: advance DAMOS-based LRU sorting" improves DAMOS's movement of memory betewwn the active and inactive LRUs and adds auto-tuning of the ratio-based quotas and of monitoring intervals (SeongJae Park) - "Support page table check on PowerPC" makes CONFIG_PAGE_TABLE_CHECK_ENFORCED work on powerpc (Andrew Donnellan) - "nodemask: align nodes_and{,not} with underlying bitmap ops" makes nodes_and() and nodes_andnot() propagate the return values from the underlying bit operations, enabling some cleanup in calling code (Yury Norov) - "mm/damon: hide kdamond and kdamond_lock from API callers" cleans up some DAMON internal interfaces (SeongJae Park) - "mm/khugepaged: cleanups and scan limit fix" does some cleanup work in khupaged and fixes a scan limit accounting issue (Shivank Garg) - "mm: balloon infrastructure cleanups" goes to town on the balloon infrastructure and its page migration function. Mainly cleanups, also some locking simplification (David Hildenbrand) - "mm/vmscan: add tracepoint and reason for kswapd_failures reset" adds additional tracepoints to the page reclaim code (Jiayuan Chen) - "Replace wq users and add WQ_PERCPU to alloc_workqueue() users" is part of Marco's kernel-wide migration from the legacy workqueue APIs over to the preferred unbound workqueues (Marco Crivellari) - "Various mm kselftests improvements/fixes" provides various unrelated improvements/fixes for the mm kselftests (Kevin Brodsky) - "mm: accelerate gigantic folio allocation" greatly speeds up gigantic folio allocation, mainly by avoiding unnecessary work in pfn_range_valid_contig() (Kefeng Wang) - "selftests/damon: improve leak detection and wss estimation reliability" improves the reliability of two of the DAMON selftests (SeongJae Park) - "mm/damon: cleanup kdamond, damon_call(), damos filter and DAMON_MIN_REGION" does some cleanup work in the core DAMON code (SeongJae Park) - "Docs/mm/damon: update intro, modules, maintainer profile, and misc" performs maintenance work on the DAMON documentation (SeongJae Park) - "mm: add and use vma_assert_stabilised() helper" refactors and cleans up the core VMA code. The main aim here is to be able to use the mmap write lock's lockdep state to perform various assertions regarding the locking which the VMA code requires (Lorenzo Stoakes) - "mm, swap: swap table phase II: unify swapin use" removes some old swap code (swap cache bypassing and swap synchronization) which wasn't working very well. Various other cleanups and simplifications were made. The end result is a 20% speedup in one benchmark (Kairui Song) - "enable PT_RECLAIM on more 64-bit architectures" makes PT_RECLAIM available on 64-bit alpha, loongarch, mips, parisc, and um. Various cleanups were performed along the way (Qi Zheng) * tag 'mm-stable-2026-02-11-19-22' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (325 commits) mm/memory: handle non-split locks correctly in zap_empty_pte_table() mm: move pte table reclaim code to memory.c mm: make PT_RECLAIM depends on MMU_GATHER_RCU_TABLE_FREE mm: convert __HAVE_ARCH_TLB_REMOVE_TABLE to CONFIG_HAVE_ARCH_TLB_REMOVE_TABLE config um: mm: enable MMU_GATHER_RCU_TABLE_FREE parisc: mm: enable MMU_GATHER_RCU_TABLE_FREE mips: mm: enable MMU_GATHER_RCU_TABLE_FREE LoongArch: mm: enable MMU_GATHER_RCU_TABLE_FREE alpha: mm: enable MMU_GATHER_RCU_TABLE_FREE mm: change mm/pt_reclaim.c to use asm/tlb.h instead of asm-generic/tlb.h mm/damon/stat: remove __read_mostly from memory_idle_ms_percentiles zsmalloc: make common caches global mm: add SPDX id lines to some mm source files mm/zswap: use %pe to print error pointers mm/vmscan: use %pe to print error pointers mm/readahead: fix typo in comment mm: khugepaged: fix NR_FILE_PAGES and NR_SHMEM in collapse_file() mm: refactor vma_map_pages to use vm_insert_pages mm/damon: unify address range representation with damon_addr_range mm/cma: replace snprintf with strscpy in cma_new_area ...	2026-02-12 11:32:37 -08:00
Linus Torvalds	45a1b8cc6c	Merge tag 'x86_misc_for_7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull misc x86 updates from Dave Hansen: "The usual smattering of x86/misc changes. The IPv6 patch in here surprised me in a couple of ways. First, the function it inlines is able to eat a lot more CPU time than I would have expected. Second, the inlining does not seem to bloat the kernel, at least in the configs folks have tested. - Inline x86-specific IPv6 checksum helper - Update IOMMU docs to use stable identifiers - Print unhashed pointers on fatal stack overflows" * tag 'x86_misc_for_7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/traps: Print unhashed pointers on stack overflow Documentation/x86: Update IOMMU spec references to use stable identifiers x86/lib: Inline csum_ipv6_magic()	2026-02-10 19:52:18 -08:00
Linus Torvalds	6f7e6393d1	Merge tag 'x86_entry_for_7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 entry code updates from Dave Hansen: "This is entirely composed of a set of long overdue VDSO cleanups. They makes the VDSO build much more logical and zap quite a bit of old cruft. It also results in a coveted net-code-removal diffstat" * tag 'x86_entry_for_7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/entry/vdso: Add vdso2c to .gitignore x86/entry/vdso32: Omit '.cfi_offset eflags' for LLVM < 16 MAINTAINERS: Adjust vdso file entry in INTEL SGX x86/entry/vdso/selftest: Update location of vgetrandom-chacha.S x86/entry/vdso: Fix filtering of vdso compiler flags x86/entry/vdso: Update the object paths for "make vdso_install" x86/entry/vdso32: When using int $0x80, use it directly x86/cpufeature: Replace X86_FEATURE_SYSENTER32 with X86_FEATURE_SYSFAST32 x86/vdso: Abstract out vdso system call internals x86/entry/vdso: Include GNU_PROPERTY and GNU_STACK PHDRs x86/entry/vdso32: Remove open-coded DWARF in sigreturn.S x86/entry/vdso32: Remove SYSCALL_ENTER_KERNEL macro in sigreturn.S x86/entry/vdso32: Don't rely on int80_landing_pad for adjusting ip x86/entry/vdso: Refactor the vdso build x86/entry/vdso: Move vdso2c to arch/x86/tools x86/entry/vdso: Rename vdso_image_* to vdso*_image	2026-02-10 19:34:26 -08:00
Linus Torvalds	ca8f421ea0	Merge tag 'x86_sev_for_v7.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 SEV updates from Borislav Petkov: - Make the SEV internal header really internal and carve out the SVSM-specific code into a separate compilation unit, along with other cleanups and fixups [ TLA translation service: 'SEV' is AMD's 'Secure Encrypted Virtualization' and SVSM is an ETLA ('Enhanced TLA') for 'Secure VM Service Module'. Some of us have trouble keeping track of this all and need all the help we can get ] * tag 'x86_sev_for_v7.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/sev: Don't emit BSS_DECRYPTED section unless it is in use x86/sev: Use kfree_sensitive() when freeing a SNP message descriptor x86/sev: Rename sev_es_ghcb_handle_msr() to __vc_handle_msr() x86/sev: Carve out the SVSM code into a separate compilation unit x86/sev: Add internal header guards x86/sev: Move the internal header	2026-02-10 19:19:06 -08:00
Linus Torvalds	57cb845067	Merge tag 'x86_paravirt_for_v7.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 paravirt updates from Borislav Petkov: - A nice cleanup to the paravirt code containing a unification of the paravirt clock interface, taming the include hell by splitting the pv_ops structure and removing of a bunch of obsolete code (Juergen Gross) * tag 'x86_paravirt_for_v7.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits) x86/paravirt: Use XOR r32,r32 to clear register in pv_vcpu_is_preempted() x86/paravirt: Remove trailing semicolons from alternative asm templates x86/pvlocks: Move paravirt spinlock functions into own header x86/paravirt: Specify pv_ops array in paravirt macros x86/paravirt: Allow pv-calls outside paravirt.h objtool: Allow multiple pv_ops arrays x86/xen: Drop xen_mmu_ops x86/xen: Drop xen_cpu_ops x86/xen: Drop xen_irq_ops x86/paravirt: Move pv_native_*() prototypes to paravirt.c x86/paravirt: Introduce new paravirt-base.h header x86/paravirt: Move paravirt_sched_clock() related code into tsc.c x86/paravirt: Use common code for paravirt_steal_clock() riscv/paravirt: Use common code for paravirt_steal_clock() loongarch/paravirt: Use common code for paravirt_steal_clock() arm64/paravirt: Use common code for paravirt_steal_clock() arm/paravirt: Use common code for paravirt_steal_clock() sched: Move clock related paravirt code to kernel/sched paravirt: Remove asm/paravirt_api_clock.h x86/paravirt: Move thunk macros to paravirt_types.h ...	2026-02-10 19:01:45 -08:00
Linus Torvalds	8cbd0d2b61	Merge tag 'x86_microcode_for_v7.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 microcode loader update from Borislav Petkov: - Since debugging the microcode loader makes sense on baremetal too (it was used in a guest only until now), extend it to be able to do that too * tag 'x86_microcode_for_v7.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/microcode/AMD: Allow loader debugging to be enabled on baremetal too	2026-02-10 18:59:06 -08:00
Linus Torvalds	9fbb481040	Merge tag 'x86_cleanups_for_v7.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cleanups from Borislav Petkov: - The usual set of cleanups and simplifications all over the tree * tag 'x86_cleanups_for_v7.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/segment: Use MOVL when reading segment registers selftests/x86: Clean up sysret_rip coding style x86/mm: Hide mm_free_global_asid() definition under CONFIG_BROADCAST_TLB_FLUSH x86/crash: Use set_memory_p() instead of __set_memory_prot() x86/CPU/AMD: Simplify the spectral chicken fix x86/platform/olpc: Replace strcpy() with strscpy() in xo15_sci_add() x86/split_lock: Remove dead string when split_lock_detect=fatal	2026-02-10 18:43:03 -08:00
Linus Torvalds	dcb4971018	Merge tag 'x86_cache_for_v7.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 resource control updates from Borislav Petkov: - Extend the resctrl machinery to support telemetry monitoring on Intel (Tony Luck) The practical usage of this is being able to tell how much energy or how much work can be attributed to a group of tasks tracked under a single idenitifier. Prepend this work with proper refactoring of resctrl domains handling code. * tag 'x86_cache_for_v7.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (30 commits) x86,fs/resctrl: Update documentation for telemetry events x86/resctrl: Enable RDT_RESOURCE_PERF_PKG fs/resctrl: Move RMID initialization to first mount x86,fs/resctrl: Compute number of RMIDs as minimum across resources fs/resctrl: Move allocation/free of closid_num_dirty_rmid[] x86/resctrl: Handle number of RMIDs supported by RDT_RESOURCE_PERF_PKG x86/resctrl: Add energy/perf choices to rdt boot option x86,fs/resctrl: Handle domain creation/deletion for RDT_RESOURCE_PERF_PKG fs/resctrl: Refactor rmdir_mondata_subdir_allrdtgrp() fs/resctrl: Refactor mkdir_mondata_subdir() x86/resctrl: Read telemetry events x86/resctrl: Find and enable usable telemetry events x86,fs/resctrl: Add architectural event pointer x86,fs/resctrl: Fill in details of events for performance and energy GUIDs x86/resctrl: Discover hardware telemetry events fs/resctrl: Emphasize that L3 monitoring resource is required for summing domains x86,fs/resctrl: Add and initialize a resource for package scope monitoring x86,fs/resctrl: Add an architectural hook called for first mount x86,fs/resctrl: Support binary fixed point event counters x86,fs/resctrl: Handle events that can be read from any CPU ...	2026-02-10 18:24:56 -08:00
Linus Torvalds	d1953aa3bc	Merge tag 'x86_alternatives_for_v7.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 alternatives update from Borislav Petkov: - Reorganize the alternatives patching mechanism to patch a single location only once instead of multiple times as it was the case with the two or three alternative options macros * tag 'x86_alternatives_for_v7.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/alternative: Patch a single alternative location only once x86/alternative: Use helper functions for patching alternatives	2026-02-10 18:22:04 -08:00
Linus Torvalds	2619c62b7e	Merge tag 'x86-irq-2026-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 irq updates from Thomas Gleixner: "Trivial cleanups for the posted MSI interrupt handling" * tag 'x86-irq-2026-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/irq_remapping: Sanitize posted_msi_supported() x86/irq: Cleanup posted MSI code	2026-02-10 17:39:08 -08:00
Linus Torvalds	b490d2a83f	Merge tag 'x86-cpu-2026-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cpu updates from Ingo Molnar: - CPU model updates (Andrew Cooper): - amd: Correct the microcode table for Zenbleed - amd: Use ZEN_MODEL_STEP_UCODE() for erratum_1386_microcode[] - Drop vestigial PBE logic in AMD/Hygon/Centaur/Cyrix - tsx: Set default TSX mode to auto (Nikolay Borisov) - Drop unused Kconfig symbol X86_P6_NOP (Randy Dunlap) * tag 'x86-cpu-2026-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/tsx: Set default TSX mode to auto x86/cpu: Drop unused Kconfig symbol X86_P6_NOP x86/cpu: Drop vestigial PBE logic in AMD/Hygon/Centaur/Cyrix x86/cpu/amd: Use ZEN_MODEL_STEP_UCODE() for erratum_1386_microcode[] x86/cpu/amd: Correct the microcode table for Zenbleed	2026-02-10 13:17:58 -08:00
Linus Torvalds	3516cadc70	Merge tag 'x86-apic-2026-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 APIC update from Ingo Molnar: - Inline __x2apic_send_IPI_dest() (Eric Dumazet) * tag 'x86-apic-2026-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/apic: Inline __x2apic_send_IPI_dest()	2026-02-10 13:16:09 -08:00
Linus Torvalds	5668a64622	Merge tag 'x86-boot-2026-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86/boot updates from Ingo Molnar: - x86/acpi: Add acpi=spcr to use SPCR-provided default console (Shenghao Yang) - x86/acpi/boot: Correct the acpi_is_processor_usable() check again (Yazen Ghannam) - Refresh the x86 memory map (e820 table) handling code, and make the printouts a bit more informative (Ingo Molnar) * tag 'x86-boot-2026-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (30 commits) x86/acpi: Add acpi=spcr to use SPCR-provided default console x86/boot/e820: Use <linux/sizes.h> symbols for literals x86/boot/e820: Make sure e820_search_gap() finds all gaps x86/boot/e820: Simplify the e820__range_remove() API x86/boot/e820: Remove e820__range_remove()'s unused return parameter x86/boot/e820: Simplify append_e820_table() and remove restriction on single-entry tables x86/boot/e820: Standardize __init/__initdata tag placement x86/boot/e820: Simplify & clarify __e820__range_add() a bit x86/boot/e820: Rename gap_start/gap_size to max_gap_start/max_gap_start in e820_search_gap() et al x86/boot/e820: Change e820_search_gap() to search for the highest-address PCI gap x86/boot/e820: Clean up e820__setup_pci_gap()/e820_search_gap() a bit x86/boot/e820: Change struct e820_table::nr_entries type from __u32 to u32 x86/boot/e820: Standardize e820 table index variable types under 'u32' x86/boot/e820: Standardize e820 table index variable names under 'idx' x86/boot/e820: Remove unnecessary header inclusions x86/boot/e820: Clean up __refdata use a bit x86/boot/e820: Clean up __e820__range_add() a bit x86/boot/e820: Improve e820_print_type() messages x86/boot/e820: Clean up confusing and self-contradictory verbiage around E820 related resource allocations x86/boot/e820: Remove pointless early_panic() indirection ...	2026-02-10 13:12:54 -08:00
Linus Torvalds	36ae1c45b2	Merge tag 'sched-core-2026-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: "Scheduler Kconfig space updates: - Further consolidate configurable preemption modes (Peter Zijlstra) Reduce the number of architectures that are allowed to offer PREEMPT_NONE and PREEMPT_VOLUNTARY, reducing the number of preemption models from four to just two: 'full' and 'lazy' on up-to-date architectures (arm64, loongarch, powerpc, riscv, s390, x86). None and voluntary are only available as legacy features on platforms that don't implement lazy preemption yet, or which don't even support preemption. The goal is to eventually remove cond_resched() and voluntary preemption altogether. RSEQ based 'scheduler time slice extension' support (Thomas Gleixner and Peter Zijlstra): This allows a thread to request a time slice extension when it enters a critical section to avoid contention on a resource when the thread is scheduled out inside of the critical section. - Add fields and constants for time slice extension - Provide static branch for time slice extensions - Add statistics for time slice extensions - Add prctl() to enable time slice extensions - Implement sys_rseq_slice_yield() - Implement syscall entry work for time slice extensions - Implement time slice extension enforcement timer - Reset slice extension when scheduled - Implement rseq_grant_slice_extension() - entry: Hook up rseq time slice extension - selftests: Implement time slice extension test - Allow registering RSEQ with slice extension - Move slice_ext_nsec to debugfs - Lower default slice extension - selftests/rseq: Add rseq slice histogram script Scheduler performance/scalability improvements: - Update rq->avg_idle when a task is moved to an idle CPU, which improves the scalability of various workloads (Shubhang Kaushik) - Reorder fields in 'struct rq' for better caching (Blake Jones) - Fair scheduler SMP NOHZ balancing code speedups (Shrikanth Hegde): - Move checking for nohz cpus after time check - Change likelyhood of nohz.nr_cpus - Remove nohz.nr_cpus and use weight of cpumask instead - Avoid false sharing for sched_clock_irqtime (Wangyang Guo) - Cleanups (Yury Norov): - Drop useless cpumask_empty() in find_energy_efficient_cpu() - Simplify task_numa_find_cpu() - Use cpumask_weight_and() in sched_balance_find_dst_group() DL scheduler updates: - Add a deadline server for sched_ext tasks (by Andrea Righi and Joel Fernandes, with fixes by Peter Zijlstra) RT scheduler updates: - Skip currently executing CPU in rto_next_cpu() (Chen Jinghuang) Entry code updates and performance improvements (Jinjie Ruan) This is part of the scheduler tree in this cycle due to inter- dependencies with the RSEQ based time slice extension work: - Remove unused syscall argument from syscall_trace_enter() - Rework syscall_exit_to_user_mode_work() for architecture reuse - Add arch_ptrace_report_syscall_entry/exit() - Inline syscall_exit_work() and syscall_trace_enter() Scheduler core updates (Peter Zijlstra): - Rework sched_class::wakeup_preempt() and rq_modified_() - Avoid rq->lock bouncing in sched_balance_newidle() - Rename rcu_dereference_check_sched_domain() => rcu_dereference_sched_domain() - <linux/compiler_types.h>: Add the __signed_scalar_typeof() helper Fair scheduler updates/refactoring (Peter Zijlstra and Ingo Molnar): - Fold the sched_avg update - Change rcu_dereference_check_sched_domain() to rcu-sched - Switch to rcu_dereference_all() - Remove superfluous rcu_read_lock() - Limit hrtick work - Join two #ifdef CONFIG_FAIR_GROUP_SCHED blocks - Clean up comments in 'struct cfs_rq' - Separate se->vlag from se->vprot - Rename cfs_rq::avg_load to cfs_rq::sum_weight - Rename cfs_rq::avg_vruntime to ::sum_w_vruntime & helper functions - Introduce and use the vruntime_cmp() and vruntime_op() wrappers for wrapped-signed aritmetics - Sort out 'blocked_load' namespace noise Scheduler debugging code updates: - Export hidden tracepoints to modules (Gabriele Monaco) - Convert copy_from_user() + kstrtouint() to kstrtouint_from_user() (Fushuai Wang) - Add assertions to QUEUE_CLASS (Peter Zijlstra) - hrtimer: Fix tracing oddity (Thomas Gleixner) Misc fixes and cleanups: - Re-evaluate scheduling when migrating queued tasks out of throttled cgroups (Zicheng Qu) - Remove task_struct->faults_disabled_mapping (Christoph Hellwig) - Fix math notation errors in avg_vruntime comment (Zhan Xusheng) - sched/cpufreq: Use %pe format for PTR_ERR() printing (zenghongling)" * tag 'sched-core-2026-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (64 commits) sched: Re-evaluate scheduling when migrating queued tasks out of throttled cgroups sched/cpufreq: Use %pe format for PTR_ERR() printing sched/rt: Skip currently executing CPU in rto_next_cpu() sched/clock: Avoid false sharing for sched_clock_irqtime selftests/sched_ext: Add test for DL server total_bw consistency selftests/sched_ext: Add test for sched_ext dl_server sched/debug: Fix dl_server (re)start conditions sched/debug: Add support to change sched_ext server params sched_ext: Add a DL server for sched_ext tasks sched/debug: Stop and start server based on if it was active sched/debug: Fix updating of ppos on server write ops sched/deadline: Clear the defer params entry: Inline syscall_exit_work() and syscall_trace_enter() entry: Add arch_ptrace_report_syscall_entry/exit() entry: Rework syscall_exit_to_user_mode_work() for architecture reuse entry: Remove unused syscall argument from syscall_trace_enter() sched: remove task_struct->faults_disabled_mapping sched: Update rq->avg_idle when a task is moved to an idle CPU selftests/rseq: Add rseq slice histogram script hrtimer: Fix trace oddity ...	2026-02-10 12:50:10 -08:00
Linus Torvalds	4d84667627	Merge tag 'perf-core-2026-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull performance event updates from Ingo Molnar: "x86 PMU driver updates: - Add support for the core PMU for Intel Diamond Rapids (DMR) CPUs (Dapeng Mi) Compared to previous iterations of the Intel PMU code, there's been a lot of changes, which center around three main areas: - Introduce the OFF-MODULE RESPONSE (OMR) facility to replace the Off-Core Response (OCR) facility - New PEBS data source encoding layout - Support the new "RDPMC user disable" feature - Likewise, a large series adds uncore PMU support for Intel Diamond Rapids (DMR) CPUs (Zide Chen) This centers around these four main areas: - DMR may have two Integrated I/O and Memory Hub (IMH) dies, separate from the compute tile (CBB) dies. Each CBB and each IMH die has its own discovery domain. - Unlike prior CPUs that retrieve the global discovery table portal exclusively via PCI or MSR, DMR uses PCI for IMH PMON discovery and MSR for CBB PMON discovery. - DMR introduces several new PMON types: SCA, HAMVF, D2D_ULA, UBR, PCIE4, CRS, CPC, ITC, OTC, CMS, and PCIE6. - IIO free-running counters in DMR are MMIO-based, unlike SPR. - Also add support for Add missing PMON units for Intel Panther Lake, and support Nova Lake (NVL), which largely maps to Panther Lake. (Zide Chen) - KVM integration: Add support for mediated vPMUs (by Kan Liang and Sean Christopherson, with fixes and cleanups by Peter Zijlstra, Sandipan Das and Mingwei Zhang) - Add Intel cstate driver to support for Wildcat Lake (WCL) CPUs, which are a low-power variant of Panther Lake (Zide Chen) - Add core, cstate and MSR PMU support for the Airmont NP Intel CPU (aka MaxLinear Lightning Mountain), which maps to the existing Airmont code (Martin Schiller) Performance enhancements: - Speed up kexec shutdown by avoiding unnecessary cross CPU calls (Jan H. Schönherr) - Fix slow perf_event_task_exit() with LBR callstacks (Namhyung Kim) User-space stack unwinding support: - Various cleanups and refactorings in preparation to generalize the unwinding code for other architectures (Jens Remus) Uprobes updates: - Transition from kmap_atomic to kmap_local_page (Keke Ming) - Fix incorrect lockdep condition in filter_chain() (Breno Leitao) - Fix XOL allocation failure for 32-bit tasks (Oleg Nesterov) Misc fixes and cleanups: - s390: Remove kvm_types.h from Kbuild (Randy Dunlap) - x86/intel/uncore: Convert comma to semicolon (Chen Ni) - x86/uncore: Clean up const mismatch (Greg Kroah-Hartman) - x86/ibs: Fix typo in dc_l2tlb_miss comment (Xiang-Bin Shi)" * tag 'perf-core-2026-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (58 commits) s390: remove kvm_types.h from Kbuild uprobes: Fix incorrect lockdep condition in filter_chain() x86/ibs: Fix typo in dc_l2tlb_miss comment x86/uprobes: Fix XOL allocation failure for 32-bit tasks perf/x86/intel/uncore: Convert comma to semicolon perf/x86/intel: Add support for rdpmc user disable feature perf/x86: Use macros to replace magic numbers in attr_rdpmc perf/x86/intel: Add core PMU support for Novalake perf/x86/intel: Add support for PEBS memory auxiliary info field in NVL perf/x86/intel: Add core PMU support for DMR perf/x86/intel: Add support for PEBS memory auxiliary info field in DMR perf/x86/intel: Support the 4 new OMR MSRs introduced in DMR and NVL perf/core: Fix slow perf_event_task_exit() with LBR callstacks perf/core: Speed up kexec shutdown by avoiding unnecessary cross CPU calls uprobes: use kmap_local_page() for temporary page mappings arm/uprobes: use kmap_local_page() in arch_uprobe_copy_ixol() mips/uprobes: use kmap_local_page() in arch_uprobe_copy_ixol() arm64/uprobes: use kmap_local_page() in arch_uprobe_copy_ixol() riscv/uprobes: use kmap_local_page() in arch_uprobe_copy_ixol() perf/x86/intel/uncore: Add Nova Lake support ...	2026-02-10 12:00:46 -08:00
Linus Torvalds	f17b474e36	Merge tag 'bpf-next-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Pull bpf updates from Alexei Starovoitov: - Support associating BPF program with struct_ops (Amery Hung) - Switch BPF local storage to rqspinlock and remove recursion detection counters which were causing false positives (Amery Hung) - Fix live registers marking for indirect jumps (Anton Protopopov) - Introduce execution context detection BPF helpers (Changwoo Min) - Improve verifier precision for 32bit sign extension pattern (Cupertino Miranda) - Optimize BTF type lookup by sorting vmlinux BTF and doing binary search (Donglin Peng) - Allow states pruning for misc/invalid slots in iterator loops (Eduard Zingerman) - In preparation for ASAN support in BPF arenas teach libbpf to move global BPF variables to the end of the region and enable arena kfuncs while holding locks (Emil Tsalapatis) - Introduce support for implicit arguments in kfuncs and migrate a number of them to new API. This is a prerequisite for cgroup sub-schedulers in sched-ext (Ihor Solodrai) - Fix incorrect copied_seq calculation in sockmap (Jiayuan Chen) - Fix ORC stack unwind from kprobe_multi (Jiri Olsa) - Speed up fentry attach by using single ftrace direct ops in BPF trampolines (Jiri Olsa) - Require frozen map for calculating map hash (KP Singh) - Fix lock entry creation in TAS fallback in rqspinlock (Kumar Kartikeya Dwivedi) - Allow user space to select cpu in lookup/update operations on per-cpu array and hash maps (Leon Hwang) - Make kfuncs return trusted pointers by default (Matt Bobrowski) - Introduce "fsession" support where single BPF program is executed upon entry and exit from traced kernel function (Menglong Dong) - Allow bpf_timer and bpf_wq use in all programs types (Mykyta Yatsenko, Andrii Nakryiko, Kumar Kartikeya Dwivedi, Alexei Starovoitov) - Make KF_TRUSTED_ARGS the default for all kfuncs and clean up their definition across the tree (Puranjay Mohan) - Allow BPF arena calls from non-sleepable context (Puranjay Mohan) - Improve register id comparison logic in the verifier and extend linked registers with negative offsets (Puranjay Mohan) - In preparation for BPF-OOM introduce kfuncs to access memcg events (Roman Gushchin) - Use CFI compatible destructor kfunc type (Sami Tolvanen) - Add bitwise tracking for BPF_END in the verifier (Tianci Cao) - Add range tracking for BPF_DIV and BPF_MOD in the verifier (Yazhou Tang) - Make BPF selftests work with 64k page size (Yonghong Song) * tag 'bpf-next-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (268 commits) selftests/bpf: Fix outdated test on storage->smap selftests/bpf: Choose another percpu variable in bpf for btf_dump test selftests/bpf: Remove test_task_storage_map_stress_lookup selftests/bpf: Update task_local_storage/task_storage_nodeadlock test selftests/bpf: Update task_local_storage/recursion test selftests/bpf: Update sk_storage_omem_uncharge test bpf: Switch to bpf_selem_unlink_nofail in bpf_local_storage_{map_free, destroy} bpf: Support lockless unlink when freeing map or local storage bpf: Prepare for bpf_selem_unlink_nofail() bpf: Remove unused percpu counter from bpf_local_storage_map_free bpf: Remove cgroup local storage percpu counter bpf: Remove task local storage percpu counter bpf: Change local_storage->lock and b->lock to rqspinlock bpf: Convert bpf_selem_unlink to failable bpf: Convert bpf_selem_link_map to failable bpf: Convert bpf_selem_unlink_map to failable bpf: Select bpf_local_storage_map_bucket based on bpf_local_storage selftests/xsk: fix number of Tx frags in invalid packet selftests/xsk: properly handle batch ending in the middle of a packet bpf: Prevent reentrance into call_rcu_tasks_trace() ...	2026-02-10 11:26:21 -08:00
Linus Torvalds	0c61526621	Merge tag 'efi-next-for-v7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi Pull EFI updates from Ard Biesheuvel: - Quirk the broken EFI framebuffer geometry on the Valve Steam Deck - Capture the EDID information of the primary display also on non-x86 EFI systems when booting via the EFI stub. * tag 'efi-next-for-v7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: efi: Support EDID information sysfb: Move edid_info into sysfb_primary_display sysfb: Pass sysfb_primary_display to devices sysfb: Replace screen_info with sysfb_primary_display sysfb: Add struct sysfb_display_info efi: sysfb_efi: Reduce number of references to global screen_info efi: earlycon: Reduce number of references to global screen_info efi: sysfb_efi: Fix efidrmfb and simpledrmfb on Valve Steam Deck efi: sysfb_efi: Convert swap width and height quirk to a callback efi: sysfb_efi: Fix lfb_linelength calculation when applying quirks efi: sysfb_efi: Replace open coded swap with the macro	2026-02-09 20:49:19 -08:00
Wangyang Guo	505da66893	sched/clock: Avoid false sharing for sched_clock_irqtime Read-mostly sched_clock_irqtime may share the same cacheline with frequently updated nohz struct. Make it as static_key to avoid false sharing issue. The only user of disable_sched_clock_irqtime() is tsc_.*mark_unstable() which may be invoked under atomic context and require a workqueue to disable static_key. But both of them calls clear_sched_clock_stable() just before doing disable_sched_clock_irqtime(). We can reuse "sched_clock_work" to also disable sched_clock_irqtime(). One additional case need to handle is if the tsc is marked unstable before late_initcall() phase, sched_clock_work will not be invoked and sched_clock_irqtime will stay enabled although clock is unstable: tsc_init() enable_sched_clock_irqtime() # irqtime accounting is enabled here ... if (unsynchronized_tsc()) # true mark_tsc_unstable() clear_sched_clock_stable() __sched_clock_stable_early = 0; ... if (static_key_count(&sched_clock_running.key) == 2) # Only happens at sched_clock_init_late() __clear_sched_clock_stable(); # Never executed ... # late_initcall() phase sched_clock_init_late() if (__sched_clock_stable_early) # Already false __set_sched_clock_stable(); # sched_clock is never marked stable # TSC unstable, but sched_clock_work won't run to disable irqtime So we need to disable_sched_clock_irqtime() in sched_clock_init_late() if clock is unstable. Reported-by: Benjamin Lei <benjamin.lei@intel.com> Suggested-by: K Prateek Nayak <kprateek.nayak@amd.com> Suggested-by: Peter Zijlstra <peterz@infradead.org> Suggested-by: Shrikanth Hegde <sshegde@linux.ibm.com> Signed-off-by: Wangyang Guo <wangyang.guo@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: K Prateek Nayak <kprateek.nayak@amd.com> Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com> Reviewed-by: Tianyou Li <tianyou.li@intel.com> Reviewed-by: Shrikanth Hegde <sshegde@linux.ibm.com> Tested-by: K Prateek Nayak <kprateek.nayak@amd.com> Link: https://patch.msgid.link/20260127072509.2627346-1-wangyang.guo@intel.com	2026-02-03 12:04:19 +01:00
Ard Biesheuvel	8c89d3ad30	x86/sev: Don't emit BSS_DECRYPTED section unless it is in use The BSS_DECRYPTED section that gets emitted into .bss will be empty if CONFIG_AMD_MEM_ENCRYPT is not defined. However, due to the fact that it is injected into .bss rather than emitted as a separate section, the 2 MiB alignment that it specifies is still taken into account unconditionally, pushing .bss out to the next 2 MiB boundary, leaving a gap that is never freed. So only emit a non-empty BSS_DECRYPTED section if it is going to be used. In that case, it would still be nice to free the padding, but that is left for later. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://patch.msgid.link/20260108092526.28586-23-ardb@kernel.org	2026-01-31 14:42:53 +01:00
Jiri Olsa	8bc11700e0	x86/fgraph: Fix return_to_handler regs.rsp value The previous change (Fixes commit) messed up the rsp register value, which is wrong because it's already adjusted with FRAME_SIZE, we need the original rsp value. This change does not affect fprobe current kernel unwind, the !perf_hw_regs path perf_callchain_kernel: if (perf_hw_regs(regs)) { if (perf_callchain_store(entry, regs->ip)) return; unwind_start(&state, current, regs, NULL); } else { unwind_start(&state, current, NULL, (void *)regs->sp); } which uses pt_regs.sp as first_frame boundary (FRAME_SIZE shift makes no difference, unwind stil stops at the right frame). This change fixes the other path when we want to unwind directly from pt_regs sp/fp/ip state, which is coming in following change. Fixes: `20a0bc1027` ("x86/fgraph,bpf: Fix stack ORC unwind from kprobe_multi return probe") Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org> Link: https://lore.kernel.org/bpf/20260126211837.472802-2-jolsa@kernel.org	2026-01-30 13:40:08 -08:00
Mike Rapoport (Microsoft)	9fac145b6d	mm, arch: consolidate hugetlb CMA reservation Every architecture that supports hugetlb_cma command line parameter reserves CMA areas for hugetlb during setup_arch(). This obfuscates the ordering of hugetlb CMA initialization with respect to the rest initialization of the core MM. Introduce arch_hugetlb_cma_order() callback to allow architectures report the desired order-per-bit of CMA areas and provide a week implementation of arch_hugetlb_cma_order() for architectures that don't support hugetlb with CMA. Use this callback in hugetlb_cma_reserve() instead if passing the order as parameter and call hugetlb_cma_reserve() from mm_core_init_early() rather than have it spread over architecture specific code. Link: https://lkml.kernel.org/r/20260111082105.290734-28-rppt@kernel.org Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alex Shi <alexs@kernel.org> Cc: Andreas Larsson <andreas@gaisler.com> Cc: "Borislav Petkov (AMD)" <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: David Hildenbrand <david@kernel.org> Cc: David S. Miller <davem@davemloft.net> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Guo Ren <guoren@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Klara Modin <klarasmodin@gmail.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Magnus Lindholm <linmag7@gmail.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Michal Simek <monstr@monstr.eu> Cc: Muchun Song <muchun.song@linux.dev> Cc: Oscar Salvador <osalvador@suse.de> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Pratyush Yadav <pratyush@kernel.org> Cc: Richard Weinberger <richard@nod.at> Cc: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vineet Gupta <vgupta@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-01-26 20:02:19 -08:00
Mike Rapoport (Microsoft)	6632314fdd	x86: don't reserve hugetlb memory in setup_arch() Commit `665eaf3133` ("x86/setup: call hugetlb_bootmem_alloc early") added an early call to hugetlb_bootmem_alloc() to setup_arch() to allow HVO style pre-initialization of vmemmap on x86. With the ordering of hugetlb reservation vs memory map initialization sorted out in core MM this no longer needs to be an architecture specific quirk. Drop the call to hugetlb_bootmem_alloc() from x86::setup_arch(). Link: https://lkml.kernel.org/r/20260111082105.290734-27-rppt@kernel.org Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alex Shi <alexs@kernel.org> Cc: Andreas Larsson <andreas@gaisler.com> Cc: "Borislav Petkov (AMD)" <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: David Hildenbrand <david@kernel.org> Cc: David S. Miller <davem@davemloft.net> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Guo Ren <guoren@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Klara Modin <klarasmodin@gmail.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Magnus Lindholm <linmag7@gmail.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Michal Simek <monstr@monstr.eu> Cc: Muchun Song <muchun.song@linux.dev> Cc: Oscar Salvador <osalvador@suse.de> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Pratyush Yadav <pratyush@kernel.org> Cc: Richard Weinberger <richard@nod.at> Cc: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vineet Gupta <vgupta@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-01-26 20:02:19 -08:00
Harshit Mogalapalli	c5489d0433	x86/kexec: add a sanity check on previous kernel's ima kexec buffer When the second-stage kernel is booted via kexec with a limiting command line such as "mem=<size>", the physical range that contains the carried over IMA measurement list may fall outside the truncated RAM leading to a kernel panic. BUG: unable to handle page fault for address: ffff97793ff47000 RIP: ima_restore_measurement_list+0xdc/0x45a #PF: error_code(0x0000) – not-present page Other architectures already validate the range with page_is_ram(), as done in commit `cbf9c4b961` ("of: check previous kernel's ima-kexec-buffer against memory bounds") do a similar check on x86. Without carrying the measurement list across kexec, the attestation would fail. Link: https://lkml.kernel.org/r/20251231061609.907170-4-harshit.m.mogalapalli@oracle.com Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Fixes: `b69a2afd5a` ("x86/kexec: Carry forward IMA measurement log on kexec") Reported-by: Paul Webb <paul.x.webb@oracle.com> Reviewed-by: Mimi Zohar <zohar@linux.ibm.com> Cc: Alexander Graf <graf@amazon.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Baoquan He <bhe@redhat.com> Cc: Borislav Betkov <bp@alien8.de> Cc: guoweikang <guoweikang.kernel@gmail.com> Cc: Henry Willard <henry.willard@oracle.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Bohac <jbohac@suse.cz> Cc: Joel Granados <joel.granados@kernel.org> Cc: Jonathan McDowell <noodles@fb.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Sohil Mehta <sohil.mehta@intel.com> Cc: Sourabh Jain <sourabhjain@linux.ibm.com> Cc: Thomas Gleinxer <tglx@linutronix.de> Cc: Yifei Liu <yifei.l.liu@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-01-26 19:07:11 -08:00
Eric Dumazet	9bad74127f	x86/apic: Inline __x2apic_send_IPI_dest() Avoid one call/ret in networking RPS/RFS fast path, with no space cost. scripts/bloat-o-meter -t vmlinux.before vmlinux.after add/remove: 0/2 grow/shrink: 2/0 up/down: 96/-97 (-1) Function old new delta x2apic_send_IPI 146 198 +52 __x2apic_send_IPI_mask 326 370 +44 __pfx___x2apic_send_IPI_dest 16 - -16 __x2apic_send_IPI_dest 81 - -81 Total: Before=20861234, After=20861233, chg -0.00% Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Link: https://patch.msgid.link/20251223052026.4141498-1-edumazet@google.com	2026-01-26 16:25:18 +01:00
Dave Jiang	3f7938b1ae	Merge branch 'for-7.0/cxl-init' into cxl-for-next Merge in patches to support several patch series such as Soft Reserve handling, type2 accelerator enabling, and LSA 2.1 labeling support. Mainly addition of cxl_memdev_attach() to allow the memdev probe to make a decision of proceed/fail depending success of CXL topology enumeration. dax/hmem, e820, resource: Defer Soft Reserved insertion until hmem is ready cxl/mem: Introduce cxl_memdev_attach for CXL-dependent operation cxl/mem: Drop @host argument to devm_cxl_add_memdev() cxl/mem: Convert devm_cxl_add_memdev() to scope-based-cleanup cxl/port: Arrange for always synchronous endpoint attach cxl/mem: Arrange for always-synchronous memdev attach cxl/mem: Fix devm_cxl_memdev_edac_release() confusion	2026-01-23 14:13:16 -07:00
Borislav Petkov (AMD)	b81db37644	Merge tag 'v6.19-rc6' into tip-x86-cleanups Pick up upstream work and `d9b40d7262` ("selftests/x86: Add selftests include path for kselftest.h after centralization") especially which is a build fix needed for a selftests cleanup coming ontop of this. Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>	2026-01-19 12:03:56 +01:00
Linus Torvalds	e503f539dc	Merge tag 'x86-urgent-2026-01-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull misc x86 fixes from Ingo Molnar: - Fix resctrl initialization on Hygon CPUs - Fix resctrl memory bandwidth counters on Hygon CPUs - Fix x86 self-tests build bug * tag 'x86-urgent-2026-01-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: selftests/x86: Add selftests include path for kselftest.h after centralization x86/resctrl: Fix memory bandwidth counter width for Hygon x86/resctrl: Add missing resctrl initialization for Hygon	2026-01-18 11:34:11 -08:00
Shenghao Yang	2a11e1479e	x86/acpi: Add acpi=spcr to use SPCR-provided default console The SPCR provided console on x86 is only available as a boot console when earlycon is provided on the kernel command line, and will not be present in /proc/consoles. While it's possible to retain the boot console with the keep_bootcon parameter, that leaves the console using the less efficient 8250_early driver. Users wanting to use the firmware suggested console (to avoid maintaining unique serial console parameters for different server models in large fleets) with the conventional driver have to parse the kernel log for the console parameters and reinsert them. [ 0.005091] ACPI: SPCR 0x000000007FFB5000 000059 (v04 ALASKA A M I 01072009 INTL 20250404) [ 0.073387] ACPI: SPCR: console: uart,io,0x3f8,115200 In commit `0231d00082` ("ACPI: SPCR: Make SPCR available to x86")¹ the SPCR console was only added as an option for earlycon but not as an ordinary console so users don't see console output changes. So users can opt in to an automatic SPCR console, make ACPI init add it if acpi=spcr is set. ¹https://lore.kernel.org/lkml/20180118150951.28964-1-prarit@redhat.com/ [ bp: Touchups. ] Signed-off-by: Shenghao Yang <me@shenghaoyang.info> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://patch.msgid.link/20260117072827.355360-1-me@shenghaoyang.info	2026-01-17 17:43:21 +01:00
Dan Williams	bc62f5b308	dax/hmem, e820, resource: Defer Soft Reserved insertion until hmem is ready Insert Soft Reserved memory into a dedicated soft_reserve_resource tree instead of the iomem_resource tree at boot. Delay publishing these ranges into the iomem hierarchy until ownership is resolved and the HMEM path is ready to consume them. Publishing Soft Reserved ranges into iomem too early conflicts with CXL hotplug and prevents region assembly when those ranges overlap CXL windows. Follow up patches will reinsert Soft Reserved ranges into iomem after CXL window publication is complete and HMEM is ready to claim the memory. This provides a cleaner handoff between EFI-defined memory ranges and CXL resource management without trimming or deleting resources later. In the meantime "Soft Reserved" resources will no longer appear in /proc/iomem, only their results. I.e. with "memmap=4G%4G+0xefffffff" Before: 100000000-1ffffffff : Soft Reserved 100000000-1ffffffff : dax1.0 100000000-1ffffffff : System RAM (kmem) After: 100000000-1ffffffff : dax1.0 100000000-1ffffffff : System RAM (kmem) The expectation is that this does not lead to a user visible regression because the dax1.0 device is created in both instances. Co-developed-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> [Smita: incorporate feedback from x86 maintainer review] Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> Link: https://patch.msgid.link/20251120031925.87762-2-Smita.KoralahalliChannabasappa@amd.com [djbw: cleanups and clarifications] Link: https://lore.kernel.org/69443f707b025_1cee10022@dwillia2-mobl4.notmuch Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2026-01-16 09:02:36 -07:00
Oleg Nesterov	d55c571e43	x86/uprobes: Fix XOL allocation failure for 32-bit tasks This script #!/usr/bin/bash echo 0 > /proc/sys/kernel/randomize_va_space echo 'void main(void) {}' > TEST.c # -fcf-protection to ensure that the 1st endbr32 insn can't be emulated gcc -m32 -fcf-protection=branch TEST.c -o test bpftrace -e 'uprobe:./test:main {}' -c ./test "hangs", the probed ./test task enters an endless loop. The problem is that with randomize_va_space == 0 get_unmapped_area(TASK_SIZE - PAGE_SIZE) called by xol_add_vma() can not just return the "addr == TASK_SIZE - PAGE_SIZE" hint, this addr is used by the stack vma. arch_get_unmapped_area_topdown() doesn't take TIF_ADDR32 into account and in_32bit_syscall() is false, this leads to info.high_limit > TASK_SIZE. vm_unmapped_area() happily returns the high address > TASK_SIZE and then get_unmapped_area() returns -ENOMEM after the "if (addr > TASK_SIZE - len)" check. handle_swbp() doesn't report this failure (probably it should) and silently restarts the probed insn. Endless loop. I think that the right fix should change the x86 get_unmapped_area() paths to rely on TIF_ADDR32 rather than in_32bit_syscall(). Note also that if CONFIG_X86_X32_ABI=y, in_x32_syscall() falsely returns true in this case because ->orig_ax = -1. But we need a simple fix for -stable, so this patch just sets TS_COMPAT if the probed task is 32-bit to make in_ia32_syscall() true. Fixes: `1b028f784e` ("x86/mm: Introduce mmap_compat_base() for 32-bit mmap()") Reported-by: Paulo Andrade <pandrade@redhat.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/all/aV5uldEvV7pb4RA8@redhat.com/ Cc: stable@vger.kernel.org Link: https://patch.msgid.link/aWO7Fdxn39piQnxu@redhat.com	2026-01-16 16:23:54 +01:00
Ryosuke Yasuoka	6b32c93560	x86/traps: Print unhashed pointers on stack overflow When a stack overflow occurs, the kernel prints hashed fault address and the stack range using %p. The actual addresses are required for debugging and hashed pointers provide no useful information in this context. Use %px to print the unhashed, raw addresses. Signed-off-by: Ryosuke Yasuoka <ryasuoka@redhat.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://patch.msgid.link/20251224070735.454816-1-ryasuoka@redhat.com	2026-01-14 16:31:53 +01:00
Borislav Petkov (AMD)	ac44a110c1	x86/microcode/AMD: Allow loader debugging to be enabled on baremetal too Debugging the loader on baremetal does make sense, so enable it there too. Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://patch.msgid.link/20260108165028.27417-1-bp@kernel.org	2026-01-14 14:46:44 +01:00
H. Peter Anvin	f49ecf5e11	x86/cpufeature: Replace X86_FEATURE_SYSENTER32 with X86_FEATURE_SYSFAST32 In most cases, the use of "fast 32-bit system call" depends either on X86_FEATURE_SEP or X86_FEATURE_SYSENTER32 \|\| X86_FEATURE_SYSCALL32. However, nearly all the logic for both is identical. Define X86_FEATURE_SYSFAST32 which indicates that either SYSENTER32 or SYSCALL32 should be used, for either 32- or 64-bit kernels. This defaults to SYSENTER; use SYSCALL if the SYSCALL32 bit is also set. As this removes ALL existing uses of X86_FEATURE_SYSENTER32, which is a kernel-only synthetic feature bit, simply remove it and replace it with X86_FEATURE_SYSFAST32. This leaves an unused alternative for a true 32-bit kernel, but that should really not matter in any way. The clearing of X86_FEATURE_SYSCALL32 can be removed once the patches for automatically clearing disabled features has been merged. Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://patch.msgid.link/20251216212606.1325678-10-hpa@zytor.com	2026-01-13 16:37:58 -08:00
H. Peter Anvin	884961618e	x86/entry/vdso32: Remove open-coded DWARF in sigreturn.S The vdso32 sigreturn.S contains open-coded DWARF bytecode, which includes a hack for gdb to not try to step back to a previous call instruction when backtracing from a signal handler. Neither of those are necessary anymore: the backtracing issue is handled by ".cfi_entry simple" and ".cfi_signal_frame", both of which have been supported for a very long time now, which allows the remaining frame to be built using regular .cfi annotations. Add a few more register offsets to the signal frame just for good measure. Replace the nop on fallthrough of the system call (which should never, ever happen) with a ud2a trap. Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://patch.msgid.link/20251216212606.1325678-7-hpa@zytor.com	2026-01-13 16:37:58 -08:00
H. Peter Anvin	93d73005bf	x86/entry/vdso: Rename vdso_image_* to vdso_image The vdso .so files are named vdso.so. These structures are binary images and descriptions of these files, so it is more consistent for them to have a naming that more directly mirrors the filenames. It is also very slightly more compact (by one character...) and simplifies the Makefile just a little bit. Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://patch.msgid.link/20251216212606.1325678-2-hpa@zytor.com	2026-01-13 15:33:20 -08:00
Linus Torvalds	0bb933a9fc	Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull x86 kvm fixes from Paolo Bonzini: - Avoid freeing stack-allocated node in kvm_async_pf_queue_task - Clear XSTATE_BV[i] in guest XSAVE state whenever XFD[i]=1 * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: selftests: kvm: Verify TILELOADD actually #NM faults when XFD[18]=1 selftests: kvm: try getting XFD and XSAVE state out of sync selftests: kvm: replace numbered sync points with actions x86/fpu: Clear XSTATE_BV[i] in guest XSAVE state whenever XFD[i]=1 x86/kvm: Avoid freeing stack-allocated node in kvm_async_pf_queue_task	2026-01-13 09:50:07 -08:00
Xiaochen Shen	7517e899e1	x86/resctrl: Fix memory bandwidth counter width for Hygon The memory bandwidth calculation relies on reading the hardware counter and measuring the delta between samples. To ensure accurate measurement, the software reads the counter frequently enough to prevent it from rolling over twice between reads. The default Memory Bandwidth Monitoring (MBM) counter width is 24 bits. Hygon CPUs provide a 32-bit width counter, but they do not support the MBM capability CPUID leaf (0xF.[ECX=1]:EAX) to report the width offset (from 24 bits). Consequently, the kernel falls back to the 24-bit default counter width, which causes incorrect overflow handling on Hygon CPUs. Fix this by explicitly setting the counter width offset to 8 bits (resulting in a 32-bit total counter width) for Hygon CPUs. Fixes: `d8df126349` ("x86/cpu/hygon: Add missing resctrl_cpu_detect() in bsp_init helper") Signed-off-by: Xiaochen Shen <shenxiaochen@open-hieco.net> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251209062650.1536952-3-shenxiaochen@open-hieco.net	2026-01-13 16:44:26 +01:00
Xiaochen Shen	6ee98aabdc	x86/resctrl: Add missing resctrl initialization for Hygon Hygon CPUs supporting Platform QoS features currently undergo partial resctrl initialization through resctrl_cpu_detect() in the Hygon BSP init helper and AMD/Hygon common initialization code. However, several critical data structures remain uninitialized for Hygon CPUs in the following paths: - get_mem_config()-> __rdt_get_mem_config_amd(): rdt_resource::membw,alloc_capable hw_res::num_closid - rdt_init_res_defs()->rdt_init_res_defs_amd(): rdt_resource::cache hw_res::msr_base,msr_update Add the missing AMD/Hygon common initialization to ensure proper Platform QoS functionality on Hygon CPUs. Fixes: `d8df126349` ("x86/cpu/hygon: Add missing resctrl_cpu_detect() in bsp_init helper") Signed-off-by: Xiaochen Shen <shenxiaochen@open-hieco.net> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251209062650.1536952-2-shenxiaochen@open-hieco.net	2026-01-13 16:20:01 +01:00
Coiby Xu	8a4e92b326	x86/crash: Use set_memory_p() instead of __set_memory_prot() set_memory_p() is available to use outside of its compilation unit since: `030ad7af94` ("x86/mm: Regularize set_memory_p() parameters and make non-static"). There is no use for __set_memory_prot() anymore so drop it too. [ bp: Massage commit message. ] Signed-off-by: Coiby Xu <coxu@redhat.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://patch.msgid.link/20260106095100.656292-1-coxu@redhat.com	2026-01-13 15:28:59 +01:00
Juergen Gross	b0b449e6fe	x86/pvlocks: Move paravirt spinlock functions into own header Instead of having the pv spinlock function definitions in paravirt.h, move them into the new header paravirt-spinlock.h. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://patch.msgid.link/20260105110520.21356-22-jgross@suse.com	2026-01-13 14:57:45 +01:00
Juergen Gross	574b3eb843	x86/paravirt: Move pv_native_() prototypes to paravirt.c The only reason the pv_native_() prototypes are needed is the complete definition of those functions via an asm() statement, which makes it impossible to have those functions as static ones. Move the prototypes from paravirt_types.h into paravirt.c, which is the only source referencing the functions. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://patch.msgid.link/20260105110520.21356-15-jgross@suse.com	2026-01-12 19:14:06 +01:00
Juergen Gross	39965afb11	x86/paravirt: Move paravirt_sched_clock() related code into tsc.c The only user of paravirt_sched_clock() is in tsc.c, so move the code from paravirt.c and paravirt.h to tsc.c. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20260105110520.21356-13-jgross@suse.com	2026-01-12 18:47:39 +01:00
Juergen Gross	589f41f2f0	x86/paravirt: Use common code for paravirt_steal_clock() Remove the arch-specific variant of paravirt_steal_clock() and use the common one instead. With all archs supporting Xen now having been switched to the common variant, including paravirt.h can be dropped from drivers/xen/time.c. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20260105110520.21356-12-jgross@suse.com	2026-01-12 16:48:26 +01:00
Juergen Gross	e6b2aa6d40	sched: Move clock related paravirt code to kernel/sched Paravirt clock related functions are available in multiple archs. In order to share the common parts, move the common static keys to kernel/sched/ and remove them from the arch specific files. Make a common paravirt_steal_clock() implementation available in kernel/sched/cputime.c, guarding it with a new config option CONFIG_HAVE_PV_STEAL_CLOCK_GEN, which can be selected by an arch in case it wants to use that common variant. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20260105110520.21356-7-jgross@suse.com	2026-01-12 15:39:14 +01:00
Juergen Gross	07f2961235	x86/paravirt: Remove not needed includes of paravirt.h In some places asm/paravirt.h is included without really being needed. Remove the related #include statements. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20260105110520.21356-2-jgross@suse.com	2026-01-12 11:26:52 +01:00
Thomas Gleixner	2e4b28c48f	treewide: Update email address In a vain attempt to consolidate the email zoo switch everything to the kernel.org account. Signed-off-by: Thomas Gleixner <tglx@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2026-01-11 06:09:11 -10:00
Tony Luck	4bbfc90122	x86/resctrl: Enable RDT_RESOURCE_PERF_PKG Since telemetry events are enumerated on resctrl mount the RDT_RESOURCE_PERF_PKG resource is not considered "monitoring capable" during early resctrl initialization. This means that the domain list for RDT_RESOURCE_PERF_PKG is not built when the CPU hotplug notifiers are registered and run for the first time right after resctrl initialization. Mark the RDT_RESOURCE_PERF_PKG as "monitoring capable" upon successful telemetry event enumeration to ensure future CPU hotplug events include this resource and initialize its domain list for CPUs that are already online. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/20251217172121.12030-1-tony.luck@intel.com	2026-01-10 11:48:13 +01:00
Tony Luck	0ecc988b02	x86,fs/resctrl: Compute number of RMIDs as minimum across resources resctrl assumes that only the L3 resource supports monitor events, so it simply takes the rdt_resource::num_rmid from RDT_RESOURCE_L3 as the system's number of RMIDs. The addition of telemetry events in a different resource breaks that assumption. Compute the number of available RMIDs as the minimum value across all mon_capable resources (analogous to how the number of CLOSIDs is computed across alloc_capable resources). Note that mount time enumeration of the telemetry resource means that this number can be reduced. If this happens, then some memory will be wasted as the allocations for rdt_l3_mon_domain::mbm_states[] and rdt_l3_mon_domain::rmid_busy_llc created during resctrl initialization will be larger than needed. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/20251217172121.12030-1-tony.luck@intel.com	2026-01-10 11:43:58 +01:00

1 2 3 4 5 ...

21181 Commits