Commit Graph

6972 Commits

Author SHA1 Message Date
Linus Torvalds
334fbe734e Merge tag 'mm-stable-2026-04-13-21-45' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:

 - "maple_tree: Replace big node with maple copy" (Liam Howlett)

   Mainly prepararatory work for ongoing development but it does reduce
   stack usage and is an improvement.

 - "mm, swap: swap table phase III: remove swap_map" (Kairui Song)

   Offers memory savings by removing the static swap_map. It also yields
   some CPU savings and implements several cleanups.

 - "mm: memfd_luo: preserve file seals" (Pratyush Yadav)

   File seal preservation to LUO's memfd code

 - "mm: zswap: add per-memcg stat for incompressible pages" (Jiayuan
   Chen)

   Additional userspace stats reportng to zswap

 - "arch, mm: consolidate empty_zero_page" (Mike Rapoport)

   Some cleanups for our handling of ZERO_PAGE() and zero_pfn

 - "mm/kmemleak: Improve scan_should_stop() implementation" (Zhongqiu
   Han)

   A robustness improvement and some cleanups in the kmemleak code

 - "Improve khugepaged scan logic" (Vernon Yang)

   Improve khugepaged scan logic and reduce CPU consumption by
   prioritizing scanning tasks that access memory frequently

 - "Make KHO Stateless" (Jason Miu)

   Simplify Kexec Handover by transitioning KHO from an xarray-based
   metadata tracking system with serialization to a radix tree data
   structure that can be passed directly to the next kernel

 - "mm: vmscan: add PID and cgroup ID to vmscan tracepoints" (Thomas
   Ballasi and Steven Rostedt)

   Enhance vmscan's tracepointing

 - "mm: arch/shstk: Common shadow stack mapping helper and
   VM_NOHUGEPAGE" (Catalin Marinas)

   Cleanup for the shadow stack code: remove per-arch code in favour of
   a generic implementation

 - "Fix KASAN support for KHO restored vmalloc regions" (Pasha Tatashin)

   Fix a WARN() which can be emitted the KHO restores a vmalloc area

 - "mm: Remove stray references to pagevec" (Tal Zussman)

   Several cleanups, mainly udpating references to "struct pagevec",
   which became folio_batch three years ago

 - "mm: Eliminate fake head pages from vmemmap optimization" (Kiryl
   Shutsemau)

   Simplify the HugeTLB vmemmap optimization (HVO) by changing how tail
   pages encode their relationship to the head page

 - "mm/damon/core: improve DAMOS quota efficiency for core layer
   filters" (SeongJae Park)

   Improve two problematic behaviors of DAMOS that makes it less
   efficient when core layer filters are used

 - "mm/damon: strictly respect min_nr_regions" (SeongJae Park)

   Improve DAMON usability by extending the treatment of the
   min_nr_regions user-settable parameter

 - "mm/page_alloc: pcp locking cleanup" (Vlastimil Babka)

   The proper fix for a previously hotfixed SMP=n issue. Code
   simplifications and cleanups ensued

 - "mm: cleanups around unmapping / zapping" (David Hildenbrand)

   A bunch of cleanups around unmapping and zapping. Mostly
   simplifications, code movements, documentation and renaming of
   zapping functions

 - "support batched checking of the young flag for MGLRU" (Baolin Wang)

   Batched checking of the young flag for MGLRU. It's part cleanups; one
   benchmark shows large performance benefits for arm64

 - "memcg: obj stock and slab stat caching cleanups" (Johannes Weiner)

   memcg cleanup and robustness improvements

 - "Allow order zero pages in page reporting" (Yuvraj Sakshith)

   Enhance free page reporting - it is presently and undesirably order-0
   pages when reporting free memory.

 - "mm: vma flag tweaks" (Lorenzo Stoakes)

   Cleanup work following from the recent conversion of the VMA flags to
   a bitmap

 - "mm/damon: add optional debugging-purpose sanity checks" (SeongJae
   Park)

   Add some more developer-facing debug checks into DAMON core

 - "mm/damon: test and document power-of-2 min_region_sz requirement"
   (SeongJae Park)

   An additional DAMON kunit test and makes some adjustments to the
   addr_unit parameter handling

 - "mm/damon/core: make passed_sample_intervals comparisons
   overflow-safe" (SeongJae Park)

   Fix a hard-to-hit time overflow issue in DAMON core

 - "mm/damon: improve/fixup/update ratio calculation, test and
   documentation" (SeongJae Park)

   A batch of misc/minor improvements and fixups for DAMON

 - "mm: move vma_(kernel|mmu)_pagesize() out of hugetlb.c" (David
   Hildenbrand)

   Fix a possible issue with dax-device when CONFIG_HUGETLB=n. Some code
   movement was required.

 - "zram: recompression cleanups and tweaks" (Sergey Senozhatsky)

   A somewhat random mix of fixups, recompression cleanups and
   improvements in the zram code

 - "mm/damon: support multiple goal-based quota tuning algorithms"
   (SeongJae Park)

   Extend DAMOS quotas goal auto-tuning to support multiple tuning
   algorithms that users can select

 - "mm: thp: reduce unnecessary start_stop_khugepaged()" (Breno Leitao)

   Fix the khugpaged sysfs handling so we no longer spam the logs with
   reams of junk when starting/stopping khugepaged

 - "mm: improve map count checks" (Lorenzo Stoakes)

   Provide some cleanups and slight fixes in the mremap, mmap and vma
   code

 - "mm/damon: support addr_unit on default monitoring targets for
   modules" (SeongJae Park)

   Extend the use of DAMON core's addr_unit tunable

 - "mm: khugepaged cleanups and mTHP prerequisites" (Nico Pache)

   Cleanups to khugepaged and is a base for Nico's planned khugepaged
   mTHP support

 - "mm: memory hot(un)plug and SPARSEMEM cleanups" (David Hildenbrand)

   Code movement and cleanups in the memhotplug and sparsemem code

 - "mm: remove CONFIG_ARCH_ENABLE_MEMORY_HOTREMOVE and cleanup
   CONFIG_MIGRATION" (David Hildenbrand)

   Rationalize some memhotplug Kconfig support

 - "change young flag check functions to return bool" (Baolin Wang)

   Cleanups to change all young flag check functions to return bool

 - "mm/damon/sysfs: fix memory leak and NULL dereference issues" (Josh
   Law and SeongJae Park)

   Fix a few potential DAMON bugs

 - "mm/vma: convert vm_flags_t to vma_flags_t in vma code" (Lorenzo
   Stoakes)

   Convert a lot of the existing use of the legacy vm_flags_t data type
   to the new vma_flags_t type which replaces it. Mainly in the vma
   code.

 - "mm: expand mmap_prepare functionality and usage" (Lorenzo Stoakes)

   Expand the mmap_prepare functionality, which is intended to replace
   the deprecated f_op->mmap hook which has been the source of bugs and
   security issues for some time. Cleanups, documentation, extension of
   mmap_prepare into filesystem drivers

 - "mm/huge_memory: refactor zap_huge_pmd()" (Lorenzo Stoakes)

   Simplify and clean up zap_huge_pmd(). Additional cleanups around
   vm_normal_folio_pmd() and the softleaf functionality are performed.

* tag 'mm-stable-2026-04-13-21-45' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (369 commits)
  mm: fix deferred split queue races during migration
  mm/khugepaged: fix issue with tracking lock
  mm/huge_memory: add and use has_deposited_pgtable()
  mm/huge_memory: add and use normal_or_softleaf_folio_pmd()
  mm: add softleaf_is_valid_pmd_entry(), pmd_to_softleaf_folio()
  mm/huge_memory: separate out the folio part of zap_huge_pmd()
  mm/huge_memory: use mm instead of tlb->mm
  mm/huge_memory: remove unnecessary sanity checks
  mm/huge_memory: deduplicate zap deposited table call
  mm/huge_memory: remove unnecessary VM_BUG_ON_PAGE()
  mm/huge_memory: add a common exit path to zap_huge_pmd()
  mm/huge_memory: handle buggy PMD entry in zap_huge_pmd()
  mm/huge_memory: have zap_huge_pmd return a boolean, add kdoc
  mm/huge: avoid big else branch in zap_huge_pmd()
  mm/huge_memory: simplify vma_is_specal_huge()
  mm: on remap assert that input range within the proposed VMA
  mm: add mmap_action_map_kernel_pages[_full]()
  uio: replace deprecated mmap hook with mmap_prepare in uio_info
  drivers: hv: vmbus: replace deprecated mmap hook with mmap_prepare
  mm: allow handling of stacked mmap_prepare hooks in more drivers
  ...
2026-04-15 12:59:16 -07:00
Linus Torvalds
508fed6795 Merge tag 'ras_core_for_v7.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RAS updates from Borislav Petkov:

 - Add new AMD MCA bank names and types to the MCA code, preceded by a
   clean up of the relevant places to have them more developer-friendly
   (read: sort them alphanumerically and clean up comments) such that
   adding new banks is easy

* tag 'ras_core_for_v7.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mce, EDAC/mce_amd: Add new SMCA bank types
  x86/mce, EDAC/mce_amd: Update CS bank type naming
  x86/mce, EDAC/mce_amd: Reorder SMCA bank type enums
2026-04-14 15:32:39 -07:00
Linus Torvalds
970216e023 Merge tag 'x86_misc_for_v7.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull misc x86 updates from Borislav Petkov:

 - Reference the tip tree maintainer handbook directly from the relevant
   MAINTAINERS file entries (covering timers, IRQ, locking, scheduling,
   perf, x86, and others) so that contributors and tooling can know
   where to look

 - Enable interrupt remapping in defconfig, which is an architectural
   requirement for x2APIC to function correctly on bare metal. Without
   it, x2APIC was effectively enabled but non-functional.

 - Ensure that drivers which register custom restart handlers (such as
   those needed for SoC-based x86 devices like Intel Lightning Mountain)
   are actually invoked during reboot, bringing x86 in line with how
   other architectures handle this.

 - Cleanups

* tag 'x86_misc_for_v7.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  MAINTAINERS: Add references to tip tree handbook
  x86/64/defconfig: Add CONFIG_IRQ_REMAP
  x86/reboot: Execute the kernel restart handler upon machine restart
  x86/mtrr: Use kstrtoul() in parse_mtrr_spare_reg()
2026-04-14 15:15:08 -07:00
Linus Torvalds
cd4cdc53cc Merge tag 'x86_microcode_for_v7.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 microcode loading updates from Borislav Petkov:
 "The kernel carries a table of Intel CPUs family, model, stepping, etc
  tuples which say what is the latest microcode for that particular CPU.

  Some CPU variants differ only by the platform ID which determines what
  microcode needs to be loaded on them.

  Carve out the platform ID handling from the microcode loader and make it
  available in a more generic place so that the old microcode
  verification machinery can use it"

* tag 'x86_microcode_for_v7.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/microcode: Add platform mask to Intel microcode "old" list
  x86/cpu: Add platform ID to CPU matching structure
  x86/cpu: Add platform ID to CPU info structure
  x86/microcode: Refactor platform ID enumeration into a helper
2026-04-14 14:57:29 -07:00
Linus Torvalds
e9635f2a73 Merge tag 'x86_fred_for_v7.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 FRED updates from Borislav Petkov:
 "We made the FRED support an opt-in initially out of fear of it
  breaking machines left and right in the case of a hw bug in the first
  generation of machines supporting it.

  Now that that the FRED code has seen a lot of hammering, flip the
  logic to be opt-out as is the usual case with new hw features"

* tag 'x86_fred_for_v7.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/fred: Remove kernel log message when initializing exceptions
  x86/fred: Enable FRED by default
2026-04-14 14:50:51 -07:00
Linus Torvalds
9f2bb6c7b3 Merge tag 'x86_cpu_for_7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 cpu updates from Dave Hansen:

 - Complete LASS enabling: deal with vsyscall and EFI

   The existing Linear Address Space Separation (LASS) support punted
   on support for common EFI and vsyscall configs. Complete the
   implementation by supporting EFI and vsyscall=xonly.

 - Clean up CPUID usage in newer Intel "avs" audio driver and update the
   x86-cpuid-db file

* tag 'x86_cpu_for_7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  tools/x86/kcpuid: Update bitfields to x86-cpuid-db v3.0
  ASoC: Intel: avs: Include CPUID header at file scope
  ASoC: Intel: avs: Check maximum valid CPUID leaf
  x86/cpu: Remove LASS restriction on vsyscall emulation
  x86/vsyscall: Disable LASS if vsyscall mode is set to EMULATE
  x86/vsyscall: Restore vsyscall=xonly mode under LASS
  x86/traps: Consolidate user fixups in the #GP handler
  x86/vsyscall: Reorganize the page fault emulation code
  x86/cpu: Remove LASS restriction on EFI
  x86/efi: Disable LASS while executing runtime services
  x86/cpu: Defer LASS enabling until userspace comes up
2026-04-14 14:24:45 -07:00
Linus Torvalds
0972ba5605 Merge tag 'x86-platform-2026-04-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 platform updates from Ingo Molnar:

 - Remove M486/M486SX/ELAN support, first minimal step (Ingo Molnar)

 - Print AGESA string from DMI additional information entry (Yazen
   Ghannam, Mario Limonciello)

 - Improve and fix the DMI code (Mario Limonciello):
     - Correct an indexing error in <linux/dmi.h>
     - Adjust dmi_decode() to use enums <linux/dmi.h>
     - Add pr_fmt() for dmi_scan.c to fix & standardize the log prefixes

* tag 'x86-platform-2026-04-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/CPU/AMD: Print AGESA string from DMI additional information entry
  firmware: dmi: Add pr_fmt() for dmi_scan.c
  firmware: dmi: Adjust dmi_decode() to use enums
  firmware: dmi: Correct an indexing error in dmi.h
  x86/cpu: Remove M486/M486SX/ELAN support
2026-04-14 14:10:44 -07:00
Linus Torvalds
ac633ba77c Merge tag 'x86-cleanups-2026-04-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 cleanups from Ingo Molnar:

 - Consolidate AMD and Hygon cases in parse_topology() (Wei Wang)

 - asm constraints cleanups in __iowrite32_copy() (Uros Bizjak)

 - Drop AMD Extended Interrupt LVT macros (Naveen N Rao)

 - Don't use REALLY_SLOW_IO for delays (Juergen Gross)

 - paravirt cleanups (Juergen Gross)

 - FPU code cleanups (Borislav Petkov)

 - split-lock handling code cleanups (Borislav Petkov, Ronan Pigott)

* tag 'x86-cleanups-2026-04-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/fpu: Correct the comment explaining what xfeatures_in_use() does
  x86/split_lock: Don't warn about unknown split_lock_detect parameter
  x86/fpu: Correct misspelled xfeaures_to_write local var
  x86/apic: Drop AMD Extended Interrupt LVT macros
  x86/cpu/topology: Consolidate AMD and Hygon cases in parse_topology()
  block/floppy: Don't use REALLY_SLOW_IO for delays
  x86/paravirt: Replace io_delay() hook with a bool
  x86/irqflags: Preemptively move include paravirt.h directive where it belongs
  x86/split_lock: Restructure the unwieldy switch-case in sld_state_show()
  x86/local: Remove trailing semicolon from _ASM_XADD in local_add_return()
  x86/asm: Use inout "+" asm onstraint modifiers in __iowrite32_copy()
2026-04-14 14:03:27 -07:00
Linus Torvalds
db23954eea Merge tag 'irq-core-2026-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull core irq updates from Thomas Gleixner:

 - Invoke add_interrupt_randomness() in handle_percpu_devid_irq() and
   cleanup the workaround in the Hyper-V driver, which would now invoke
   it twice on ARM64. Removing it from the driver requires to add it to
   the x86 system vector entry point

 - Remove the pointles cpu_read_lock() around reading CPU possible mask,
   which is read only after init

 - Add documentation for the interaction between device tree bindings
   and the interrupt type defines in irq.h

 - Delete stale defines in the matrix allocator and the equivalent in
   loongarch

* tag 'irq-core-2026-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  Drivers: hv: Move add_interrupt_randomness() to hypervisor callback sysvec
  genirq/chip: Invoke add_interrupt_randomness() in handle_percpu_devid_irq()
  genirq/affinity: Remove cpus_read_lock() while reading cpu_possible_mask
  genirq/matrix, LoongArch: Delete IRQ_MATRIX_BITS leftovers
  genirq: Document interaction between <linux/irq.h> and DT binding defines
2026-04-14 10:02:41 -07:00
Linus Torvalds
a970ed1881 Merge tag 'bitmap-for-v7.1' of https://github.com/norov/linux
Pull bitmap updates from Yury Norov:

 - new API: bitmap_weight_from() and bitmap_weighted_xor() (Yury)

 - drop unused __find_nth_andnot_bit() (Yury)

 - new tests and test improvements (Andy, Akinobu, Yury)

 - fixes for count_zeroes API (Yury)

 - cleanup bitmap_print_to_pagebuf() mess (Yury)

 - documentation updates (Andy, Kai, Kit).

* tag 'bitmap-for-v7.1' of https://github.com/norov/linux: (24 commits)
  bitops: Update kernel-doc for sign_extendXX()
  powerpc/xive: simplify xive_spapr_debug_show()
  thermal: intel: switch cpumask_get() to using cpumask_print_to_pagebuf()
  coresight: don't use bitmap_print_to_pagebuf()
  lib/prime_numbers: drop temporary buffer in dump_primes()
  drm/xe: switch xe_pagefault_queue_init() to using bitmap_weighted_or()
  ice: use bitmap_empty() in ice_vf_has_no_qs_ena
  ice: use bitmap_weighted_xor() in ice_find_free_recp_res_idx()
  bitmap: introduce bitmap_weighted_xor()
  bitmap: add test_zero_nbits()
  bitmap: exclude nbits == 0 cases from bitmap test
  bitmap: test bitmap_weight() for more
  asm-generic/bitops: Fix a comment typo in instrumented-atomic.h
  bitops: fix kernel-doc parameter name for parity8()
  lib: count_zeros: unify count_{leading,trailing}_zeros()
  lib: count_zeros: fix 32/64-bit inconsistency in count_trailing_zeros()
  lib: crypto: fix comments for count_leading_zeros()
  x86/topology: use bitmap_weight_from()
  bitmap: add bitmap_weight_from()
  lib/find_bit_benchmark: avoid clearing randomly filled bitmap in test_find_first_bit()
  ...
2026-04-14 08:55:18 -07:00
Linus Torvalds
d7c8087a9c Merge tag 'pm-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki:
 "Once again, cpufreq is the most active development area, mostly
  because of the new feature additions and documentation updates in the
  amd-pstate driver, but there are also changes in the cpufreq core
  related to boost support and other assorted updates elsewhere.

  Next up are power capping changes due to the major cleanup of the
  Intel RAPL driver.

  On the cpuidle front, a new C-states table for Intel Panther Lake is
  added to the intel_idle driver, the stopped tick handling in the menu
  and teo governors is updated, and there are a couple of cleanups.

  Apart from the above, support for Tegra114 is added to devfreq and
  there are assorted cleanups of that code, there are also two updates
  of the operating performance points (OPP) library, two minor updates
  related to hibernation, and cpupower utility man pages updates and
  cleanups.

  Specifics:

   - Update qcom-hw DT bindings to include Eliza hardware (Abel Vesa)

   - Update cpufreq-dt-platdev blocklist (Faruque Ansari)

   - Minor updates to driver and dt-bindings for Tegra (Thierry Reding,
     Rosen Penev)

   - Add MAINTAINERS entry for CPPC driver (Viresh Kumar)

   - Add support for new features: CPPC performance priority, Dynamic
     EPP, Raw EPP, and new unit tests for them to amd-pstate (Gautham
     Shenoy, Mario Limonciello)

   - Fix sysfs files being present when HW missing and broken/outdated
     documentation in the amd-pstate driver (Ninad Naik, Gautham Shenoy)

   - Pass the policy to cpufreq_driver->adjust_perf() to avoid using
     cpufreq_cpu_get() in the .adjust_perf() callback in amd-pstate
     which leads to a scheduling-while-atomic bug (K Prateek Nayak)

   - Clean up dead code in Kconfig for cpufreq (Julian Braha)

   - Remove max_freq_req update for pre-existing cpufreq policy and add
     a boost_freq_req QoS request to save the boost constraint instead
     of overwriting the last scaling_max_freq constraint (Pierre
     Gondois)

   - Embed cpufreq QoS freq_req objects in cpufreq policy so they all
     are allocated in one go along with the policy to simplify lifetime
     rules and avoid error handling issues (Viresh Kumar)

   - Use DMI max speed when CPPC is unavailable in the acpi-cpufreq
     scaling driver (Henry Tseng)

   - Switch policy_is_shared() in cpufreq to using cpumask_nth() instead
     of cpumask_weight() because the former is more efficient (Yury
     Norov)

   - Use sysfs_emit() in sysfs show functions for cpufreq governor
     attributes (Thorsten Blum)

   - Update intel_pstate to stop returning an error when "off" is
     written to its status sysfs attribute while the driver is already
     off (Fabio De Francesco)

   - Include current frequency in the debug message printed by
     __cpufreq_driver_target() (Pengjie Zhang)

   - Refine stopped tick handling in the menu cpuidle governor and
     rearrange stopped tick handling in the teo cpuidle governor (Rafael
     Wysocki)

   - Add Panther Lake C-states table to the intel_idle driver (Artem
     Bityutskiy)

   - Clean up dead dependencies on CPU_IDLE in Kconfig (Julian Braha)

   - Simplify cpuidle_register_device() with guard() (Huisong Li)

   - Use performance level if available to distinguish between rates in
     OPP debugfs (Manivannan Sadhasivam)

   - Fix scoped_guard in dev_pm_opp_xlate_required_opp() (Viresh Kumar)

   - Return -ENODATA if the snapshot image is not loaded (Alberto
     Garcia)

   - Remove inclusion of crypto/hash.h from hibernate_64.c on x86 (Eric
     Biggers)

   - Clean up and rearrange the intel_rapl power capping driver to make
     the respective interface drivers (TPMI, MSR, and MMOI) hold their
     own settings and primitives and consolidate PL4 and PMU support
     flags into rapl_defaults (Kuppuswamy Sathyanarayanan)

   - Correct kernel-doc function parameter names in the power capping
     core code (Randy Dunlap)

   - Remove unneeded casting for HZ_PER_KHZ in devfreq (Andy Shevchenko)

   - Use _visible attribute to replace create/remove_sysfs_files() in
     devfreq (Pengjie Zhang)

   - Add Tegra114 support to activity monitor device in tegra30-devfreq
     as a preparation to upcoming EMC controller support (Svyatoslav
     Ryhel)

   - Fix mistakes in cpupower man pages, add the boost and epp options
     to the cpupower-frequency-info man page, and add the perf-bias
     option to the cpupower-info man page (Roberto Ricci)

   - Remove unnecessary extern declarations from getopt.h in arguments
     parsing functions in cpufreq-set, cpuidle-info, cpuidle-set,
     cpupower-info, and cpupower-set utilities (Kaushlendra Kumar)"

* tag 'pm-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (74 commits)
  cpufreq/amd-pstate: Add POWER_SUPPLY select for dynamic EPP
  cpupower: remove extern declarations in cmd functions
  cpuidle: Simplify cpuidle_register_device() with guard()
  PM / devfreq: tegra30-devfreq: add support for Tegra114
  PM / devfreq: use _visible attribute to replace create/remove_sysfs_files()
  PM / devfreq: Remove unneeded casting for HZ_PER_KHZ
  MAINTAINERS: amd-pstate: Step down as maintainer, add Prateek as reviewer
  cpufreq: Pass the policy to cpufreq_driver->adjust_perf()
  cpufreq/amd-pstate: Pass the policy to amd_pstate_update()
  cpufreq/amd-pstate-ut: Add a unit test for raw EPP
  cpufreq/amd-pstate: Add support for raw EPP writes
  cpufreq/amd-pstate: Add support for platform profile class
  cpufreq/amd-pstate: add kernel command line to override dynamic epp
  cpufreq/amd-pstate: Add dynamic energy performance preference
  Documentation: amd-pstate: fix dead links in the reference section
  cpufreq/amd-pstate: Cache the max frequency in cpudata
  Documentation/amd-pstate: Add documentation for amd_pstate_floor_{freq,count}
  Documentation/amd-pstate: List amd_pstate_prefcore_ranking sysfs file
  Documentation/amd-pstate: List amd_pstate_hw_prefcore sysfs file
  amd-pstate-ut: Add a testcase to validate the visibility of driver attributes
  ...
2026-04-13 19:47:52 -07:00
Ronan Pigott
fbe80bd699 x86/split_lock: Don't warn about unknown split_lock_detect parameter
The split_lock_detect command line parameter is handled in sld_setup() shortly
after cpu_parse_early_param() but still before parse_early_param().

Add a dummy parsing function so that parse_early_param() doesn't later
complain about the "unknown" parameter split_lock_detect=, and pass it along
to init.

  [ bp: Massage commit message. ]

Signed-off-by: Ronan Pigott <ronan@rjp.ie>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://patch.msgid.link/20260405181807.3906-1-ronan@rjp.ie
2026-04-05 23:12:40 +02:00
David Hildenbrand (Arm)
52a9e9cd18 mm: rename zap_vma_ptes() to zap_special_vma_range()
zap_vma_ptes() is the only zapping function we export to modules.

It's essentially a wrapper around zap_vma_range(), however, with some
safety checks:
* That the passed range fits fully into the VMA
* That it's only used for VM_PFNMAP

We will add support for VM_MIXEDMAP next, so use the more-generic term
"special vma", although "special" is a bit overloaded.  Maybe we'll later
just support any VM_SPECIAL flag.

While at it, improve the kerneldoc.

Link: https://lkml.kernel.org/r/20260227200848.114019-16-david@kernel.org
Signed-off-by: David Hildenbrand (Arm) <david@kernel.org>
Acked-by: Leon Romanovsky <leon@kernel.org>	[drivers/infiniband]
Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Arve <arve@android.com>
Cc: "Borislav Petkov (AMD)" <bp@alien8.de>
Cc: Carlos Llamas <cmllamas@google.com>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Claudio Imbrenda <imbrenda@linux.ibm.com>
Cc: Daniel Borkman <daniel@iogearbox.net>
Cc: Dave Airlie <airlied@gmail.com>
Cc: David Ahern <dsahern@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Dimitri Sivanich <dimitri.sivanich@hpe.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Hartley Sweeten <hsweeten@visionengravers.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Ian Abbott <abbotti@mev.co.uk>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jakub Kacinski <kuba@kernel.org>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Jann Horn <jannh@google.com>
Cc: Janosch Frank <frankja@linux.ibm.com>
Cc: Jarkko Sakkinen <jarkko@kernel.org>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Jonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Namhyung kim <namhyung@kernel.org>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Todd Kjos <tkjos@android.com>
Cc: Tvrtko Ursulin <tursulin@ursulin.net>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:15 -07:00
Yazen Ghannam
0422b07bc4 x86/mce/amd: Filter bogus hardware errors on Zen3 clients
Users have been observing multiple L3 cache deferred errors after recent
kernel rework of deferred error handling.¹ ⁴

The errors are bogus due to inconsistent status values. Also, user verified
that bogus MCA_DESTAT values are present on the system even with an older
kernel.²

The errors seem to be garbage values present in the MCA_DESTAT of some L3
cache banks. These were implicitly ignored before the recent kernel rework
because these do not generate a deferred error interrupt.

A later revision of the rework patch was merged for v6.19. This naturally
filtered out most of the bogus error logs. However, a few signatures still
remain.³

Minimize the scope of the filter to the reported CPU
family/model/stepping and only for errors which don't have the Enabled
bit in the MCi status MSR.

¹ https://lore.kernel.org/20250915010010.3547-1-spasswolf@web.de
² https://lore.kernel.org/6e1eda7dd55f6fa30405edf7b0f75695cf55b237.camel@web.de
³ https://lore.kernel.org/21ba47fa8893b33b94370c2a42e5084cf0d2e975.camel@web.dehttps://lore.kernel.org/r/CAKFB093B2k3sKsGJ_QNX1jVQsaXVFyy=wNwpzCGLOXa_vSDwXw@mail.gmail.com

  [ bp: Generalize the condition according to which errors are bogus. ]

Fixes: 7cb735d7c0 ("x86/mce: Unify AMD DFR handler with MCA Polling")
Closes: https://lore.kernel.org/20250915010010.3547-1-spasswolf@web.de
Reported-by: Bert Karwatzki <spasswolf@web.de>
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Tested-By: Bert Karwatzki <spasswolf@web.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/20250915010010.3547-1-spasswolf@web.de
2026-04-05 12:42:22 +02:00
Michael Kelley
e8be82c2d7 Drivers: hv: Move add_interrupt_randomness() to hypervisor callback sysvec
The Hyper-V ISRs, for normal guests and when running in the hypervisor root
patition, are calling add_interrupt_randomness() as a primary source of
entropy. The call is currently in the ISRs as a common place to handle both
x86/x64 and arm64.

On x86/x64, hypervisor interrupts come through a custom sysvec entry, and
do not go through a generic interrupt handler.

On arm64, hypervisor interrupts come through an emulated GICv3. GICv3 uses
the generic handler handle_percpu_devid_irq(), which does not do
add_interrupt_randomness() -- unlike its counterpart
handle_percpu_irq().

But handle_percpu_devid_irq() is now updated to do the
add_interrupt_randomness(). So add_interrupt_randomness() is now needed
only in Hyper-V's x86/x64 custom sysvec path.

Move add_interrupt_randomness() from the Hyper-V ISRs into the Hyper-V
x86/x64 custom sysvec path, matching the existing STIMER0 sysvec path.

With this change, add_interrupt_randomness() is no longer called from any
device drivers, which is appropriate.

Signed-off-by: Michael Kelley <mhklinux@outlook.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Acked-by: Wei Liu <wei.liu@kernel.org>
Link: https://patch.msgid.link/20260402202400.1707-3-mhklkml@zohomail.com
2026-04-04 21:01:56 +02:00
Rafael J. Wysocki
5cdfedf68e Merge tag 'amd-pstate-v7.1-2026-04-02' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/superm1/linux
Pull amd-pstate new content for 7.1 (2026-04-02) from Mario Limonciello:

"Add support for new features:
  * CPPC performance priority
  * Dynamic EPP
  * Raw EPP
  * New unit tests for new features
 Fixes for:
  * PREEMPT_RT
  * sysfs files being present when HW missing
  * Broken/outdated documentation"

* tag 'amd-pstate-v7.1-2026-04-02' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/superm1/linux: (22 commits)
  MAINTAINERS: amd-pstate: Step down as maintainer, add Prateek as reviewer
  cpufreq: Pass the policy to cpufreq_driver->adjust_perf()
  cpufreq/amd-pstate: Pass the policy to amd_pstate_update()
  cpufreq/amd-pstate-ut: Add a unit test for raw EPP
  cpufreq/amd-pstate: Add support for raw EPP writes
  cpufreq/amd-pstate: Add support for platform profile class
  cpufreq/amd-pstate: add kernel command line to override dynamic epp
  cpufreq/amd-pstate: Add dynamic energy performance preference
  Documentation: amd-pstate: fix dead links in the reference section
  cpufreq/amd-pstate: Cache the max frequency in cpudata
  Documentation/amd-pstate: Add documentation for amd_pstate_floor_{freq,count}
  Documentation/amd-pstate: List amd_pstate_prefcore_ranking sysfs file
  Documentation/amd-pstate: List amd_pstate_hw_prefcore sysfs file
  amd-pstate-ut: Add a testcase to validate the visibility of driver attributes
  amd-pstate-ut: Add module parameter to select testcases
  amd-pstate: Introduce a tracepoint trace_amd_pstate_cppc_req2()
  amd-pstate: Add sysfs support for floor_freq and floor_count
  amd-pstate: Add support for CPPC_REQ2 and FLOOR_PERF
  x86/cpufeatures: Add AMD CPPC Performance Priority feature.
  amd-pstate: Make certain freq_attrs conditionally visible
  ...
2026-04-04 20:55:56 +02:00
Naveen N Rao (AMD)
5635c8bfd3 x86/apic: Drop AMD Extended Interrupt LVT macros
AMD defines Extended Interrupt Local Vector Table (EILVT) registers to allow
for additional interrupt sources. While the APIC registers for those are
unique to AMD, the format of those registers follows the standard LVT
registers. Drop EILVT-specific macros in favor of the standard APIC
LVT macros.

Drop unused APIC_EILVT_NR_AMD_K8 and APIC_EILVT_LVTOFF while at it.

No functional change.

  [ bp: Merge the two cleanup patches into one. ]

Signed-off-by: Naveen N Rao (AMD) <naveen@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Tested-by: Manali Shukla <manali.shukla@amd.com>
Link: https://patch.msgid.link/b98d69037c0102d2ccd082a941888a689cd214c9.1775019269.git.naveen@kernel.org
2026-04-04 00:56:40 +02:00
Gautham R. Shenoy
172100088f x86/cpufeatures: Add AMD CPPC Performance Priority feature.
Some future AMD processors have feature named "CPPC Performance
Priority" which lets userspace specify different floor performance
levels for different CPUs. The platform firmware takes these different
floor performance levels into consideration while throttling the CPUs
under power/thermal constraints. The presence of this feature is
indicated by bit 16 of the EDX register for CPUID leaf
0x80000007. More details can be found in AMD Publication titled "AMD64
Collaborative Processor Performance Control (CPPC) Performance
Priority" Revision 1.10.

Define a new feature bit named X86_FEATURE_CPPC_PERF_PRIO to map to
CPUID 0x80000007.EDX[16].

Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
2026-04-02 11:28:21 -05:00
Yazen Ghannam
bc91133e26 x86/CPU/AMD: Print AGESA string from DMI additional information entry
Type 40 entries (Additional Information) are summarized in section 7.41 as
part of the SMBIOS specification.  Generally, these entries aren't interesting
to save.

However on some AMD Zen systems, the AGESA version is stored here. This is
useful to save to the kernel message logs for debugging. It can be used to
cross-reference issues.

Implement an iterator for the Additional Information entries. Use this to find
and print the AGESA string. Do so in AMD code, since the use case is
AMD-specific.

  [ bp: Match only "AGESA". ]

Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Co-developed-by: "Mario Limonciello (AMD)" <superm1@kernel.org>
Signed-off-by: "Mario Limonciello (AMD)" <superm1@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Jean Delvare <jdelvare@suse.de>
Link: https://patch.msgid.link/20260307141024.819807-6-superm1@kernel.org
2026-04-01 20:54:16 +02:00
H. Peter Anvin
ac66a73be0 x86/fred: Enable FRED by default
When FRED was added to the mainline kernel, it was set up as an explicit
opt-in due to the risk of regressions before hardware was available publicly.

Now, Panther Lake (Core Ultra 300 series) has been released, and benchmarking
by Phoronix has shown that it provides a significant performance benefit on
most workloads:

  https://www.phoronix.com/review/intel-fred-panther-lake

Accordingly, enable FRED by default if the CPU supports it. FRED can of
course still be disabled via the fred=off command line option.

Touch up Kconfig help too.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
Link: https://patch.msgid.link/20260325230151.1898287-2-hpa@zytor.com
2026-03-27 16:04:47 +01:00
Yury Norov
c6a98dff41 x86/topology: use bitmap_weight_from()
Switch topo_unit_count() to use bitmap_weight_from().

Signed-off-by: Yury Norov <ynorov@nvidia.com>
2026-03-23 13:33:51 -04:00
Wei Wang
6a9fe1ad90 x86/cpu/topology: Consolidate AMD and Hygon cases in parse_topology()
Merge the two separate switch cases for AMD and Hygon as they share the
common cpu_parse_topology_amd().

Also drop the IS_ENABLED(CONFIG_CPU_SUP_AMD/HYGON) guards, because
1) they are dead code: when a vendor's CONFIG_CPU_SUP_* is disabled, its
   vendor detection code (in amd.c / hygon.c) is not compiled, so
   x86_vendor will never be set to X86_VENDOR_AMD / X86_VENDOR_HYGON,
   instead it will default to X86_VENDOR_UNKNOWN and those switch cases
   are unreachable.
2) topology_amd.o is always built (obj-y), so cpu_parse_topology_amd() is
   always available regardless of CPU_SUP_* configuration.

Signed-off-by: Wei Wang <wei.w.wang@hotmail.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Tested-by: Yongwei Xu <xuyongwei@open-hieco.net>
Link: https://patch.msgid.link/SI2PR01MB4393D6B7E17AB05612AEE925DC4BA@SI2PR01MB4393.apcprd01.prod.exchangelabs.com
2026-03-23 16:25:26 +01:00
Peter Zijlstra
a3e93cac25 x86/cpu: Add comment clarifying CRn pinning
To avoid future confusion on the purpose and design of the CRn pinning code.

Also note that if the attacker controls page-tables, the CRn bits lose much of
the attraction anyway.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://patch.msgid.link/20260320092521.GG3739106@noisy.programming.kicks-ass.net
2026-03-23 14:25:53 +01:00
Borislav Petkov (AMD)
411df123c0 x86/cpu: Remove X86_CR4_FRED from the CR4 pinned bits mask
Commit in Fixes added the FRED CR4 bit to the CR4 pinned bits mask so
that whenever something else modifies CR4, that bit remains set. Which
in itself is a perfectly fine idea.

However, there's an issue when during boot FRED is initialized: first on
the BSP and later on the APs. Thus, there's a window in time when
exceptions cannot be handled.

This becomes particularly nasty when running as SEV-{ES,SNP} or TDX
guests which, when they manage to trigger exceptions during that short
window described above, triple fault due to FRED MSRs not being set up
yet.

See Link tag below for a much more detailed explanation of the
situation.

So, as a result, the commit in that Link URL tried to address this
shortcoming by temporarily disabling CR4 pinning when an AP is not
online yet.

However, that is a problem in itself because in this case, an attack on
the kernel needs to only modify the online bit - a single bit in RW
memory - and then disable CR4 pinning and then disable SM*P, leading to
more and worse things to happen to the system.

So, instead, remove the FRED bit from the CR4 pinning mask, thus
obviating the need to temporarily disable CR4 pinning.

If someone manages to disable FRED when poking at CR4, then
idt_invalidate() would make sure the system would crash'n'burn on the
first exception triggered, which is a much better outcome security-wise.

Fixes: ff45746fbf ("x86/cpu: Add X86_CR4_FRED macro")
Suggested-by: Dave Hansen <dave.hansen@linux.intel.com>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Cc: <stable@kernel.org> # 6.12+
Link: https://lore.kernel.org/r/177385987098.1647592.3381141860481415647.tip-bot2@tip-bot2
2026-03-23 14:12:03 +01:00
Nikunj A Dadhania
05243d490b x86/cpu: Enable FSGSBASE early in cpu_init_exception_handling()
Move FSGSBASE enablement from identify_cpu() to cpu_init_exception_handling()
to ensure it is enabled before any exceptions can occur on both boot and
secondary CPUs.

== Background ==

Exception entry code (paranoid_entry()) uses ALTERNATIVE patching based on
X86_FEATURE_FSGSBASE to decide whether to use RDGSBASE/WRGSBASE instructions
or the slower RDMSR/SWAPGS sequence for saving/restoring GSBASE.

On boot CPU, ALTERNATIVE patching happens after enabling FSGSBASE in CR4.
When the feature is available, the code is permanently patched to use
RDGSBASE/WRGSBASE, which require CR4.FSGSBASE=1 to execute without triggering

== Boot Sequence ==

Boot CPU (with CR pinning enabled):
  trap_init()
    cpu_init()                   <- Uses unpatched code (RDMSR/SWAPGS)
      x2apic_setup()
  ...
  arch_cpu_finalize_init()
    identify_boot_cpu()
      identify_cpu()
        cr4_set_bits(X86_CR4_FSGSBASE)  # Enables the feature
	# This becomes part of cr4_pinned_bits
    ...
    alternative_instructions()   <- Patches code to use RDGSBASE/WRGSBASE

Secondary CPUs (with CR pinning enabled):
  start_secondary()
    cr4_init()                   <- Code already patched, CR4.FSGSBASE=1
                                    set implicitly via cr4_pinned_bits

    cpu_init()                   <- exceptions work because FSGSBASE is
                                    already enabled

Secondary CPU (with CR pinning disabled):
  start_secondary()
    cr4_init()                   <- Code already patched, CR4.FSGSBASE=0
    cpu_init()
      x2apic_setup()
        rdmsrq(MSR_IA32_APICBASE)  <- Triggers #VC in SNP guests
          exc_vmm_communication()
            paranoid_entry()       <- Uses RDGSBASE with CR4.FSGSBASE=0
                                      (patched code)
    ...
    ap_starting()
      identify_secondary_cpu()
        identify_cpu()
	  cr4_set_bits(X86_CR4_FSGSBASE)  <- Enables the feature, which is
                                             too late

== CR Pinning ==

Currently, for secondary CPUs, CR4.FSGSBASE is set implicitly through
CR-pinning: the boot CPU sets it during identify_cpu(), it becomes part of
cr4_pinned_bits, and cr4_init() applies those pinned bits to secondary CPUs.
This works but creates an undocumented dependency between cr4_init() and the
pinning mechanism.

== Problem ==

Secondary CPUs boot after alternatives have been applied globally. They
execute already-patched paranoid_entry() code that uses RDGSBASE/WRGSBASE
instructions, which require CR4.FSGSBASE=1. Upcoming changes to CR pinning
behavior will break the implicit dependency, causing secondary CPUs to
generate #UD.

This issue manifests itself on AMD SEV-SNP guests, where the rdmsrq() in
x2apic_setup() triggers a #VC exception early during cpu_init(). The #VC
handler (exc_vmm_communication()) executes the patched paranoid_entry() path.
Without CR4.FSGSBASE enabled, RDGSBASE instructions trigger #UD.

== Fix ==

Enable FSGSBASE explicitly in cpu_init_exception_handling() before loading
exception handlers. This makes the dependency explicit and ensures both
boot and secondary CPUs have FSGSBASE enabled before paranoid_entry()
executes.

Fixes: c82965f9e5 ("x86/entry/64: Handle FSGSBASE enabled paranoid entry/exit")
Reported-by: Borislav Petkov <bp@alien8.de>
Suggested-by: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
Cc: <stable@kernel.org>
Link: https://patch.msgid.link/20260318075654.1792916-2-nikunj@amd.com
2026-03-23 13:29:50 +01:00
Linus Torvalds
8d8bd2a5aa Merge tag 'x86-urgent-2026-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:

 - Improve Qemu MCE-injection behavior by only using AMD SMCA MSRs if
   the feature bit is set

 - Fix the relative path of gettimeofday.c inclusion in vclock_gettime.c

 - Fix a boot crash on UV clusters when a socket is marked as
   'deconfigured' which are mapped to the SOCK_EMPTY node ID by
   the UV firmware, while Linux APIs expect NUMA_NO_NODE.

   The difference being (0xffff [unsigned short ~0]) vs [int -1]

* tag 'x86-urgent-2026-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/platform/uv: Handle deconfigured sockets
  x86/entry/vdso: Fix path of included gettimeofday.c
  x86/mce/amd: Check SMCA feature bit before accessing SMCA MSRs
2026-03-22 10:54:12 -07:00
Juergen Gross
9eece49856 x86/paravirt: Replace io_delay() hook with a bool
The io_delay() paravirt hook is in no way performance critical and all users
setting it to a different function than native_io_delay() are using an empty
function as replacement.

Allow replacing the hook with a bool indicating whether native_io_delay()
should be called.

  [ bp: Massage commit message. ]

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://patch.msgid.link/20260119182632.596369-3-jgross@suse.com
2026-03-22 08:43:05 +01:00
Linus Torvalds
c3d13784d5 Merge tag 'hyperv-fixes-signed-20260319' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux
Pull Hyper-V fixes from Wei Liu:

 - Fix ARM64 MSHV support (Anirudh Rayabharam)

 - Fix MSHV driver memory handling issues (Stanislav Kinsburskii)

 - Update maintainers for Hyper-V DRM driver (Saurabh Sengar)

 - Misc clean up in MSHV crashdump code (Ard Biesheuvel, Uros Bizjak)

 - Minor improvements to MSHV code (Mukesh R, Wei Liu)

 - Revert not yet released MSHV scrub partition hypercall (Wei Liu)

* tag 'hyperv-fixes-signed-20260319' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
  mshv: Fix error handling in mshv_region_pin
  MAINTAINERS: Update maintainers for Hyper-V DRM driver
  mshv: Fix use-after-free in mshv_map_user_memory error path
  mshv: pass struct mshv_user_mem_region by reference
  x86/hyperv: Use any general-purpose register when saving %cr2 and %cr8
  x86/hyperv: Use current_stack_pointer to avoid asm() in hv_hvcrash_ctxt_save()
  x86/hyperv: Save segment registers directly to memory in hv_hvcrash_ctxt_save()
  x86/hyperv: Use __naked attribute to fix stackless C function
  Revert "mshv: expose the scrub partition hypercall"
  mshv: add arm64 support for doorbell & intercept SINTs
  mshv: refactor synic init and cleanup
  x86/hyperv: print out reserved vectors in hexadecimal
2026-03-20 09:18:22 -07:00
Sohil Mehta
584d752b8a x86/cpu: Remove LASS restriction on vsyscall emulation
Vsyscall emulation has two modes of operation: XONLY and EMULATE. The
default XONLY mode is now supported with a LASS-triggered #GP. OTOH,
LASS is disabled if someone requests the deprecated EMULATE mode via the
vsyscall=emulate command line option. So, remove the restriction on LASS
when the overall vsyscall emulation support is compiled in.

As a result, there is no need for setup_lass() anymore. LASS is enabled
by default through a late_initcall().

Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Reviewed-by:
Tested-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
Link: https://patch.msgid.link/20260309181029.398498-6-sohil.mehta@intel.com
2026-03-19 15:11:13 -07:00
William Roche
201bc182ad x86/mce/amd: Check SMCA feature bit before accessing SMCA MSRs
People do effort to inject MCEs into guests in order to simulate/test
handling of hardware errors. The real use case behind it is testing the
handling of SIGBUS which the memory failure code sends to the process.

If that process is QEMU, instead of killing the whole guest, the MCE can
be injected into the guest kernel so that latter can attempt proper
handling and kill the user *process*  in the guest, instead, which
caused the MCE. The assumption being here that the whole injection flow
can supply enough information that the guest kernel can pinpoint the
right process. But that's a different topic...

Regardless of virtualization or not, access to SMCA-specific registers
like MCA_DESTAT should only be done after having checked the smca
feature bit. And there are AMD machines like Bulldozer (the one before
Zen1) which do support deferred errors but are not SMCA machines.

Therefore, properly check the feature bit before accessing related MSRs.

  [ bp: Rewrite commit message. ]

Fixes: 7cb735d7c0 ("x86/mce: Unify AMD DFR handler with MCA Polling")
Signed-off-by: William Roche <william.roche@oracle.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20260218163025.1316501-1-william.roche@oracle.com
2026-03-18 23:02:16 +01:00
Borislav Petkov
04e43ec9f0 x86/split_lock: Restructure the unwieldy switch-case in sld_state_show()
Split the handling in two parts:

1. handle the sld_state option first

2. handle X86_FEATURE flag-based printing afterwards

This splits the function nicely into two, separate logical things which
are easier to parse and understand.

Also, zap the printing in the disabled case.

Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Link: https://patch.msgid.link/20260226145033.GAaaBduQ0rWXydOkAm@fat_crate.local
2026-03-13 16:57:02 +01:00
Yazen Ghannam
b90d398138 x86/mce, EDAC/mce_amd: Add new SMCA bank types
Recognize new SMCA bank types and include their short names for sysfs
and long names for decoding.

Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://patch.msgid.link/20260307163316.345923-4-yazen.ghannam@amd.com
2026-03-11 13:51:59 +01:00
Yazen Ghannam
b595a00972 x86/mce, EDAC/mce_amd: Update CS bank type naming
Recent documentation updated the "CS" bank type name from "Coherent
Slave" to "Coherent Station".

Apply this change in the kernel also.

Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://patch.msgid.link/20260307163316.345923-3-yazen.ghannam@amd.com
2026-03-11 13:51:59 +01:00
Yazen Ghannam
bee9f4178b x86/mce, EDAC/mce_amd: Reorder SMCA bank type enums
Originally, the SMCA bank type enums were ordered based on processor
documentation. However, the ordering became inconsistent after new bank
types were added over time.

Sort the bank type enums alphanumerically in most places.  Sort the
"enum to HWID/McaType" mapping by HWID/McaType. Drop redundant code
comments.

No functional changes.

  [ bp: Sort them alphanumerically. ]

Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://patch.msgid.link/20260307163316.345923-2-yazen.ghannam@amd.com
2026-03-11 13:51:40 +01:00
Dave Hansen
7989c39341 x86/microcode: Add platform mask to Intel microcode "old" list
Intel sometimes has CPUs with identical family/model/stepping but
which need different microcode. These CPUs are differentiated with the
platform ID.

The Intel "microcode-20250512" release was used to generate the
existing contents of intel-ucode-defs.h. Use that same release and add
the platform mask to the definitions.

This makes the list a few entries longer because some CPUs previously
that shared a definition now need two or more. for example for the
ancient Pentium III there are two CPUs that differ only in their
platform and have two different microcode versions (note:
.driver_data is the microcode version):

	{ ..., .model = 0x05, .steppings = 0x0001, .platform_mask = 0x01, .driver_data = 0x40 },
	{ ..., .model = 0x05, .steppings = 0x0001, .platform_mask = 0x08, .driver_data = 0x45 },

Another example is the state-of-the-art Granite Rapids:

	{ ...,  .model = 0xad, .steppings = 0x0002, .platform_mask = 0x20, .driver_data = 0xa0000d1 },
	{ ...,  .model = 0xad, .steppings = 0x0002, .platform_mask = 0x95, .driver_data = 0x10003a2 },

As you can see, this differentiation with platform ID has been
necessary for a long time and is still relevant today.

Without the platform matching, the old microcode table is incomplete.
For instance, it might lead someone with a Pentium III, platform 0x0,
and microcode 0x40 to think that they should have microcode 0x45,
which is really only for platform 0x4 (.platform_mask==0x08).

In practice, this meant that folks with fully updated microcode were
seeing "Vulnerable" in the "old_microcode" file.

1. https://github.com/intel/Intel-Linux-Processor-Microcode-Data-Files

Closes: https://lore.kernel.org/all/38660F8F-499E-48CD-B58B-4822228A5941@nutanix.com/
Fixes: 4e2c719782 ("x86/cpu: Help users notice when running old Intel microcode")
Reported-by: Jon Kohler <jon@nutanix.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
Tested-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/all/3ECBB974-C6F0-47A7-94B6-3646347F1CC2@nutanix.com/
Link: https://patch.msgid.link/20260304181024.76E3F038@davehans-spike.ostc.intel.com
2026-03-05 12:25:45 -08:00
Dave Hansen
fab0c75d50 x86/cpu: Add platform ID to CPU matching structure
The existing x86_match_cpu() infrastructure can be used to match
a bunch of attributes of a CPU: vendor, family, model, steppings
and CPU features.

But, there's one more attribute that's missing and unable to be
matched against: the platform ID, enumerated on Intel CPUs in
MSR_IA32_PLATFORM_ID. It is a little more obscure and is only
queried during microcode loading. This is because Intel sometimes
has CPUs with identical family/model/stepping but which need
different microcode. These CPUs are differentiated with the
platform ID.

Add a field in 'struct x86_cpu_id' for the platform ID. Similar
to the stepping field, make the new field a mask of platform IDs.
Some examples:

	0x01: matches only platform ID 0x0
	0x02: matches only platform ID 0x1
	0x03: matches platform IDs 0x0 or 0x1
	0x80: matches only platform ID 0x7
	0xff: matches all 8 possible platform IDs

Since the mask is only a byte wide, it nestles in next to another
u8 and does not even increase the size of 'struct x86_cpu_id'.

Reserve the all 0's value as the wildcard (X86_PLATFORM_ANY). This
avoids forcing changes to existing 'struct x86_cpu_id' users.  They
can just continue to fill the field with 0's and their matching will
work exactly as before.

Note: If someone is ever looking for space in 'struct x86_cpu_id',
this new field could probably get stuck over in ->driver_data
for the one user that there is.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
Link: https://patch.msgid.link/20260304181022.058DF07C@davehans-spike.ostc.intel.com
2026-03-05 12:25:38 -08:00
Dave Hansen
d8630b67ca x86/cpu: Add platform ID to CPU info structure
The end goal here is to be able to do x86_match_cpu() and match on a
specific platform ID. While it would be possible to stash this ID
off somewhere or read it dynamically, that approaches would not be
consistent with the other fields which can be matched.

Read the platform ID and store it in cpuinfo_x86.

There are lots of sites to set this new field. Place it near
the place c->microcode is established since the platform ID is
so closely intertwined with microcode updates.

Note: This should not grow the size of 'struct cpuinfo_x86' in
practice since the u8 fits next to another u8 in the structure.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
Reviewed-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Link: https://patch.msgid.link/20260304181020.8D518228@davehans-spike.ostc.intel.com
2026-03-05 12:25:32 -08:00
Dave Hansen
238be4ba87 x86/microcode: Refactor platform ID enumeration into a helper
Today, the only code that cares about the platform ID is the microcode
update code itself. To facilitate storing the platform ID in a more
generic place and using it outside of the microcode update itself, put
the enumeration into a helper function. Mirror
intel_get_microcode_revision()'s naming and location.

But, moving away from intel_collect_cpu_info() means that the model
and family information in CPUID is not readily available. Just call
CPUID again.

Note that the microcode header is a mask of supported platform IDs.
Only stick the ID part in the helper. Leave the 1<<id part in the
microcode handling.

Also note that the PII is weird. It does not really have a platform
ID because it doesn't even have the MSR. Just consider it to be
platform ID 0. Instead of saying >=PII, say <=PII. The PII is the
real oddball here being the only CPU with Linux microcode updates
but no platform ID. It's worth calling it out by name.

This does subtly change the sig->pf for the PII though from 0x0
to 0x1. Make up for that by ignoring sig->pf when the microcode
update platform mask is 0x0.

[ dhansen: reflow comment for bpetkov ]

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
Reviewed-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Link: https://patch.msgid.link/20260304181018.EB6404F8@davehans-spike.ostc.intel.com
2026-03-05 12:25:18 -08:00
Tony Luck
59674fc9d0 x86/resctrl: Fix SNC detection
Now that the x86 topology code has a sensible nodes-per-package
measure, that does not depend on the online status of CPUs, use this
to divinate the SNC mode.

Note that when Cluster on Die (CoD) is configured on older systems this
will also show multiple NUMA nodes per package. Intel Resource Director
Technology is incomaptible with CoD. Print a warning and do not use the
fixup MSR_RMID_SNC_CONFIG.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Zhang Rui <rui.zhang@intel.com>
Tested-by: Chen Yu <yu.c.chen@intel.com>
Link: https://patch.msgid.link/aaCxbbgjL6OZ6VMd@agluck-desk3
Link: https://patch.msgid.link/20260303110100.367976706@infradead.org
2026-03-04 16:35:09 +01:00
Peter Zijlstra
ae6730ff42 x86/topo: Add topology_num_nodes_per_package()
Use the MADT and SRAT table data to compute __num_nodes_per_package.

Specifically, SRAT has already been parsed in x86_numa_init(), which is called
before acpi_boot_init() which parses MADT. So both are available in
topology_init_possible_cpus().

This number is useful to divinate the various Intel CoD/SNC and AMD NPS modes,
since the platforms are failing to provide this otherwise.

Doing it this way is independent of the number of online CPUs and
other such shenanigans.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Tony Luck <tony.luck@intel.com>
Tested-by: K Prateek Nayak <kprateek.nayak@amd.com>
Tested-by: Zhang Rui <rui.zhang@intel.com>
Tested-by: Chen Yu <yu.c.chen@intel.com>
Tested-by: Kyle Meyer <kyle.meyer@hpe.com>
Link: https://patch.msgid.link/20260303110100.004091624@infradead.org
2026-03-04 16:35:08 +01:00
Sohil Mehta
68400c1aaf x86/cpu: Remove LASS restriction on EFI
The initial LASS enabling has been deferred to much later during boot,
and EFI runtime services now run with LASS temporarily disabled. This
removes LASS from the path of all EFI services.

Remove the LASS restriction on EFI config, as the two can now coexist.

Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Tested-by: Tony Luck <tony.luck@intel.com>
Tested-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
Link: https://patch.msgid.link/20260120234730.2215498-4-sohil.mehta@intel.com
2026-03-03 09:49:45 -08:00
Sohil Mehta
b3226af5ad x86/cpu: Defer LASS enabling until userspace comes up
LASS blocks any kernel access to the lower half of the virtual address
space. Unfortunately, some EFI accesses happen during boot with bit 63
cleared, which causes a #GP fault when LASS is enabled.

Notably, the SetVirtualAddressMap() call can only happen in EFI physical
mode. Also, EFI_BOOT_SERVICES_CODE/_DATA could be accessed even after
ExitBootServices(). The boot services memory is truly freed during
efi_free_boot_services() after SVAM has completed.

To prevent EFI from tripping LASS, at a minimum, LASS enabling must be
deferred until EFI has completely finished entering virtual mode
(including freeing boot services memory). Moving setup_lass() to
arch_cpu_finalize_init() would do the trick, but that would make the
implementation very fragile. Something else might come in the future
that would need the LASS enabling to be moved again.

In general, security features such as LASS provide limited value before
userspace is up. They aren't necessary during early boot while only
trusted ring0 code is executing. Introduce a generic late initcall to
defer activating some CPU features until userspace is enabled.

For now, only move the LASS CR4 programming to this initcall. As APs are
already up by the time late initcalls run, some extra steps are needed
to enable LASS on all CPUs. Use a CPU hotplug callback instead of
on_each_cpu() or smp_call_function(). This ensures that LASS is enabled
on every CPU that is currently online as well as any future CPUs that
come online later. Note, even though hotplug callbacks run with
preemption enabled, cr4_set_bits() would disable interrupts while
updating CR4.

Keep the existing logic in place to clear the LASS feature bits early.
setup_clear_cpu_cap() must be called before boot_cpu_data is finalized
and alternatives are patched. Eventually, the entire setup_lass() logic
can go away once the restrictions based on vsyscall emulation and EFI
are removed.

Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Tested-by: Tony Luck <tony.luck@intel.com>
Tested-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
Link: https://patch.msgid.link/20260120234730.2215498-2-sohil.mehta@intel.com
2026-03-03 09:49:44 -08:00
Thorsten Blum
9a4af5a00a x86/mtrr: Use kstrtoul() in parse_mtrr_spare_reg()
Replace the deprecated simple_strtoul()¹ with kstrtoul() for parsing the early
boot parameter mtrr_spare_reg_nr. simple_strtoul() silently sets
nr_mtrr_spare_reg to 0 for invalid input instead of rejecting it and keeping
the default value.

Return kstrtoul()'s retval directly to propagate parsing failures instead of
treating them as success. Also return -EINVAL when '=' is missing from the
boot parameter and 'arg' is NULL.

  ¹ https://www.kernel.org/doc/html/latest/process/deprecated.html#simple-strtol-simple-strtoll-simple-strtoul-simple-strtoull

  [ bp: Massage commit message. ]

Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://patch.msgid.link/20260302135341.3473-2-thorsten.blum@linux.dev
2026-03-02 17:25:31 +01:00
Wei Liu
f69cfd8e8f x86/hyperv: print out reserved vectors in hexadecimal
Signed-off-by: Wei Liu <wei.liu@kernel.org>
2026-02-24 17:10:44 +00:00
Linus Torvalds
32a92f8c89 Convert more 'alloc_obj' cases to default GFP_KERNEL arguments
This converts some of the visually simpler cases that have been split
over multiple lines.  I only did the ones that are easy to verify the
resulting diff by having just that final GFP_KERNEL argument on the next
line.

Somebody should probably do a proper coccinelle script for this, but for
me the trivial script actually resulted in an assertion failure in the
middle of the script.  I probably had made it a bit _too_ trivial.

So after fighting that far a while I decided to just do some of the
syntactically simpler cases with variations of the previous 'sed'
scripts.

The more syntactically complex multi-line cases would mostly really want
whitespace cleanup anyway.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21 20:03:00 -08:00
Linus Torvalds
323bbfcf1e Convert 'alloc_flex' family to use the new default GFP_KERNEL argument
This is the exact same thing as the 'alloc_obj()' version, only much
smaller because there are a lot fewer users of the *alloc_flex()
interface.

As with alloc_obj() version, this was done entirely with mindless brute
force, using the same script, except using 'flex' in the pattern rather
than 'objs*'.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21 17:09:51 -08:00
Linus Torvalds
bf4afc53b7 Convert 'alloc_obj' family to use the new default GFP_KERNEL argument
This was done entirely with mindless brute force, using

    git grep -l '\<k[vmz]*alloc_objs*(.*, GFP_KERNEL)' |
        xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/'

to convert the new alloc_obj() users that had a simple GFP_KERNEL
argument to just drop that argument.

Note that due to the extreme simplicity of the scripting, any slightly
more complex cases spread over multiple lines would not be triggered:
they definitely exist, but this covers the vast bulk of the cases, and
the resulting diff is also then easier to check automatically.

For the same reason the 'flex' versions will be done as a separate
conversion.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21 17:09:51 -08:00
Kees Cook
69050f8d6d treewide: Replace kmalloc with kmalloc_obj for non-scalar types
This is the result of running the Coccinelle script from
scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to
avoid scalar types (which need careful case-by-case checking), and
instead replace kmalloc-family calls that allocate struct or union
object instances:

Single allocations:	kmalloc(sizeof(TYPE), ...)
are replaced with:	kmalloc_obj(TYPE, ...)

Array allocations:	kmalloc_array(COUNT, sizeof(TYPE), ...)
are replaced with:	kmalloc_objs(TYPE, COUNT, ...)

Flex array allocations:	kmalloc(struct_size(PTR, FAM, COUNT), ...)
are replaced with:	kmalloc_flex(*PTR, FAM, COUNT, ...)

(where TYPE may also be *VAR)

The resulting allocations no longer return "void *", instead returning
"TYPE *".

Signed-off-by: Kees Cook <kees@kernel.org>
2026-02-21 01:02:28 -08:00
Linus Torvalds
d31558c077 Merge tag 'hyperv-next-signed-20260218' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux
Pull Hyper-V updates from Wei Liu:

 - Debugfs support for MSHV statistics (Nuno Das Neves)

 - Support for the integrated scheduler (Stanislav Kinsburskii)

 - Various fixes for MSHV memory management and hypervisor status
   handling (Stanislav Kinsburskii)

 - Expose more capabilities and flags for MSHV partition management
   (Anatol Belski, Muminul Islam, Magnus Kulke)

 - Miscellaneous fixes to improve code quality and stability (Carlos
   López, Ethan Nelson-Moore, Li RongQing, Michael Kelley, Mukesh
   Rathor, Purna Pavan Chandra Aekkaladevi, Stanislav Kinsburskii, Uros
   Bizjak)

 - PREEMPT_RT fixes for vmbus interrupts (Jan Kiszka)

* tag 'hyperv-next-signed-20260218' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: (34 commits)
  mshv: Handle insufficient root memory hypervisor statuses
  mshv: Handle insufficient contiguous memory hypervisor status
  mshv: Introduce hv_deposit_memory helper functions
  mshv: Introduce hv_result_needs_memory() helper function
  mshv: Add SMT_ENABLED_GUEST partition creation flag
  mshv: Add nested virtualization creation flag
  Drivers: hv: vmbus: Simplify allocation of vmbus_evt
  mshv: expose the scrub partition hypercall
  mshv: Add support for integrated scheduler
  mshv: Use try_cmpxchg() instead of cmpxchg()
  x86/hyperv: Fix error pointer dereference
  x86/hyperv: Reserve 3 interrupt vectors used exclusively by MSHV
  Drivers: hv: vmbus: Use kthread for vmbus interrupts on PREEMPT_RT
  x86/hyperv: Remove ASM_CALL_CONSTRAINT with VMMCALL insn
  x86/hyperv: Use savesegment() instead of inline asm() to save segment registers
  mshv: fix SRCU protection in irqfd resampler ack handler
  mshv: make field names descriptive in a header struct
  x86/hyperv: Update comment in hyperv_cleanup()
  mshv: clear eventfd counter on irqfd shutdown
  x86/hyperv: Use memremap()/memunmap() instead of ioremap_cache()/iounmap()
  ...
2026-02-20 08:48:31 -08:00
Linus Torvalds
eeccf287a2 Merge tag 'mm-stable-2026-02-18-19-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull more MM  updates from Andrew Morton:

 - "mm/vmscan: fix demotion targets checks in reclaim/demotion" fixes a
   couple of issues in the demotion code - pages were failed demotion
   and were finding themselves demoted into disallowed nodes (Bing Jiao)

 - "Remove XA_ZERO from error recovery of dup_mmap()" fixes a rare
   mapledtree race and performs a number of cleanups (Liam Howlett)

 - "mm: add bitmap VMA flag helpers and convert all mmap_prepare to use
   them" implements a lot of cleanups following on from the conversion
   of the VMA flags into a bitmap (Lorenzo Stoakes)

 - "support batch checking of references and unmapping for large folios"
   implements batching to greatly improve the performance of reclaiming
   clean file-backed large folios (Baolin Wang)

 - "selftests/mm: add memory failure selftests" does as claimed (Miaohe
   Lin)

* tag 'mm-stable-2026-02-18-19-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (36 commits)
  mm/page_alloc: clear page->private in free_pages_prepare()
  selftests/mm: add memory failure dirty pagecache test
  selftests/mm: add memory failure clean pagecache test
  selftests/mm: add memory failure anonymous page test
  mm: rmap: support batched unmapping for file large folios
  arm64: mm: implement the architecture-specific clear_flush_young_ptes()
  arm64: mm: support batch clearing of the young flag for large folios
  arm64: mm: factor out the address and ptep alignment into a new helper
  mm: rmap: support batched checks of the references for large folios
  tools/testing/vma: add VMA userland tests for VMA flag functions
  tools/testing/vma: separate out vma_internal.h into logical headers
  tools/testing/vma: separate VMA userland tests into separate files
  mm: make vm_area_desc utilise vma_flags_t only
  mm: update all remaining mmap_prepare users to use vma_flags_t
  mm: update shmem_[kernel]_file_*() functions to use vma_flags_t
  mm: update secretmem to use VMA flags on mmap_prepare
  mm: update hugetlbfs to use VMA flags on mmap_prepare
  mm: add basic VMA flag operation helper functions
  tools: bitmap: add missing bitmap_[subset(), andnot()]
  mm: add mk_vma_flags() bitmap flag macro helper
  ...
2026-02-18 20:50:32 -08:00