Commit Graph

1154339 Commits

Author SHA1 Message Date
Oliver Upton
073988eb32 Merge branch kvm-arm64/MAINTAINERS into kvmarm/next
* kvm-arm64/MAINTAINERS:
  : KVM/arm64 MAINTAINERS updates
  :
  :  - Drop the old columbia.edu mailing list (you will be missed!)
  :
  :  - Move Oliver up to co-maintainer w/ Marc
  KVM: arm64: Drop Columbia-hosted mailing list
  MAINTAINERS: Add Oliver Upton as co-maintainer of KVM/arm64

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-02-13 23:25:50 +00:00
Oliver Upton
52b603628a Merge branch kvm-arm64/parallel-access-faults into kvmarm/next
* kvm-arm64/parallel-access-faults:
  : Parallel stage-2 access fault handling
  :
  : The parallel faults changes that went in to 6.2 covered most stage-2
  : aborts, with the exception of stage-2 access faults. Building on top of
  : the new infrastructure, this series adds support for handling access
  : faults (i.e. updating the access flag) in parallel.
  :
  : This is expected to provide a performance uplift for cores that do not
  : implement FEAT_HAFDBS, such as those from the fruit company.
  KVM: arm64: Condition HW AF updates on config option
  KVM: arm64: Handle access faults behind the read lock
  KVM: arm64: Don't serialize if the access flag isn't set
  KVM: arm64: Return EAGAIN for invalid PTE in attr walker
  KVM: arm64: Ignore EAGAIN for walks outside of a fault
  KVM: arm64: Use KVM's pte type/helpers in handle_access_fault()

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-02-13 22:33:10 +00:00
Oliver Upton
e8789ab704 Merge branch kvm-arm64/virtual-cache-geometry into kvmarm/next
* kvm-arm64/virtual-cache-geometry:
  : Virtualized cache geometry for KVM guests, courtesy of Akihiko Odaki.
  :
  : KVM/arm64 has always exposed the host cache geometry directly to the
  : guest, even though non-secure software should never perform CMOs by
  : Set/Way. This was slightly wrong, as the cache geometry was derived from
  : the PE on which the vCPU thread was running and not a sanitized value.
  :
  : All together this leads to issues migrating VMs on heterogeneous
  : systems, as the cache geometry saved/restored could be inconsistent.
  :
  : KVM/arm64 now presents 1 level of cache with 1 set and 1 way. The cache
  : geometry is entirely controlled by userspace, such that migrations from
  : older kernels continue to work.
  KVM: arm64: Mark some VM-scoped allocations as __GFP_ACCOUNT
  KVM: arm64: Normalize cache configuration
  KVM: arm64: Mask FEAT_CCIDX
  KVM: arm64: Always set HCR_TID2
  arm64/cache: Move CLIDR macro definitions
  arm64/sysreg: Add CCSIDR2_EL1
  arm64/sysreg: Convert CCSIDR_EL1 to automatic generation
  arm64: Allow the definition of UNKNOWN system register fields

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-02-13 22:32:40 +00:00
Oliver Upton
619cec0085 Merge branch arm64/for-next/sme2 into kvmarm/next
Merge the SME2 branch to fix up a rather annoying conflict due to the
EL2 finalization refactor.

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-02-13 22:30:17 +00:00
Oliver Upton
92425e058a Merge branch kvm/kvm-hw-enable-refactor into kvmarm/next
Merge the kvm_init() + hardware enable rework to avoid conflicts
with kvmarm.

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-02-13 22:28:34 +00:00
Oliver Upton
5f623a598d KVM: arm64: Mark some VM-scoped allocations as __GFP_ACCOUNT
Generally speaking, any memory allocations that can be associated with a
particular VM should be charged to the cgroup of its process.
Nonetheless, there are a couple spots in KVM/arm64 that aren't currently
accounted:

 - the ccsidr array containing the virtualized cache hierarchy

 - the cpumask of supported cpus, for use of the vPMU on heterogeneous
   systems

Go ahead and set __GFP_ACCOUNT for these allocations.

Reviewed-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Link: https://lore.kernel.org/r/20230206235229.4174711-1-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-02-07 13:56:18 +00:00
Marc Zyngier
9442d05bba arm64/sme: Fix __finalise_el2 SMEver check
When checking for ID_AA64SMFR0_EL1.SMEver, __check_override assumes
that the ID_AA64SMFR0_EL1 value is in x1, and the intent of the code
is to reuse value read a few lines above.

However, as the comment says at the beginning of the macro, x1 will
be clobbered, and the checks always fails.

The easiest fix is just to reload the id register before checking it.

Fixes: f122576f35 ("arm64/sme: Enable host kernel to access ZT0")
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-02-06 16:34:29 +00:00
Mark Brown
b2ab432bcf kselftest/arm64: Remove redundant _start labels from zt-test
The newly added zt-test program copied the pattern from the other FP
stress test programs of having a redundant _start label which is
rejected by clang, as we did in a parallel series for the other tests
remove the label so we can build with clang.

No functional change.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230130-arm64-fix-sme2-clang-v1-1-3ce81d99ea8f@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-31 16:02:13 +00:00
Marc Zyngier
960c3028a1 KVM: arm64: Drop Columbia-hosted mailing list
After many years of awesome service, the kvmarm mailing list hosted by
Columbia is being decommissioned, and replaced by kvmarm@lists.linux.dev.

Many thanks to Columbia for having hosted us for so long, and to the
kernel.org folks for giving us a new home.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230113132809.1979119-1-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-01-26 18:43:14 +00:00
Oliver Upton
58817be467 MAINTAINERS: Add Oliver Upton as co-maintainer of KVM/arm64
Going forward I intend to help Marc with maintaining KVM/arm64. We've
spoken about this quite a bit and he has been a tremendous help in
ramping up to the task (thank you!). We haven't worked out the exact
details of how the process will work, but the goal is to even out the
maintenance responsibilities to give us both ample time for development.

To that end, updating the maintainers entry to reflect the change.

Acked-by: Will Deacon <will@kernel.org>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230123210256.2728218-1-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-01-26 18:43:13 +00:00
Akihiko Odaki
7af0c2534f KVM: arm64: Normalize cache configuration
Before this change, the cache configuration of the physical CPU was
exposed to vcpus. This is problematic because the cache configuration a
vcpu sees varies when it migrates between vcpus with different cache
configurations.

Fabricate cache configuration from the sanitized value, which holds the
CTR_EL0 value the userspace sees regardless of which physical CPU it
resides on.

CLIDR_EL1 and CCSIDR_EL1 are now writable from the userspace so that
the VMM can restore the values saved with the old kernel.

Suggested-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Link: https://lore.kernel.org/r/20230112023852.42012-8-akihiko.odaki@daynix.com
[ Oliver: Squash Marc's fix for CCSIDR_EL1.LineSize when set from userspace ]
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-01-21 18:09:23 +00:00
Mark Brown
3eb1b41fba kselftest/arm64: Add coverage of SME 2 and 2.1 hwcaps
Add the hwcaps defined by SME 2 and 2.1 to the hwcaps test.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20221208-arm64-sme2-v4-21-f2fa0aef982f@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-20 12:23:09 +00:00
Mark Brown
4e1aa1a18f kselftest/arm64: Add coverage of the ZT ptrace regset
Add coverage of the ZT ptrace interface.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20221208-arm64-sme2-v4-20-f2fa0aef982f@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-20 12:23:09 +00:00
Mark Brown
49886aa9ab kselftest/arm64: Add SME2 coverage to syscall-abi
Verify that ZT0 is preserved over syscalls when it is present and
PSTATE.ZA is set.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20221208-arm64-sme2-v4-19-f2fa0aef982f@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-20 12:23:08 +00:00
Mark Brown
18f8729ab3 kselftest/arm64: Add test coverage for ZT register signal frames
We should have a ZT register frame with an expected size when ZA is enabled
and have no ZT frame when ZA is disabled. Since we don't load any data into
ZT we expect the data to all be zeros since the architecture guarantees it
will be set to 0 as ZA is enabled.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20221208-arm64-sme2-v4-18-f2fa0aef982f@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-20 12:23:08 +00:00
Mark Brown
afe6f18275 kselftest/arm64: Teach the generic signal context validation about ZT
Add ZT to the set of signal contexts that the shared code understands and
validates the form of.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20221208-arm64-sme2-v4-17-f2fa0aef982f@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-20 12:23:08 +00:00
Mark Brown
6382937326 kselftest/arm64: Enumerate SME2 in the signal test utility code
Support test cases for SME2 by adding it to the set of features that we
enumerate so test cases can check for it.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20221208-arm64-sme2-v4-16-f2fa0aef982f@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-20 12:23:08 +00:00
Mark Brown
f63a9f15b2 kselftest/arm64: Cover ZT in the FP stress test
Hook up the newly added zt-test program in the FPSIMD stress tests, start
a copy per CPU when SME2 is supported.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20221208-arm64-sme2-v4-15-f2fa0aef982f@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-20 12:23:08 +00:00
Mark Brown
1c07425e90 kselftest/arm64: Add a stress test program for ZT0
Following the pattern for the other register sets add a stress test program
for ZT0 which continually loads and verifies patterns in the register in
an effort to discover context switching problems.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20221208-arm64-sme2-v4-14-f2fa0aef982f@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-20 12:23:07 +00:00
Mark Brown
7d5d8601e4 arm64/sme: Add hwcaps for SME 2 and 2.1 features
In order to allow userspace to discover the presence of the new SME features
add hwcaps for them.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20221208-arm64-sme2-v4-13-f2fa0aef982f@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-20 12:23:07 +00:00
Mark Brown
f90b529bcb arm64/sme: Implement ZT0 ptrace support
Implement support for a new note type NT_ARM64_ZT providing access to
ZT0 when implemented. Since ZT0 is a register with constant size this is
much simpler than for other SME state.

As ZT0 is only accessible when PSTATE.ZA is set writes to ZT0 cause
PSTATE.ZA to be set, the main alternative would be to return -EBUSY in
this case but this seemed more constructive. Practical users are also
going to be working with ZA anyway and have some understanding of the
state.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20221208-arm64-sme2-v4-12-f2fa0aef982f@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-20 12:23:07 +00:00
Mark Brown
ee072cf708 arm64/sme: Implement signal handling for ZT
Add a new signal context type for ZT which is present in the signal frame
when ZA is enabled and ZT is supported by the system. In order to account
for the possible addition of further ZT registers in the future we make the
number of registers variable in the ABI, though currently the only possible
number is 1. We could just use a bare list head for the context since the
number of registers can be inferred from the size of the context but for
usability and future extensibility we define a header with the number of
registers and some reserved fields in it.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20221208-arm64-sme2-v4-11-f2fa0aef982f@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-20 12:23:06 +00:00
Mark Brown
95fcec7132 arm64/sme: Implement context switching for ZT0
When the system supports SME2 the ZT0 register must be context switched as
part of the floating point state. This register is stored immediately
after ZA in memory and is only accessible when PSTATE.ZA is set so we
handle it in the same functions we use to save and restore ZA.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20221208-arm64-sme2-v4-10-f2fa0aef982f@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-20 12:23:06 +00:00
Mark Brown
d6138b4adc arm64/sme: Provide storage for ZT0
When the system supports SME2 there is an additional register ZT0 which
we must store when the task is using SME. Since ZT0 is accessible only
when PSTATE.ZA is set just like ZA we allocate storage for it along with
ZA, increasing the allocation size for the memory region where we store
ZA and storing the data for ZT after that for ZA.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20221208-arm64-sme2-v4-9-f2fa0aef982f@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-20 12:23:06 +00:00
Mark Brown
d4913eee15 arm64/sme: Add basic enumeration for SME2
Add basic feature detection for SME2, detecting that the feature is present
and disabling traps for ZT0.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20221208-arm64-sme2-v4-8-f2fa0aef982f@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-20 12:23:06 +00:00
Mark Brown
f122576f35 arm64/sme: Enable host kernel to access ZT0
The new register ZT0 introduced by SME2 comes with a new trap, disable it
for the host kernel so that we can implement support for it.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20221208-arm64-sme2-v4-7-f2fa0aef982f@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-20 12:23:06 +00:00
Mark Brown
2cdeecdb95 arm64/sme: Manually encode ZT0 load and store instructions
In order to avoid unrealistic toolchain requirements we manually encode the
instructions for loading and storing ZT0.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20221208-arm64-sme2-v4-6-f2fa0aef982f@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-20 12:23:06 +00:00
Mark Brown
8ef55603b8 arm64/esr: Document ISS for ZT0 being disabled
SME2 defines a new ISS code for use when trapping acesses to ZT0, add a
definition for it.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20221208-arm64-sme2-v4-5-f2fa0aef982f@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-20 12:23:05 +00:00
Mark Brown
4edc11744e arm64/sme: Document SME 2 and SME 2.1 ABI
As well as a number of simple features which only add new instructions and
require corresponding hwcaps SME2 introduces a new register ZT0 for which
we must define ABI. Fortunately this is a fixed size 512 bits and therefore
much more straightforward than the base SME state, the only wrinkle is that
it is only accessible when ZA is accessible.

While there is only a single register the architecture is written with a
view to exensibility, including a number in the name, so follow this in the
ABI.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20221208-arm64-sme2-v4-4-f2fa0aef982f@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-20 12:23:05 +00:00
Mark Brown
0f3bbe0edf arm64/sysreg: Update system registers for SME 2 and 2.1
FEAT_SME2 and FEAT_SME2P1 introduce several new SME features which can
be enumerated via ID_AA64SMFR0_EL1 and a new register ZT0 access to
which is controlled via SMCR_ELn, add the relevant register description.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20221208-arm64-sme2-v4-3-f2fa0aef982f@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-20 12:23:05 +00:00
Mark Brown
6dabf1fac6 arm64: Document boot requirements for SME 2
SME 2 introduces the new ZT0 register, we require that access to this
reigster is not trapped when we identify that the feature is supported.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20221208-arm64-sme2-v4-2-f2fa0aef982f@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-20 12:23:05 +00:00
Mark Brown
ce514000da arm64/sme: Rename za_state to sme_state
In preparation for adding support for storage for ZT0 to the thread_struct
rename za_state to sme_state. Since ZT0 is accessible when PSTATE.ZA is
set just like ZA itself we will extend the allocation done for ZA to
cover it, avoiding the need to further expand task_struct for non-SME
tasks.

No functional changes.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20221208-arm64-sme2-v4-1-f2fa0aef982f@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-20 12:23:05 +00:00
Linus Torvalds
5dc4c995db Linux 6.2-rc4 v6.2-rc4 2023-01-15 09:22:43 -06:00
Linus Torvalds
f0f70ddb8f Merge tag 'x86_urgent_for_v6.2_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:

 - Make sure the poking PGD is pinned for Xen PV as it requires it this
   way

 - Fixes for two resctrl races when moving a task or creating a new
   monitoring group

 - Fix SEV-SNP guests running under HyperV where MTRRs are disabled to
   not return a UC- type mapping type on memremap() and thus cause a
   serious slowdown

 - Fix insn mnemonics in bioscall.S now that binutils is starting to fix
   confusing insn suffixes

* tag 'x86_urgent_for_v6.2_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mm: fix poking_init() for Xen PV guests
  x86/resctrl: Fix event counts regression in reused RMIDs
  x86/resctrl: Fix task CLOSID/RMID update race
  x86/pat: Fix pat_x_mtrr_type() for MTRR disabled case
  x86/boot: Avoid using Intel mnemonics in AT&T syntax asm
2023-01-15 07:17:44 -06:00
Linus Torvalds
8aa9761223 Merge tag 'edac_urgent_for_v6.2_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras
Pull EDAC fixes from Borislav Petkov:

 - Fix the EDAC device's confusion in the polling setting units

 - Fix a memory leak in highbank's probing function

* tag 'edac_urgent_for_v6.2_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
  EDAC/highbank: Fix memory leak in highbank_mc_probe()
  EDAC/device: Fix period calculation in edac_device_reset_delay_period()
2023-01-15 07:12:58 -06:00
Linus Torvalds
b1d63f0c77 Merge tag 'powerpc-6.2-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:

 - Fix a build failure with some versions of ld that have an odd version
   string

 - Fix incorrect use of mutex in the IMC PMU driver

Thanks to Kajol Jain, Michael Petlan, Ojaswin Mujoo, Peter Zijlstra, and
Yang Yingliang.

* tag 'powerpc-6.2-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/64s/hash: Make stress_hpt_timer_fn() static
  powerpc/imc-pmu: Fix use of mutex in IRQs disabled section
  powerpc/boot: Fix incorrect version calculation issue in ld_version
2023-01-15 07:09:41 -06:00
Linus Torvalds
7c69844052 Merge tag 'iommu-fixes-v6.2-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull iommu fixes from Joerg Roedel:

 - Core: Fix an iommu-group refcount leak

 - Fix overflow issue in IOVA alloc path

 - ARM-SMMU fixes from Will:
    - Fix VFIO regression on NXP SoCs by reporting IOMMU_CAP_CACHE_COHERENCY
    - Fix SMMU shutdown paths to avoid device unregistration race

 - Error handling fix for Mediatek IOMMU driver

* tag 'iommu-fixes-v6.2-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
  iommu/mediatek-v1: Fix an error handling path in mtk_iommu_v1_probe()
  iommu/iova: Fix alloc iova overflows issue
  iommu: Fix refcount leak in iommu_device_claim_dma_owner
  iommu/arm-smmu-v3: Don't unregister on shutdown
  iommu/arm-smmu: Don't unregister on shutdown
  iommu/arm-smmu: Report IOMMU_CAP_CACHE_COHERENCY even betterer
2023-01-14 10:48:15 -06:00
Linus Torvalds
4f43ade45d Merge tag 'fixes-2023-01-14' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock
Pull memblock fix from Mike Rapoport:
 "memblock: always release pages to the buddy allocator in
  memblock_free_late()

  If CONFIG_DEFERRED_STRUCT_PAGE_INIT is enabled, memblock_free_pages()
  only releases pages to the buddy allocator if they are not in the
  deferred range. This is correct for free pages (as defined by
  for_each_free_mem_pfn_range_in_zone()) because free pages in the
  deferred range will be initialized and released as part of the
  deferred init process.

  memblock_free_pages() is called by memblock_free_late(), which is used
  to free reserved ranges after memblock_free_all() has run. All pages
  in reserved ranges have been initialized at that point, and
  accordingly, those pages are not touched by the deferred init process.

  This means that currently, if the pages that memblock_free_late()
  intends to release are in the deferred range, they will never be
  released to the buddy allocator. They will forever be reserved.

  In addition, memblock_free_pages() calls kmsan_memblock_free_pages(),
  which is also correct for free pages but is not correct for reserved
  pages. KMSAN metadata for reserved pages is initialized by
  kmsan_init_shadow(), which runs shortly before memblock_free_all().

  For both of these reasons, memblock_free_pages() should only be called
  for free pages, and memblock_free_late() should call
  __free_pages_core() directly instead.

  One case where this issue can occur in the wild is EFI boot on x86_64.
  The x86 EFI code reserves all EFI boot services memory ranges via
  memblock_reserve() and frees them later via memblock_free_late()
  (efi_reserve_boot_services() and efi_free_boot_services(),
  respectively).

  If any of those ranges happens to fall within the deferred init range,
  the pages will not be released and that memory will be unavailable.

  For example, on an Amazon EC2 t3.micro VM (1 GB) booting via EFI:

    v6.2-rc2:
    Node 0, zone      DMA
          spanned  4095
          present  3999
          managed  3840
    Node 0, zone    DMA32
          spanned  246652
          present  245868
          managed  178867

    v6.2-rc2 + patch:
    Node 0, zone      DMA
          spanned  4095
          present  3999
          managed  3840
    Node 0, zone    DMA32
          spanned  246652
          present  245868
          managed  222816   # +43,949 pages"

* tag 'fixes-2023-01-14' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock:
  mm: Always release pages to the buddy allocator in memblock_free_late().
2023-01-14 10:08:08 -06:00
Linus Torvalds
880ca43e5c Merge tag 'hardening-v6.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull kernel hardening fixes from Kees Cook:

 - Fix CFI hash randomization with KASAN (Sami Tolvanen)

 - Check size of coreboot table entry and use flex-array

* tag 'hardening-v6.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  kbuild: Fix CFI hash randomization with KASAN
  firmware: coreboot: Check size of table entry and use flex-array
2023-01-14 10:04:00 -06:00
Linus Torvalds
8b7be52f3f Merge tag 'modules-6.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux
Pull module fix from Luis Chamberlain:
 "Just one fix for modules by Nick"

* tag 'modules-6.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux:
  kallsyms: Fix scheduling with interrupts disabled in self-test
2023-01-14 08:17:27 -06:00
Linus Torvalds
b35ad63eec Merge tag '6.2-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs fixes from Steve French:

 - memory leak and double free fix

 - two symlink fixes

 - minor cleanup fix

 - two smb1 fixes

* tag '6.2-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: Fix uninitialized memory read for smb311 posix symlink create
  cifs: fix potential memory leaks in session setup
  cifs: do not query ifaces on smb1 mounts
  cifs: fix double free on failed kerberos auth
  cifs: remove redundant assignment to the variable match
  cifs: fix file info setting in cifs_open_file()
  cifs: fix file info setting in cifs_query_path_info()
2023-01-14 08:08:25 -06:00
Linus Torvalds
8e76813085 Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
 "Two minor fixes in the hisi_sas driver which only impact enterprise
  style multi-expander and shared disk situations and no core changes"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: hisi_sas: Set a port invalid only if there are no devices attached when refreshing port id
  scsi: hisi_sas: Use abort task set to reset SAS disks when discovered
2023-01-14 07:57:25 -06:00
Linus Torvalds
34cbf89afc Merge tag 'ata-6.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata
Pull ATA fix from Damien Le Moal:
 "A single fix to prevent building the pata_cs5535 driver with user mode
  linux as it uses msr operations that are not defined with UML"

* tag 'ata-6.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata:
  ata: pata_cs5535: Don't build on UML
2023-01-14 07:52:11 -06:00
Linus Torvalds
97ec4d559d Merge tag 'block-6.2-2023-01-13' of git://git.kernel.dk/linux
Pull block fixes from Jens Axboe:
 "Nothing major in here, just a collection of NVMe fixes and dropping a
  wrong might_sleep() that static checkers tripped over but which isn't
  valid"

* tag 'block-6.2-2023-01-13' of git://git.kernel.dk/linux:
  MAINTAINERS: stop nvme matching for nvmem files
  nvme: don't allow unprivileged passthrough on partitions
  nvme: replace the "bool vec" arguments with flags in the ioctl path
  nvme: remove __nvme_ioctl
  nvme-pci: fix error handling in nvme_pci_enable()
  nvme-pci: add NVME_QUIRK_IDENTIFY_CNS quirk to Apple T2 controllers
  nvme-apple: add NVME_QUIRK_IDENTIFY_CNS quirk to fix regression
  block: Drop spurious might_sleep() from blk_put_queue()
2023-01-13 17:41:19 -06:00
Linus Torvalds
2ce7592df9 Merge tag 'io_uring-6.2-2023-01-13' of git://git.kernel.dk/linux
Pull io_uring fixes from Jens Axboe:
 "A fix for a regression that happened last week, rest is fixes that
  will be headed to stable as well. In detail:

   - Fix for a regression added with the leak fix from last week (me)

   - In writing a test case for that leak, inadvertently discovered a
     case where we a poll request can race. So fix that up and mark it
     for stable, and also ensure that fdinfo covers both the poll tables
     that we have. The latter was an oversight when the split poll table
     were added (me)

   - Fix for a lockdep reported issue with IOPOLL (Pavel)"

* tag 'io_uring-6.2-2023-01-13' of git://git.kernel.dk/linux:
  io_uring: lock overflowing for IOPOLL
  io_uring/poll: attempt request issue after racy poll wakeup
  io_uring/fdinfo: include locked hash table in fdinfo output
  io_uring/poll: add hash if ready poll request can't complete inline
  io_uring/io-wq: only free worker if it was allocated for creation
2023-01-13 17:37:09 -06:00
Linus Torvalds
9e058c2952 Merge tag 'pci-v6.2-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull pci fixes from Bjorn Helgaas:

 - Work around apparent firmware issue that made Linux reject MMCONFIG
   space, which broke PCI extended config space (Bjorn Helgaas)

 - Fix CONFIG_PCIE_BT1 dependency due to mid-air collision between a
   PCI_MSI_IRQ_DOMAIN -> PCI_MSI change and addition of PCIE_BT1 (Lukas
   Bulwahn)

* tag 'pci-v6.2-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  x86/pci: Treat EfiMemoryMappedIO as reservation of ECAM space
  x86/pci: Simplify is_mmconf_reserved() messages
  PCI: dwc: Adjust to recent removal of PCI_MSI_IRQ_DOMAIN
2023-01-13 17:32:22 -06:00
Sami Tolvanen
42633ed852 kbuild: Fix CFI hash randomization with KASAN
Clang emits a asan.module_ctor constructor to each object file
when KASAN is enabled, and these functions are indirectly called
in do_ctors. With CONFIG_CFI_CLANG, the compiler also emits a CFI
type hash before each address-taken global function so they can
pass indirect call checks.

However, in commit 0c3e806ec0 ("x86/cfi: Add boot time hash
randomization"), x86 implemented boot time hash randomization,
which relies on the .cfi_sites section generated by objtool. As
objtool is run against vmlinux.o instead of individual object
files with X86_KERNEL_IBT (enabled by default), CFI types in
object files that are not part of vmlinux.o end up not being
included in .cfi_sites, and thus won't get randomized and trip
CFI when called.

Only .vmlinux.export.o and init/version-timestamp.o are linked
into vmlinux separately from vmlinux.o. As these files don't
contain any functions, disable KASAN for both of them to avoid
breaking hash randomization.

Link: https://github.com/ClangBuiltLinux/linux/issues/1742
Fixes: 0c3e806ec0 ("x86/cfi: Add boot time hash randomization")
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20230112224948.1479453-2-samitolvanen@google.com
2023-01-13 15:22:03 -08:00
Kees Cook
3b293487b8 firmware: coreboot: Check size of table entry and use flex-array
The memcpy() of the data following a coreboot_table_entry couldn't
be evaluated by the compiler under CONFIG_FORTIFY_SOURCE. To make it
easier to reason about, add an explicit flexible array member to struct
coreboot_device so the entire entry can be copied at once. Additionally,
validate the sizes before copying. Avoids this run-time false positive
warning:

  memcpy: detected field-spanning write (size 168) of single field "&device->entry" at drivers/firmware/google/coreboot_table.c:103 (size 8)

Reported-by: Paul Menzel <pmenzel@molgen.mpg.de>
Link: https://lore.kernel.org/all/03ae2704-8c30-f9f0-215b-7cdf4ad35a9a@molgen.mpg.de/
Cc: Jack Rosenthal <jrosenth@chromium.org>
Cc: Guenter Roeck <groeck@chromium.org>
Cc: Julius Werner <jwerner@chromium.org>
Cc: Brian Norris <briannorris@chromium.org>
Cc: Stephen Boyd <swboyd@chromium.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Julius Werner <jwerner@chromium.org>
Reviewed-by: Guenter Roeck <groeck@chromium.org>
Link: https://lore.kernel.org/r/20230107031406.gonna.761-kees@kernel.org
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Reviewed-by: Jack Rosenthal <jrosenth@chromium.org>
Link: https://lore.kernel.org/r/20230112230312.give.446-kees@kernel.org
2023-01-13 15:22:03 -08:00
Nicholas Piggin
da35048f26 kallsyms: Fix scheduling with interrupts disabled in self-test
kallsyms_on_each* may schedule so must not be called with interrupts
disabled. The iteration function could disable interrupts, but this
also changes lookup_symbol() to match the change to the other timing
code.

Reported-by: Erhard F. <erhard_f@mailbox.org>
Link: https://lore.kernel.org/all/bug-216902-206035@https.bugzilla.kernel.org%2F/
Reported-by: kernel test robot <oliver.sang@intel.com>
Link: https://lore.kernel.org/oe-lkp/202212251728.8d0872ff-oliver.sang@intel.com
Fixes: 30f3bb0977 ("kallsyms: Add self-test facility")
Tested-by: "Erhard F." <erhard_f@mailbox.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-01-13 15:09:08 -08:00
Peter Foley
22eebaa631 ata: pata_cs5535: Don't build on UML
This driver uses MSR functions that aren't implemented under UML.
Avoid building it to prevent tripping up allyesconfig.

e.g.
/usr/lib/gcc/x86_64-pc-linux-gnu/12/../../../../x86_64-pc-linux-gnu/bin/ld: pata_cs5535.c:(.text+0x3a3): undefined reference to `__tracepoint_read_msr'
/usr/lib/gcc/x86_64-pc-linux-gnu/12/../../../../x86_64-pc-linux-gnu/bin/ld: pata_cs5535.c:(.text+0x3d2): undefined reference to `__tracepoint_write_msr'
/usr/lib/gcc/x86_64-pc-linux-gnu/12/../../../../x86_64-pc-linux-gnu/bin/ld: pata_cs5535.c:(.text+0x457): undefined reference to `__tracepoint_write_msr'
/usr/lib/gcc/x86_64-pc-linux-gnu/12/../../../../x86_64-pc-linux-gnu/bin/ld: pata_cs5535.c:(.text+0x481): undefined reference to `do_trace_write_msr'
/usr/lib/gcc/x86_64-pc-linux-gnu/12/../../../../x86_64-pc-linux-gnu/bin/ld: pata_cs5535.c:(.text+0x4d5): undefined reference to `do_trace_write_msr'
/usr/lib/gcc/x86_64-pc-linux-gnu/12/../../../../x86_64-pc-linux-gnu/bin/ld: pata_cs5535.c:(.text+0x4f5): undefined reference to `do_trace_read_msr'
/usr/lib/gcc/x86_64-pc-linux-gnu/12/../../../../x86_64-pc-linux-gnu/bin/ld: pata_cs5535.c:(.text+0x51c): undefined reference to `do_trace_write_msr'

Signed-off-by: Peter Foley <pefoley2@pefoley.com>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
2023-01-14 07:38:48 +09:00