Commit Graph

1279778 Commits

Author SHA1 Message Date
Srinivas Pandruvada
8bdab3c8f2 cpufreq: intel_pstate: Update Meteor Lake EPPs
Update the default balance_performance EPP to 64. This gives better
performance and also perf/watt compared to current value of 115.

For example:

Speedometer 2.1
	score: +19%
	Perf/watt: +5.25%

Webxprt 4 score
	score: +12%
	Perf/watt: +6.12%

3DMark Wildlife extreme unlimited score
	score: +3.2%
	Perf/watt: +11.5%

Geekbench6 MT
	score: +2.14%
	Perf/watt: +0.32%

Also update balance_power EPP default to 179. With this change:
	Video Playback power is reduced by 52%
	Team video conference power is reduced by 35%

With Power profile daemon now sets balance_power EPP on DC instead of
balance_performance, updating balance_power EPP will help to extend
battery life.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-06-07 21:05:09 +02:00
Tony Luck
ca8752384c cpufreq: intel_pstate: Switch to new Intel CPU model defines
New CPU #defines encode vendor and family as well as model.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-06-07 20:33:46 +02:00
Tony Luck
691fef8ccb cpufreq: Switch to new Intel CPU model defines
New CPU #defines encode vendor and family as well as model.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-06-07 20:32:05 +02:00
Linus Torvalds
64c6a36d79 Merge tag 'pm-6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
 "These fix the intel_pstate and amd-pstate cpufreq drivers and the
  cpupower utility.

  Specifics:

   - Fix a recently introduced unchecked HWP MSR access in the
     intel_pstate driver (Srinivas Pandruvada)

   - Add missing conversion from MHz to KHz to amd_pstate_set_boost() to
     address sysfs inteface inconsistency and fix P-state frequency
     reporting on AMD Family 1Ah CPUs in the cpupower utility (Dhananjay
     Ugwekar)

   - Get rid of an excess global header file used by the amd-pstate
     cpufreq driver (Arnd Bergmann)"

* tag 'pm-6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  cpufreq: intel_pstate: Fix unchecked HWP MSR access
  cpufreq: amd-pstate: Fix the inconsistency in max frequency units
  cpufreq: amd-pstate: remove global header file
  tools/power/cpupower: Fix Pstate frequency reporting on AMD Family 1Ah CPUs
2024-06-05 15:12:35 -07:00
Linus Torvalds
19ca0d8a43 Merge tag 'for-6.10-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fix from David Sterba:
 "A fix for fast fsync that needs to handle errors during writes after
  some COW failure so it does not lead to an inconsistent state"

* tag 'for-6.10-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: ensure fast fsync waits for ordered extents after a write failure
2024-06-05 11:28:25 -07:00
Linus Torvalds
e20b269d73 Merge tag 'bcachefs-2024-06-05' of https://evilpiepirate.org/git/bcachefs
Pull bcachefs fixes from Kent Overstreet:
 "Just a few small fixes"

* tag 'bcachefs-2024-06-05' of https://evilpiepirate.org/git/bcachefs:
  bcachefs: Fix trans->locked assert
  bcachefs: Rereplicate now moves data off of durability=0 devices
  bcachefs: Fix GFP_KERNEL allocation in break_cycle()
2024-06-05 11:25:41 -07:00
Linus Torvalds
558dc49aac Merge tag 'i2c-for-6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
 "This should have been my second pull request during the merge window
  but one dependency in the drm subsystem fell through the cracks and
  was only applied for rc2.

  Now we can finally remove I2C_CLASS_SPD"

* tag 'i2c-for-6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: Remove I2C_CLASS_SPD
  i2c: synquacer: Remove a clk reference from struct synquacer_i2c
2024-06-05 10:32:20 -07:00
Linus Torvalds
208d9b65c0 Merge tag 'tpmdd-next-6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd
Pull tpm fixes from Jarkko Sakkinen:
 "The bug fix for tpm_tis_core_init() is not that critical but still
  makes sense to get into release for the sake of better quality.

  I included the Intel CPU model define change mainly to help Tony just
  a bit, as for this subsystem it cannot realistically speaking cause
  any possible harm"

* tag 'tpmdd-next-6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd:
  tpm: Switch to new Intel CPU model defines
  tpm_tis: Do *not* flush uninitialized work
2024-06-05 10:29:13 -07:00
Linus Torvalds
71d7b52cc3 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
 "This is dominated by a couple large series for ARM and x86
  respectively, but apart from that things are calm.

  ARM:

   - Large set of FP/SVE fixes for pKVM, addressing the fallout from the
     per-CPU data rework and making sure that the host is not involved
     in the FP/SVE switching any more

   - Allow FEAT_BTI to be enabled with NV now that FEAT_PAUTH is
     completely supported

   - Fix for the respective priorities of Failed PAC, Illegal Execution
     state and Instruction Abort exceptions

   - Fix the handling of AArch32 instruction traps failing their
     condition code, which was broken by the introduction of
     ESR_EL2.ISS2

   - Allow vcpus running in AArch32 state to be restored in System mode

   - Fix AArch32 GPR restore that would lose the 64 bit state under some
     conditions

  RISC-V:

   - No need to use mask when hart-index-bits is 0

   - Fix incorrect reg_subtype labels in
     kvm_riscv_vcpu_set_reg_isa_ext()

  x86:

   - Fixes and debugging help for the #VE sanity check.

     Also disable it by default, even for CONFIG_DEBUG_KERNEL, because
     it was found to trigger spuriously (most likely a processor erratum
     as the exact symptoms vary by generation).

   - Avoid WARN() when two NMIs arrive simultaneously during an
     NMI-disabled situation (GIF=0 or interrupt shadow) when the
     processor supports virtual NMI.

     While generally KVM will not request an NMI window when virtual
     NMIs are supported, in this case it *does* have to single-step over
     the interrupt shadow or enable the STGI intercept, in order to
     deliver the latched second NMI.

   - Drop support for hand tuning APIC timer advancement from userspace.

     Since we have adaptive tuning, and it has proved to work well, drop
     the module parameter for manual configuration and with it a few
     stupid bugs that it had"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (32 commits)
  KVM: x86/mmu: Don't save mmu_invalidate_seq after checking private attr
  KVM: arm64: Ensure that SME controls are disabled in protected mode
  KVM: arm64: Refactor CPACR trap bit setting/clearing to use ELx format
  KVM: arm64: Consolidate initializing the host data's fpsimd_state/sve in pKVM
  KVM: arm64: Eagerly restore host fpsimd/sve state in pKVM
  KVM: arm64: Allocate memory mapped at hyp for host sve state in pKVM
  KVM: arm64: Specialize handling of host fpsimd state on trap
  KVM: arm64: Abstract set/clear of CPTR_EL2 bits behind helper
  KVM: arm64: Fix prototype for __sve_save_state/__sve_restore_state
  KVM: arm64: Reintroduce __sve_save_state
  KVM: x86: Drop support for hand tuning APIC timer advancement from userspace
  KVM: SEV-ES: Delegate LBR virtualization to the processor
  KVM: SEV-ES: Disallow SEV-ES guests when X86_FEATURE_LBRV is absent
  KVM: SEV-ES: Prevent MSR access post VMSA encryption
  RISC-V: KVM: Fix incorrect reg_subtype labels in kvm_riscv_vcpu_set_reg_isa_ext function
  RISC-V: KVM: No need to use mask when hart-index-bit is 0
  KVM: arm64: nv: Expose BTI and CSV_frac to a guest hypervisor
  KVM: arm64: nv: Fix relative priorities of exceptions generated by ERETAx
  KVM: arm64: AArch32: Fix spurious trapping of conditional instructions
  KVM: arm64: Allow AArch32 PSTATE.M to be restored as System mode
  ...
2024-06-05 08:43:41 -07:00
Rafael J. Wysocki
9b7e7ff0fe Merge branch 'pm-cpufreq'
Merge cpufreq fixes for 6.10-rc3:

 - Fix a recently introduced unchecked HWP MSR access in the
   intel_pstate driver (Srinivas Pandruvada).

 - Add missing conversion from MHz to KHz to amd_pstate_set_boost()
   to address sysfs inteface inconsistency (Dhananjay Ugwekar).

 - Get rid of an excess global header file used by the amd-pstate
   cpufreq driver (Arnd Bergmann).

* pm-cpufreq:
  cpufreq: intel_pstate: Fix unchecked HWP MSR access
  cpufreq: amd-pstate: Fix the inconsistency in max frequency units
  cpufreq: amd-pstate: remove global header file
2024-06-05 17:11:47 +02:00
Kent Overstreet
319fef29e9 bcachefs: Fix trans->locked assert
in bch2_move_data_btree, we might start with the trans unlocked from a
previous loop iteration - we need a trans_begin() before iter_init().

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-06-05 10:44:08 -04:00
Kent Overstreet
fdccb24352 bcachefs: Rereplicate now moves data off of durability=0 devices
This fixes an issue where setting a device to durability=0 after it's
been used makes it impossible to remove.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-06-05 10:44:08 -04:00
Kent Overstreet
9a64e1bfd8 bcachefs: Fix GFP_KERNEL allocation in break_cycle()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-06-05 10:44:08 -04:00
Tao Su
db574f2f96 KVM: x86/mmu: Don't save mmu_invalidate_seq after checking private attr
Drop the second snapshot of mmu_invalidate_seq in kvm_faultin_pfn().
Before checking the mismatch of private vs. shared, mmu_invalidate_seq is
saved to fault->mmu_seq, which can be used to detect an invalidation
related to the gfn occurred, i.e. KVM will not install a mapping in page
table if fault->mmu_seq != mmu_invalidate_seq.

Currently there is a second snapshot of mmu_invalidate_seq, which may not
be same as the first snapshot in kvm_faultin_pfn(), i.e. the gfn attribute
may be changed between the two snapshots, but the gfn may be mapped in
page table without hindrance. Therefore, drop the second snapshot as it
has no obvious benefits.

Fixes: f6adeae81f ("KVM: x86/mmu: Handle no-slot faults at the beginning of kvm_faultin_pfn()")
Signed-off-by: Tao Su <tao1.su@linux.intel.com>
Message-ID: <20240528102234.2162763-1-tao1.su@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-05 06:45:06 -04:00
Paolo Bonzini
45ce0314bf Merge tag 'kvmarm-fixes-6.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 fixes for 6.10, take #1

- Large set of FP/SVE fixes for pKVM, addressing the fallout
  from the per-CPU data rework and making sure that the host
  is not involved in the FP/SVE switching any more

- Allow FEAT_BTI to be enabled with NV now that FEAT_PAUTH
  is copletely supported

- Fix for the respective priorities of Failed PAC, Illegal
  Execution state and Instruction Abort exceptions

- Fix the handling of AArch32 instruction traps failing their
  condition code, which was broken by the introduction of
  ESR_EL2.ISS2

- Allow vpcus running in AArch32 state to be restored in
  System mode

- Fix AArch32 GPR restore that would lose the 64 bit state
  under some conditions
2024-06-05 06:32:18 -04:00
Tony Luck
f071d02eca tpm: Switch to new Intel CPU model defines
New CPU #defines encode vendor and family as well as model.

Link: https://lore.kernel.org/all/20240520224620.9480-4-tony.luck@intel.com/
Signed-off-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2024-06-05 04:55:04 +03:00
Jan Beulich
0ea00e249c tpm_tis: Do *not* flush uninitialized work
tpm_tis_core_init() may fail before tpm_tis_probe_irq_single() is
called, in which case tpm_tis_remove() unconditionally calling
flush_work() is triggering a warning for .func still being NULL.

Cc: stable@vger.kernel.org # v6.5+
Fixes: 481c2d1462 ("tpm,tpm_tis: Disable interrupts after 1000 unhandled IRQs")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2024-06-05 01:18:01 +03:00
Linus Torvalds
51214520ad Merge tag 'devicetree-fixes-for-6.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux
Pull devicetree fixes from Rob Herring:

 - Fix regression in 'interrupt-map' handling affecting Apple M1 mini
   (at least)

 - Fix binding example warning in stm32 st,mlahb binding

 - Fix schema error in Allwinner platform binding causing lots of
   spurious warnings

 - Add missing MODULE_DESCRIPTION() to DT kunit tests

* tag 'devicetree-fixes-for-6.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
  of: property: Fix fw_devlink handling of interrupt-map
  of/irq: Factor out parsing of interrupt-map parent phandle+args from of_irq_parse_raw()
  dt-bindings: arm: stm32: st,mlahb: Drop spurious "reg" property from example
  dt-bindings: arm: sunxi: Fix incorrect '-' usage
  of: of_test: add MODULE_DESCRIPTION()
2024-06-04 14:08:44 -07:00
Linus Torvalds
32f88d65f0 Merge tag 'linux_kselftest-fixes-6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull kselftest fixes from Shuah Khan:
 "Fixes to build warnings in several tests and fixes to ftrace tests"

* tag 'linux_kselftest-fixes-6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  selftests/futex: don't pass a const char* to asprintf(3)
  selftests/futex: don't redefine .PHONY targets (all, clean)
  selftests/tracing: Fix event filter test to retry up to 10 times
  selftests/futex: pass _GNU_SOURCE without a value to the compiler
  selftests/overlayfs: Fix build error on ppc64
  selftests/openat2: Fix build warnings on ppc64
  selftests: cachestat: Fix build warnings on ppc64
  tracing/selftests: Fix kprobe event name test for .isra. functions
  selftests/ftrace: Update required config
  selftests/ftrace: Fix to check required event file
  kselftest/alsa: Ensure _GNU_SOURCE is defined
2024-06-04 10:34:13 -07:00
Fuad Tabba
afb91f5f8a KVM: arm64: Ensure that SME controls are disabled in protected mode
KVM (and pKVM) do not support SME guests. Therefore KVM ensures
that the host's SME state is flushed and that SME controls for
enabling access to ZA storage and for streaming are disabled.

pKVM needs to protect against a buggy/malicious host. Ensure that
it wouldn't run a guest when protected mode is enabled should any
of the SME controls be enabled.

Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://lore.kernel.org/r/20240603122852.3923848-10-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-06-04 15:06:33 +01:00
Fuad Tabba
a69283ae1d KVM: arm64: Refactor CPACR trap bit setting/clearing to use ELx format
When setting/clearing CPACR bits for EL0 and EL1, use the ELx
format of the bits, which covers both. This makes the code
clearer, and reduces the chances of accidentally missing a bit.

No functional change intended.

Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://lore.kernel.org/r/20240603122852.3923848-9-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-06-04 15:06:33 +01:00
Fuad Tabba
1696fc2174 KVM: arm64: Consolidate initializing the host data's fpsimd_state/sve in pKVM
Now that we have introduced finalize_init_hyp_mode(), lets
consolidate the initializing of the host_data fpsimd_state and
sve state.

Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Fuad Tabba <tabba@google.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20240603122852.3923848-8-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-06-04 15:06:33 +01:00
Fuad Tabba
b5b9955617 KVM: arm64: Eagerly restore host fpsimd/sve state in pKVM
When running in protected mode we don't want to leak protected
guest state to the host, including whether a guest has used
fpsimd/sve. Therefore, eagerly restore the host state on guest
exit when running in protected mode, which happens only if the
guest has used fpsimd/sve.

Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://lore.kernel.org/r/20240603122852.3923848-7-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-06-04 15:06:33 +01:00
Fuad Tabba
66d5b53e20 KVM: arm64: Allocate memory mapped at hyp for host sve state in pKVM
Protected mode needs to maintain (save/restore) the host's sve
state, rather than relying on the host kernel to do that. This is
to avoid leaking information to the host about guests and the
type of operations they are performing.

As a first step towards that, allocate memory mapped at hyp, per
cpu, for the host sve state. The following patch will use this
memory to save/restore the host state.

Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://lore.kernel.org/r/20240603122852.3923848-6-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-06-04 15:06:33 +01:00
Fuad Tabba
e511e08a9f KVM: arm64: Specialize handling of host fpsimd state on trap
In subsequent patches, n/vhe will diverge on saving the host
fpsimd/sve state when taking a guest fpsimd/sve trap. Add a
specialized helper to handle it.

No functional change intended.

Reviewed-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://lore.kernel.org/r/20240603122852.3923848-5-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-06-04 15:06:33 +01:00
Fuad Tabba
6d8fb3cbf7 KVM: arm64: Abstract set/clear of CPTR_EL2 bits behind helper
The same traps controlled by CPTR_EL2 or CPACR_EL1 need to be
toggled in different parts of the code, but the exact bits and
their polarity differ between these two formats and the mode
(vhe/nvhe/hvhe).

To reduce the amount of duplicated code and the chance of getting
the wrong bit/polarity or missing a field, abstract the set/clear
of CPTR_EL2 bits behind a helper.

Since (h)VHE is the way of the future, use the CPACR_EL1 format,
which is a subset of the VHE CPTR_EL2, as a reference.

No functional change intended.

Suggested-by: Oliver Upton <oliver.upton@linux.dev>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://lore.kernel.org/r/20240603122852.3923848-4-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-06-04 15:06:33 +01:00
Fuad Tabba
45f4ea9bcf KVM: arm64: Fix prototype for __sve_save_state/__sve_restore_state
Since the prototypes for __sve_save_state/__sve_restore_state at
hyp were added, the underlying macro has acquired a third
parameter for saving/restoring ffr.

Fix the prototypes to account for the third parameter, and
restore the ffr for the guest since it is saved.

Suggested-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Fuad Tabba <tabba@google.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20240603122852.3923848-3-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-06-04 15:06:32 +01:00
Fuad Tabba
87bb39ed40 KVM: arm64: Reintroduce __sve_save_state
Now that the hypervisor is handling the host sve state in
protected mode, it needs to be able to save it.

This reverts commit e66425fc9b ("KVM: arm64: Remove unused
__sve_save_state").

Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://lore.kernel.org/r/20240603122852.3923848-2-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-06-04 15:06:32 +01:00
Linus Torvalds
2ab7951410 Merge tag 'cxl-fixes-6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl
Pull cxl fixes from Dave Jiang:

 - Compile fix for cxl-test from missing linux/vmalloc.h

 - Fix for memregion leaks in devm_cxl_add_region()

* tag 'cxl-fixes-6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl:
  cxl/region: Fix memregion leaks in devm_cxl_add_region()
  cxl/test: Add missing vmalloc.h for tools/testing/cxl/test/mem.c
2024-06-03 14:42:41 -07:00
Paolo Bonzini
b50788f7cd Merge tag 'kvm-riscv-fixes-6.10-1' of https://github.com/kvm-riscv/linux into HEAD
KVM/riscv fixes for 6.10, take #1

- No need to use mask when hart-index-bits is 0
- Fix incorrect reg_subtype labels in kvm_riscv_vcpu_set_reg_isa_ext()
2024-06-03 13:18:18 -04:00
Paolo Bonzini
b3233c737e Merge branch 'kvm-fixes-6.10-1' into HEAD
* Fixes and debugging help for the #VE sanity check.  Also disable
  it by default, even for CONFIG_DEBUG_KERNEL, because it was found
  to trigger spuriously (most likely a processor erratum as the
  exact symptoms vary by generation).

* Avoid WARN() when two NMIs arrive simultaneously during an NMI-disabled
  situation (GIF=0 or interrupt shadow) when the processor supports
  virtual NMI.  While generally KVM will not request an NMI window
  when virtual NMIs are supported, in this case it *does* have to
  single-step over the interrupt shadow or enable the STGI intercept,
  in order to deliver the latched second NMI.

* Drop support for hand tuning APIC timer advancement from userspace.
  Since we have adaptive tuning, and it has proved to work well,
  drop the module parameter for manual configuration and with it a
  few stupid bugs that it had.
2024-06-03 13:18:08 -04:00
Sean Christopherson
89a58812c4 KVM: x86: Drop support for hand tuning APIC timer advancement from userspace
Remove support for specifying a static local APIC timer advancement value,
and instead present a read-only boolean parameter to let userspace enable
or disable KVM's dynamic APIC timer advancement.  Realistically, it's all
but impossible for userspace to specify an advancement that is more
precise than what KVM's adaptive tuning can provide.  E.g. a static value
needs to be tuned for the exact hardware and kernel, and if KVM is using
hrtimers, likely requires additional tuning for the exact configuration of
the entire system.

Dropping support for a userspace provided value also fixes several flaws
in the interface.  E.g. KVM interprets a negative value other than -1 as a
large advancement, toggling between a negative and positive value yields
unpredictable behavior as vCPUs will switch from dynamic to static
advancement, changing the advancement in the middle of VM creation can
result in different values for vCPUs within a VM, etc.  Those flaws are
mostly fixable, but there's almost no justification for taking on yet more
complexity (it's minimal complexity, but still non-zero).

The only arguments against using KVM's adaptive tuning is if a setup needs
a higher maximum, or if the adjustments are too reactive, but those are
arguments for letting userspace control the absolute max advancement and
the granularity of each adjustment, e.g. similar to how KVM provides knobs
for halt polling.

Link: https://lore.kernel.org/all/20240520115334.852510-1-zhoushuling@huawei.com
Cc: Shuling Zhou <zhoushuling@huawei.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20240522010304.1650603-1-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-03 13:08:05 -04:00
Ravi Bangoria
b7e4be0a22 KVM: SEV-ES: Delegate LBR virtualization to the processor
As documented in APM[1], LBR Virtualization must be enabled for SEV-ES
guests. Although KVM currently enforces LBRV for SEV-ES guests, there
are multiple issues with it:

o MSR_IA32_DEBUGCTLMSR is still intercepted. Since MSR_IA32_DEBUGCTLMSR
  interception is used to dynamically toggle LBRV for performance reasons,
  this can be fatal for SEV-ES guests. For ex SEV-ES guest on Zen3:

  [guest ~]# wrmsr 0x1d9 0x4
  KVM: entry failed, hardware error 0xffffffff
  EAX=00000004 EBX=00000000 ECX=000001d9 EDX=00000000

  Fix this by never intercepting MSR_IA32_DEBUGCTLMSR for SEV-ES guests.
  No additional save/restore logic is required since MSR_IA32_DEBUGCTLMSR
  is of swap type A.

o KVM will disable LBRV if userspace sets MSR_IA32_DEBUGCTLMSR before the
  VMSA is encrypted. Fix this by moving LBRV enablement code post VMSA
  encryption.

[1]: AMD64 Architecture Programmer's Manual Pub. 40332, Rev. 4.07 - June
     2023, Vol 2, 15.35.2 Enabling SEV-ES.
     https://bugzilla.kernel.org/attachment.cgi?id=304653

Fixes: 376c6d2850 ("KVM: SVM: Provide support for SEV-ES vCPU creation/loading")
Co-developed-by: Nikunj A Dadhania <nikunj@amd.com>
Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Message-ID: <20240531044644.768-4-ravi.bangoria@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-03 13:07:18 -04:00
Ravi Bangoria
d922056215 KVM: SEV-ES: Disallow SEV-ES guests when X86_FEATURE_LBRV is absent
As documented in APM[1], LBR Virtualization must be enabled for SEV-ES
guests. So, prevent SEV-ES guests when LBRV support is missing.

[1]: AMD64 Architecture Programmer's Manual Pub. 40332, Rev. 4.07 - June
     2023, Vol 2, 15.35.2 Enabling SEV-ES.
     https://bugzilla.kernel.org/attachment.cgi?id=304653

Fixes: 376c6d2850 ("KVM: SVM: Provide support for SEV-ES vCPU creation/loading")
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Message-ID: <20240531044644.768-3-ravi.bangoria@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-03 13:06:48 -04:00
Nikunj A Dadhania
27bd5fdc24 KVM: SEV-ES: Prevent MSR access post VMSA encryption
KVM currently allows userspace to read/write MSRs even after the VMSA is
encrypted. This can cause unintentional issues if MSR access has side-
effects. For ex, while migrating a guest, userspace could attempt to
migrate MSR_IA32_DEBUGCTLMSR and end up unintentionally disabling LBRV on
the target. Fix this by preventing access to those MSRs which are context
switched via the VMSA, once the VMSA is encrypted.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Message-ID: <20240531044644.768-2-ravi.bangoria@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-03 13:06:48 -04:00
Linus Torvalds
f06ce44145 Merge tag 'loongarch-fixes-6.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson
Pull LoongArch fixes from Huacai Chen:
 "Some bootloader interface fixes, a dts fix, and a trivial cleanup"

* tag 'loongarch-fixes-6.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
  LoongArch: Fix GMAC's phy-mode definitions in dts
  LoongArch: Override higher address bits in JUMP_VIRT_ADDR
  LoongArch: Fix entry point in kernel image header
  LoongArch: Add all CPUs enabled by fdt to NUMA node 0
  LoongArch: Fix built-in DTB detection
  LoongArch: Remove CONFIG_ACPI_TABLE_UPGRADE in platform_init()
2024-06-03 09:27:45 -07:00
Srinivas Pandruvada
1e24c31351 cpufreq: intel_pstate: Fix unchecked HWP MSR access
Fix unchecked MSR access error for processors with no HWP support. On
such processors, maximum frequency can be changed by the system firmware
using ACPI event ACPI_PROCESSOR_NOTIFY_HIGEST_PERF_CHANGED. This results
in accessing HWP MSR 0x771.

Call Trace:
	<TASK>
	generic_exec_single+0x58/0x120
	smp_call_function_single+0xbf/0x110
	rdmsrl_on_cpu+0x46/0x60
	intel_pstate_get_hwp_cap+0x1b/0x70
	intel_pstate_update_limits+0x2a/0x60
	acpi_processor_notify+0xb7/0x140
	acpi_ev_notify_dispatch+0x3b/0x60

HWP MSR 0x771 can be only read on a CPU which supports HWP and enabled.
Hence intel_pstate_get_hwp_cap() can only be called when hwp_active is
true.

Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Closes: https://lore.kernel.org/linux-pm/20240529155740.Hq2Hw7be@linutronix.de/
Fixes: e8217b4bec ("cpufreq: intel_pstate: Update the maximum CPU frequency consistently")
Tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-06-03 18:00:23 +02:00
Huacai Chen
eb36e520f4 LoongArch: Fix GMAC's phy-mode definitions in dts
The GMAC of Loongson chips cannot insert the correct 1.5-2ns delay. So
we need the PHY to insert internal delays for both transmit and receive
data lines from/to the PHY device. Fix this by changing the "phy-mode"
from "rgmii" to "rgmii-id" in dts.

Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-06-03 15:45:53 +08:00
Jiaxun Yang
1098efd299 LoongArch: Override higher address bits in JUMP_VIRT_ADDR
In JUMP_VIRT_ADDR we are performing an or calculation on address value
directly from pcaddi.

This will only work if we are currently running from direct 1:1 mapping
addresses or firmware's DMW is configured exactly same as kernel. Still,
we should not rely on such assumption.

Fix by overriding higher bits in address comes from pcaddi, so we can
get rid of or operator.

Cc: stable@vger.kernel.org
Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-06-03 15:45:53 +08:00
Jiaxun Yang
beb2800074 LoongArch: Fix entry point in kernel image header
Currently kernel entry in head.S is in DMW address range, firmware is
instructed to jump to this address after loading the kernel image.

However kernel should not make any assumption on firmware's DMW
setting, thus the entry point should be a physical address falls into
direct translation region.

Fix by converting entry address to physical and amend entry calculation
logic in libstub accordingly.

BTW, use ABSOLUTE() to calculate variables to make Clang/LLVM happy.

Cc: stable@vger.kernel.org
Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-06-03 15:45:53 +08:00
Jiaxun Yang
3de9c42d02 LoongArch: Add all CPUs enabled by fdt to NUMA node 0
NUMA enabled kernel on FDT based machine fails to boot because CPUs
are all in NUMA_NO_NODE and mm subsystem won't accept that.

Fix by adding them to default NUMA node at FDT parsing phase and move
numa_add_cpu(0) to a later point.

Cc: stable@vger.kernel.org
Fixes: 88d4d957ed ("LoongArch: Add FDT booting support from efi system table")
Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-06-03 15:45:53 +08:00
Jiaxun Yang
b56f67a6c7 LoongArch: Fix built-in DTB detection
fdt_check_header(__dtb_start) will always success because kernel
provides a dummy dtb, and by coincidence __dtb_start clashed with
entry of this dummy dtb. The consequence is fdt passed from firmware
will never be taken.

Fix by trying to utilise __dtb_start only when CONFIG_BUILTIN_DTB is
enabled.

Cc: stable@vger.kernel.org
Fixes: 7b937cc243 ("of: Create of_root if no dtb provided by firmware")
Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-06-03 15:45:53 +08:00
Tiezhu Yang
6c3ca6654a LoongArch: Remove CONFIG_ACPI_TABLE_UPGRADE in platform_init()
Both acpi_table_upgrade() and acpi_boot_table_init() are defined as
empty functions under !CONFIG_ACPI_TABLE_UPGRADE and !CONFIG_ACPI in
include/linux/acpi.h, there are no implicit declaration errors with
various configs.

  #ifdef CONFIG_ACPI_TABLE_UPGRADE
  void acpi_table_upgrade(void);
  #else
  static inline void acpi_table_upgrade(void) { }
  #endif

  #ifdef	CONFIG_ACPI
  ...
  void acpi_boot_table_init (void);
  ...
  #else	/* !CONFIG_ACPI */
  ...
  static inline void acpi_boot_table_init(void)
  {
  }
  ...
  #endif	/* !CONFIG_ACPI */

As Huacai suggested, CONFIG_ACPI_TABLE_UPGRADE is ugly and not necessary
here, just remove it. At the same time, just keep CONFIG_ACPI to prevent
potential build errors in future, and give a signal to indicate the code
is ACPI-specific. For the same reason, we also put acpi_table_upgrade()
under CONFIG_ACPI.

Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-06-03 15:45:53 +08:00
Wolfram Sang
c4aff1d1ec Merge tag 'i2c-host-6.10-pt2' of git://git.kernel.org/pub/scm/linux/kernel/git/andi.shyti/linux into i2c/for-current
Removed the SPD class of i2c devices from the device core.

Additionally, a cleanup in the Synquacer code removes the pclk
from the global structure, as it is used only in the probe.
Therefore, it is now declared locally.
2024-06-03 08:51:53 +02:00
Linus Torvalds
c3f38fa61a Linux 6.10-rc2 v6.10-rc2 2024-06-02 15:44:56 -07:00
Linus Torvalds
58d89ee81a Merge tag 'ata-6.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux
Pull ata fixes from Niklas Cassel:

 - Add a quirk for three different devices that have shown issues with
   LPM (link power management). These devices appear to not implement
   LPM properly, since we see command timeouts when enabling LPM. The
   quirk disables LPM for these problematic devices. (Me)

 - Do not apply the Intel PCS quirk on Alder Lake. The quirk is not
   needed and was originally added by mistake when LPM support was
   enabled for this AHCI controller. Enabling the quirk when not needed
   causes the the controller to not be able to detect the connected
   devices on some platforms.

* tag 'ata-6.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux:
  ata: libata-core: Add ATA_HORKAGE_NOLPM for Apacer AS340
  ata: libata-core: Add ATA_HORKAGE_NOLPM for AMD Radeon S3 SSD
  ata: libata-core: Add ATA_HORKAGE_NOLPM for Crucial CT240BX500SSD1
  ata: ahci: Do not apply Intel PCS quirk on Intel Alder Lake
2024-06-02 13:30:53 -07:00
Linus Torvalds
a693b9c95a Merge tag 'x86-urgent-2024-06-02' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
 "Miscellaneous topology parsing fixes:

   - Fix topology parsing regression on older CPUs in the new AMD/Hygon
     parser

   - Fix boot crash on odd Intel Quark and similar CPUs that do not fill
     out cpuinfo_x86::x86_clflush_size and zero out
     cpuinfo_x86::x86_cache_alignment as a result.

     Provide 32 bytes as a general fallback value.

   - Fix topology enumeration on certain rare CPUs where the BIOS locks
     certain CPUID leaves and the kernel unlocked them late, which broke
     with the new topology parsing code. Factor out this unlocking logic
     and move it earlier in the parsing sequence"

* tag 'x86-urgent-2024-06-02' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/topology/intel: Unlock CPUID before evaluating anything
  x86/cpu: Provide default cache line size if not enumerated
  x86/topology/amd: Evaluate SMT in CPUID leaf 0x8000001e only on family 0x17 and greater
2024-06-02 09:32:34 -07:00
Linus Torvalds
3fca58ffad Merge tag 'sched-urgent-2024-06-02' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fix from Ingo Molnar:
 "Export a symbol to make life easier for instrumentation/debugging"

* tag 'sched-urgent-2024-06-02' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/x86: Export 'percpu arch_freq_scale'
2024-06-02 09:23:35 -07:00
Linus Torvalds
efa8f11a7e Merge tag 'perf-urgent-2024-06-02' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf events fix from Ingo Molnar:
 "Add missing MODULE_DESCRIPTION() lines"

* tag 'perf-urgent-2024-06-02' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/intel: Add missing MODULE_DESCRIPTION() lines
  perf/x86/rapl: Add missing MODULE_DESCRIPTION() line
2024-06-02 09:20:37 -07:00
Linus Torvalds
00a8c352dd Merge tag 'hardening-v6.10-rc2-take2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull hardening fixes from Kees Cook:

 - scsi: mpt3sas: Avoid possible run-time warning with long manufacturer
   strings

 - mailmap: update entry for Kees Cook

 - kunit/fortify: Remove __kmalloc_node() test

* tag 'hardening-v6.10-rc2-take2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  kunit/fortify: Remove __kmalloc_node() test
  mailmap: update entry for Kees Cook
  scsi: mpt3sas: Avoid possible run-time warning with long manufacturer strings
2024-06-02 09:15:28 -07:00