Commit Graph

1266529 Commits

Author SHA1 Message Date
Uros Bizjak
cd84351c8c perf/x86/amd: Use try_cmpxchg() in events/amd/{un,}core.c
Replace this pattern in events/amd/{un,}core.c:

    cmpxchg(*ptr, old, new) == old

... with the simpler and faster:

    try_cmpxchg(*ptr, &old, new)

The x86 CMPXCHG instruction returns success in the ZF flag, so this change
saves a compare after the CMPXCHG.

No functional change intended.

Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20240425101708.5025-1-ubizjak@gmail.com
2024-05-18 11:15:13 +02:00
Ingo Molnar
9d351132ed perf/x86/cstate: Remove unused 'struct perf_cstate_msr'
Use of this structure was removed in:

  8f2a28c585 ("perf/x86/cstate: Use new probe function")

Remove the now stale type as well.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: linux-kernel@vger.kernel.org
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2024-05-08 09:53:23 +02:00
Dhananjay Ugwekar
626c5acf39 perf/x86/rapl: Rename 'maxdie' to nr_rapl_pmu and 'dieid' to rapl_pmu_idx
AMD CPUs have the scope of RAPL energy-pkg event as package, whereas
Intel Cascade Lake CPUs have the scope as die.

To account for the difference in the energy-pkg event scope between AMD
and Intel CPUs, give more generic and semantically correct names to the
maxdie and dieid variables.

No functional change.

Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: K Prateek Nayak <kprateek.nayak@amd.com>
Link: https://lore.kernel.org/r/20240502095115.177713-2-Dhananjay.Ugwekar@amd.com
2024-05-02 13:32:21 +02:00
Ingo Molnar
10ed2b1181 Merge branch 'x86/cpu' into perf/core, to pick up dependent commits
We are going to fix perf-events fallout of changes in tip:x86/cpu,
so merge in that branch first.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2024-05-02 13:31:29 +02:00
Adrian Hunter
690ca3a306 x86/insn: Add support for APX EVEX instructions to the opcode map
To support APX functionality, the EVEX prefix is used to:

 - promote legacy instructions
 - promote VEX instructions
 - add new instructions

Promoted VEX instructions require no extra annotation because the opcodes
do not change and the permissive nature of the instruction decoder already
allows them to have an EVEX prefix.

Promoted legacy instructions and new instructions are placed in map 4 which
has not been used before.

Create a new table for map 4 and add APX instructions.

Annotate SCALABLE instructions with "(es)" - refer to patch "x86/insn: Add
support for APX EVEX to the instruction decoder logic". SCALABLE
instructions must be represented in both no-prefix (NP) and 66 prefix
forms.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20240502105853.5338-9-adrian.hunter@intel.com
2024-05-02 13:13:46 +02:00
Adrian Hunter
87bbaf1a4b x86/insn: Add support for APX EVEX to the instruction decoder logic
Intel Advanced Performance Extensions (APX) extends the EVEX prefix to
support:

 - extended general purpose registers (EGPRs) i.e. r16 to r31
 - Push-Pop Acceleration (PPX) hints
 - new data destination (NDD) register
 - suppress status flags writes (NF) of common instructions
 - new instructions

Refer to the Intel Advanced Performance Extensions (Intel APX) Architecture
Specification for details.

The extended EVEX prefix does not need amended instruction decoder logic,
except in one area. Some instructions are defined as SCALABLE which means
the EVEX.W bit and EVEX.pp bits are used to determine operand size.
Specifically, if an instruction is SCALABLE and EVEX.W is zero, then
EVEX.pp value 0 (representing no prefix NP) means default operand size,
whereas EVEX.pp value 1 (representing 66 prefix) means operand size
override i.e. 16 bits

Add an attribute (INAT_EVEX_SCALABLE) to identify such instructions, and
amend the logic appropriately.

Amend the awk script that generates the attribute tables from the opcode
map, to recognise "(es)" as attribute INAT_EVEX_SCALABLE.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20240502105853.5338-8-adrian.hunter@intel.com
2024-05-02 13:13:45 +02:00
Adrian Hunter
159039af8c x86/insn: x86/insn: Add support for REX2 prefix to the instruction decoder opcode map
Support for REX2 has been added to the instruction decoder logic and the
awk script that generates the attribute tables from the opcode map.

Add REX2 prefix byte (0xD5) to the opcode map.

Add annotation (!REX2) for map 0/1 opcodes that are reserved under REX2.

Add JMPABS to the opcode map and add annotation (REX2) to identify that it
has a mandatory REX2 prefix. A separate opcode attribute table is not
needed at this time because JMPABS has the same attribute encoding as the
MOV instruction that it shares an opcode with i.e. INAT_MOFFSET.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20240502105853.5338-7-adrian.hunter@intel.com
2024-05-02 13:13:44 +02:00
Adrian Hunter
eada38d575 x86/insn: Add support for REX2 prefix to the instruction decoder logic
Intel Advanced Performance Extensions (APX) uses a new 2-byte prefix named
REX2 to select extended general purpose registers (EGPRs) i.e. r16 to r31.

The REX2 prefix is effectively an extended version of the REX prefix.

REX2 and EVEX are also used with PUSH/POP instructions to provide a
Push-Pop Acceleration (PPX) hint. With PPX hints, a CPU will attempt to
fast-forward register data between matching PUSH and POP instructions.

REX2 is valid only with opcodes in maps 0 and 1. Similar extension for
other maps is provided by the EVEX prefix, covered in a separate patch.

Some opcodes in maps 0 and 1 are reserved under REX2. One of these is used
for a new 64-bit absolute direct jump instruction JMPABS.

Refer to the Intel Advanced Performance Extensions (Intel APX) Architecture
Specification for details.

Define a code value for the REX2 prefix (INAT_PFX_REX2), and add attribute
flags for opcodes reserved under REX2 (INAT_NO_REX2) and to identify
opcodes (only JMPABS) that require a mandatory REX2 prefix
(INAT_REX2_VARIANT).

Amend logic to read the REX2 prefix and get the opcode attribute for the
map number (0 or 1) encoded in the REX2 prefix.

Amend the awk script that generates the attribute tables from the opcode
map, to recognise "REX2" as attribute INAT_PFX_REX2, and "(!REX2)"
as attribute INAT_NO_REX2, and "(REX2)" as attribute INAT_REX2_VARIANT.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20240502105853.5338-6-adrian.hunter@intel.com
2024-05-02 13:13:44 +02:00
Adrian Hunter
9dd3612895 x86/insn: Add misc new Intel instructions
The x86 instruction decoder is used not only for decoding kernel
instructions. It is also used by perf uprobes (user space probes) and by
perf tools Intel Processor Trace decoding. Consequently, it needs to
support instructions executed by user space also.

Add instructions documented in Intel Architecture Instruction Set
Extensions and Future Features Programming Reference March 2024
319433-052, that have not been added yet:

	AADD
	AAND
	AOR
	AXOR
	CMPccXADD
	PBNDKB
	RDMSRLIST
	URDMSR
	UWRMSR
	VBCSTNEBF162PS
	VBCSTNESH2PS
	VCVTNEEBF162PS
	VCVTNEEPH2PS
	VCVTNEOBF162PS
	VCVTNEOPH2PS
	VCVTNEPS2BF16
	VPDPB[SU,UU,SS]D[,S]
	VPDPW[SU,US,UU]D[,S]
	VPMADD52HUQ
	VPMADD52LUQ
	VSHA512MSG1
	VSHA512MSG2
	VSHA512RNDS2
	VSM3MSG1
	VSM3MSG2
	VSM3RNDS2
	VSM4KEY4
	VSM4RNDS4
	WRMSRLIST
	TCMMIMFP16PS
	TCMMRLFP16PS
	TDPFP16PS
	PREFETCHIT1
	PREFETCHIT0

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20240502105853.5338-5-adrian.hunter@intel.com
2024-05-02 13:13:43 +02:00
Adrian Hunter
b800026434 x86/insn: Add VEX versions of VPDPBUSD, VPDPBUSDS, VPDPWSSD and VPDPWSSDS
The x86 instruction decoder is used not only for decoding kernel
instructions. It is also used by perf uprobes (user space probes) and by
perf tools Intel Processor Trace decoding. Consequently, it needs to
support instructions executed by user space also.

Intel Architecture Instruction Set Extensions and Future Features manual
number 319433-044 of May 2021, documented VEX versions of instructions
VPDPBUSD, VPDPBUSDS, VPDPWSSD and VPDPWSSDS, but the opcode map has them
listed as EVEX only.

Remove EVEX-only (ev) annotation from instructions VPDPBUSD, VPDPBUSDS,
VPDPWSSD and VPDPWSSDS, which allows them to be decoded with either a VEX
or EVEX prefix.

Fixes: 0153d98f2d ("x86/insn: Add misc instructions to x86 instruction decoder")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20240502105853.5338-4-adrian.hunter@intel.com
2024-05-02 13:13:42 +02:00
Adrian Hunter
59162e0c11 x86/insn: Fix PUSH instruction in x86 instruction decoder opcode map
The x86 instruction decoder is used not only for decoding kernel
instructions. It is also used by perf uprobes (user space probes) and by
perf tools Intel Processor Trace decoding. Consequently, it needs to
support instructions executed by user space also.

Opcode 0x68 PUSH instruction is currently defined as 64-bit operand size
only i.e. (d64). That was based on Intel SDM Opcode Map. However that is
contradicted by the Instruction Set Reference section for PUSH in the
same manual.

Remove 64-bit operand size only annotation from opcode 0x68 PUSH
instruction.

Example:

  $ cat pushw.s
  .global  _start
  .text
  _start:
          pushw   $0x1234
          mov     $0x1,%eax   # system call number (sys_exit)
          int     $0x80
  $ as -o pushw.o pushw.s
  $ ld -s -o pushw pushw.o
  $ objdump -d pushw | tail -4
  0000000000401000 <.text>:
    401000:       66 68 34 12             pushw  $0x1234
    401004:       b8 01 00 00 00          mov    $0x1,%eax
    401009:       cd 80                   int    $0x80
  $ perf record -e intel_pt//u ./pushw
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.014 MB perf.data ]

 Before:

  $ perf script --insn-trace=disasm
  Warning:
  1 instruction trace errors
           pushw   10349 [000] 10586.869237014:            401000 [unknown] (/home/ahunter/git/misc/rtit-tests/pushw)           pushw $0x1234
           pushw   10349 [000] 10586.869237014:            401006 [unknown] (/home/ahunter/git/misc/rtit-tests/pushw)           addb %al, (%rax)
           pushw   10349 [000] 10586.869237014:            401008 [unknown] (/home/ahunter/git/misc/rtit-tests/pushw)           addb %cl, %ch
           pushw   10349 [000] 10586.869237014:            40100a [unknown] (/home/ahunter/git/misc/rtit-tests/pushw)           addb $0x2e, (%rax)
   instruction trace error type 1 time 10586.869237224 cpu 0 pid 10349 tid 10349 ip 0x40100d code 6: Trace doesn't match instruction

 After:

  $ perf script --insn-trace=disasm
             pushw   10349 [000] 10586.869237014:            401000 [unknown] (./pushw)           pushw $0x1234
             pushw   10349 [000] 10586.869237014:            401004 [unknown] (./pushw)           movl $1, %eax

Fixes: eb13296cfa ("x86: Instruction decoder API")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20240502105853.5338-3-adrian.hunter@intel.com
2024-05-02 13:13:41 +02:00
Chang S. Bae
a5dd673ab7 x86/insn: Add Key Locker instructions to the opcode map
The x86 instruction decoder needs to know these new instructions that
are going to be used in the crypto library as well as the x86 core
code. Add the following:

LOADIWKEY:
	Load a CPU-internal wrapping key.

ENCODEKEY128:
	Wrap a 128-bit AES key to a key handle.

ENCODEKEY256:
	Wrap a 256-bit AES key to a key handle.

AESENC128KL:
	Encrypt a 128-bit block of data using a 128-bit AES key
	indicated by a key handle.

AESENC256KL:
	Encrypt a 128-bit block of data using a 256-bit AES key
	indicated by a key handle.

AESDEC128KL:
	Decrypt a 128-bit block of data using a 128-bit AES key
	indicated by a key handle.

AESDEC256KL:
	Decrypt a 128-bit block of data using a 256-bit AES key
	indicated by a key handle.

AESENCWIDE128KL:
	Encrypt 8 128-bit blocks of data using a 128-bit AES key
	indicated by a key handle.

AESENCWIDE256KL:
	Encrypt 8 128-bit blocks of data using a 256-bit AES key
	indicated by a key handle.

AESDECWIDE128KL:
	Decrypt 8 128-bit blocks of data using a 128-bit AES key
	indicated by a key handle.

AESDECWIDE256KL:
	Decrypt 8 128-bit blocks of data using a 256-bit AES key
	indicated by a key handle.

The detail can be found in Intel Software Developer Manual.

Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Link: https://lore.kernel.org/r/20240502105853.5338-2-adrian.hunter@intel.com
2024-05-02 13:13:41 +02:00
Ingo Molnar
ad112b3a75 Merge tag 'v6.9-rc6' into perf/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2024-05-02 13:12:31 +02:00
Tony Luck
2eda374e88 x86/mm: Switch to new Intel CPU model defines
New CPU #defines encode vendor and family as well as model.

[ dhansen: vertically align 0's in invlpg_miss_ids[] ]

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/all/20240424181518.41946-1-tony.luck%40intel.com
2024-04-29 10:31:36 +02:00
Tony Luck
d9b6886cd7 x86/tsc_msr: Switch to new Intel CPU model defines
New CPU #defines encode vendor and family as well as model.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/all/20240424181518.41927-1-tony.luck%40intel.com
2024-04-29 10:31:34 +02:00
Tony Luck
f21b075b67 x86/tsc: Switch to new Intel CPU model defines
New CPU #defines encode vendor and family as well as model.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/all/20240424181517.41907-1-tony.luck%40intel.com
2024-04-29 10:31:32 +02:00
Tony Luck
4db64279bc x86/cpu: Switch to new Intel CPU model defines
New CPU #defines encode vendor and family as well as model.

[ dhansen: vertically align macro and remove stray subject / ]

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/all/20240424181516.41887-1-tony.luck%40intel.com
2024-04-29 10:31:30 +02:00
Tony Luck
db99675e43 x86/resctrl: Switch to new Intel CPU model defines
New CPU #defines encode vendor and family as well as model.

  [ bp: Squash two resctrl patches into one. ]

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/all/20240424181514.41848-1-tony.luck%40intel.com
2024-04-29 10:31:28 +02:00
Tony Luck
375a756448 x86/microcode/intel: Switch to new Intel CPU model defines
New CPU #defines encode vendor and family as well as model.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/all/20240424181513.41829-1-tony.luck%40intel.com
2024-04-29 10:31:26 +02:00
Tony Luck
4a5f2dd162 x86/mce: Switch to new Intel CPU model defines
New CPU #defines encode vendor and family as well as model.

  [ bp: Squash *three* mce patches into one, fold in fix:
    https://lore.kernel.org/r/20240429022051.63360-1-tony.luck@intel.com ]

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/all/20240424181511.41772-1-tony.luck%40intel.com
2024-04-29 10:31:23 +02:00
Tony Luck
c73cd37221 x86/cpu: Switch to new Intel CPU model defines
New CPU #defines encode vendor and family as well as model.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/all/20240424181511.41753-1-tony.luck%40intel.com
2024-04-29 10:31:19 +02:00
Tony Luck
fe3edc524d x86/cpu/intel_epb: Switch to new Intel CPU model defines
New CPU #defines encode vendor and family as well as model.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/all/20240424181510.41733-1-tony.luck%40intel.com
2024-04-29 10:31:16 +02:00
Tony Luck
d32bc2111c x86/aperfmperf: Switch to new Intel CPU model defines
New CPU #defines encode vendor and family as well as model.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/all/20240424181505.41654-1-tony.luck%40intel.com
2024-04-29 10:31:12 +02:00
Tony Luck
8fb5f44e5d x86/apic: Switch to new Intel CPU model defines
New CPU #defines encode vendor and family as well as model.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/all/20240424181504.41634-1-tony.luck%40intel.com
2024-04-29 10:31:08 +02:00
Tony Luck
add69475de perf/x86/msr: Switch to new Intel CPU model defines
New CPU #defines encode vendor and family as well as model.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/all/20240424181503.41614-1-tony.luck%40intel.com
2024-04-29 10:31:04 +02:00
Tony Luck
9828a1cff4 perf/x86/intel/uncore: Switch to new Intel CPU model defines
New CPU #defines encode vendor and family as well as model.

  [ bp: Squash *three* uncore patches into one. ]

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/all/20240424181501.41557-1-tony.luck%40intel.com
2024-04-29 10:30:39 +02:00
Linus Torvalds
e67572cd22 Linux 6.9-rc6 v6.9-rc6 2024-04-28 13:47:24 -07:00
Linus Torvalds
245c8e8174 Merge tag 'sched-urgent-2024-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Ingo Molnar:

 - Fix EEVDF corner cases

 - Fix two nohz_full= related bugs that can cause boot crashes
   and warnings

* tag 'sched-urgent-2024-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/isolation: Fix boot crash when maxcpus < first housekeeping CPU
  sched/isolation: Prevent boot crash when the boot CPU is nohz_full
  sched/eevdf: Prevent vlag from going out of bounds in reweight_eevdf()
  sched/eevdf: Fix miscalculation in reweight_entity() when se is not curr
  sched/eevdf: Always update V if se->on_rq when reweighting
2024-04-28 12:11:26 -07:00
Linus Torvalds
aec147c188 Merge tag 'x86-urgent-2024-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:

 - Make the CPU_MITIGATIONS=n interaction with conflicting
   mitigation-enabling boot parameters a bit saner.

 - Re-enable CPU mitigations by default on non-x86

 - Fix TDX shared bit propagation on mprotect()

 - Fix potential show_regs() system hang when PKE initialization
   is not fully finished yet.

 - Add the 0x10-0x1f model IDs to the Zen5 range

 - Harden #VC instruction emulation some more

* tag 'x86-urgent-2024-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  cpu: Ignore "mitigations" kernel parameter if CPU_MITIGATIONS=n
  cpu: Re-enable CPU mitigations by default for !X86 architectures
  x86/tdx: Preserve shared bit on mprotect()
  x86/cpu: Fix check for RDPKRU in __show_regs()
  x86/CPU/AMD: Add models 0x10-0x1f to the Zen5 range
  x86/sev: Check for MWAITX and MONITORX opcodes in the #VC handler
2024-04-28 11:58:16 -07:00
Linus Torvalds
8d62e9bf28 Merge tag 'irq-urgent-2024-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fix from Ingo Molnar:
 "Fix a double free bug in the init error path of the GICv3 irqchip
  driver"

* tag 'irq-urgent-2024-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip/gic-v3-its: Prevent double free on error
2024-04-28 11:51:13 -07:00
Oleg Nesterov
257bf89d84 sched/isolation: Fix boot crash when maxcpus < first housekeeping CPU
housekeeping_setup() checks cpumask_intersects(present, online) to ensure
that the kernel will have at least one housekeeping CPU after smp_init(),
but this doesn't work if the maxcpus= kernel parameter limits the number of
processors available after bootup.

For example, a kernel with "maxcpus=2 nohz_full=0-2" parameters crashes at
boot time on a virtual machine with 4 CPUs.

Change housekeeping_setup() to use cpumask_first_and() and check that the
returned CPU number is valid and less than setup_max_cpus.

Another corner case is "nohz_full=0" on a machine with a single CPU or with
the maxcpus=1 kernel argument. In this case non_housekeeping_mask is empty
and tick_nohz_full_setup() makes no sense. And indeed, the kernel hits the
WARN_ON(tick_nohz_full_running) in tick_sched_do_timer().

And how should the kernel interpret the "nohz_full=" parameter? It should
be silently ignored, but currently cpulist_parse() happily returns the
empty cpumask and this leads to the same problem.

Change housekeeping_setup() to check cpumask_empty(non_housekeeping_mask)
and do nothing in this case.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Phil Auld <pauld@redhat.com>
Acked-by: Frederic Weisbecker <frederic@kernel.org>
Link: https://lore.kernel.org/r/20240413141746.GA10008@redhat.com
2024-04-28 10:08:21 +02:00
Oleg Nesterov
5097cbcb38 sched/isolation: Prevent boot crash when the boot CPU is nohz_full
Documentation/timers/no_hz.rst states that the "nohz_full=" mask must not
include the boot CPU, which is no longer true after:

  08ae95f4fd ("nohz_full: Allow the boot CPU to be nohz_full").

However after:

  aae17ebb53 ("workqueue: Avoid using isolated cpus' timers on queue_delayed_work")

the kernel will crash at boot time in this case; housekeeping_any_cpu()
returns an invalid CPU number until smp_init() brings the first
housekeeping CPU up.

Change housekeeping_any_cpu() to check the result of cpumask_any_and() and
return smp_processor_id() in this case.

This is just the simple and backportable workaround which fixes the
symptom, but smp_processor_id() at boot time should be safe at least for
type == HK_TYPE_TIMER, this more or less matches the tick_do_timer_boot_cpu
logic.

There is no worry about cpu_down(); tick_nohz_cpu_down() will not allow to
offline tick_do_timer_cpu (the 1st online housekeeping CPU).

Fixes: aae17ebb53 ("workqueue: Avoid using isolated cpus' timers on queue_delayed_work")
Reported-by: Chris von Recklinghausen <crecklin@redhat.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Phil Auld <pauld@redhat.com>
Acked-by: Frederic Weisbecker <frederic@kernel.org>
Link: https://lore.kernel.org/r/20240411143905.GA19288@redhat.com
Closes: https://lore.kernel.org/all/20240402105847.GA24832@redhat.com/
2024-04-28 10:07:12 +02:00
Linus Torvalds
2c81593889 Merge tag 'rust-fixes-6.9' of https://github.com/Rust-for-Linux/linux
Pull Rust fixes from Miguel Ojeda:

 - Soundness: make internal functions generated by the 'module!' macro
   inaccessible, do not implement 'Zeroable' for 'Infallible' and
   require 'Send' for the 'Module' trait.

 - Build: avoid errors with "empty" files and workaround 'rustdoc' ICE.

 - Kconfig: depend on '!CFI_CLANG' and avoid selecting 'CONSTRUCTORS'.

 - Code docs: remove non-existing key from 'module!' macro example.

 - Docs: trivial rendering fix in arch table.

* tag 'rust-fixes-6.9' of https://github.com/Rust-for-Linux/linux:
  rust: remove `params` from `module` macro example
  kbuild: rust: force `alloc` extern to allow "empty" Rust files
  kbuild: rust: remove unneeded `@rustc_cfg` to avoid ICE
  rust: kernel: require `Send` for `Module` implementations
  rust: phy: implement `Send` for `Registration`
  rust: make mutually exclusive with CFI_CLANG
  rust: macros: fix soundness issue in `module!` macro
  rust: init: remove impl Zeroable for Infallible
  docs: rust: fix improper rendering in Arch Support page
  rust: don't select CONSTRUCTORS
2024-04-27 12:11:55 -07:00
Linus Torvalds
57865f3970 Merge tag 'riscv-for-linus-6.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fixes from Palmer Dabbelt:

 - A fix for TASK_SIZE on rv64/NOMMU, to reflect the lack of user/kernel
   separation

 - A fix to avoid loading rv64/NOMMU kernel past the start of RAM

 - A fix for RISCV_HWPROBE_EXT_ZVFHMIN on ilp32 to avoid signed integer
   overflow in the bitmask

 - The sud_test kselftest has been fixed to properly swizzle the syscall
   number into the return register, which are not the same on RISC-V

 - A fix for a build warning in the perf tools on rv32

 - A fix for the CBO selftests, to avoid non-constants leaking into the
   inline asm

 - A pair of fixes for T-Head PBMT errata probing, which has been
   renamed MAE by the vendor

* tag 'riscv-for-linus-6.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  RISC-V: selftests: cbo: Ensure asm operands match constraints, take 2
  perf riscv: Fix the warning due to the incompatible type
  riscv: T-Head: Test availability bit before enabling MAE errata
  riscv: thead: Rename T-Head PBMT to MAE
  selftests: sud_test: return correct emulated syscall value on RISC-V
  riscv: hwprobe: fix invalid sign extension for RISCV_HWPROBE_EXT_ZVFHMIN
  riscv: Fix loading 64-bit NOMMU kernels past the start of RAM
  riscv: Fix TASK_SIZE on 64-bit NOMMU
2024-04-27 12:02:55 -07:00
Linus Torvalds
d43df69f38 Merge tag '6.9-rc5-cifs-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6
Pull smb client fixes from Steve French:
 "Three smb3 client fixes, all also for stable:

   - two small locking fixes spotted by Coverity

   - FILE_ALL_INFO and network_open_info packing fix"

* tag '6.9-rc5-cifs-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6:
  smb3: fix lock ordering potential deadlock in cifs_sync_mid_result
  smb3: missing lock when picking channel
  smb: client: Fix struct_group() usage in __packed structs
2024-04-27 11:35:40 -07:00
Linus Torvalds
5d12ed4bea Merge tag 'i2c-for-6.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
 "Fix a race condition in the at24 eeprom handler, a NULL pointer
  exception in the I2C core for controllers only using target modes,
  drop a MAINTAINERS entry, and fix an incorrect DT binding for at24"

* tag 'i2c-for-6.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: smbus: fix NULL function pointer dereference
  MAINTAINERS: Drop entry for PCA9541 bus master selector
  eeprom: at24: fix memory corruption race condition
  dt-bindings: eeprom: at24: Fix ST M24C64-D compatible schema
2024-04-27 11:24:53 -07:00
Tetsuo Handa
2e5449f4f2 profiling: Remove create_prof_cpu_mask().
create_prof_cpu_mask() is no longer used after commit 1f44a22577 ("s390:
convert interrupt handling to use generic hardirq").

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-04-27 11:17:48 -07:00
Linus Torvalds
8a5c3ef7db Merge tag 'soundwire-6.9-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire
Pull soundwire fix from Vinod Koul:

 - Single AMD driver fix for wake interrupt handling in clockstop mode

* tag 'soundwire-6.9-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire:
  soundwire: amd: fix for wake interrupt handling for clockstop mode
2024-04-27 11:14:32 -07:00
Linus Torvalds
6fba14a7b5 Merge tag 'dmaengine-fix-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine
Pull dmaengine fixes from Vinod Koul:

 - Revert pl330 issue_pending waits until WFP state due to regression
   reported in Bluetooth loading

 - Xilinx driver fixes for synchronization, buffer offsets, locking and
   kdoc

 - idxd fixes for spinlock and preventing the migration of the perf
   context to an invalid target

 - idma driver fix for interrupt handling when powered off

 - Tegra driver residual calculation fix

 - Owl driver register access fix

* tag 'dmaengine-fix-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine:
  dmaengine: idxd: Fix oops during rmmod on single-CPU platforms
  dmaengine: xilinx: xdma: Clarify kdoc in XDMA driver
  dmaengine: xilinx: xdma: Fix synchronization issue
  dmaengine: xilinx: xdma: Fix wrong offsets in the buffers addresses in dma descriptor
  dma: xilinx_dpdma: Fix locking
  dmaengine: idxd: Convert spinlock to mutex to lock evl workqueue
  idma64: Don't try to serve interrupts when device is powered off
  dmaengine: tegra186: Fix residual calculation
  dmaengine: owl: fix register access functions
  dmaengine: Revert "dmaengine: pl330: issue_pending waits until WFP state"
2024-04-27 11:07:35 -07:00
Linus Torvalds
63407d3081 Merge tag 'phy-fixes-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy
Pull phy fixes from Vinod Koul:

 - static checker (array size, bounds) fix for marvel driver

 - Rockchip rk3588 pcie fixes for bifurcation and mux

 - Qualcomm qmp-compbo fix for VCO, register base and regulator name for
   m31 driver

 - charger det crash fix for ti driver

* tag 'phy-fixes-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy:
  phy: ti: tusb1210: Resolve charger-det crash if charger psy is unregistered
  phy: qcom: qmp-combo: fix VCO div offset on v5_5nm and v6
  phy: phy-rockchip-samsung-hdptx: Select CONFIG_RATIONAL
  phy: qcom: m31: match requested regulator name with dt schema
  phy: qcom: qmp-combo: Fix register base for QSERDES_DP_PHY_MODE
  phy: qcom: qmp-combo: Fix VCO div offset on v3
  phy: rockchip: naneng-combphy: Fix mux on rk3588
  phy: rockchip-snps-pcie3: fix clearing PHP_GRF_PCIESEL_CON bits
  phy: rockchip-snps-pcie3: fix bifurcation on rk3588
  phy: freescale: imx8m-pcie: fix pcie link-up instability
  phy: marvell: a3700-comphy: Fix hardcoded array size
  phy: marvell: a3700-comphy: Fix out of bounds read
2024-04-27 11:01:12 -07:00
Wolfram Sang
91811a31b6 i2c: smbus: fix NULL function pointer dereference
Baruch reported an OOPS when using the designware controller as target
only. Target-only modes break the assumption of one transfer function
always being available. Fix this by always checking the pointer in
__i2c_transfer.

Reported-by: Baruch Siach <baruch@tkos.co.il>
Closes: https://lore.kernel.org/r/4269631780e5ba789cf1ae391eec1b959def7d99.1712761976.git.baruch@tkos.co.il
Fixes: 4b1acc4333 ("i2c: core changes for slave support")
[wsa: dropped the simplification in core-smbus to avoid theoretical regressions]
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Tested-by: Baruch Siach <baruch@tkos.co.il>
2024-04-27 12:57:57 +02:00
Linus Torvalds
5eb4573ea6 Merge tag 'soc-fixes-6.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull ARM SoC fixes from Arnd Bergmann:
 "There are a lot of minor DT fixes for Mediatek, Rockchip, Qualcomm and
  Microchip and NXP, addressing both build-time warnings and bugs found
  during runtime testing.

  Most of these changes are machine specific fixups, but there are a few
  notable regressions that affect an entire SoC:

   - The Qualcomm MSI support that was improved for 6.9 ended up being
     wrong on some chips and now gets fixed.

   - The i.MX8MP camera interface broke due to a typo and gets updated
     again.

  The main driver fix is also for Qualcomm platforms, rewriting an
  interface in the QSEECOM firmware support that could lead to crashing
  the kernel from a trusted application.

  The only other code changes are minor fixes for Mediatek SoC drivers"

* tag 'soc-fixes-6.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (50 commits)
  ARM: dts: imx6ull-tarragon: fix USB over-current polarity
  soc: mediatek: mtk-socinfo: depends on CONFIG_SOC_BUS
  soc: mediatek: mtk-svs: Append "-thermal" to thermal zone names
  arm64: dts: imx8mp: Fix assigned-clocks for second CSI2
  ARM: dts: microchip: at91-sama7g54_curiosity: Replace regulator-suspend-voltage with the valid property
  ARM: dts: microchip: at91-sama7g5ek: Replace regulator-suspend-voltage with the valid property
  arm64: dts: rockchip: Fix USB interface compatible string on kobol-helios64
  arm64: dts: qcom: sc8180x: Fix ss_phy_irq for secondary USB controller
  arm64: dts: qcom: sm8650: Fix the msi-map entries
  arm64: dts: qcom: sm8550: Fix the msi-map entries
  arm64: dts: qcom: sm8450: Fix the msi-map entries
  arm64: dts: qcom: sc8280xp: add missing PCIe minimum OPP
  arm64: dts: qcom: x1e80100: Fix the compatible for cluster idle states
  arm64: dts: qcom: Fix type of "wdog" IRQs for remoteprocs
  arm64: dts: rockchip: regulator for sd needs to be always on for BPI-R2Pro
  dt-bindings: rockchip: grf: Add missing type to 'pcie-phy' node
  arm64: dts: rockchip: drop redundant disable-gpios in Lubancat 2
  arm64: dts: rockchip: drop redundant disable-gpios in Lubancat 1
  arm64: dts: rockchip: drop redundant pcie-reset-suspend in Scarlet Dumo
  arm64: dts: rockchip: mark system power controller and fix typo on orangepi-5-plus
  ...
2024-04-26 14:39:45 -07:00
Linus Torvalds
e6ebf01172 Merge tag 'mm-hotfixes-stable-2024-04-26-13-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
 "11 hotfixes. 8 are cc:stable and the remaining 3 (nice ratio!) address
  post-6.8 issues or aren't considered suitable for backporting.

  All except one of these are for MM. I see no particular theme - it's
  singletons all over"

* tag 'mm-hotfixes-stable-2024-04-26-13-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  mm/hugetlb: fix DEBUG_LOCKS_WARN_ON(1) when dissolve_free_hugetlb_folio()
  selftests: mm: protection_keys: save/restore nr_hugepages value from launch script
  stackdepot: respect __GFP_NOLOCKDEP allocation flag
  hugetlb: check for anon_vma prior to folio allocation
  mm: zswap: fix shrinker NULL crash with cgroup_disable=memory
  mm: turn folio_test_hugetlb into a PageType
  mm: support page_mapcount() on page_has_type() pages
  mm: create FOLIO_FLAG_FALSE and FOLIO_TYPE_OPS macros
  mm/hugetlb: fix missing hugetlb_lock for resv uncharge
  selftests: mm: fix unused and uninitialized variable warning
  selftests/harness: remove use of LINE_MAX
2024-04-26 13:48:03 -07:00
Linus Torvalds
4630932a55 Merge tag 'mmc-v6.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Pull MMC host fixes from Ulf Hansson:

 - moxart: Fix regression for sg_miter for PIO mode

 - sdhci-msm: Avoid hang by preventing access to suspended controller

 - sdhci-of-dwcmshc: Fix SD card tuning error for th1520

* tag 'mmc-v6.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
  mmc: moxart: fix handling of sgm->consumed, otherwise WARN_ON triggers
  mmc: sdhci-of-dwcmshc: th1520: Increase tuning loop count to 128
  mmc: sdhci-msm: pervent access to suspended controller
2024-04-26 13:17:33 -07:00
Linus Torvalds
c9e35b4aeb Merge tag 'arc-6.9-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc
Pull ARC fixes from Vineet Gupta:

 - Incorrect VIPT aliasing assumption

 - Misc build warning fixes and some typos

* tag 'arc-6.9-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
  ARC: [plat-hsdk]: Remove misplaced interrupt-cells property
  ARC: Fix typos
  ARC: mm: fix new code about cache aliasing
  ARC: Fix -Wmissing-prototypes warnings
2024-04-26 13:11:33 -07:00
Linus Torvalds
bbacf717de Merge tag 'mtd/fixes-for-6.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux
Pull MTD fixes from Miquel Raynal:
 "There has been OTP support improvements in the NVMEM subsystem, and
  later also improvements of OTP support in the NAND subsystem. This
  lead to situations that we currently cannot handle, so better prevent
  this situation from happening in order to avoid canceling device's
  probe.

  In the raw NAND subsystem, two runtime fixes have been shared, one
  fixing two important commands in the Qcom driver since it got reworked
  and a NULL pointer dereference happening on STB chips.

  Arnd also fixed a UBSAN link failure on diskonchip"

* tag 'mtd/fixes-for-6.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux:
  mtd: limit OTP NVMEM cell parse to non-NAND devices
  mtd: diskonchip: work around ubsan link failure
  mtd: rawnand: qcom: Fix broken OP_RESET_DEVICE command in qcom_misc_cmd_type_exec()
  mtd: rawnand: brcmnand: Fix data access violation for STB chip
2024-04-26 13:05:34 -07:00
Linus Torvalds
3022bf37da Merge tag 'gpio-fixes-for-v6.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull gpio fixes from Bartosz Golaszewski:

 - fix a regression in pin access control in gpio-tegra186

 - make data pointer dereference robust in Intel Tangier driver

* tag 'gpio-fixes-for-v6.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
  gpio: tegra186: Fix tegra186_gpio_is_accessible() check
  gpio: tangier: Use correct type for the IRQ chip data
2024-04-26 11:27:02 -07:00
Linus Torvalds
5b43efa158 Merge tag 'cxl-fixes-6.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl
Pull cxl fix from Dave Jiang:

 - Fix potential payload size confusion in cxl_mem_get_poison()

* tag 'cxl-fixes-6.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl:
  cxl/core: Fix potential payload size confusion in cxl_mem_get_poison()
2024-04-26 11:21:20 -07:00
Linus Torvalds
08f0677dfc Merge tag 'for-6.9/dm-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device mapper fixes from Mike Snitzer:

 - Fix 6.9 regression so that DM device removal is performed
   synchronously by default.

   Asynchronous removal has always been possible but it isn't the
   default. It is important that synchronous removal be preserved,
   otherwise it is an interface change that breaks lvm2.

 - Remove errant semicolon in drivers/md/dm-vdo/murmurhash3.c

* tag 'for-6.9/dm-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dm: restore synchronous close of device mapper block device
  dm vdo murmurhash: remove unneeded semicolon
2024-04-26 11:17:24 -07:00
Linus Torvalds
52034cae02 Merge tag 'vfs-6.9-rc6.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs fixes from Christian Brauner:
 "This contains a few small fixes for this merge window and the attempt
  to handle the ntfs removal regression that was reported a little while
  ago:

   - After the removal of the legacy ntfs driver we received reports
     about regressions for some people that do mount "ntfs" explicitly
     and expect the driver to be available. Since ntfs3 is a drop-in for
     legacy ntfs we alias legacy ntfs to ntfs3 just like ext3 is aliased
     to ext4.

     We also enforce legacy ntfs is always mounted read-only and give it
     custom file operations to ensure that ioctl()'s can't be abused to
     perform write operations.

   - Fix an unbalanced module_get() in bdev_open().

   - Two smaller fixes for the netfs work done earlier in this cycle.

   - Fix the errno returned from the new FS_IOC_GETUUID and
     FS_IOC_GETFSSYSFSPATH ioctls. Both commands just pull information
     out of the superblock so there's no need to call into the actual
     ioctl handlers.

     So instead of returning ENOIOCTLCMD to indicate to fallback we just
     return ENOTTY directly avoiding that indirection"

* tag 'vfs-6.9-rc6.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  netfs: Fix the pre-flush when appending to a file in writethrough mode
  netfs: Fix writethrough-mode error handling
  ntfs3: add legacy ntfs file operations
  ntfs3: enforce read-only when used as legacy ntfs driver
  ntfs3: serve as alias for the legacy ntfs driver
  block: fix module reference leakage from bdev_open_by_dev error path
  fs: Return ENOTTY directly if FS_IOC_GETUUID or FS_IOC_GETFSSYSFSPATH fail
2024-04-26 11:01:28 -07:00