Commit Graph

1294267 Commits

Author SHA1 Message Date
Linus Torvalds
c9f33436d8 Merge tag 'riscv-for-linus-6.11-mw2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull more RISC-V updates from Palmer Dabbelt:

 - Support for NUMA (via SRAT and SLIT), console output (via SPCR), and
   cache info (via PPTT) on ACPI-based systems.

 - The trap entry/exit code no longer breaks the return address stack
   predictor on many systems, which results in an improvement to trap
   latency.

 - Support for HAVE_ARCH_STACKLEAK.

 - The sv39 linear map has been extended to support 128GiB mappings.

 - The frequency of the mtime CSR is now visible via hwprobe.

* tag 'riscv-for-linus-6.11-mw2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (21 commits)
  RISC-V: Provide the frequency of time CSR via hwprobe
  riscv: Extend sv39 linear mapping max size to 128G
  riscv: enable HAVE_ARCH_STACKLEAK
  riscv: signal: Remove unlikely() from WARN_ON() condition
  riscv: Improve exception and system call latency
  RISC-V: Select ACPI PPTT drivers
  riscv: cacheinfo: initialize cacheinfo's level and type from ACPI PPTT
  riscv: cacheinfo: remove the useless input parameter (node) of ci_leaf_init()
  RISC-V: ACPI: Enable SPCR table for console output on RISC-V
  riscv: boot: remove duplicated targets line
  trace: riscv: Remove deprecated kprobe on ftrace support
  riscv: cpufeature: Extract common elements from extension checking
  riscv: Introduce vendor variants of extension helpers
  riscv: Add vendor extensions to /proc/cpuinfo
  riscv: Extend cpufeature.c to detect vendor extensions
  RISC-V: run savedefconfig for defconfig
  RISC-V: hwprobe: sort EXT_KEY()s in hwprobe_isa_ext0() alphabetically
  ACPI: NUMA: replace pr_info with pr_debug in arch_acpi_numa_init
  ACPI: NUMA: change the ACPI_NUMA to a hidden option
  ACPI: NUMA: Add handler for SRAT RINTC affinity structure
  ...
2024-07-27 10:14:34 -07:00
Linus Torvalds
c17f1224b8 Merge tag 'for-linus-6.11-rc1a-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fixes from Juergen Gross:
 "Two fixes for issues introduced in this merge window:

   - fix enhanced debugging in the Xen multicall handling

   - two patches fixing a boot failure when running as dom0 in PVH mode"

* tag 'for-linus-6.11-rc1a-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  x86/xen: fix memblock_reserve() usage on PVH
  x86/xen: move xen_reserve_extra_memory()
  xen: fix multicall debug data referencing
2024-07-27 09:58:24 -07:00
Linus Torvalds
3a7e02c040 minmax: avoid overly complicated constant expressions in VM code
The minmax infrastructure is overkill for simple constants, and can
cause huge expansions because those simple constants are then used by
other things.

For example, 'pageblock_order' is a core VM constant, but because it was
implemented using 'min_t()' and all the type-checking that involves, it
actually expanded to something like 2.5kB of preprocessor noise.

And when that simple constant was then used inside other expansions:

  #define pageblock_nr_pages      (1UL << pageblock_order)
  #define pageblock_start_pfn(pfn)  ALIGN_DOWN((pfn), pageblock_nr_pages)

and we then use that inside a 'max()' macro:

	case ISOLATE_SUCCESS:
		update_cached = false;
		last_migrated_pfn = max(cc->zone->zone_start_pfn,
			pageblock_start_pfn(cc->migrate_pfn - 1));

the end result was that one statement expanding to 253kB in size.

There are probably other cases of this, but this one case certainly
stood out.

I've added 'MIN_T()' and 'MAX_T()' macros for this kind of "core simple
constant with specific type" use.  These macros skip the type checking,
and as such need to be very sparingly used only for obvious cases that
have active issues like this.

Reported-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Link: https://lore.kernel.org/all/36aa2cad-1db1-4abf-8dd2-fb20484aabc3@lucifer.local/
Cc: David Laight <David.Laight@aculab.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-07-26 15:32:27 -07:00
Linus Torvalds
e8432ac802 minmax: avoid overly complex min()/max() macro arguments in xen
We have some very fancy min/max macros that have tons of sanity checking
to warn about mixed signedness etc.

This is all things that a sane compiler should warn about, but there are
no sane compiler interfaces for this, and '-Wsign-compare' is broken [1]
and not useful.

So then we compensate (some would say over-compensate) by doing the
checks manually with some truly horrid macro games.

And no, we can't just use __builtin_types_compatible_p(), because the
whole question of "does it make sense to compare these two values" is a
lot more complicated than that.

For example, it makes a ton of sense to compare unsigned values with
simple constants like "5", even if that is indeed a signed type.  So we
have these very strange macros to try to make sensible type checking
decisions on the arguments to 'min()' and 'max()'.

But that can cause enormous code expansion if the min()/max() macros are
used with complicated expressions, and particularly if you nest these
things so that you get the first big expansion then expanded again.

The xen setup.c file ended up ballooning to over 50MB of preprocessed
noise that takes 15s to compile (obviously depending on the build host),
largely due to one single line.

So let's split that one single line to just be simpler.  I think it ends
up being more legible to humans too at the same time.  Now that single
file compiles in under a second.

Reported-and-reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Link: https://lore.kernel.org/all/c83c17bb-be75-4c67-979d-54eee38774c6@lucifer.local/
Link: https://staticthinking.wordpress.com/2023/07/25/wsign-compare-is-garbage/ [1]
Cc: David Laight <David.Laight@aculab.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-07-26 15:09:07 -07:00
Linus Torvalds
2f8c4f5062 Merge tag 'auxdisplay-for-v6.11-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k
Pull auxdisplay updates from Geert Uytterhoeven:

  - add support for configuring the boot message on line displays

  - miscellaneous fixes and improvements

* tag 'auxdisplay-for-v6.11-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
  auxdisplay: ht16k33: Drop reference after LED registration
  auxdisplay: Use sizeof(*pointer) instead of sizeof(type)
  auxdisplay: hd44780: add missing MODULE_DESCRIPTION() macro
  auxdisplay: linedisp: add missing MODULE_DESCRIPTION() macro
  auxdisplay: linedisp: Support configuring the boot message
  auxdisplay: charlcd: Provide a forward declaration
2024-07-26 11:04:28 -07:00
Linus Torvalds
eb966e0c5f Merge tag 'sound-fix-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
 "A collection of fixes gathered since the previous pull.

  We see a bit large LOCs at a HD-audio quirk, but that's only bulk COEF
  data, hence it's safe to take. In addition to that, there were two
  minor fixes for MIDI 2.0 handling for ALSA core, and the rest are all
  rather random small and device-specific fixes"

* tag 'sound-fix-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ASoC: fsl-asoc-card: Dynamically allocate memory for snd_soc_dai_link_components
  ASoC: amd: yc: Support mic on Lenovo Thinkpad E16 Gen 2
  ALSA: hda/realtek: Implement sound init sequence for Samsung Galaxy Book3 Pro 360
  ALSA: hda/realtek: cs35l41: Fixup remaining asus strix models
  ASoC: SOF: ipc4-topology: Preserve the DMA Link ID for ChainDMA on unprepare
  ASoC: SOF: ipc4-topology: Only handle dai_config with HW_PARAMS for ChainDMA
  ALSA: ump: Force 1 Group for MIDI1 FBs
  ALSA: ump: Don't update FB name for static blocks
  ALSA: usb-audio: Add a quirk for Sonix HD USB Camera
  ASoC: TAS2781: Fix tasdev_load_calibrated_data()
  ASoC: tegra: select CONFIG_SND_SIMPLE_CARD_UTILS
  ASoC: Intel: use soc_intel_is_byt_cr() only when IOSF_MBI is reachable
  ALSA: usb-audio: Move HD Webcam quirk to the right place
  ALSA: hda: tas2781: mark const variables as __maybe_unused
  ALSA: usb-audio: Fix microphone sound on HD webcam.
  ASoC: sof: amd: fix for firmware reload failure in Vangogh platform
  ASoC: Intel: Fix RT5650 SSP lookup
  ASOC: SOF: Intel: hda-loader: only wait for HDaudio IOC for IPC4 devices
  ASoC: SOF: imx8m: Fix DSP control regmap retrieval
2024-07-26 11:01:31 -07:00
Linus Torvalds
0ba9b15511 Merge tag 'drm-next-2024-07-26' of https://gitlab.freedesktop.org/drm/kernel
Pull drm fixes from Dave Airlie:
 "Fixes for rc1, mostly amdgpu, i915 and xe, with some other misc ones,
  doesn't seem to be anything too serious.

  amdgpu:
   - Bump driver version for GFX12 DCC
   - DC documention warning fixes
   - VCN unified queue power fix
   - SMU fix
   - RAS fix
   - Display corruption fix
   - SDMA 5.2 workaround
   - GFX12 fixes
   - Uninitialized variable fix
   - VCN/JPEG 4.0.3 fixes
   - Misc display fixes
   - RAS fixes
   - VCN4/5 harvest fix
   - GPU reset fix

  i915:
   - Reset intel_dp->link_trained before retraining the link
   - Don't switch the LTTPR mode on an active link
   - Do not consider preemption during execlists_dequeue for gen8
   - Allow NULL memory region

  xe:
   - xe_exec ioctl minor fix on sync entry cleanup upon error
   - SRIOV: limit VF LMEM provisioning
   - Wedge mode fixes

  v3d:
   - fix indirect dispatch on newer v3d revs

  panel:
   - fix panel backlight bindings"

* tag 'drm-next-2024-07-26' of https://gitlab.freedesktop.org/drm/kernel: (39 commits)
  drm/amdgpu: reset vm state machine after gpu reset(vram lost)
  drm/amdgpu: add missed harvest check for VCN IP v4/v5
  drm/amdgpu: Fix eeprom max record count
  drm/amdgpu: fix ras UE error injection failure issue
  drm/amd/display: Remove ASSERT if significance is zero in math_ceil2
  drm/amd/display: Check for NULL pointer
  drm/amdgpu/vcn: Use offsets local to VCN/JPEG in VF
  drm/amdgpu: Add empty HDP flush function to VCN v4.0.3
  drm/amdgpu: Add empty HDP flush function to JPEG v4.0.3
  drm/amd/amdgpu: Fix uninitialized variable warnings
  drm/amdgpu: Fix atomics on GFX12
  drm/amdgpu/sdma5.2: Update wptr registers as well as doorbell
  drm/i915: Allow NULL memory region
  drm/i915/gt: Do not consider preemption during execlists_dequeue for gen8
  dt-bindings: display: panel: samsung,atna33xc20: Document ATNA45AF01
  drm/xe: Don't suspend device upon wedge
  drm/xe: Wedge the entire device
  drm/xe/pf: Limit fair VF LMEM provisioning
  drm/xe/exec: Fix minor bug related to xe_sync_entry_cleanup
  drm/amd/display: fix corruption with high refresh rates on DCN 3.0
  ...
2024-07-26 10:57:07 -07:00
Linus Torvalds
65ad409e63 Merge tag 's390-6.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull more s390 updates from Vasily Gorbik:

 - Fix KMSAN build breakage caused by the conflict between s390 and
   mm-stable trees

 - Add KMSAN page markers for ptdump

 - Add runtime constant support

 - Fix __pa/__va for modules under non-GPL licenses by exporting
   necessary vm_layout struct with EXPORT_SYMBOL to prevent linkage
   problems

 - Fix an endless loop in the CF_DIAG event stop in the CPU Measurement
   Counter Facility code when the counter set size is zero

 - Remove the PROTECTED_VIRTUALIZATION_GUEST config option and enable
   its functionality by default

 - Support allocation of multiple MSI interrupts per device and improve
   logging of architecture-specific limitations

 - Add support for lowcore relocation as a debugging feature to catch
   all null ptr dereferences in the kernel address space, improving
   detection beyond the current implementation's limited write access
   protection

 - Clean up and rework CPU alternatives to allow for callbacks and early
   patching for the lowcore relocation

* tag 's390-6.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (39 commits)
  s390: Remove protvirt and kvm config guards for uv code
  s390/boot: Add cmdline option to relocate lowcore
  s390/kdump: Make kdump ready for lowcore relocation
  s390/entry: Make system_call() ready for lowcore relocation
  s390/entry: Make ret_from_fork() ready for lowcore relocation
  s390/entry: Make __switch_to() ready for lowcore relocation
  s390/entry: Make restart_int_handler() ready for lowcore relocation
  s390/entry: Make mchk_int_handler() ready for lowcore relocation
  s390/entry: Make int handlers ready for lowcore relocation
  s390/entry: Make pgm_check_handler() ready for lowcore relocation
  s390/entry: Add base register to CHECK_VMAP_STACK/CHECK_STACK macro
  s390/entry: Add base register to SIEEXIT macro
  s390/entry: Add base register to MBEAR macro
  s390/entry: Make __sie64a() ready for lowcore relocation
  s390/head64: Make startup code ready for lowcore relocation
  s390: Add infrastructure to patch lowcore accesses
  s390/atomic_ops: Disable flag outputs constraint for GCC versions below 14.2.0
  s390/entry: Move SIE indicator flag to thread info
  s390/nmi: Simplify ptregs setup
  s390/alternatives: Remove alternative facility list
  ...
2024-07-26 10:47:53 -07:00
Linus Torvalds
a6294b5b1f Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
 "The usual summary below, but the main fix is for the fast GUP lockless
  page-table walk when we have a combination of compile-time and
  run-time folding of the p4d and the pud respectively.

   - Remove some redundant Kconfig conditionals

   - Fix string output in ptrace selftest

   - Fix fast GUP crashes in some page-table configurations

   - Remove obsolete linker option when building the vDSO

   - Fix some sysreg field definitions for the GIC"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: mm: Fix lockless walks with static and dynamic page-table folding
  arm64/sysreg: Correct the values for GICv4.1
  arm64/vdso: Remove --hash-style=sysv
  kselftest: missing arg in ptrace.c
  arm64/Kconfig: Remove redundant 'if HAVE_FUNCTION_GRAPH_TRACER'
  arm64: remove redundant 'if HAVE_ARCH_KASAN' in Kconfig
2024-07-26 10:39:10 -07:00
Linus Torvalds
6467dfdfc9 Merge tag 'ceph-for-6.11-rc1' of https://github.com/ceph/ceph-client
Pull ceph updates from Ilya Dryomov:
 "A small patchset to address bogus I/O errors and ultimately an
  assertion failure in the face of watch errors with -o exclusive
  mappings in RBD marked for stable and some assorted CephFS fixes"

* tag 'ceph-for-6.11-rc1' of https://github.com/ceph/ceph-client:
  rbd: don't assume rbd_is_lock_owner() for exclusive mappings
  rbd: don't assume RBD_LOCK_STATE_LOCKED for exclusive mappings
  rbd: rename RBD_LOCK_STATE_RELEASING and releasing_wait
  ceph: fix incorrect kmalloc size of pagevec mempool
  ceph: periodically flush the cap releases
  ceph: convert comma to semicolon in __ceph_dentry_dir_lease_touch()
  ceph: use cap_wait_list only if debugfs is enabled
2024-07-26 10:34:42 -07:00
Linus Torvalds
732c275394 Merge tag 'erofs-for-6.11-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs
Pull more erofs updates from Gao Xiang:

 - Support STATX_DIOALIGN and FS_IOC_GETFSSYSFSPATH

 - Fix a race of LZ4 decompression due to recent refactoring

 - Another multi-page folio adaption in erofs_bread()

* tag 'erofs-for-6.11-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
  erofs: convert comma to semicolon
  erofs: support multi-page folios for erofs_bread()
  erofs: add support for FS_IOC_GETFSSYSFSPATH
  erofs: fix race in z_erofs_get_gbuf()
  erofs: support STATX_DIOALIGN
2024-07-26 10:31:03 -07:00
Linus Torvalds
dd90ad50cb Merge tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull struct file leak fixes from Al Viro:
 "a couple of leaks on failure exits missing fdput()"

* tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  lirc: rc_dev_get_from_fd(): fix file leak
  powerpc: fix a file leak in kvm_vcpu_ioctl_enable_cap()
2024-07-26 10:26:33 -07:00
Linus Torvalds
4c7be57f27 arm64: allow installing compressed image by default
On arm64 we build compressed images, but "make install" by default will
install the old non-compressed one.  To actually get the compressed
image install, you need to use "make zinstall", which is not the usual
way to install a kernel.

Which may not sound like much of an issue, but when you deal with
multiple architectures (and years of your fingers knowing the regular
"make install" incantation), this inconsistency is pretty annoying.

But as Will Deacon says:
 "Sadly, bootloaders being as top quality as you might expect, I don't
  think we're in a position to rely on decompressor support across the
  board. Our Image.gz is literally just that -- we don't have a built-in
  decompressor (nor do I think we want to rush into that again after the
  fun we had on arm32) and the recent EFI zboot support solves that
  problem for platforms using EFI.

  Changing the default 'install' target terrifies me. There are bound to
  be folks with embedded boards who've scripted this and we could really
  ruin their day if we quietly give them a compressed kernel that their
  bootloader doesn't know how to handle :/"

So make this conditional on a new "COMPRESSED_INSTALL" option.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-07-26 10:07:56 -07:00
Linus Torvalds
51c4767503 Merge tag 'bitmap-6.11-rc1' of https://github.com:/norov/linux
Pull bitmap updates from Yury Norov:
 "Random fixes"

* tag 'bitmap-6.11-rc1' of https://github.com:/norov/linux:
  riscv: Remove unnecessary int cast in variable_fls()
  radix tree test suite: put definition of bitmap_clear() into lib/bitmap.c
  bitops: Add a comment explaining the double underscore macros
  lib: bitmap: add missing MODULE_DESCRIPTION() macros
  cpumask: introduce assign_cpu() macro
2024-07-26 09:50:36 -07:00
Palmer Dabbelt
52420e483d RISC-V: Provide the frequency of time CSR via hwprobe
The RISC-V architecture makes a real time counter CSR (via RDTIME
instruction) available for applications in U-mode but there is no
architected mechanism for an application to discover the frequency
the counter is running at. Some applications (e.g., DPDK) use the
time counter for basic performance analysis as well as fine grained
time-keeping.

Add support to the hwprobe system call to export the time CSR
frequency to code running in U-mode.

Signed-off-by: Yunhui Cui <cuiyunhui@bytedance.com>
Reviewed-by: Evan Green <evan@rivosinc.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Acked-by: Punit Agrawal <punit.agrawal@bytedance.com>
Link: https://lore.kernel.org/r/20240702033731.71955-2-cuiyunhui@bytedance.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-26 05:50:51 -07:00
Stuart Menefy
5c8405d763 riscv: Extend sv39 linear mapping max size to 128G
This harmonizes all virtual addressing modes which can now all map
(PGDIR_SIZE * PTRS_PER_PGD) / 4 of physical memory.

The RISCV implementation of KASAN requires that the boundary between
shallow mappings are aligned on an 8G boundary. In this case we need
VMALLOC_START to be 8G aligned. So although we only need to move the
start of the linear mapping down by 4GiB to allow 128GiB to be mapped,
we actually move it down by 8GiB (creating a 4GiB hole between the
linear mapping and KASAN shadow space) to maintain the alignment
requirement.

Signed-off-by: Stuart Menefy <stuart.menefy@codasip.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240630110550.1731929-1-stuart.menefy@codasip.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-26 05:50:50 -07:00
Palmer Dabbelt
3aa1a7d013 Merge patch series "RISC-V: Select ACPI PPTT drivers"
This series adds support for ACPI PPTT via cacheinfo.

* b4-shazam-merge:
  RISC-V: Select ACPI PPTT drivers
  riscv: cacheinfo: initialize cacheinfo's level and type from ACPI PPTT
  riscv: cacheinfo: remove the useless input parameter (node) of ci_leaf_init()

Link: https://lore.kernel.org/r/20240617131425.7526-1-cuiyunhui@bytedance.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-26 05:50:49 -07:00
Palmer Dabbelt
ec1dc56b54 Merge patch "Enable SPCR table for console output on RISC-V"
Sia Jee Heng <jeeheng.sia@starfivetech.com> says:

The ACPI SPCR code has been used to enable console output for ARM64 and
X86. The same code can be reused for RISC-V. Furthermore, SPCR table is
mandated for headless system as outlined in the RISC-V BRS
Specification, chapter 6.

* b4-shazam-merge:
  RISC-V: ACPI: Enable SPCR table for console output on RISC-V

Link: https://lore.kernel.org/r/20240502073751.102093-1-jeeheng.sia@starfivetech.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-26 05:50:48 -07:00
Jisheng Zhang
b5db73fb18 riscv: enable HAVE_ARCH_STACKLEAK
Add support for the stackleak feature. Whenever the kernel returns to user
space the kernel stack is filled with a poison value.

At the same time, disables the plugin in EFI stub code because EFI stub
is out of scope for the protection.

Tested on qemu and milkv duo:
/ # echo STACKLEAK_ERASING > /sys/kernel/debug/provoke-crash/DIRECT
[   38.675575] lkdtm: Performing direct entry STACKLEAK_ERASING
[   38.678448] lkdtm: stackleak stack usage:
[   38.678448]   high offset: 288 bytes
[   38.678448]   current:     496 bytes
[   38.678448]   lowest:      1328 bytes
[   38.678448]   tracked:     1328 bytes
[   38.678448]   untracked:   448 bytes
[   38.678448]   poisoned:    14312 bytes
[   38.678448]   low offset:  8 bytes
[   38.689887] lkdtm: OK: the rest of the thread stack is properly erased

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20240623235316.2010-1-jszhang@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-26 05:50:47 -07:00
Zhongqiu Han
1d20e5d437 riscv: signal: Remove unlikely() from WARN_ON() condition
"WARN_ON(unlikely(x))" is excessive. WARN_ON() already uses unlikely()
internally.

Signed-off-by: Zhongqiu Han <quic_zhonhan@quicinc.com>
Reviewed-by: Bjorn Andersson <quic_bjorande@quicinc.com>
Reviewed-by: Andy Chiu <andy.chiu@sifive.com>
Link: https://lore.kernel.org/r/20240620033434.3778156-1-quic_zhonhan@quicinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-26 05:50:46 -07:00
Anton Blanchard
5d5fc33ce5 riscv: Improve exception and system call latency
Many CPUs implement return address branch prediction as a stack. The
RISCV architecture refers to this as a return address stack (RAS). If
this gets corrupted then the CPU will mispredict at least one but
potentally many function returns.

There are two issues with the current RISCV exception code:

- We are using the alternate link stack (x5/t0) for the indirect branch
  which makes the hardware think this is a function return. This will
  corrupt the RAS.

- We modify the return address of handle_exception to point to
  ret_from_exception. This will also corrupt the RAS.

Testing the null system call latency before and after the patch:

Visionfive2 (StarFive JH7110 / U74)
baseline: 189.87 ns
patched:  176.76 ns

Lichee pi 4a (T-Head TH1520 / C910)
baseline: 666.58 ns
patched:  636.90 ns

Just over 7% on the U74 and just over 4% on the C910.

Signed-off-by: Anton Blanchard <antonb@tenstorrent.com>
Signed-off-by: Cyril Bur <cyrilbur@tenstorrent.com>
Tested-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-by: Jisheng Zhang <jszhang@kernel.org>
Link: https://lore.kernel.org/r/20240607061335.2197383-1-cyrilbur@tenstorrent.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-26 05:50:45 -07:00
Chen Ni
14e9283fb2 erofs: convert comma to semicolon
Replace a comma between expression statements by a semicolon.

Signed-off-by: Chen Ni <nichen@iscas.ac.cn>
Link: https://lore.kernel.org/r/20240724020721.2389738-1-nichen@iscas.ac.cn
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2024-07-26 18:48:12 +08:00
Gao Xiang
5d3bb77e5f erofs: support multi-page folios for erofs_bread()
If the requested page is part of the previous multi-page folio, there
is no need to call read_mapping_folio() again.

Also, get rid of the remaining one of page->index [1] in our codebase.

[1] https://lore.kernel.org/r/Zp8fgUSIBGQ1TN0D@casper.infradead.org

Cc: Matthew Wilcox <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240723073024.875290-1-hsiangkao@linux.alibaba.com
2024-07-26 18:47:57 +08:00
Huang Xiaojia
684b290abc erofs: add support for FS_IOC_GETFSSYSFSPATH
FS_IOC_GETFSSYSFSPATH ioctl exposes /sys/fs path of a given filesystem,
potentially standarizing sysfs reporting. This patch add support for
FS_IOC_GETFSSYSFSPATH for erofs, "erofs/<dev>" will be outputted for bdev
cases, "erofs/[domain_id,]<fs_id>" will be outputted for fscache cases.

Signed-off-by: Huang Xiaojia <huangxiaojia2@huawei.com>
Link: https://lore.kernel.org/r/20240720082335.441563-1-huangxiaojia2@huawei.com
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2024-07-26 18:47:46 +08:00
Gao Xiang
7dc5537c3f erofs: fix race in z_erofs_get_gbuf()
In z_erofs_get_gbuf(), the current task may be migrated to another
CPU between `z_erofs_gbuf_id()` and `spin_lock(&gbuf->lock)`.

Therefore, z_erofs_put_gbuf() will trigger the following issue
which was found by stress test:

<2>[772156.434168] kernel BUG at fs/erofs/zutil.c:58!
..
<4>[772156.435007]
<4>[772156.439237] CPU: 0 PID: 3078 Comm: stress Kdump: loaded Tainted: G            E      6.10.0-rc7+ #2
<4>[772156.439239] Hardware name: Alibaba Cloud Alibaba Cloud ECS, BIOS 1.0.0 01/01/2017
<4>[772156.439241] pstate: 83400005 (Nzcv daif +PAN -UAO +TCO +DIT -SSBS BTYPE=--)
<4>[772156.439243] pc : z_erofs_put_gbuf+0x64/0x70 [erofs]
<4>[772156.439252] lr : z_erofs_lz4_decompress+0x600/0x6a0 [erofs]
..
<6>[772156.445958] stress (3127): drop_caches: 1
<4>[772156.446120] Call trace:
<4>[772156.446121]  z_erofs_put_gbuf+0x64/0x70 [erofs]
<4>[772156.446761]  z_erofs_lz4_decompress+0x600/0x6a0 [erofs]
<4>[772156.446897]  z_erofs_decompress_queue+0x740/0xa10 [erofs]
<4>[772156.447036]  z_erofs_runqueue+0x428/0x8c0 [erofs]
<4>[772156.447160]  z_erofs_readahead+0x224/0x390 [erofs]
..

Fixes: f36f3010f6 ("erofs: rename per-CPU buffers to global buffer pool and make it configurable")
Cc: <stable@vger.kernel.org> # 6.10+
Reviewed-by: Chunhai Guo <guochunhai@vivo.com>
Reviewed-by: Sandeep Dhavale <dhavale@google.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240722035110.3456740-1-hsiangkao@linux.alibaba.com
2024-07-26 18:47:33 +08:00
Hongbo Li
9c421ef3f6 erofs: support STATX_DIOALIGN
Add support for STATX_DIOALIGN to EROFS, so that direct I/O
alignment restrictions are exposed to userspace in a generic
way.

[Before]
```
./statx_test /mnt/erofs/testfile
statx(/mnt/erofs/testfile) = 0
dio mem align:0
dio offset align:0
```

[After]
```
./statx_test /mnt/erofs/testfile
statx(/mnt/erofs/testfile) = 0
dio mem align:512
dio offset align:512
```

Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240718083243.2485437-1-hsiangkao@linux.alibaba.com
2024-07-26 18:47:22 +08:00
Dave Airlie
d4ef5d2b7e Merge tag 'amd-drm-fixes-6.11-2024-07-25' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-fixes-6.11-2024-07-25:

amdgpu:
- SDMA 5.2 workaround
- GFX12 fixes
- Uninitialized variable fix
- VCN/JPEG 4.0.3 fixes
- Misc display fixes
- RAS fixes
- VCN4/5 harvest fix
- GPU reset fix

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240725202900.2155572-1-alexander.deucher@amd.com
2024-07-26 09:52:15 +10:00
Dave Airlie
86f259cb7c Merge tag 'drm-misc-next-fixes-2024-07-25' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-next
A single fix for a panel compatible

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maxime Ripard <mripard@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240725-frisky-wren-of-tact-f5f504@houat
2024-07-26 09:51:14 +10:00
Dave Airlie
a37cd98cd5 Merge tag 'drm-intel-next-fixes-2024-07-25' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-next
- Do not consider preemption during execlists_dequeue for gen8 [gt] (Nitin Gote)
- Allow NULL memory region (Jonathan Cavitt)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Tvrtko Ursulin <tursulin@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZqICQzyzm/6hDWy4@linux
2024-07-26 06:41:03 +10:00
Linus Torvalds
1722389b0d Merge tag 'net-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
 "Including fixes from bpf and netfilter.

  A lot of networking people were at a conference last week, busy
  catching COVID, so relatively short PR.

  Current release - regressions:

   - tcp: process the 3rd ACK with sk_socket for TFO and MPTCP

  Current release - new code bugs:

   - l2tp: protect session IDR and tunnel session list with one lock,
     make sure the state is coherent to avoid a warning

   - eth: bnxt_en: update xdp_rxq_info in queue restart logic

   - eth: airoha: fix location of the MBI_RX_AGE_SEL_MASK field

  Previous releases - regressions:

   - xsk: require XDP_UMEM_TX_METADATA_LEN to actuate tx_metadata_len,
     the field reuses previously un-validated pad

  Previous releases - always broken:

   - tap/tun: drop short frames to prevent crashes later in the stack

   - eth: ice: add a per-VF limit on number of FDIR filters

   - af_unix: disable MSG_OOB handling for sockets in sockmap/sockhash"

* tag 'net-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (34 commits)
  tun: add missing verification for short frame
  tap: add missing verification for short frame
  mISDN: Fix a use after free in hfcmulti_tx()
  gve: Fix an edge case for TSO skb validity check
  bnxt_en: update xdp_rxq_info in queue restart logic
  tcp: process the 3rd ACK with sk_socket for TFO/MPTCP
  selftests/bpf: Add XDP_UMEM_TX_METADATA_LEN to XSK TX metadata test
  xsk: Require XDP_UMEM_TX_METADATA_LEN to actuate tx_metadata_len
  bpf: Fix a segment issue when downgrading gso_size
  net: mediatek: Fix potential NULL pointer dereference in dummy net_device handling
  MAINTAINERS: make Breno the netconsole maintainer
  MAINTAINERS: Update bonding entry
  net: nexthop: Initialize all fields in dumped nexthops
  net: stmmac: Correct byte order of perfect_match
  selftests: forwarding: skip if kernel not support setting bridge fdb learning limit
  tipc: Return non-zero value from tipc_udp_addr2str() on error
  netfilter: nft_set_pipapo_avx2: disable softinterrupts
  ice: Fix recipe read procedure
  ice: Add a per-VF limit on number of FDIR filters
  net: bonding: correctly annotate RCU in bond_should_notify_peers()
  ...
2024-07-25 13:32:25 -07:00
Linus Torvalds
8bf100092d Merge tag 'printk-for-6.11-trivial' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux
Pull printk updates from Petr Mladek:

 - trivial printk changes

The bigger "real" printk work is still being discussed.

* tag 'printk-for-6.11-trivial' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
  vsprintf: add missing MODULE_DESCRIPTION() macro
  printk: Rename console_replay_all() and update context
2024-07-25 13:18:41 -07:00
Linus Torvalds
b485625078 Merge tag 'constfy-sysctl-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl
Pull sysctl constification from Joel Granados:
 "Treewide constification of the ctl_table argument of proc_handlers
  using a coccinelle script and some manual code formatting fixups.

  This is a prerequisite to moving the static ctl_table structs into
  read-only data section which will ensure that proc_handler function
  pointers cannot be modified"

* tag 'constfy-sysctl-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl:
  sysctl: treewide: constify the ctl_table argument of proc_handlers
2024-07-25 12:58:36 -07:00
Linus Torvalds
bba959f477 Merge tag 'efi-fixes-for-v6.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi
Pull EFI fixes from Ard Biesheuvel:

 - Wipe screen_info after allocating it from the heap - used by arm32
   and EFI zboot, other EFI architectures allocate it statically

 - Revert to allocating boot_params from the heap on x86 when entering
   via the native PE entrypoint, to work around a regression on older
   Dell hardware

* tag 'efi-fixes-for-v6.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
  x86/efistub: Revert to heap allocated boot_params for PE entrypoint
  efi/libstub: Zero initialize heap allocated struct screen_info
2024-07-25 12:55:21 -07:00
Linus Torvalds
9b21993654 Merge tag 'kgdb-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/danielt/linux
Pull kgdb updates from Daniel Thompson:
 "Three small changes this cycle:

   - Clean up an architecture abstraction that is no longer needed
     because all the architectures have converged.

   - Actually use the prompt argument to kdb_position_cursor() instead
     of ignoring it (functionally this fix is a nop but that was due to
     luck rather than good judgement)

   - Fix a -Wformat-security warning"

* tag 'kgdb-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/danielt/linux:
  kdb: Get rid of redundant kdb_curr_task()
  kdb: Use the passed prompt in kdb_position_cursor()
  kdb: address -Wformat-security warnings
2024-07-25 12:48:42 -07:00
Linus Torvalds
28e7241cb8 Merge tag 'mips_6.11_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux
Pull MIPS updates from Thomas Bogendoerfer:

 - Use improved timer sync for Loongson64

 - Fix address of GCR_ACCESS register

 - Add missing MODULE_DESCRIPTION

* tag 'mips_6.11_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
  mips: sibyte: add missing MODULE_DESCRIPTION() macro
  MIPS: SMP-CPS: Fix address for GCR_ACCESS register for CM3 and later
  MIPS: Loongson64: Switch to SYNC_R4K
2024-07-25 12:41:53 -07:00
Linus Torvalds
f646429524 Merge tag 'parisc-for-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parisc updates from Helge Deller:
 "The gettimeofday() and clock_gettime() syscalls are now available as
  vDSO functions, and Dave added a patch which allows to use NVMe cards
  in the PCI slots as fast and easy alternative to SCSI discs.

  Summary:

   - add gettimeofday() and clock_gettime() vDSO functions

   - enable PCI_MSI_ARCH_FALLBACKS to allow PCI to PCIe bridge adaptor
     with PCIe NVME card to function in parisc machines

   - allow users to reduce kernel unaligned runtime warnings

   - minor code cleanups"

* tag 'parisc-for-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc: Add support for CONFIG_SYSCTL_ARCH_UNALIGN_NO_WARN
  parisc: Use max() to calculate parisc_tlb_flush_threshold
  parisc: Fix warning at drivers/pci/msi/msi.h:121
  parisc: Add 64-bit gettimeofday() and clock_gettime() vDSO functions
  parisc: Add 32-bit gettimeofday() and clock_gettime() vDSO functions
  parisc: Clean up unistd.h file
2024-07-25 12:37:42 -07:00
Linus Torvalds
f9bcc61ad1 Merge tag 'uml-for-linus-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux
Pull UML updates from Richard Weinberger:

 - Support for preemption

 - i386 Rust support

 - Huge cleanup by Benjamin Berg

 - UBSAN support

 - Removal of dead code

* tag 'uml-for-linus-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux: (41 commits)
  um: vector: always reset vp->opened
  um: vector: remove vp->lock
  um: register power-off handler
  um: line: always fill *error_out in setup_one_line()
  um: remove pcap driver from documentation
  um: Enable preemption in UML
  um: refactor TLB update handling
  um: simplify and consolidate TLB updates
  um: remove force_flush_all from fork_handler
  um: Do not flush MM in flush_thread
  um: Delay flushing syscalls until the thread is restarted
  um: remove copy_context_skas0
  um: remove LDT support
  um: compress memory related stub syscalls while adding them
  um: Rework syscall handling
  um: Add generic stub_syscall6 function
  um: Create signal stack memory assignment in stub_data
  um: Remove stub-data.h include from common-offsets.h
  um: time-travel: fix signal blocking race/hang
  um: time-travel: remove time_exit()
  ...
2024-07-25 12:33:08 -07:00
Linus Torvalds
c2a96b7f18 Merge tag 'driver-core-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg KH:
 "Here is the big set of driver core changes for 6.11-rc1.

  Lots of stuff in here, with not a huge diffstat, but apis are evolving
  which required lots of files to be touched. Highlights of the changes
  in here are:

   - platform remove callback api final fixups (Uwe took many releases
     to get here, finally!)

   - Rust bindings for basic firmware apis and initial driver-core
     interactions.

     It's not all that useful for a "write a whole driver in rust" type
     of thing, but the firmware bindings do help out the phy rust
     drivers, and the driver core bindings give a solid base on which
     others can start their work.

     There is still a long way to go here before we have a multitude of
     rust drivers being added, but it's a great first step.

   - driver core const api changes.

     This reached across all bus types, and there are some fix-ups for
     some not-common bus types that linux-next and 0-day testing shook
     out.

     This work is being done to help make the rust bindings more safe,
     as well as the C code, moving toward the end-goal of allowing us to
     put driver structures into read-only memory. We aren't there yet,
     but are getting closer.

   - minor devres cleanups and fixes found by code inspection

   - arch_topology minor changes

   - other minor driver core cleanups

  All of these have been in linux-next for a very long time with no
  reported problems"

* tag 'driver-core-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (55 commits)
  ARM: sa1100: make match function take a const pointer
  sysfs/cpu: Make crash_hotplug attribute world-readable
  dio: Have dio_bus_match() callback take a const *
  zorro: make match function take a const pointer
  driver core: module: make module_[add|remove]_driver take a const *
  driver core: make driver_find_device() take a const *
  driver core: make driver_[create|remove]_file take a const *
  firmware_loader: fix soundness issue in `request_internal`
  firmware_loader: annotate doctests as `no_run`
  devres: Correct code style for functions that return a pointer type
  devres: Initialize an uninitialized struct member
  devres: Fix memory leakage caused by driver API devm_free_percpu()
  devres: Fix devm_krealloc() wasting memory
  driver core: platform: Switch to use kmemdup_array()
  driver core: have match() callback in struct bus_type take a const *
  MAINTAINERS: add Rust device abstractions to DRIVER CORE
  device: rust: improve safety comments
  MAINTAINERS: add Danilo as FIRMWARE LOADER maintainer
  MAINTAINERS: add Rust FW abstractions to FIRMWARE LOADER
  firmware: rust: improve safety comments
  ...
2024-07-25 10:42:22 -07:00
Linus Torvalds
b2eed73360 Merge tag 'linux-watchdog-6.11-rc1' of git://www.linux-watchdog.org/linux-watchdog
Pull watchdog updates from Wim Van Sebroeck:

 - make watchdog_class const

 - rework of the rzg2l_wdt driver

 - other small fixes and improvements

* tag 'linux-watchdog-6.11-rc1' of git://www.linux-watchdog.org/linux-watchdog:
  dt-bindings: watchdog: dlg,da9062-watchdog: Drop blank space
  watchdog: rzn1: Convert comma to semicolon
  watchdog: lenovo_se10_wdt: Convert comma to semicolon
  dt-bindings: watchdog: renesas,wdt: Document RZ/G3S support
  watchdog: rzg2l_wdt: Add suspend/resume support
  watchdog: rzg2l_wdt: Rely on the reset driver for doing proper reset
  watchdog: rzg2l_wdt: Remove comparison with zero
  watchdog: rzg2l_wdt: Remove reset de-assert from probe
  watchdog: rzg2l_wdt: Check return status of pm_runtime_put()
  watchdog: rzg2l_wdt: Use pm_runtime_resume_and_get()
  watchdog: rzg2l_wdt: Make the driver depend on PM
  watchdog: rzg2l_wdt: Restrict the driver to ARCH_RZG2L and ARCH_R9A09G011
  watchdog: imx7ulp_wdt: keep already running watchdog enabled
  watchdog: starfive: Add missing clk_disable_unprepare()
  watchdog: Make watchdog_class const
2024-07-25 10:18:35 -07:00
Linus Torvalds
9cf601e865 Merge tag 'dma-mapping-6.11-2024-07-24' of git://git.infradead.org/users/hch/dma-mapping
Pull dma-mapping fix from Christoph Hellwig:

 - fix the order of actions in dmam_free_coherent (Lance Richardson)

* tag 'dma-mapping-6.11-2024-07-24' of git://git.infradead.org/users/hch/dma-mapping:
  dma: fix call order in dmam_free_coherent
2024-07-25 10:10:34 -07:00
Takashi Iwai
e8b96a66ae Merge tag 'asoc-fix-v6.11-merge-window' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus
ASoC: Fixes for v6.11

A selection of routine fixes and quirks that came in since the merge
window.  The fsl-asoc-card change is a fix for systems with multiple
cards where updating templates in place leaks data from one card to
another.
2024-07-25 18:04:55 +02:00
Jakub Kicinski
af65ea42bd Merge branch 'tap-tun-harden-by-dropping-short-frame'
Dongli Zhang says:

====================
tap/tun: harden by dropping short frame

This is to harden all of tap/tun to avoid any short frame smaller than the
Ethernet header (ETH_HLEN).

While the xen-netback already rejects short frame smaller than ETH_HLEN ...

 914 static void xenvif_tx_build_gops(struct xenvif_queue *queue,
 915                                      int budget,
 916                                      unsigned *copy_ops,
 917                                      unsigned *map_ops)
 918 {
... ...
1007                 if (unlikely(txreq.size < ETH_HLEN)) {
1008                         netdev_dbg(queue->vif->dev,
1009                                    "Bad packet size: %d\n", txreq.size);
1010                         xenvif_tx_err(queue, &txreq, extra_count, idx);
1011                         break;
1012                 }

... the short frame may not be dropped by vhost-net/tap/tun.

This fixes CVE-2024-41090 and CVE-2024-41091.
====================

Link: https://patch.msgid.link/20240724170452.16837-1-dongli.zhang@oracle.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-25 08:07:07 -07:00
Dongli Zhang
049584807f tun: add missing verification for short frame
The cited commit missed to check against the validity of the frame length
in the tun_xdp_one() path, which could cause a corrupted skb to be sent
downstack. Even before the skb is transmitted, the
tun_xdp_one-->eth_type_trans() may access the Ethernet header although it
can be less than ETH_HLEN. Once transmitted, this could either cause
out-of-bound access beyond the actual length, or confuse the underlayer
with incorrect or inconsistent header length in the skb metadata.

In the alternative path, tun_get_user() already prohibits short frame which
has the length less than Ethernet header size from being transmitted for
IFF_TAP.

This is to drop any frame shorter than the Ethernet header size just like
how tun_get_user() does.

CVE: CVE-2024-41091
Inspired-by: https://lore.kernel.org/netdev/1717026141-25716-1-git-send-email-si-wei.liu@oracle.com/
Fixes: 043d222f93 ("tuntap: accept an array of XDP buffs through sendmsg()")
Cc: stable@vger.kernel.org
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Link: https://patch.msgid.link/20240724170452.16837-3-dongli.zhang@oracle.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-25 08:07:05 -07:00
Si-Wei Liu
ed7f2afdd0 tap: add missing verification for short frame
The cited commit missed to check against the validity of the frame length
in the tap_get_user_xdp() path, which could cause a corrupted skb to be
sent downstack. Even before the skb is transmitted, the
tap_get_user_xdp()-->skb_set_network_header() may assume the size is more
than ETH_HLEN. Once transmitted, this could either cause out-of-bound
access beyond the actual length, or confuse the underlayer with incorrect
or inconsistent header length in the skb metadata.

In the alternative path, tap_get_user() already prohibits short frame which
has the length less than Ethernet header size from being transmitted.

This is to drop any frame shorter than the Ethernet header size just like
how tap_get_user() does.

CVE: CVE-2024-41090
Link: https://lore.kernel.org/netdev/1717026141-25716-1-git-send-email-si-wei.liu@oracle.com/
Fixes: 0efac27791 ("tap: accept an array of XDP buffs through sendmsg()")
Cc: stable@vger.kernel.org
Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Link: https://patch.msgid.link/20240724170452.16837-2-dongli.zhang@oracle.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-25 08:07:05 -07:00
Dan Carpenter
61ab751451 mISDN: Fix a use after free in hfcmulti_tx()
Don't dereference *sp after calling dev_kfree_skb(*sp).

Fixes: af69fb3a8f ("Add mISDN HFC multiport driver")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/8be65f5a-c2dd-4ba0-8a10-bfe5980b8cfb@stanley.mountain
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-25 08:05:05 -07:00
Bailey Forrest
36e3b949e3 gve: Fix an edge case for TSO skb validity check
The NIC requires each TSO segment to not span more than 10
descriptors. NIC further requires each descriptor to not exceed
16KB - 1 (GVE_TX_MAX_BUF_SIZE_DQO).

The descriptors for an skb are generated by
gve_tx_add_skb_no_copy_dqo() for DQO RDA queue format.
gve_tx_add_skb_no_copy_dqo() loops through each skb frag and
generates a descriptor for the entire frag if the frag size is
not greater than GVE_TX_MAX_BUF_SIZE_DQO. If the frag size is
greater than GVE_TX_MAX_BUF_SIZE_DQO, it is split into descriptor(s)
of size GVE_TX_MAX_BUF_SIZE_DQO and a descriptor is generated for
the remainder (frag size % GVE_TX_MAX_BUF_SIZE_DQO).

gve_can_send_tso() checks if the descriptors thus generated for an
skb would meet the requirement that each TSO-segment not span more
than 10 descriptors. However, the current code misses an edge case
when a TSO segment spans multiple descriptors within a large frag.
This change fixes the edge case.

gve_can_send_tso() relies on the assumption that max gso size (9728)
is less than GVE_TX_MAX_BUF_SIZE_DQO and therefore within an skb
fragment a TSO segment can never span more than 2 descriptors.

Fixes: a57e5de476 ("gve: DQO: Add TX path")
Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com>
Signed-off-by: Bailey Forrest <bcf@google.com>
Reviewed-by: Jeroen de Borst <jeroendb@google.com>
Cc: stable@vger.kernel.org
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20240724143431.3343722-1-pkaligineedi@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-25 07:58:07 -07:00
Taehee Yoo
b537633ce5 bnxt_en: update xdp_rxq_info in queue restart logic
When the netdev_rx_queue_restart() restarts queues, the bnxt_en driver
updates(creates and deletes) a page_pool.
But it doesn't update xdp_rxq_info, so the xdp_rxq_info is still
connected to an old page_pool.
So, bnxt_rx_ring_info->page_pool indicates a new page_pool, but
bnxt_rx_ring_info->xdp_rxq is still connected to an old page_pool.

An old page_pool is no longer used so it is supposed to be
deleted by page_pool_destroy() but it isn't.
Because the xdp_rxq_info is holding the reference count for it and the
xdp_rxq_info is not updated, an old page_pool will not be deleted in
the queue restart logic.

Before restarting 1 queue:
./tools/net/ynl/samples/page-pool
enp10s0f1np1[6] page pools: 4 (zombies: 0)
	refs: 8192 bytes: 33554432 (refs: 0 bytes: 0)
	recycling: 0.0% (alloc: 128:8048 recycle: 0:0)

After restarting 1 queue:
./tools/net/ynl/samples/page-pool
enp10s0f1np1[6] page pools: 5 (zombies: 0)
	refs: 10240 bytes: 41943040 (refs: 0 bytes: 0)
	recycling: 20.0% (alloc: 160:10080 recycle: 1920:128)

Before restarting queues, an interface has 4 page_pools.
After restarting one queue, an interface has 5 page_pools, but it
should be 4, not 5.
The reason is that queue restarting logic creates a new page_pool and
an old page_pool is not deleted due to the absence of an update of
xdp_rxq_info logic.

Fixes: 2d694c27d3 ("bnxt_en: implement netdev_queue_mgmt_ops")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Reviewed-by: David Wei <dw@davidwei.uk>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Link: https://patch.msgid.link/20240721053554.1233549-1-ap420073@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-25 07:42:48 -07:00
Jakub Kicinski
f7578df913 Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
pull-request: bpf 2024-07-25

We've added 14 non-merge commits during the last 8 day(s) which contain
a total of 19 files changed, 177 insertions(+), 70 deletions(-).

The main changes are:

1) Fix af_unix to disable MSG_OOB handling for sockets in BPF sockmap and
   BPF sockhash. Also add test coverage for this case, from Michal Luczaj.

2) Fix a segmentation issue when downgrading gso_size in the BPF helper
   bpf_skb_adjust_room(), from Fred Li.

3) Fix a compiler warning in resolve_btfids due to a missing type cast,
   from Liwei Song.

4) Fix stack allocation for arm64 to align the stack pointer at a 16 byte
   boundary in the fexit_sleep BPF selftest, from Puranjay Mohan.

5) Fix a xsk regression to require a flag when actuating tx_metadata_len,
   from Stanislav Fomichev.

6) Fix function prototype BTF dumping in libbpf for prototypes that have
   no input arguments, from Andrii Nakryiko.

7) Fix stacktrace symbol resolution in perf script for BPF programs
   containing subprograms, from Hou Tao.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  selftests/bpf: Add XDP_UMEM_TX_METADATA_LEN to XSK TX metadata test
  xsk: Require XDP_UMEM_TX_METADATA_LEN to actuate tx_metadata_len
  bpf: Fix a segment issue when downgrading gso_size
  tools/resolve_btfids: Fix comparison of distinct pointer types warning in resolve_btfids
  bpf, events: Use prog to emit ksymbol event for main program
  selftests/bpf: Test sockmap redirect for AF_UNIX MSG_OOB
  selftests/bpf: Parametrize AF_UNIX redir functions to accept send() flags
  selftests/bpf: Support SOCK_STREAM in unix_inet_redir_to_connected()
  af_unix: Disable MSG_OOB handling for sockets in sockmap/sockhash
  bpftool: Fix typo in usage help
  libbpf: Fix no-args func prototype BTF dumping syntax
  MAINTAINERS: Update powerpc BPF JIT maintainers
  MAINTAINERS: Update email address of Naveen
  selftests/bpf: fexit_sleep: Fix stack allocation for arm64
====================

Link: https://patch.msgid.link/20240725114312.32197-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-25 07:40:25 -07:00
Shengjiu Wang
ab53dfdcdd ASoC: fsl-asoc-card: Dynamically allocate memory for snd_soc_dai_link_components
The static snd_soc_dai_link_components cause conflict for multiple
instances of this generic driver. For example, when there is
wm8962 and SPDIF case enabled together, the contaminated
snd_soc_dai_link_components will cause another device probe fail.

Fixes: 6d174cc4f2 ("ASoC: fsl-asoc-card: merge spdif support from imx-spdif.c")
Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
Link: https://patch.msgid.link/1721877773-5229-1-git-send-email-shengjiu.wang@nxp.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2024-07-25 14:20:30 +01:00
Will Deacon
36639013b3 arm64: mm: Fix lockless walks with static and dynamic page-table folding
Lina reports random oopsen originating from the fast GUP code when
16K pages are used with 4-level page-tables, the fourth level being
folded at runtime due to lack of LPA2.

In this configuration, the generic implementation of
p4d_offset_lockless() will return a 'p4d_t *' corresponding to the
'pgd_t' allocated on the stack of the caller, gup_fast_pgd_range().
This is normally fine, but when the fourth level of page-table is folded
at runtime, pud_offset_lockless() will offset from the address of the
'p4d_t' to calculate the address of the PUD in the same page-table page.
This results in a stray stack read when the 'p4d_t' has been allocated
on the stack and can send the walker into the weeds.

Fix the problem by providing our own definition of p4d_offset_lockless()
when CONFIG_PGTABLE_LEVELS <= 4 which returns the real page-table
pointer rather than the address of the local stack variable.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/50360968-13fb-4e6f-8f52-1725b3177215@asahilina.net
Fixes: 0dd4f60a2c ("arm64: mm: Add support for folding PUDs at runtime")
Reported-by: Asahi Lina <lina@asahilina.net>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20240725090345.28461-1-will@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2024-07-25 13:20:55 +01:00