Commit Graph

102362 Commits

Author SHA1 Message Date
Linus Torvalds
402e262d77 Merge tag 'efi-next-for-v6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi
Pull EFI updates from Ard Biesheuvel:

 - Expose the OVMF firmware debug log via sysfs

 - Lower the default log level for the EFI stub to avoid corrupting any
   splash screens with unimportant diagnostic output

* tag 'efi-next-for-v6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
  efi: add API doc entry for ovmf_debug_log
  efistub: Lower default log level
  efi: add ovmf debug log driver
2025-08-09 18:10:01 +03:00
Linus Torvalds
2988dfed8a Merge tag 'block-6.17-20250808' of git://git.kernel.dk/linux
Pull more block updates from Jens Axboe:

 - MD pull request via Yu:
      - mddev null-ptr-dereference fix, by Erkun
      - md-cluster fail to remove the faulty disk regression fix, by
        Heming
      - minor cleanup, by Li Nan and Jinchao
      - mdadm lifetime regression fix reported by syzkaller, by Yu Kuai

 - MD pull request via Christoph
      - add support for getting the FDP featuee in fabrics passthru path
        (Nitesh Shetty)
      - add capability to connect to an administrative controller
        (Kamaljit Singh)
      - fix a leak on sgl setup error (Keith Busch)
      - initialize discovery subsys after debugfs is initialized
        (Mohamed Khalfella)
      - fix various comment typos (Bjorn Helgaas)
      - remove unneeded semicolons (Jiapeng Chong)

 - nvmet debugfs ordering issue fix

 - Fix UAF in the tag_set in zloop

 - Ensure sbitmap shallow depth covers entire set

 - Reduce lock roundtrips in io context lookup

 - Move scheduler tags alloc/free out of elevator and freeze lock, to
   fix some lockdep found issues

 - Improve robustness of queue limits checking

 - Fix a regression with IO priorities, if no io context exists

* tag 'block-6.17-20250808' of git://git.kernel.dk/linux: (26 commits)
  lib/sbitmap: make sbitmap_get_shallow() internal
  lib/sbitmap: convert shallow_depth from one word to the whole sbitmap
  nvmet: exit debugfs after discovery subsystem exits
  block, bfq: Reorder struct bfq_iocq_bfqq_data
  md: make rdev_addable usable for rcu mode
  md/raid1: remove struct pool_info and related code
  md/raid1: change r1conf->r1bio_pool to a pointer type
  block: ensure discard_granularity is zero when discard is not supported
  zloop: fix KASAN use-after-free of tag set
  block: Fix default IO priority if there is no IO context
  nvme: fix various comment typos
  nvme-auth: remove unneeded semicolon
  nvme-pci: fix leak on sgl setup error
  nvmet: initialize discovery subsys after debugfs is initialized
  nvme: add capability to connect to an administrative controller
  nvmet: add support for FDP in fabrics passthru path
  md: rename recovery_cp to resync_offset
  md/md-cluster: handle REMOVE message earlier
  md: fix create on open mddev lifetime regression
  block: fix potential deadlock while running nr_hw_queue update
  ...
2025-08-09 08:47:28 +03:00
Linus Torvalds
0227b49b50 Merge tag 'gpio-updates-for-v6.17-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull gpio updates from Bartosz Golaszewski:
 "As discussed: there's a small commit that removes the legacy GPIO line
  value setter callbacks as they're no longer used and a big, treewide
  commit that renames the new ones to the old names across all GPIO
  drivers at once.

  While at it: there are also two fixes that I picked up over the course
  of the merge window:

   - remove unused, legacy GPIO line value setters from struct gpio_chip

   - rename the new set callbacks back to the original names treewide

   - fix interrupt handling in gpio-mlxbf2

   - revert a buggy immutable irqchip conversion"

* tag 'gpio-updates-for-v6.17-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
  treewide: rename GPIO set callbacks back to their original names
  gpio: remove legacy GPIO line value setter callbacks
  gpio: mlxbf2: use platform_get_irq_optional()
  Revert "gpio: pxa: Make irq_chip immutable"
2025-08-09 08:15:43 +03:00
Linus Torvalds
ccc1ead23c Merge tag 'nfs-for-6.17-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client updates from Trond Myklebust:
 "Highlights include:

  Stable fixes:
   - don't inherit NFS filesystem capabilities when crossing from one
     filesystem to another

  Bugfixes:
   - NFS wakeup of __nfs_lookup_revalidate() needs memory barriers
   - NFS improve bounds checking in nfs_fh_to_dentry()
   - NFS Fix allocation errors when writing to a NFS file backed
     loopback device
   - NFSv4: More listxattr fixes
   - SUNRPC: fix client handling of TLS alerts
   - pNFS block/scsi layout fix for an uninitialised pointer
     dereference
   - pNFS block/scsi layout fixes for the extent encoding, stripe
     mapping, and disk offset overflows
   - pNFS layoutcommit work around for RPC size limitations
   - pNFS/flexfiles avoid looping when handling fatal errors after
     layoutget
   - localio: fix various race conditions

  Features and cleanups:
   - Add NFSv4 support for retrieving the btime
   - NFS: Allow folio migration for the case of mode == MIGRATE_SYNC
   - NFS: Support using a kernel keyring to store TLS certificates
   - NFSv4: Speed up delegation lookup using a hash table
   - Assorted cleanups to remove unused variables and struct fields
   - Assorted new tracepoints to improve debugging"

* tag 'nfs-for-6.17-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (44 commits)
  NFS/localio: nfs_uuid_put() fix the wake up after unlinking the file
  NFS/localio: nfs_uuid_put() fix races with nfs_open/close_local_fh()
  NFS/localio: nfs_close_local_fh() fix check for file closed
  NFSv4: Remove duplicate lookups, capability probes and fsinfo calls
  NFS: Fix the setting of capabilities when automounting a new filesystem
  sunrpc: fix client side handling of tls alerts
  nfs/localio: use read_seqbegin() rather than read_seqbegin_or_lock()
  NFS: Fixup allocation flags for nfsiod's __GFP_NORETRY
  NFSv4.2: another fix for listxattr
  NFS: Fix filehandle bounds checking in nfs_fh_to_dentry()
  SUNRPC: Silence warnings about parameters not being described
  NFS: Clean up pnfs_put_layout_hdr()/pnfs_destroy_layout_final()
  NFS: Fix wakeup of __nfs_lookup_revalidate() in unblock_revalidate()
  NFS: use a hash table for delegation lookup
  NFS: track active delegations per-server
  NFS: move the delegation_watermark module parameter
  NFS: cleanup nfs_inode_reclaim_delegation
  NFS: cleanup error handling in nfs4_server_common_setup
  pNFS/flexfiles: don't attempt pnfs on fatal DS errors
  NFS: drop __exit from nfs_exit_keyring
  ...
2025-08-09 07:20:44 +03:00
Linus Torvalds
3781648824 Merge tag 'net-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
  Previous releases - regressions:

   - netlink: avoid infinite retry looping in netlink_unicast()

  Previous releases - always broken:

   - packet: fix a race in packet_set_ring() and packet_notifier()

   - ipv6: reject malicious packets in ipv6_gso_segment()

   - sched: mqprio: fix stack out-of-bounds write in tc entry parsing

   - net: drop UFO packets (injected via virtio) in udp_rcv_segment()

   - eth: mlx5: correctly set gso_segs when LRO is used, avoid false
     positive checksum validation errors

   - netpoll: prevent hanging NAPI when netcons gets enabled

   - phy: mscc: fix parsing of unicast frames for PTP timestamping

   - a number of device tree / OF reference leak fixes"

* tag 'net-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (44 commits)
  pptp: fix pptp_xmit() error path
  net: ti: icssg-prueth: Fix skb handling for XDP_PASS
  net: Update threaded state in napi config in netif_set_threaded
  selftests: netdevsim: Xfail nexthop test on slow machines
  eth: fbnic: Lock the tx_dropped update
  eth: fbnic: Fix tx_dropped reporting
  eth: fbnic: remove the debugging trick of super high page bias
  net: ftgmac100: fix potential NULL pointer access in ftgmac100_phy_disconnect
  dt-bindings: net: Replace bouncing Alexandru Tachici emails
  dpll: zl3073x: ZL3073X_I2C and ZL3073X_SPI should depend on NET
  net/sched: mqprio: fix stack out-of-bounds write in tc entry parsing
  Revert "net: mdio_bus: Use devm for getting reset GPIO"
  selftests: net: packetdrill: xfail all problems on slow machines
  net/packet: fix a race in packet_set_ring() and packet_notifier()
  benet: fix BUG when creating VFs
  net: airoha: npu: Add missing MODULE_FIRMWARE macros
  net: devmem: fix DMA direction on unmapping
  ipa: fix compile-testing with qcom-mdt=m
  eth: fbnic: unlink NAPIs from queues on error to open
  net: Add locking to protect skb->dev access in ip_output
  ...
2025-08-08 07:03:25 +03:00
Yu Kuai
45fa9f97e6 lib/sbitmap: make sbitmap_get_shallow() internal
Because it's only used in sbitmap.c

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20250807032413.1469456-3-yukuai1@huaweicloud.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-08-07 06:30:17 -06:00
Yu Kuai
42e6c6ce03 lib/sbitmap: convert shallow_depth from one word to the whole sbitmap
Currently elevators will record internal 'async_depth' to throttle
asynchronous requests, and they both calculate shallow_dpeth based on
sb->shift, with the respect that sb->shift is the available tags in one
word.

However, sb->shift is not the availbale tags in the last word, see
__map_depth:

if (index == sb->map_nr - 1)
  return sb->depth - (index << sb->shift);

For consequence, if the last word is used, more tags can be get than
expected, for example, assume nr_requests=256 and there are four words,
in the worst case if user set nr_requests=32, then the first word is
the last word, and still use bits per word, which is 64, to calculate
async_depth is wrong.

One the ohter hand, due to cgroup qos, bfq can allow only one request
to be allocated, and set shallow_dpeth=1 will still allow the number
of words request to be allocated.

Fix this problems by using shallow_depth to the whole sbitmap instead
of per word, also change kyber, mq-deadline and bfq to follow this,
a new helper __map_depth_with_shallow() is introduced to calculate
available bits in each word.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Link: https://lore.kernel.org/r/20250807032413.1469456-2-yukuai1@huaweicloud.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-08-07 06:30:17 -06:00
Bartosz Golaszewski
d9d87d90cc treewide: rename GPIO set callbacks back to their original names
The conversion of all GPIO drivers to using the .set_rv() and
.set_multiple_rv() callbacks from struct gpio_chip (which - unlike their
predecessors - return an integer and allow the controller drivers to
indicate failures to users) is now complete and the legacy ones have
been removed. Rename the new callbacks back to their original names in
one sweeping change.

Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2025-08-07 10:07:06 +02:00
Bartosz Golaszewski
397a46c9aa gpio: remove legacy GPIO line value setter callbacks
With no more users of the legacy GPIO line value setters - .set() and
.set_multiple() - we can now remove them from the kernel.

Link: https://lore.kernel.org/r/20250725074651.14002-1-brgl@bgdev.pl
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2025-08-07 09:56:43 +02:00
Linus Torvalds
6e64f45803 Merge tag 'input-for-v6.17-rc0' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input updates from Dmitry Torokhov:

 - updates to several drivers consuming GPIO APIs to use setters
   returning error codes

 - an infrastructure allowing to define "overlays" for touchscreens
   carving out regions implementing buttons and other elements from a
   bigger sensors and a corresponding update to st1232 driver

 - an update to AT/PS2 keyboard driver to map F13-F24 by default

 - Samsung keypad driver got a facelift

 - evdev input handler will now bind to all devices using EV_SYN event
   instead of abusing id->driver_info

 - two new sub-drivers implementing 1A (capacitive buttons) and 21
   (forcepad button) functions in Synaptics RMI driver

 - support for polling mode in Goodix touchscreen driver

 - support for support for FocalTech FT8716 in edt-ft5x06 driver

 - support for MT6359 in mtk-pmic-keys driver

 - removal of pcf50633-input driver since platform it was used on is
   gone

 - new definitions for game controller "grip" buttons (BTN_GRIP*) and
   corresponding changes to xpad and hid-steam controller drivers

 - a new definition for "performance" key

* tag 'input-for-v6.17-rc0' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (38 commits)
  HID: hid-steam: Use new BTN_GRIP* buttons
  Input: add keycode for performance mode key
  Input: max77693 - convert to atomic pwm operation
  Input: st1232 - add touch-overlay handling
  dt-bindings: input: touchscreen: st1232: add touch-overlay example
  Input: touch-overlay - add touchscreen overlay handling
  dt-bindings: touchscreen: add touch-overlay property
  Input: atkbd - correctly map F13 - F24
  Input: xpad - use new BTN_GRIP* buttons
  Input: Add and document BTN_GRIP*
  Input: xpad - change buttons the D-Pad gets mapped as to BTN_DPAD_*
  Documentation: Fix capitalization of XBox -> Xbox
  Input: synaptics-rmi4 - add support for F1A
  dt-bindings: input: syna,rmi4: Document F1A function
  Input: synaptics-rmi4 - add support for Forcepads (F21)
  Input: mtk-pmic-keys - add support for MT6359 PMIC keys
  Input: remove special handling of id->driver_info when matching
  Input: evdev - switch matching to EV_SYN
  Input: samsung-keypad - use BIT() and GENMASK() where appropriate
  Input: samsung-keypad - use per-chip parameters
  ...
2025-08-07 07:40:01 +03:00
Linus Torvalds
e8214ed59b Merge tag 'vfio-v6.17-rc1-v2' of https://github.com/awilliam/linux-vfio
Pull VFIO updates from Alex Williamson:

 - Fix imbalance where the no-iommu/cdev device path skips too much on
   open, failing to increment a reference, but still decrements the
   reference on close. Add bounds checking to prevent such underflows
   (Jacob Pan)

 - Fill missing detach_ioas op for pds_vfio_pci, fixing probe failure
   when used with IOMMUFD (Brett Creeley)

 - Split SR-IOV VFs to separate dev_set, avoiding unnecessary
   serialization between VFs that appear on the same bus (Alex
   Williamson)

 - Fix a theoretical integer overflow is the mlx5-vfio-pci variant
   driver (Artem Sadovnikov)

 - Implement missing VF token checking support via vfio cdev/IOMMUFD
   interface (Jason Gunthorpe)

 - Update QAT vfio-pci variant driver to claim latest VF devices
   (Małgorzata Mielnik)

 - Add a cond_resched() call to avoid holding the CPU too long during
   DMA mapping operations (Keith Busch)

* tag 'vfio-v6.17-rc1-v2' of https://github.com/awilliam/linux-vfio:
  vfio/type1: conditional rescheduling while pinning
  vfio/qat: add support for intel QAT 6xxx virtual functions
  vfio/qat: Remove myself from VFIO QAT PCI driver maintainers
  vfio/pci: Do vf_token checks for VFIO_DEVICE_BIND_IOMMUFD
  vfio/mlx5: fix possible overflow in tracking max message size
  vfio/pci: Separate SR-IOV VF dev_set
  vfio/pds: Fix missing detach_ioas op
  vfio: Prevent open_count decrement to negative
  vfio: Fix unbalanced vfio_df_close call in no-iommu mode
2025-08-07 07:32:50 +03:00
Linus Torvalds
479058002c Merge tag 'ata-6.17-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux
Pull ata fixes from Damien Le Moal:

 - Cleanup whitespace in messages in libata-core and the pata_pdc2027x,
   pata_macio drivers (Colin)

 - Fix ata_to_sense_error() to avoid seeing nonsensical sense data for
   rare cases where we fail to get sense data from the drive. The
   complementary fix to this is to ensure that we always return the
   generic "ABORTED COMMAND" sense data for a failed command for which
   we have no status or error fields

 - The recent changes to link power management (LPM) which now prevent
   the user from attempting to set an LPM policy through the
   link_power_management_policy caused some regressions in test
   environments because of the error that is now returned when writing
   to that attribute when LPM is not supported. To allow users to not
   trip on this, introduce the new link_power_management_supported
   attribute to allow simple testing of a port/device LPM support (me)

* tag 'ata-6.17-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux:
  ata: pata_pdc2027x: Remove space before newline and abbreviations
  ata: pata_macio: Remove space before newline
  ata: libata-core: Remove space before newline
  ata: libata-sata: Add link_power_management_supported sysfs attribute
  ata: libata-scsi: Return aborted command when missing sense and result TF
  ata: libata-scsi: Fix ata_to_sense_error() status handling
2025-08-06 08:52:59 +03:00
Linus Torvalds
a530a36bb5 Merge tag 'kbuild-v6.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild updates from Masahiro Yamada:
 "This is the last pull request from me.

  I'm grateful to have been able to continue as a maintainer for eight
  years. From the next cycle, Nathan and Nicolas will maintain Kbuild.

   - Fix a shortcut key issue in menuconfig

   - Fix missing rebuild of kheaders

   - Sort the symbol dump generated by gendwarfsyms

   - Support zboot extraction in scripts/extract-vmlinux

   - Migrate gconfig to GTK 3

   - Add TAR variable to allow overriding the default tar command

   - Hand over Kbuild maintainership"

* tag 'kbuild-v6.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (92 commits)
  MAINTAINERS: hand over Kbuild maintenance
  kheaders: make it possible to override TAR
  kbuild: userprogs: use correct linker when mixing clang and GNU ld
  kconfig: lxdialog: replace strcpy() with strncpy() in inputbox.c
  kconfig: lxdialog: replace strcpy with snprintf in print_autowrap
  kconfig: gconf: refactor text_insert_help()
  kconfig: gconf: remove unneeded variable in text_insert_msg
  kconfig: gconf: use hyphens in signals
  kconfig: gconf: replace GtkImageMenuItem with GtkMenuItem
  kconfig: gconf: Fix Back button behavior
  kconfig: gconf: fix single view to display dependent symbols correctly
  scripts: add zboot support to extract-vmlinux
  gendwarfksyms: order -T symtypes output by name
  gendwarfksyms: use preferred form of sizeof for allocation
  kconfig: qconf: confine {begin,end}Group to constructor and destructor
  kconfig: qconf: fix ConfigList::updateListAllforAll()
  kconfig: add a function to dump all menu entries in a tree-like format
  kconfig: gconf: show GTK version in About dialog
  kconfig: gconf: replace GtkHPaned and GtkVPaned with GtkPaned
  kconfig: gconf: replace GdkColor with GdkRGBA
  ...
2025-08-06 07:32:52 +03:00
Jason Gunthorpe
86624ba3b5 vfio/pci: Do vf_token checks for VFIO_DEVICE_BIND_IOMMUFD
This was missed during the initial implementation. The VFIO PCI encodes
the vf_token inside the device name when opening the device from the group
FD, something like:

  "0000:04:10.0 vf_token=bd8d9d2b-5a5f-4f5a-a211-f591514ba1f3"

This is used to control access to a VF unless there is co-ordination with
the owner of the PF.

Since we no longer have a device name in the cdev path, pass the token
directly through VFIO_DEVICE_BIND_IOMMUFD using an optional field
indicated by VFIO_DEVICE_BIND_FLAG_TOKEN.

Fixes: 5fcc26969a ("vfio: Add VFIO_DEVICE_BIND_IOMMUFD")
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/0-v3-bdd8716e85fe+3978a-vfio_token_jgg@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2025-08-05 15:41:14 -06:00
Linus Torvalds
da23ea194d Merge tag 'mm-stable-2025-08-03-12-35' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull more MM updates from Andrew Morton:
 "Significant patch series in this pull request:

   - "mseal cleanups" (Lorenzo Stoakes)

     Some mseal cleaning with no intended functional change.

   - "Optimizations for khugepaged" (David Hildenbrand)

     Improve khugepaged throughput by batching PTE operations for large
     folios. This gain is mainly for arm64.

   - "x86: enable EXECMEM_ROX_CACHE for ftrace and kprobes" (Mike Rapoport)

     A bugfix, additional debug code and cleanups to the execmem code.

   - "mm/shmem, swap: bugfix and improvement of mTHP swap in" (Kairui Song)

     Bugfixes, cleanups and performance improvememnts to the mTHP swapin
     code"

* tag 'mm-stable-2025-08-03-12-35' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (38 commits)
  mm: mempool: fix crash in mempool_free() for zero-minimum pools
  mm: correct type for vmalloc vm_flags fields
  mm/shmem, swap: fix major fault counting
  mm/shmem, swap: rework swap entry and index calculation for large swapin
  mm/shmem, swap: simplify swapin path and result handling
  mm/shmem, swap: never use swap cache and readahead for SWP_SYNCHRONOUS_IO
  mm/shmem, swap: tidy up swap entry splitting
  mm/shmem, swap: tidy up THP swapin checks
  mm/shmem, swap: avoid redundant Xarray lookup during swapin
  x86/ftrace: enable EXECMEM_ROX_CACHE for ftrace allocations
  x86/kprobes: enable EXECMEM_ROX_CACHE for kprobes allocations
  execmem: drop writable parameter from execmem_fill_trapping_insns()
  execmem: add fallback for failures in vmalloc(VM_ALLOW_HUGE_VMAP)
  execmem: move execmem_force_rw() and execmem_restore_rox() before use
  execmem: rework execmem_cache_free()
  execmem: introduce execmem_alloc_rw()
  execmem: drop unused execmem_update_copy()
  mm: fix a UAF when vma->mm is freed after vma->vm_refcnt got dropped
  mm/rmap: add anon_vma lifetime debug check
  mm: remove mm/io-mapping.c
  ...
2025-08-05 16:02:07 +03:00
Linus Torvalds
0974f486f3 Merge tag 'f2fs-for-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs
Pull f2fs updates from Jaegeuk Kim:
 "Three main updates: folio conversion by Matthew, switch to a new mount
  API by Hongbo and Eric, and several sysfs entries to tune GCs for ZUFS
  with finer granularity by Daeho.

  There are also patches to address bugs and issues in the existing
  features such as GCs, file pinning, write-while-dio-read, contingous
  block allocation, and memory access violations.

  Enhancements:
   - switch to new mount API and folio conversion
   - add sysfs nodes to controle F2FS GCs for ZUFS
   - improve performance on the nat entry cache
   - drop inode from the donation list when the last file is closed
   - avoid splitting bio when reading multiple pages

  Bug fixes:
   - fix to trigger foreground gc during f2fs_map_blocks() in lfs mode
   - make sure zoned device GC to use FG_GC in shortage of free section
   - fix to calculate dirty data during has_not_enough_free_secs()
   - fix to update upper_p in __get_secs_required() correctly
   - wait for inflight dio completion, excluding pinned files read using dio
   - don't break allocation when crossing contiguous sections
   - vm_unmap_ram() may be called from an invalid context
   - fix to avoid out-of-boundary access in dnode page
   - fix to avoid panic in f2fs_evict_inode
   - fix to avoid UAF in f2fs_sync_inode_meta()
   - fix to use f2fs_is_valid_blkaddr_raw() in do_write_page()
   - fix UAF of f2fs_inode_info in f2fs_free_dic
   - fix to avoid invalid wait context issue
   - fix bio memleak when committing super block
   - handle nat.blkaddr corruption in f2fs_get_node_info()

  In addition, there are also clean-ups and minor bug fixes"

* tag 'f2fs-for-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (109 commits)
  f2fs: drop inode from the donation list when the last file is closed
  f2fs: add gc_boost_gc_greedy sysfs node
  f2fs: add gc_boost_gc_multiple sysfs node
  f2fs: fix to trigger foreground gc during f2fs_map_blocks() in lfs mode
  f2fs: fix to calculate dirty data during has_not_enough_free_secs()
  f2fs: fix to update upper_p in __get_secs_required() correctly
  f2fs: directly add newly allocated pre-dirty nat entry to dirty set list
  f2fs: avoid redundant clean nat entry move in lru list
  f2fs: zone: wait for inflight dio completion, excluding pinned files read using dio
  f2fs: ignore valid ratio when free section count is low
  f2fs: don't break allocation when crossing contiguous sections
  f2fs: remove unnecessary tracepoint enabled check
  f2fs: merge the two conditions to avoid code duplication
  f2fs: vm_unmap_ram() may be called from an invalid context
  f2fs: fix to avoid out-of-boundary access in dnode page
  f2fs: switch to the new mount api
  f2fs: introduce fs_context_operation structure
  f2fs: separate the options parsing and options checking
  f2fs: Add f2fs_fs_context to record the mount options
  f2fs: Allow sbi to be NULL in f2fs_printk
  ...
2025-08-04 16:27:21 -07:00
Linus Torvalds
35a813e010 Merge tag 'printk-for-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux
Pull printk updates from Petr Mladek:

 - Add new "hash_pointers=[auto|always|never]" boot parameter to force
   the hashing even with "slab_debug" enabled

 - Allow to stop CPU, after losing nbcon console ownership during
   panic(), even without proper NMI

 - Allow to use the printk kthread immediately even for the 1st
   registered nbcon

 - Compiler warning removal

* tag 'printk-for-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
  printk: nbcon: Allow reacquire during panic
  printk: Allow to use the printk kthread immediately even for 1st nbcon
  slab: Decouple slab_debug and no_hash_pointers
  vsprintf: Use __diag macros to disable '-Wsuggest-attribute=format'
  compiler-gcc.h: Introduce __diag_GCC_all
2025-08-04 10:54:36 -07:00
Petr Mladek
731ae3ad96 Merge branch 'for-6.17-hash_pointers' into for-linus 2025-08-04 14:16:21 +02:00
Petr Mladek
3bfd34ed36 Merge branch 'for-6.15-printf-attribute' into for-linus 2025-08-04 14:15:14 +02:00
Dmitry Torokhov
a7bee4e7f7 Merge tag 'ib-mfd-gpio-input-pwm-v6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd into next
Merge an immutable branch between MFD, GPIO, Input and PWM to resolve
conflicts for the merge window pull request.
2025-08-03 23:28:48 -07:00
Linus Torvalds
d2eedaa390 Merge tag 'rtc-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux
Pull RTC updates from Alexandre Belloni:
 "Support for a new RTC in an existing driver and all the drivers
  exposing clocks using the common clock framework have been converted
  to determine_rate(). Summary:

  Subsystem:
   - Convert drivers exposing a clock from round_rate() to determine_rate()

  Drivers:
   - ds1307: oscillator stop flag handling for ds1341
   - pcf85063: add support for RV8063"

* tag 'rtc-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux: (34 commits)
  rtc: ds1685: Update Joshua Kinard's email address.
  rtc: rv3032: convert from round_rate() to determine_rate()
  rtc: rv3028: convert from round_rate() to determine_rate()
  rtc: pcf8563: convert from round_rate() to determine_rate()
  rtc: pcf85063: convert from round_rate() to determine_rate()
  rtc: nct3018y: convert from round_rate() to determine_rate()
  rtc: max31335: convert from round_rate() to determine_rate()
  rtc: m41t80: convert from round_rate() to determine_rate()
  rtc: hym8563: convert from round_rate() to determine_rate()
  rtc: ds1307: convert from round_rate() to determine_rate()
  rtc: rv3028: fix incorrect maximum clock rate handling
  rtc: pcf8563: fix incorrect maximum clock rate handling
  rtc: pcf85063: fix incorrect maximum clock rate handling
  rtc: nct3018y: fix incorrect maximum clock rate handling
  rtc: hym8563: fix incorrect maximum clock rate handling
  rtc: ds1307: fix incorrect maximum clock rate handling
  rtc: pcf85063: scope pcf85063_config structures
  rtc: Optimize calculations in rtc_time64_to_tm()
  dt-bindings: rtc: amlogic,a4-rtc: Add compatible string for C3
  rtc: ds1307: handle oscillator stop flag (OSF) for ds1341
  ...
2025-08-03 20:17:34 -07:00
Linus Torvalds
e991acf1bc Merge tag 'mm-nonmm-stable-2025-08-03-12-47' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull non-MM updates from Andrew Morton:
 "Significant patch series in this pull request:

   - "squashfs: Remove page->mapping references" (Matthew Wilcox) gets
     us closer to being able to remove page->mapping

   - "relayfs: misc changes" (Jason Xing) does some maintenance and
     minor feature addition work in relayfs

   - "kdump: crashkernel reservation from CMA" (Jiri Bohac) switches
     us from static preallocation of the kdump crashkernel's working
     memory over to dynamic allocation. So the difficulty of a-priori
     estimation of the second kernel's needs is removed and the first
     kernel obtains extra memory

   - "generalize panic_print's dump function to be used by other
     kernel parts" (Feng Tang) implements some consolidation and
     rationalization of the various ways in which a failing kernel
     splats information at the operator

* tag 'mm-nonmm-stable-2025-08-03-12-47' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (80 commits)
  tools/getdelays: add backward compatibility for taskstats version
  kho: add test for kexec handover
  delaytop: enhance error logging and add PSI feature description
  samples: Kconfig: fix spelling mistake "instancess" -> "instances"
  fat: fix too many log in fat_chain_add()
  scripts/spelling.txt: add notifer||notifier to spelling.txt
  xen/xenbus: fix typo "notifer"
  net: mvneta: fix typo "notifer"
  drm/xe: fix typo "notifer"
  cxl: mce: fix typo "notifer"
  KVM: x86: fix typo "notifer"
  MAINTAINERS: add maintainers for delaytop
  ucount: use atomic_long_try_cmpxchg() in atomic_long_inc_below()
  ucount: fix atomic_long_inc_below() argument type
  kexec: enable CMA based contiguous allocation
  stackdepot: make max number of pools boot-time configurable
  lib/xxhash: remove unused functions
  init/Kconfig: restore CONFIG_BROKEN help text
  lib/raid6: update recov_rvv.c zero page usage
  docs: update docs after introducing delaytop
  ...
2025-08-03 16:23:09 -07:00
Linus Torvalds
3c4a063b1f Merge tag 'trace-v6.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull more tracing updates from Steven Rostedt:

 - Remove unneeded goto out statements

   Over time, the logic was restructured but left a "goto out" where the
   out label simply did a "return ret;". Instead of jumping to this out
   label, simply return immediately and remove the out label.

 - Add guard(ring_buffer_nest)

   Some calls to the tracing ring buffer can happen when the ring buffer
   is already being written to at the same context (for example, a
   trace_printk() in between a ring_buffer_lock_reserve() and a
   ring_buffer_unlock_commit()).

   In order to not trigger the recursion detection, these functions use
   ring_buffer_nest_start() and ring_buffer_nest_end(). Create a guard()
   for these functions so that their use cases can be simplified and not
   need to use goto for the release.

 - Clean up the tracing code with guard() and __free() logic

   There were several locations that were prime candidates for using
   guard() and __free() helpers. Switch them over to use them.

 - Fix output of function argument traces for unsigned int values

   The function tracer with "func-args" option set will record up to 6
   argument registers and then use BTF to format them for human
   consumption when the trace file is read. There are several arguments
   that are "unsigned long" and even "unsigned int" that are either and
   address or a mask. It is easier to understand if they were printed
   using hexadecimal instead of decimal. The old method just printed all
   non-pointer values as signed integers, which made it even worse for
   unsigned integers.

   For instance, instead of:

     __local_bh_disable_ip(ip=-2127311112, cnt=256) <-handle_softirqs

   show:

     __local_bh_disable_ip(ip=0xffffffff8133cef8, cnt=0x100) <-handle_softirqs"

* tag 'trace-v6.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  tracing: Have unsigned int function args displayed as hexadecimal
  ring-buffer: Convert ring_buffer_write() to use guard(preempt_notrace)
  tracing: Use __free(kfree) in trace.c to remove gotos
  tracing: Add guard() around locks and mutexes in trace.c
  tracing: Add guard(ring_buffer_nest)
  tracing: Remove unneeded goto out logic
2025-08-03 15:03:04 -07:00
Linus Torvalds
8877fcb70f Merge tag 'modules-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/modules/linux
Pull module updates from Daniel Gomez:
 "This is a small set of changes for modules, primarily to extend module
  users to use the module data structures in combination with the
  already no-op stub module functions, even when support for modules is
  disabled in the kernel configuration. This change follows the kernel's
  coding style for conditional compilation and allows kunit code to drop
  all CONFIG_MODULES ifdefs, which is also part of the changes. This
  should allow others part of the kernel to do the same cleanup.

  The remaining changes include a fix for module name length handling
  which could potentially lead to the removal of an incorrect module,
  and various cleanups"

* tag 'modules-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/modules/linux:
  module: Rename MAX_PARAM_PREFIX_LEN to __MODULE_NAME_LEN
  tracing: Replace MAX_PARAM_PREFIX_LEN with MODULE_NAME_LEN
  module: Restore the moduleparam prefix length check
  module: Remove unnecessary +1 from last_unloaded_module::name size
  module: Prevent silent truncation of module name in delete_module(2)
  kunit: test: Drop CONFIG_MODULE ifdeffery
  module: make structure definitions always visible
  module: move 'struct module_use' to internal.h
2025-08-03 14:16:52 -07:00
Linus Torvalds
546b0ad6a8 Merge tag 'i3c/for-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux
Pull i3c updates from Alexandre Belloni:
 "New driver:
   - Renesas I3C controller

  Subsystem:
   - use adapter timeout value for I2C transfers
   - don't fail if GETHDRCAP is unsupported
   - replace ENOTSUPP with SUSV4-compliant EOPNOTSUPP

  Drivers:
   - svc: Fix npcm845 FIFO_EMPTY quirk"

* tag 'i3c/for-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux: (25 commits)
  i3c: add missing include to internal header
  i3c: dw: Remove redundant pm_runtime_mark_last_busy() calls
  i3c: master: svc: Remove redundant pm_runtime_mark_last_busy() calls
  i3c: master: svc: Fix npcm845 FIFO_EMPTY quirk
  i3c: master: Add basic driver for the Renesas I3C controller
  dt-bindings: i3c: Add Renesas I3C controller
  i3c: Add more parameters for controllers to the header
  i3c: Standardize defines for specification parameters
  i3c: fix module_i3c_i2c_driver() with I3C=n
  i3c: master: cdns: Simplify handling clocks in probe()
  i3c: Fix i3c_device_do_priv_xfers() kernel-doc indentation
  i3c: master: dw: Use i3c_writel_fifo() and i3c_readl_fifo()
  i3c: master: cdns: Use i3c_writel_fifo() and i3c_readl_fifo()
  i3c: master: Add inline i3c_readl_fifo() and i3c_writel_fifo()
  i3c: prefix hexadecimal entries in sysfs
  i3c: master: cdns: replace ENOTSUPP with SUSV4-compliant EOPNOTSUPP
  i3c: dw: replace ENOTSUPP with SUSV4-compliant EOPNOTSUPP
  i3c: master: replace ENOTSUPP with SUSV4-compliant EOPNOTSUPP
  i3c: don't fail if GETHDRCAP is unsupported
  i3c: add patchwork entry to MAINTAINERS
  ...
2025-08-03 14:12:02 -07:00
Joshua Kinard
bb5b0b4317 rtc: ds1685: Update Joshua Kinard's email address.
I am switching my address to a personal domain, so need to update the
driver's files and the entry in MAINTAINERS.

Signed-off-by: Joshua Kinard <kumba@gentoo.org>
Link: https://lore.kernel.org/r/20250721170051.32407-1-kumba@gentoo.org
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
2025-08-03 03:28:52 +02:00
Linus Torvalds
186f3edfdd Merge tag 'pinctrl-v6.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Pull pin control updates from Linus Walleij:
 "Nothing stands out, apart from maybe the interesting Eswin EIC7700, a
  RISC-V SoC I've never seen before.

  Core changes:

   - Open code PINCTRL_FUNCTION_DESC() instead of defining a complex
     macro only used in one place

   - Add pinmux_generic_add_pinfunction() helper and use this in a few
     drivers

  New drivers:

   - Amlogic S7, S7D and S6 pin control support

   - Eswin EIC7700 pin control support

   - Qualcomm PMIV0104, PM7550 and Milos pin control support

     Because of unhelpful numbering schemes, the Qualcomm driver now
     needs to start to rely on SoC codenames

   - STM32 HDP pin control support

   - Mediatek MT8189 pin control support

  Improvements:

   - Switch remaining pin control drivers over to the new GPIO set
     callback that provides a return value

   - Support RSVD (reserved) pins in the STM32 driver

   - Move many fixed assignments over to pinctrl_desc definitions

   - Handle multiple TLMM regions in the Qualcomm driver"

* tag 'pinctrl-v6.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: (105 commits)
  pinctrl: mediatek: Add pinctrl driver for mt8189
  dt-bindings: pinctrl: mediatek: Add support for mt8189
  pinctrl: aspeed-g6: Add PCIe RC PERST pin group
  pinctrl: ingenic: use pinmux_generic_add_pinfunction()
  pinctrl: keembay: use pinmux_generic_add_pinfunction()
  pinctrl: mediatek: moore: use pinmux_generic_add_pinfunction()
  pinctrl: airoha: use pinmux_generic_add_pinfunction()
  pinctrl: equilibrium: use pinmux_generic_add_pinfunction()
  pinctrl: provide pinmux_generic_add_pinfunction()
  pinctrl: pinmux: open-code PINCTRL_FUNCTION_DESC()
  pinctrl: ma35: use new GPIO line value setter callbacks
  MAINTAINERS: add Clément Le Goffic as STM32 HDP maintainer
  pinctrl: stm32: Introduce HDP driver
  dt-bindings: pinctrl: stm32: Introduce HDP
  pinctrl: qcom: Add Milos pinctrl driver
  dt-bindings: pinctrl: document the Milos Top Level Mode Multiplexer
  pinctrl: qcom: spmi: Add PM7550
  dt-bindings: pinctrl: qcom,pmic-gpio: Add PM7550 support
  pinctrl: qcom: spmi: Add PMIV0104
  dt-bindings: pinctrl: qcom,pmic-gpio: Add PMIV0104 support
  ...
2025-08-02 12:07:09 -07:00
Mike Rapoport (Microsoft)
ab674b6871 execmem: drop writable parameter from execmem_fill_trapping_insns()
After update of execmem_cache_free() that made memory writable before
updating it, there is no need to update read only memory, so the writable
parameter to execmem_fill_trapping_insns() is not needed.  Drop it.

Link: https://lkml.kernel.org/r/20250713071730.4117334-7-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Daniel Gomez <da.gomez@samsung.com>
Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: Petr Pavlu <petr.pavlu@suse.com>
Cc: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-08-02 12:06:12 -07:00
Mike Rapoport (Microsoft)
838955f64a execmem: introduce execmem_alloc_rw()
Some callers of execmem_alloc() require the memory to be temporarily
writable even when it is allocated from ROX cache.  These callers use
execemem_make_temp_rw() right after the call to execmem_alloc().

Wrap this sequence in execmem_alloc_rw() API.

Link: https://lkml.kernel.org/r/20250713071730.4117334-3-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Reviewed-by: Daniel Gomez <da.gomez@samsung.com>
Reviewed-by: Petr Pavlu <petr.pavlu@suse.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-08-02 12:06:11 -07:00
Mike Rapoport (Microsoft)
fcd90ad31e execmem: drop unused execmem_update_copy()
Patch series "x86: enable EXECMEM_ROX_CACHE for ftrace and kprobes", v3.

These patches enable use of EXECMEM_ROX_CACHE for ftrace and kprobes
allocations on x86.

They also include some ground work in execmem.

Since the execmem model for caching large ROX pages changed from the
initial assumption that the memory that is allocated from ROX cache is
always ROX to the current state where memory can be temporarily made RW
and then restored to ROX, we can stop using text poking to update it. 
This also saves the hassle of trying lock text_mutex in
execmem_cache_free() when kprobes already hold that mutex.


This patch (of 8):

The execmem_update_copy() that used text poking was required when memory
allocated from ROX cache was always read-only.  Since now its permissions
can be switched to read-write there is no need in a function that updates
memory with text poking.

Remove it.

Link: https://lkml.kernel.org/r/20250713071730.4117334-1-rppt@kernel.org
Link: https://lkml.kernel.org/r/20250713071730.4117334-2-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Daniel Gomez <da.gomez@samsung.com>
Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: Petr Pavlu <petr.pavlu@suse.com>
Cc: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-08-02 12:06:11 -07:00
Suren Baghdasaryan
9bbffee67f mm: fix a UAF when vma->mm is freed after vma->vm_refcnt got dropped
By inducing delays in the right places, Jann Horn created a reproducer for
a hard to hit UAF issue that became possible after VMAs were allowed to be
recycled by adding SLAB_TYPESAFE_BY_RCU to their cache.

Race description is borrowed from Jann's discovery report:
lock_vma_under_rcu() looks up a VMA locklessly with mas_walk() under
rcu_read_lock().  At that point, the VMA may be concurrently freed, and it
can be recycled by another process.  vma_start_read() then increments the
vma->vm_refcnt (if it is in an acceptable range), and if this succeeds,
vma_start_read() can return a recycled VMA.

In this scenario where the VMA has been recycled, lock_vma_under_rcu()
will then detect the mismatching ->vm_mm pointer and drop the VMA through
vma_end_read(), which calls vma_refcount_put().  vma_refcount_put() drops
the refcount and then calls rcuwait_wake_up() using a copy of vma->vm_mm. 
This is wrong: It implicitly assumes that the caller is keeping the VMA's
mm alive, but in this scenario the caller has no relation to the VMA's mm,
so the rcuwait_wake_up() can cause UAF.

The diagram depicting the race:
T1         T2         T3
==         ==         ==
lock_vma_under_rcu
  mas_walk
          <VMA gets removed from mm>
                      mmap
                        <the same VMA is reallocated>
  vma_start_read
    __refcount_inc_not_zero_limited_acquire
                      munmap
                        __vma_enter_locked
                          refcount_add_not_zero
  vma_end_read
    vma_refcount_put
      __refcount_dec_and_test
                          rcuwait_wait_event
                            <finish operation>
      rcuwait_wake_up [UAF]

Note that rcuwait_wait_event() in T3 does not block because refcount was
already dropped by T1.  At this point T3 can exit and free the mm causing
UAF in T1.

To avoid this we move vma->vm_mm verification into vma_start_read() and
grab vma->vm_mm to stabilize it before vma_refcount_put() operation.

[surenb@google.com: v3]
  Link: https://lkml.kernel.org/r/20250729145709.2731370-1-surenb@google.com
Link: https://lkml.kernel.org/r/20250728175355.2282375-1-surenb@google.com
Fixes: 3104138517 ("mm: make vma cache SLAB_TYPESAFE_BY_RCU")
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Reported-by: Jann Horn <jannh@google.com>
Closes: https://lore.kernel.org/all/CAG48ez0-deFbVH=E3jbkWx=X3uVbd8nWeo6kbJPQ0KoUD+m2tA@mail.gmail.com/
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Jann Horn <jannh@google.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-08-02 12:06:11 -07:00
Jann Horn
a222439e1e mm/rmap: add anon_vma lifetime debug check
If an anon folio is mapped into userspace, its anon_vma must be alive,
otherwise rmap walks can hit UAF.

There have been syzkaller reports a few months ago[1][2] of UAF in rmap
walks that seems to indicate that there can be pages with elevated
mapcount whose anon_vma has already been freed, but I think we never
figured out what the cause is; and syzkaller only hit these UAFs when
memory pressure randomly caused reclaim to rmap-walk the affected pages,
so it of course didn't manage to create a reproducer.

Add a VM_WARN_ON_FOLIO() when we add/remove mappings of anonymous folios
to hopefully catch such issues more reliably.

[1] https://lore.kernel.org/r/67abaeaf.050a0220.110943.0041.GAE@google.com
[2] https://lore.kernel.org/r/67a76f33.050a0220.3d72c.0028.GAE@google.com

Link: https://lkml.kernel.org/r/20250725-anonvma-uaf-debug-v2-1-bc3c7e5ba5b1@google.com
Signed-off-by: Jann Horn <jannh@google.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Harry Yoo <harry.yoo@oracle.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Rik van Riel <riel@surriel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-08-02 12:06:11 -07:00
Lorenzo Stoakes
9a4f90e246 mm: remove mm/io-mapping.c
This is dead code, which was used from commit b739f125e4 ("i915: use
io_mapping_map_user") but reverted a month later by commit 0e4fe0c9f2
("Revert "i915: use io_mapping_map_user"") back in 2021.

Since then nobody has used it, so remove it.

[akpm@linux-foundation.org: update Documentation/core-api/mm-api.rst, per Vlastimil]
Link: https://lkml.kernel.org/r/20250725142901.81502-1-lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-08-02 12:06:10 -07:00
David Hildenbrand
3dfde97800 mm: add get_and_clear_ptes() and clear_ptes()
Patch series "Optimizations for khugepaged", v4.

If the underlying folio mapped by the ptes is large, we can process those
ptes in a batch using folio_pte_batch().

For arm64 specifically, this results in a 16x reduction in the number of
ptep_get() calls, since on a contig block, ptep_get() on arm64 will
iterate through all 16 entries to collect a/d bits.  Next, ptep_clear()
will cause a TLBI for every contig block in the range via
contpte_try_unfold().  Instead, use clear_ptes() to only do the TLBI at
the first and last contig block of the range.

For split folios, there will be no pte batching; the batch size returned
by folio_pte_batch() will be 1.  For pagetable split folios, the ptes will
still point to the same large folio; for arm64, this results in the
optimization described above, and for other arches, a minor improvement is
expected due to a reduction in the number of function calls and batching
atomic operations.


This patch (of 3):

Let's add variants to be used where "full" does not apply -- which will
be the majority of cases in the future. "full" really only applies if
we are about to tear down a full MM.

Use get_and_clear_ptes() in existing code, clear_ptes() users will
be added next.

Link: https://lkml.kernel.org/r/20250724052301.23844-2-dev.jain@arm.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Dev Jain <dev.jain@arm.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Reviewed-by: Barry Song <baohua@kernel.org>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Mariano Pache <npache@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-08-02 12:06:10 -07:00
Lorenzo Stoakes
f225b34f1e mm/mseal: always define VM_SEALED
Patch series "mseal cleanups", v4.

Perform a number of cleanups to the mseal logic.  Firstly, VM_SEALED is
treated differently from every other VMA flag, it really doesn't make
sense to do this, so we start by making this consistent with everything
else.

Next we place the madvise logic where it belongs - in mm/madvise.c.  It
really makes no sense to abstract this elsewhere.  In doing so, we go to
great lengths to explain very clearly the previously very confusing logic
as to what sealed mappings are impacted here.

In doing so, we retain existing logic regarding treatment of madvise()
discard operations for a sealed, read-only MAP_PRIVATE file-backed
mapping.  This is something we likely need to revisit.

We then abstract out and explain the 'are there are any gaps in this range
in the mm?' check being performed as a prerequisite to mseal being
performed.

Finally, we simplify the actual mseal logic which is really quite
straightforward.

No functional change is intended.


This patch (of 4):

There is no reason to treat VM_SEALED in a special way, in each other case
in which a VMA flag is unavailable due to configuration, we simply assign
that flag to VM_NONE, so make VM_SEALED consistent with all other VMA
flags in this respect.

Additionally, use the next available bit for VM_SEALED, 42, rather than
arbitrarily putting it at 63 and update the declaration to match all other
VMA flags.

No functional change intended.

Link: https://lkml.kernel.org/r/cover.1753431105.git.lorenzo.stoakes@oracle.com
Link: https://lkml.kernel.org/r/aeb398a77029b6e7377cd944328bc9bbc3c90537.1753431105.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reviewed-by: Pedro Falcato <pfalcato@suse.de>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Jeff Xu <jeffxu@chromium.org>
Cc: Kees Cook <kees@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-08-02 12:06:09 -07:00
Joanne Koong
d171b10b2d mm/page-flags: remove folio_start_writeback_keepwrite()
Commit cd57b77197 ("ext4: Convert ext4_bio_write_page() to use a folio)
removed set_page_writeback_keepwrite() which was the last/only caller of
folio_start_writeback_keepwrite().

Link: https://lkml.kernel.org/r/20250722182230.2114587-1-joannelkoong@gmail.com
Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-08-02 12:06:08 -07:00
Alexander Graf
07d2490297 kexec: enable CMA based contiguous allocation
When booting a new kernel with kexec_file, the kernel picks a target
location that the kernel should live at, then allocates random pages,
checks whether any of those patches magically happens to coincide with a
target address range and if so, uses them for that range.

For every page allocated this way, it then creates a page list that the
relocation code - code that executes while all CPUs are off and we are
just about to jump into the new kernel - copies to their final memory
location.  We can not put them there before, because chances are pretty
good that at least some page in the target range is already in use by the
currently running Linux environment.  Copying is happening from a single
CPU at RAM rate, which takes around 4-50 ms per 100 MiB.

All of this is inefficient and error prone.

To successfully kexec, we need to quiesce all devices of the outgoing
kernel so they don't scribble over the new kernel's memory.  We have seen
cases where that does not happen properly (*cough* GIC *cough*) and hence
the new kernel was corrupted.  This started a month long journey to root
cause failing kexecs to eventually see memory corruption, because the new
kernel was corrupted severely enough that it could not emit output to tell
us about the fact that it was corrupted.  By allocating memory for the
next kernel from a memory range that is guaranteed scribbling free, we can
boot the next kernel up to a point where it is at least able to detect
corruption and maybe even stop it before it becomes severe.  This
increases the chance for successful kexecs.

Since kexec got introduced, Linux has gained the CMA framework which can
perform physically contiguous memory mappings, while keeping that memory
available for movable memory when it is not needed for contiguous
allocations.  The default CMA allocator is for DMA allocations.

This patch adds logic to the kexec file loader to attempt to place the
target payload at a location allocated from CMA.  If successful, it uses
that memory range directly instead of creating copy instructions during
the hot phase.  To ensure that there is a safety net in case anything goes
wrong with the CMA allocation, it also adds a flag for user space to force
disable CMA allocations.

Using CMA allocations has two advantages:

  1) Faster by 4-50 ms per 100 MiB. There is no more need to copy in the
     hot phase.
  2) More robust. Even if by accident some page is still in use for DMA,
     the new kernel image will be safe from that access because it resides
     in a memory region that is considered allocated in the old kernel and
     has a chance to reinitialize that component.

Link: https://lkml.kernel.org/r/20250610085327.51817-1-graf@amazon.com
Signed-off-by: Alexander Graf <graf@amazon.com>
Acked-by: Baoquan He <bhe@redhat.com>
Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Zhongkun He <hezhongkun.hzk@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-08-02 12:01:38 -07:00
Dr. David Alan Gilbert
6c6d8f8ba7 lib/xxhash: remove unused functions
xxh32_digest() and xxh32_update() were added in 2017 in the original
xxhash commit, but have remained unused.

Remove them.

Link: https://lkml.kernel.org/r/20250716133245.243363-1-linux@treblig.org
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Dave Gilbert <linux@treblig.org>
Cc: Nick Terrell <terrelln@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-08-02 12:01:37 -07:00
Linus Torvalds
7061835997 Merge tag 'firewire-updates-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394
Pull firewire updates from Takashi Sakamoto:
 "This update replaces the remaining tasklet usage in the FireWire
  subsystem with workqueue for asynchronous packet transmission. With
  this change, tasklets are now fully eliminated from the subsystem.

  Asynchronous packet transmission is used for serial bus topology
  management as well as for the operation of the SBP-2 protocol driver
  (firewire-sbp2). To ensure reliability during low-memory conditions,
  the associated workqueue is created with the WQ_MEM_RECLAIM flag,
  allowing it to participate in memory reclaim paths. Other attributes
  are aligned with those used for isochronous packet handling, which was
  migrated to workqueues in v6.12.

  The workqueues are sleepable and support preemptible work items,
  making them more suitable for real-time workloads that benefit from
  timely task preemption at the system level.

  There remains an issue where 'schedule()' may be called within an RCU
  read-side critical section, due to a direct replacement of
  'tasklet_disable_in_atomic()' with 'disable_work_sync()'. A proposed
  fix for this has been posted[1], and is currently under review and
  testing. It is expected to be sent upstream later"

Link: https://lore.kernel.org/lkml/20250728015125.17825-1-o-takashi@sakamocchi.jp/ [1]

* tag 'firewire-updates-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394:
  firewire: ohci: reduce the size of common context structure by extracting members into AT structure
  firewire: core: minor code refactoring to localize table of gap count
  firewire: ohci: use workqueue to handle events of AT request/response contexts
  firewire: ohci: use workqueue to handle events of AR request/response contexts
  firewire: core: allocate workqueue for AR/AT request/response contexts
  firewire: core: use from_work() macro to expand parent structure of work_struct
  firewire: ohci: use from_work() macro to expand parent structure of work_struct
  firewire: ohci: correct code comments about bus_reset tasklet
2025-08-02 09:52:53 -07:00
Linus Torvalds
a6923c06a3 Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Pull bpf fixes from Alexei Starovoitov:

 - Fix kCFI failures in JITed BPF code on arm64 (Sami Tolvanen, Puranjay
   Mohan, Mark Rutland, Maxwell Bland)

 - Disallow tail calls between BPF programs that use different cgroup
   local storage maps to prevent out-of-bounds access (Daniel Borkmann)

 - Fix unaligned access in flow_dissector and netfilter BPF programs
   (Paul Chaignon)

 - Avoid possible use of uninitialized mod_len in libbpf (Achill
   Gilgenast)

* tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  selftests/bpf: Test for unaligned flow_dissector ctx access
  bpf: Improve ctx access verifier error message
  bpf: Check netfilter ctx accesses are aligned
  bpf: Check flow_dissector ctx accesses are aligned
  arm64/cfi,bpf: Support kCFI + BPF on arm64
  cfi: Move BPF CFI types and helpers to generic code
  cfi: add C CFI type macro
  libbpf: Avoid possible use of uninitialized mod_len
  bpf: Fix oob access in cgroup local storage
  bpf: Move cgroup iterator helpers to bpf.h
  bpf: Move bpf map owner out of common struct
  bpf: Add cookie object to bpf maps
2025-08-01 17:13:26 -07:00
Linus Torvalds
d41e5839d8 Merge tag 'cxl-for-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl
Pull CXL updates from Dave Jiang:
 "The most significant changes in this pull request is the series that
  introduces ACQUIRE() and ACQUIRE_ERR() macros to replace conditional
  locking and ease the pain points of scoped_cond_guard().

  The series also includes follow on changes that refactor the CXL
  sub-system to utilize the new macros.

  Detail summary:

   - Add documentation template for CXL conventions to document CXL
     platform quirks

   - Replace mutex_lock_io() with mutex_lock() for mailbox

   - Add location limit for fake CFMWS range for cxl_test, ARM platform
     enabling

   - CXL documentation typo and clarity fixes

   - Use correct format specifier for function cxl_set_ecs_threshold()

   - Make cxl_bus_type constant

   - Introduce new helper cxl_resource_contains_addr() to check address
     availability

   - Fix wrong DPA checking for PPR operation

   - Remove core/acpi.c and CXL core dependency on ACPI

   - Introduce ACQUIRE() and ACQUIRE_ERR() for conditional locks

   - Add CXL updates utilizing ACQUIRE() macro to remove gotos and
     improve readability

   - Add return for the dummy version of cxl_decoder_detach() without
     CONFIG_CXL_REGION

   - CXL events updates for spec r3.2

   - Fix return of __cxl_decoder_detach() error path

   - CXL debugfs documentation fix"

* tag 'cxl-for-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (28 commits)
  Documentation/ABI/testing/debugfs-cxl: Add 'cxl' to clear_poison path
  cxl/region: Fix an ERR_PTR() vs NULL bug
  cxl/events: Trace Memory Sparing Event Record
  cxl/events: Add extra validity checks for CVME count in DRAM Event Record
  cxl/events: Add extra validity checks for corrected memory error count in General Media Event Record
  cxl/events: Update Common Event Record to CXL spec rev 3.2
  cxl: Fix -Werror=return-type in cxl_decoder_detach()
  cleanup: Fix documentation build error for ACQUIRE updates
  cxl: Convert to ACQUIRE() for conditional rwsem locking
  cxl/region: Consolidate cxl_decoder_kill_region() and cxl_region_detach()
  cxl/region: Move ready-to-probe state check to a helper
  cxl/region: Split commit_store() into __commit() and queue_reset() helpers
  cxl/decoder: Drop pointless locking
  cxl/decoder: Move decoder register programming to a helper
  cxl/mbox: Convert poison list mutex to ACQUIRE()
  cleanup: Introduce ACQUIRE() and ACQUIRE_ERR() for conditional locks
  cxl: Remove core/acpi.c and cxl core dependency on ACPI
  cxl/core: Using cxl_resource_contains_addr() to check address availability
  cxl/edac: Fix wrong dpa checking for PPR operation
  cxl/core: Introduce a new helper cxl_resource_contains_addr()
  ...
2025-08-01 15:47:06 -07:00
Eric Dumazet
d45cf1e7d7 ipv6: reject malicious packets in ipv6_gso_segment()
syzbot was able to craft a packet with very long IPv6 extension headers
leading to an overflow of skb->transport_header.

This 16bit field has a limited range.

Add skb_reset_transport_header_careful() helper and use it
from ipv6_gso_segment()

WARNING: CPU: 0 PID: 5871 at ./include/linux/skbuff.h:3032 skb_reset_transport_header include/linux/skbuff.h:3032 [inline]
WARNING: CPU: 0 PID: 5871 at ./include/linux/skbuff.h:3032 ipv6_gso_segment+0x15e2/0x21e0 net/ipv6/ip6_offload.c:151
Modules linked in:
CPU: 0 UID: 0 PID: 5871 Comm: syz-executor211 Not tainted 6.16.0-rc6-syzkaller-g7abc678e3084 #0 PREEMPT(full)
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/12/2025
 RIP: 0010:skb_reset_transport_header include/linux/skbuff.h:3032 [inline]
 RIP: 0010:ipv6_gso_segment+0x15e2/0x21e0 net/ipv6/ip6_offload.c:151
Call Trace:
 <TASK>
  skb_mac_gso_segment+0x31c/0x640 net/core/gso.c:53
  nsh_gso_segment+0x54a/0xe10 net/nsh/nsh.c:110
  skb_mac_gso_segment+0x31c/0x640 net/core/gso.c:53
  __skb_gso_segment+0x342/0x510 net/core/gso.c:124
  skb_gso_segment include/net/gso.h:83 [inline]
  validate_xmit_skb+0x857/0x11b0 net/core/dev.c:3950
  validate_xmit_skb_list+0x84/0x120 net/core/dev.c:4000
  sch_direct_xmit+0xd3/0x4b0 net/sched/sch_generic.c:329
  __dev_xmit_skb net/core/dev.c:4102 [inline]
  __dev_queue_xmit+0x17b6/0x3a70 net/core/dev.c:4679

Fixes: d1da932ed4 ("ipv6: Separate ipv6 offload support")
Reported-by: syzbot+af43e647fd835acc02df@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/688a1a05.050a0220.5d226.0008.GAE@google.com/T/#u
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Dawid Osuchowski <dawid.osuchowski@linux.intel.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20250730131738.3385939-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-01 14:40:53 -07:00
Linus Torvalds
821c9e515d Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull virtio updates from Michael Tsirkin:

 - vhost can now support legacy threading if enabled in Kconfig

 - vsock memory allocation strategies for large buffers have been
   improved, reducing pressure on kmalloc

 - vhost now supports the in-order feature. guest bits missed the merge
   window.

 - fixes, cleanups all over the place

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (30 commits)
  vsock/virtio: Allocate nonlinear SKBs for handling large transmit buffers
  vsock/virtio: Rename virtio_vsock_skb_rx_put()
  vhost/vsock: Allocate nonlinear SKBs for handling large receive buffers
  vsock/virtio: Move SKB allocation lower-bound check to callers
  vsock/virtio: Rename virtio_vsock_alloc_skb()
  vsock/virtio: Resize receive buffers so that each SKB fits in a 4K page
  vsock/virtio: Move length check to callers of virtio_vsock_skb_rx_put()
  vsock/virtio: Validate length in packet header before skb_put()
  vhost/vsock: Avoid allocating arbitrarily-sized SKBs
  vhost_net: basic in_order support
  vhost: basic in order support
  vhost: fail early when __vhost_add_used() fails
  vhost: Reintroduce kthread API and add mode selection
  vdpa: Fix IDR memory leak in VDUSE module exit
  vdpa/mlx5: Fix release of uninitialized resources on error path
  vhost-scsi: Fix check for inline_sg_cnt exceeding preallocated limit
  virtio: virtio_dma_buf: fix missing parameter documentation
  vhost: Fix typos
  vhost: vringh: Remove unused functions
  vhost: vringh: Remove unused iotlb functions
  ...
2025-08-01 14:17:48 -07:00
Linus Torvalds
0bd0a41a51 Merge tag 'pci-v6.17-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci
Pull PCI updates from Bjorn Helgaas:
 "Enumeration:

   - Allow built-in drivers, not just modular drivers, to use async
     initial probing (Lukas Wunner)

   - Support Immediate Readiness even on devices with no PM Capability
     (Sean Christopherson)

   - Consolidate definition of PCIE_RESET_CONFIG_WAIT_MS (100ms), the
     required delay between a reset and sending config requests to a
     device (Niklas Cassel)

   - Add pci_is_display() to check for "Display" base class and use it
     in ALSA hda, vfio, vga_switcheroo, vt-d (Mario Limonciello)

   - Allow 'isolated PCI functions' (multi-function devices without a
     function 0) for LoongArch, similar to s390 and jailhouse (Huacai
     Chen)

  Power control:

   - Add ability to enable optional slot clock for cases where the PCIe
     host controller and the slot are supplied by different clocks
     (Marek Vasut)

  PCIe native device hotplug:

   - Fix runtime PM ref imbalance on Hot-Plug Capable ports caused by
     misinterpreting a config read failure after a device has been
     removed (Lukas Wunner)

   - Avoid creating a useless PCIe port service device for pciehp if the
     slot is handled by the ACPI hotplug driver (Lukas Wunner)

   - Ignore ACPI hotplug slots when calculating depth of pciehp hotplug
     ports (Lukas Wunner)

  Virtualization:

   - Save VF resizable BAR state and restore it after reset (Michał
     Winiarski)

   - Allow IOV resources (VF BARs) to be resized (Michał Winiarski)

   - Add pci_iov_vf_bar_set_size() so drivers can control VF BAR size
     (Michał Winiarski)

  Endpoint framework:

   - Add RC-to-EP doorbell support using platform MSI controller,
     including a test case (Frank Li)

   - Allow BAR assignment via configfs so platforms have flexibility in
     determining BAR usage (Jerome Brunet)

  Native PCIe controller drivers:

   - Convert amazon,al-alpine-v[23]-pcie, apm,xgene-pcie,
     axis,artpec6-pcie, marvell,armada-3700-pcie, st,spear1340-pcie to
     DT schema format (Rob Herring)

   - Use dev_fwnode() instead of of_fwnode_handle() to remove OF
     dependency in altera (fixes an unused variable), designware-host,
     mediatek, mediatek-gen3, mobiveil, plda, xilinx, xilinx-dma,
     xilinx-nwl (Jiri Slaby, Arnd Bergmann)

   - Convert aardvark, altera, brcmstb, designware-host, iproc,
     mediatek, mediatek-gen3, mobiveil, plda, rcar-host, vmd, xilinx,
     xilinx-dma, xilinx-nwl from using pci_msi_create_irq_domain() to
     using msi_create_parent_irq_domain() instead; this makes the
     interrupt controller per-PCI device, allows dynamic allocation of
     vectors after initialization, and allows support of IMS (Nam Cao)

  APM X-Gene PCIe controller driver:

   - Rewrite MSI handling to MSI CPU affinity, drop useless CPU hotplug
     bits, use device-managed memory allocations, and clean things up
     (Marc Zyngier)

   - Probe xgene-msi as a standard platform driver rather than a
     subsys_initcall (Marc Zyngier)

  Broadcom STB PCIe controller driver:

   - Add optional DT 'num-lanes' property and if present, use it to
     override the Maximum Link Width advertised in Link Capabilities
     (Jim Quinlan)

  Cadence PCIe controller driver:

   - Use PCIe Message routing types from the PCI core rather than
     defining private ones (Hans Zhang)

  Freescale i.MX6 PCIe controller driver:

   - Add IMX8MQ_EP third 64-bit BAR in epc_features (Richard Zhu)

   - Add IMX8MM_EP and IMX8MP_EP fixed 256-byte BAR 4 in epc_features
     (Richard Zhu)

   - Configure LUT for MSI/IOMMU in Endpoint mode so Root Complex can
     trigger doorbel on Endpoint (Frank Li)

   - Remove apps_reset (LTSSM_EN) from
     imx_pcie_{assert,deassert}_core_reset(), which fixes a hotplug
     regression on i.MX8MM (Richard Zhu)

   - Delay Endpoint link start until configfs 'start' written (Richard
     Zhu)

  Intel VMD host bridge driver:

   - Add Intel Panther Lake (PTL)-H/P/U Vendor ID (George D Sworo)

  Qualcomm PCIe controller driver:

   - Add DT binding and driver support for SA8255p, which supports ECAM
     for Configuration Space access (Mayank Rana)

   - Update DT binding and driver to describe PHYs and per-Root Port
     resets in a Root Port stanza and deprecate describing them in the
     host bridge; this makes it possible to support multiple Root Ports
     in the future (Krishna Chaitanya Chundru)

   - Add Qualcomm QCS615 to SM8150 DT binding (Ziyue Zhang)

   - Add Qualcomm QCS8300 to SA8775p DT binding (Ziyue Zhang)

   - Drop TBU and ref clocks from Qualcomm SM8150 and SC8180x DT
     bindings (Konrad Dybcio)

   - Document 'link_down' reset in Qualcomm SA8775P DT binding (Ziyue
     Zhang)

   - Add required PCIE_RESET_CONFIG_WAIT_MS delay after Link up IRQ
     (Niklas Cassel)

  Rockchip PCIe controller driver:

   - Drop unused PCIe Message routing and code definitions (Hans Zhang)

   - Remove several unused header includes (Hans Zhang)

   - Use standard PCIe config register definitions instead of
     rockchip-specific redefinitions (Geraldo Nascimento)

   - Set Target Link Speed to 5.0 GT/s before retraining so we have a
     chance to train at a higher speed (Geraldo Nascimento)

  Rockchip DesignWare PCIe controller driver:

   - Prevent race between link training and register update via DBI by
     inhibiting link training after hot reset and link down (Wilfred
     Mallawa)

   - Add required PCIE_RESET_CONFIG_WAIT_MS delay after Link up IRQ
     (Niklas Cassel)

  Sophgo PCIe controller driver:

   - Add DT binding and driver for Sophgo SG2044 PCIe controller driver
     in Root Complex mode (Inochi Amaoto)

  Synopsys DesignWare PCIe controller driver:

   - Add required PCIE_RESET_CONFIG_WAIT_MS after waiting for Link up on
     Ports that support > 5.0 GT/s. Slower Ports still rely on the
     not-quite-correct PCIE_LINK_WAIT_SLEEP_MS 90ms default delay while
     waiting for the Link (Niklas Cassel)"

* tag 'pci-v6.17-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (116 commits)
  dt-bindings: PCI: qcom,pcie-sa8775p: Document 'link_down' reset
  dt-bindings: PCI: Remove 83xx-512x-pci.txt
  dt-bindings: PCI: Convert amazon,al-alpine-v[23]-pcie to DT schema
  dt-bindings: PCI: Convert marvell,armada-3700-pcie to DT schema
  dt-bindings: PCI: Convert apm,xgene-pcie to DT schema
  dt-bindings: PCI: Convert axis,artpec6-pcie to DT schema
  dt-bindings: PCI: Convert st,spear1340-pcie to DT schema
  PCI: Move is_pciehp check out of pciehp_is_native()
  PCI: pciehp: Use is_pciehp instead of is_hotplug_bridge
  PCI/portdrv: Use is_pciehp instead of is_hotplug_bridge
  PCI/ACPI: Fix runtime PM ref imbalance on Hot-Plug Capable ports
  selftests: pci_endpoint: Add doorbell test case
  misc: pci_endpoint_test: Add doorbell test case
  PCI: endpoint: pci-epf-test: Add doorbell test support
  PCI: endpoint: Add pci_epf_align_inbound_addr() helper for inbound address alignment
  PCI: endpoint: pci-ep-msi: Add checks for MSI parent and mutability
  PCI: endpoint: Add RC-to-EP doorbell support using platform MSI controller
  PCI: dwc: Add Sophgo SG2044 PCIe controller driver in Root Complex mode
  PCI: vmd: Switch to msi_create_parent_irq_domain()
  PCI: vmd: Convert to lock guards
  ...
2025-08-01 13:59:07 -07:00
Steven Rostedt
788fa4b47c tracing: Add guard(ring_buffer_nest)
Some calls to the tracing ring buffer can happen when the ring buffer is
already being written to by the same context (for example, a
trace_printk() in between a ring_buffer_lock_reserve() and a
ring_buffer_unlock_commit()).

In order to not trigger the recursion detection, these functions use
ring_buffer_nest_start() and ring_buffer_nest_end(). Create a guard() for
these functions so that their use cases can be simplified and not need to
use goto for the release.

Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: https://lore.kernel.org/20250801203857.710501021@kernel.org
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-08-01 16:49:15 -04:00
Linus Torvalds
877d94c74e Merge tag 'linux-watchdog-6.17-rc1' of git://www.linux-watchdog.org/linux-watchdog
Pull watchdog updates from Wim Van Sebroeck:

 - sbsa: Adjust keepalive timeout to avoid MediaTek WS0 race condition

 - Various improvements and fixes

* tag 'linux-watchdog-6.17-rc1' of git://www.linux-watchdog.org/linux-watchdog:
  watchdog: sbsa: Adjust keepalive timeout to avoid MediaTek WS0 race condition
  watchdog: dw_wdt: Fix default timeout
  watchdog: Don't use "proxy" headers
  watchdog: it87_wdt: Don't use "proxy" headers
  watchdog: renesas_wdt: Convert to DEFINE_SIMPLE_DEV_PM_OPS()
  watchdog: iTCO_wdt: Report error if timeout configuration fails
  watchdog: rti_wdt: Use of_reserved_mem_region_to_resource() for "memory-region"
  dt-bindings: watchdog: nxp,pnx4008-wdt: allow clocks property
  watchdog: ziirave_wdt: check record length in ziirave_firm_verify()
2025-08-01 13:32:43 -07:00
Linus Torvalds
8582976acc Merge tag 'phy-for-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy
Pull phy updates from Vinod Koul:
 "New Support:

   - Qualcomm Milos Synopsys eUSB2 PHY, SM8750 QMP phy support, M31
     eUSB2 PHY driver

   - Samsung Exynos990 usbdrd phy, Exynos7870 MIPI phy support

   - Renesas RZ/V2N usb2-phy support

  Updates:

   - Bulk Yaml binding conversion By Rob H (too many to be listed)

   - cadence: Sierra PCIe, USB PHY multilink configuration support

   - Qualcomm refactoring of UFS PHY reset and UFS driver support for
     phy calibrate API"

* tag 'phy-for-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy: (74 commits)
  phy: qcom: phy-qcom-m31: Update IPQ5332 M31 USB phy initialization sequence
  dt-bindings: phy: Convert brcm,sr-usb-combo-phy to DT schema
  dt-bindings: phy: Convert ti,da830-usb-phy to DT schema
  dt-bindings: phy: marvell,mmp2-usb-phy: Drop status from the example
  dt-bindings: phy: mixel, mipi-dsi-phy: Allow assigned-clock* properties
  phy: exynos-mipi-video: correct cam0 sysreg property name for exynos7870
  phy: qcom: phy-qcom-snps-eusb2: Update init sequence per HPG 1.0.2
  phy: qcom: phy-qcom-snps-eusb2: Add missing write from init sequence
  dt-bindings: phy: qcom,snps-eusb2: document the Milos Synopsys eUSB2 PHY
  dt-bindings: usb: qcom,snps-dwc3: Add Milos compatible
  phy: rockchip-pcie: Properly disable TEST_WRITE strobe signal
  phy: rockchip-pcie: Enable all four lanes if required
  dt-bindings: phy: qcom,sc8280xp-qmp-pcie-phy: Update pcie phy bindings for QCS615
  phy: qcom: qmp-combo: Add missing PLL (VCO) configuration on SM8750
  phy: qcom: m31-eusb2: drop registration printk
  phy: qcom: m31-eusb2: fix match data santity check
  phy: qcom: qmp-pcie: Update PHY settings for QCS8300 & SA8775P
  phy: qualcomm: phy-qcom-eusb2-repeater: Don't zero-out registers
  dt-bindings: phy: qcom,snps-eusb2-repeater: Remove default tuning values
  phy: mediatek: tphy: Cleanup and document slew calibration
  ...
2025-08-01 12:31:50 -07:00
Linus Torvalds
6fac1139d9 Merge tag 'soundwire-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire
Pull soundwire updates from Vinod Koul:
 "A couple of small core changes and driver updates:

   - Core: handling of nesting irqs to outside the lock, stream
     parameters handing on port prep failures.

   - AMD driver support for ACP 7.2 platforms and improved handing of
     slave alerts and resume sequences

   - Qualcomm updating driver debug spew

   - Intel BPT message length limitations, rt721 codec as wake capable
     etc"

* tag 'soundwire-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire:
  soundwire: amd: Add support for acp7.2 platform
  soundwire: stream: restore params when prepare ports fail
  soundwire: debugfs: move debug statement outside of error handling
  soundwire: amd: add check for status update registers
  soundwire: intel_auxdevice: add rt721 codec to wake_capable_list
  soundwire: Correct some property names
  soundwire: update Intel BPT message length limitation
  soundwire: intel_ace2.x: Use str_read_write() helper
  soundwire: amd: cancel pending slave status handling workqueue during remove sequence
  soundwire: amd: serialize amd manager resume sequence during pm_prepare
  soundwire: qcom: demote probe registration printk
  ASoC: cs42l43: Remove unnecessary work functions
  soundwire: Move handle_nested_irq outside of sdw_dev_lock
  MAINTAINERS: Remove Sanyog Kale as reviewer on SoundWire
2025-08-01 11:09:27 -07:00
Linus Torvalds
d6f38c1239 Merge tag 'trace-v6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing updates from Steven Rostedt:

 - Deprecate auto-mounting tracefs to /sys/kernel/debug/tracing

   When tracefs was first introduced back in 2014, the directory
   /sys/kernel/tracing was added and is the designated location to mount
   tracefs. To keep backward compatibility, tracefs was auto-mounted in
   /sys/kernel/debug/tracing as well.

   All distros now mount tracefs on /sys/kernel/tracing. Having it seen
   in two different locations has lead to various issues and
   inconsistencies.

   The VFS folks have to also maintain debugfs_create_automount() for
   this single user.

   It's been over 10 years. Tooling and scripts should start replacing
   the debugfs location with the tracefs one. The reason tracefs was
   created in the first place was to allow access to the tracing
   facilities without the need to configure debugfs into the kernel.
   Using tracefs should now be more robust.

   A new config is created: CONFIG_TRACEFS_AUTOMOUNT_DEPRECATED which is
   default y, so that the kernel is still built with the automount. This
   config allows those that want to remove the automount from debugfs to
   do so.

   When tracefs is accessed from /sys/kernel/debug/tracing, the
   following printk is triggerd:

     pr_warn("NOTICE: Automounting of tracing to debugfs is deprecated and will be removed in 2030\n");

   This gives users another 5 years to fix their scripts.

 - Use queue_rcu_work() instead of call_rcu() for freeing event filters

   The number of filters to be free can be many depending on the number
   of events within an event system. Freeing them from softirq context
   can potentially cause undesired latency. Use the RCU workqueue to
   free them instead.

 - Remove pointless memory barriers in latency code

   Memory barriers were added to some of the latency code a long time
   ago with the idea of "making them visible", but that's not what
   memory barriers are for. They are to synchronize access between
   different variables. There was no synchronization here making them
   pointless.

 - Remove "__attribute__()" from the type field of event format

   When LLVM is used to compile the kernel with CONFIG_DEBUG_INFO_BTF=y
   and PAHOLE_HAS_BTF_TAG=y, some of the format fields get expanded with
   the following:

     field:const char * filename;      offset:24;      size:8; signed:0;

   Turns into:

     field:const char __attribute__((btf_type_tag("user"))) * filename;      offset:24;      size:8; signed:0;

   This confuses parsers. Add code to strip these tags from the strings.

 - Add eprobe config option CONFIG_EPROBE_EVENTS

   Eprobes were added back in 5.15 but were only enabled when another
   probe was enabled (kprobe, fprobe, uprobe, etc). The eprobes had no
   config option of their own. Add one as they should be a separate
   entity.

   It's default y to keep with the old kernels but still has
   dependencies on TRACING and HAVE_REGS_AND_STACK_ACCESS_API.

 - Add eprobe documentation

   When eprobes were added back in 5.15 no documentation was added to
   describe them. This needs to be rectified.

 - Replace open coded cpumask_next_wrap() in move_to_next_cpu()

 - Have preemptirq_delay_run() use off-stack CPU mask

 - Remove obsolete comment about pelt_cfs event

   DECLARE_TRACE() appends "_tp" to trace events now, but the comment
   above pelt_cfs still mentioned appending it manually.

 - Remove EVENT_FILE_FL_SOFT_MODE flag

   The SOFT_MODE flag was required when the soft enabling and disabling
   of trace events was first introduced. But there was a bug with this
   approach as it only worked for a single instance. When multiple users
   required soft disabling and disabling the code was changed to have a
   ref count. The SOFT_MODE flag is now set iff the ref count is non
   zero. This is redundant and just reading the ref count is good
   enough.

 - Fix typo in comment

* tag 'trace-v6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  Documentation: tracing: Add documentation about eprobes
  tracing: Have eprobes have their own config option
  tracing: Remove "__attribute__()" from the type field of event format
  tracing: Deprecate auto-mounting tracefs in debugfs
  tracing: Fix comment in trace_module_remove_events()
  tracing: Remove EVENT_FILE_FL_SOFT_MODE flag
  tracing: Remove pointless memory barriers
  tracing/sched: Remove obsolete comment on suffixes
  kernel: trace: preemptirq_delay_test: use offstack cpu mask
  tracing: Use queue_rcu_work() to free filters
  tracing: Replace opencoded cpumask_next_wrap() in move_to_next_cpu()
2025-08-01 10:29:36 -07:00
Linus Torvalds
c6439bfaab Merge tag 'trace-deferred-unwind-v6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull initial deferred unwind infrastructure from Steven Rostedt:
 "This is the core infrastructure for the deferred unwinder that is
  required for sframes[1]. Several other patch series are based on this
  work although those patch series are not dependent on each other. In
  order to simplify the development, having this core series upstream
  will allow the other series to be worked on in parallel. The other
  series are:

    - The two patches to implement x86 support [2] [3]

    - The s390 work [4]

    - The perf work [5]

    - The ftrace work [6]

    - The sframe work [7]

  And more is on the way.

  The core infrastructure adds the following in kernel APIs:

    - int unwind_user_faultable(struct unwind_stacktrace *trace);

        Performs a user space stack trace that may fault user pages in.

    - int unwind_deferred_init(struct unwind_work *work, unwind_callback_t func);

        Allows a tracer to register with the unwind deferred
        infrastructure.

    - int unwind_deferred_request(struct unwind_work *work, u64 *cookie);

        Used when a tracer request a deferred trace. Can be called from
        interrupt or NMI context.

    - void unwind_deferred_cancel(struct unwind_work *work);

        Called by a tracer to unregister from the deferred unwind
        infrastructure.

    - void unwind_deferred_task_exit(struct task_struct *task);

        Called by task exit code to flush any pending unwind requests.

    - void unwind_task_init(struct task_struct *task);

        Called by do_fork() to initialize the task struct for the
        deferred unwinder.

    - void unwind_task_free(struct task_struct *task);

        Called by do_exit() to free up any resources used by the
        deferred unwinder.

    None of the above is actually compiled unless an architecture enables it,
    which none currently do"

Link: https://sourceware.org/binutils/wiki/sframe [1]
Link: https://lore.kernel.org/linux-trace-kernel/20250717004958.260781923@kernel.org/ [2]
Link: https://lore.kernel.org/linux-trace-kernel/20250717004958.432327787@kernel.org/ [3]
Link: https://lore.kernel.org/linux-trace-kernel/20250710163522.3195293-1-jremus@linux.ibm.com/ [4]
Link: https://lore.kernel.org/linux-trace-kernel/20250718164119.089692174@kernel.org/ [5]
Link: https://lore.kernel.org/linux-trace-kernel/20250424192612.505622711@goodmis.org/ [6]
Link: https://lore.kernel.org/linux-trace-kernel/20250717012848.927473176@kernel.org/ [7]

* tag 'trace-deferred-unwind-v6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  unwind: Finish up unwind when a task exits
  unwind deferred: Use SRCU unwind_deferred_task_work()
  unwind: Add USED bit to only have one conditional on way back to user space
  unwind deferred: Add unwind_completed mask to stop spurious callbacks
  unwind deferred: Use bitmask to determine which callbacks to call
  unwind_user/deferred: Make unwind deferral requests NMI-safe
  unwind_user/deferred: Add deferred unwinding interface
  unwind_user/deferred: Add unwind cache
  unwind_user/deferred: Add unwind_user_faultable()
  unwind_user: Add user space unwinding API with frame pointer support
2025-08-01 09:46:24 -07:00