Commit Graph

1294501 Commits

Author SHA1 Message Date
Alexander Mikhalitsyn
d561254fb7 fuse: support idmapped FUSE_EXT_GROUPS
We don't need to remap parent_gid, but have to adjust
group membership checks and take idmapping into account.

Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2024-09-04 16:48:22 +02:00
Alexander Mikhalitsyn
10dc721836 fuse: add an idmap argument to fuse_simple_request
If idmap == NULL *and* filesystem daemon declared idmapped mounts
support, then uid/gid values in a fuse header will be -1.

No functional changes intended.

Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2024-09-04 16:48:22 +02:00
Alexander Mikhalitsyn
aa16880d9f fuse: add basic infrastructure to support idmappings
Add some preparational changes in fuse_get_req/fuse_force_creds
to handle idmappings.

Miklos suggested [1], [2] to change the meaning of in.h.uid/in.h.gid
fields when daemon declares support for idmapped mounts. In a new semantic,
we fill uid/gid values in fuse header with a id-mapped caller uid/gid (for
requests which create new inodes), for all the rest cases we just send -1
to userspace.

No functional changes intended.

Link: https://lore.kernel.org/all/CAJfpegsVY97_5mHSc06mSw79FehFWtoXT=hhTUK_E-Yhr7OAuQ@mail.gmail.com/ [1]
Link: https://lore.kernel.org/all/CAJfpegtHQsEUuFq1k4ZbTD3E1h-GsrN3PWyv7X8cg6sfU_W2Yw@mail.gmail.com/ [2]
Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2024-09-04 16:48:22 +02:00
Alexander Mikhalitsyn
2097154a10 namespace: introduce SB_I_NOIDMAP flag
Right now we determine if filesystem support vfs idmappings or not basing
on the FS_ALLOW_IDMAP flag presence. This "static" way works perfecly well
for local filesystems like ext4, xfs, btrfs, etc. But for network-like
filesystems like fuse, cephfs this approach is not ideal, because sometimes
proper support of vfs idmaps requires some extensions for the on-wire
protocol, which implies that changes have to be made not only in the Linux
kernel code but also in the 3rd party components like libfuse, cephfs MDS
server and so on.

We have seen that issue during our work on cephfs idmapped mounts [1] with
Christian, but right now I'm working on the idmapped mounts support for
fuse/virtiofs and I think that it is a right time for this extension.

[1] 5ccd8530dd ("ceph: handle idmapped mounts in create_request_message()")

Suggested-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2024-09-04 16:47:49 +02:00
Aurelien Aptel
506b21c945 fuse: use correct name fuse_conn_list in docstring
fuse_mount_list doesn't exist, use fuse_conn_list.

Signed-off-by: Aurelien Aptel <aaptel@nvidia.com>
Reviewed-by: Bernd Schubert <bschubert@ddn.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2024-08-29 11:43:13 +02:00
Josef Bacik
396b209e40 fuse: add simple request tracepoints
I've been timing various fuse operations and it's quite annoying to do
with kprobes.  Add two tracepoints for sending and ending fuse requests
to make it easier to debug and time various operations.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Bernd Schubert <bschubert@ddn.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2024-08-29 11:43:13 +02:00
Joanne Koong
0acad9289b fuse: refactor out shared logic in fuse_writepages_fill() and fuse_writepage_locked()
This change refactors the shared logic in fuse_writepages_fill() and
fuse_writepages_locked() into two separate helper functions,
fuse_writepage_args_page_fill() and fuse_writepage_args_setup().

No functional changes added.

Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2024-08-29 11:43:12 +02:00
Joanne Koong
4046d3adcc fuse: move fuse file initialization to wpa allocation time
Before this change, wpa->ia.ff is initialized with an acquired reference
on the fuse file right before it submits the writeback request. If there
are auxiliary writebacks, then the initialization and reference
acquisition needs to also be set before we submit the auxiliary writeback
request.

To make the logic simpler and to pave the way for a subsequent
refactoring of fuse_writepages_fill() and fuse_writepage_locked(), this
change initializes and acquires wpa->ia.ff when the wpa is allocated.

No functional changes added.

Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2024-08-29 11:43:12 +02:00
Joanne Koong
9a8ebcf5e0 fuse: convert fuse_writepages_fill() to use a folio for its tmp page
To pave the way for refactoring out the shared logic in
fuse_writepages_fill() and fuse_writepage_locked(), this change converts
the temporary page in fuse_writepages_fill() to use the folio API.

This is similar to the change in commit e0887e095a ("fuse: Convert
fuse_writepage_locked to take a folio"), which converted the tmp page in
fuse_writepage_locked() to use the folio API.

inc_node_page_state() is intentionally preserved here instead of
converting to node_stat_add_folio() since it is updating the stat of the
underlying page and to better maintain API symmetry with
dec_node_page_stat() in fuse_writepage_finish_stat().

No functional changes added.

Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2024-08-29 11:43:12 +02:00
Joanne Koong
672c3b7457 fuse: move initialization of fuse_file to fuse_writepages() instead of in callback
Prior to this change, data->ff is checked and if not initialized then
initialized in the fuse_writepages_fill() callback, which gets called
for every dirty page in the address space mapping.

This logic is better placed in the main fuse_writepages() caller where
data.ff is initialized before walking the dirty pages.

No functional changes added.

Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2024-08-29 11:43:12 +02:00
Joanne Koong
c04e3b2118 fuse: refactor finished writeback stats updates into helper function
Move the logic for updating the bdi and page stats for a finished
writeback into a separate helper function, where it can be called from
both fuse_writepage_finish() and fuse_writepage_add() (in the case
where there is already an auxiliary write request for the page).

No functional changes added.

Suggested by: Jingbo Xu <jefflexu@linux.alibaba.com>

Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2024-08-29 11:43:12 +02:00
Joanne Koong
509a6458b4 fuse: drop unused fuse_mount arg in fuse_writepage_finish()
Drop the unused "struct fuse_mount *fm" arg in
fuse_writepage_finish().

No functional changes added.

Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
Reviewed-by: Jingbo Xu <jefflexu@linux.alibaba.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2024-08-29 11:43:12 +02:00
yangyun
ac5cffec53 fuse: add fast path for fuse_range_is_writeback
In some cases, the fi->writepages may be empty. And there is no need
to check fi->writepages with spin_lock, which may have an impact on
performance due to lock contention. For example, in scenarios where
multiple readers read the same file without any writers, or where
the page cache is not enabled.

Also remove the outdated comment since commit 6b2fb79963 ("fuse:
optimize writepages search") has optimize the situation by replacing
list with rb-tree.

Signed-off-by: yangyun <yangyun50@huawei.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2024-08-29 11:43:12 +02:00
Miklos Szeredi
5de8acb41c fuse: cleanup request queuing towards virtiofs
Virtiofs has its own queuing mechanism, but still requests are first queued
on fiq->pending to be immediately dequeued and queued onto the virtio
queue.

The queuing on fiq->pending is unnecessary and might even have some
performance impact due to being a contention point.

Forget requests are handled similarly.

Move the queuing of requests and forgets into the fiq->ops->*.
fuse_iqueue_ops are renamed to reflect the new semantics.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Fixed-by: Jingbo Xu <jefflexu@linux.alibaba.com>
Reviewed-by: Jingbo Xu <jefflexu@linux.alibaba.com>
Tested-by: Peter-Jan Gootzen <pgootzen@nvidia.com>
Reviewed-by: Peter-Jan Gootzen <pgootzen@nvidia.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2024-08-29 11:43:12 +02:00
Bernd Schubert
3ab394b363 fuse: disable the combination of passthrough and writeback cache
Current design and handling of passthrough is without fuse
caching and with that FUSE_WRITEBACK_CACHE is conflicting.

Fixes: 7dc4e97a4f ("fuse: introduce FUSE_PASSTHROUGH capability")
Cc: stable@kernel.org # v6.9
Signed-off-by: Bernd Schubert <bschubert@ddn.com>
Acked-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2024-08-29 11:43:01 +02:00
Joanne Koong
f7790d6778 fuse: update stats for pages in dropped aux writeback list
In the case where the aux writeback list is dropped (e.g. the pages
have been truncated or the connection is broken), the stats for
its pages and backing device info need to be updated as well.

Fixes: e2653bd53a ("fuse: fix leaked aux requests")
Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Cc: <stable@vger.kernel.org> # v5.1
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2024-08-28 18:10:29 +02:00
Miklos Szeredi
76a51ac00c fuse: clear PG_uptodate when using a stolen page
Originally when a stolen page was inserted into fuse's page cache by
fuse_try_move_page(), it would be marked uptodate.  Then
fuse_readpages_end() would call SetPageUptodate() again on the already
uptodate page.

Commit 413e8f014c ("fuse: Convert fuse_readpages_end() to use
folio_end_read()") changed that by replacing the SetPageUptodate() +
unlock_page() combination with folio_end_read(), which does mostly the
same, except it sets the uptodate flag with an xor operation, which in the
above scenario resulted in the uptodate flag being cleared, which in turn
resulted in EIO being returned on the read.

Fix by clearing PG_uptodate instead of setting it in fuse_try_move_page(),
conforming to the expectation of folio_end_read().

Reported-by: Jürg Billeter <j@bitron.ch>
Debugged-by: Matthew Wilcox <willy@infradead.org>
Fixes: 413e8f014c ("fuse: Convert fuse_readpages_end() to use folio_end_read()")
Cc: <stable@vger.kernel.org> # v6.10
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2024-08-28 18:10:29 +02:00
yangyun
3002240d16 fuse: fix memory leak in fuse_create_open
The memory of struct fuse_file is allocated but not freed
when get_create_ext return error.

Fixes: 3e2b6fdbdc ("fuse: send security context of inode on file")
Cc: stable@vger.kernel.org # v5.17
Signed-off-by: yangyun <yangyun50@huawei.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2024-08-28 18:10:29 +02:00
Joanne Koong
97f30876c9 fuse: check aborted connection before adding requests to pending list for resending
There is a race condition where inflight requests will not be aborted if
they are in the middle of being re-sent when the connection is aborted.

If fuse_resend has already moved all the requests in the fpq->processing
lists to its private queue ("to_queue") and then the connection starts
and finishes aborting, these requests will be added to the pending queue
and remain on it indefinitely.

Fixes: 760eac73f9 ("fuse: Introduce a new notification type for resend pending requests")
Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Jingbo Xu <jefflexu@linux.alibaba.com>
Cc: <stable@vger.kernel.org> # v6.9
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2024-08-28 18:10:29 +02:00
Jann Horn
b18915248a fuse: use unsigned type for getxattr/listxattr size truncation
The existing code uses min_t(ssize_t, outarg.size, XATTR_LIST_MAX) when
parsing the FUSE daemon's response to a zero-length getxattr/listxattr
request.
On 32-bit kernels, where ssize_t and outarg.size are the same size, this is
wrong: The min_t() will pass through any size values that are negative when
interpreted as signed.
fuse_listxattr() will then return this userspace-supplied negative value,
which callers will treat as an error value.

This kind of bug pattern can lead to fairly bad security bugs because of
how error codes are used in the Linux kernel. If a caller were to convert
the numeric error into an error pointer, like so:

    struct foo *func(...) {
      int len = fuse_getxattr(..., NULL, 0);
      if (len < 0)
        return ERR_PTR(len);
      ...
    }

then it would end up returning this userspace-supplied negative value cast
to a pointer - but the caller of this function wouldn't recognize it as an
error pointer (IS_ERR_VALUE() only detects values in the narrow range in
which legitimate errno values are), and so it would just be treated as a
kernel pointer.

I think there is at least one theoretical codepath where this could happen,
but that path would involve virtio-fs with submounts plus some weird
SELinux configuration, so I think it's probably not a concern in practice.

Cc: stable@vger.kernel.org # v4.9
Fixes: 63401ccdb2 ("fuse: limit xattr returned size")
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2024-08-28 18:10:29 +02:00
Linus Torvalds
8400291e28 Linux 6.11-rc1 v6.11-rc1 2024-07-28 14:19:55 -07:00
Linus Torvalds
a0c04bd55a Merge tag 'kbuild-fixes-v6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild fixes from Masahiro Yamada:

 - Fix RPM package build error caused by an incorrect locale setup

 - Mark modules.weakdep as ghost in RPM package

 - Fix the odd combination of -S and -c in stack protector scripts,
   which is an error with the latest Clang

* tag 'kbuild-fixes-v6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
  kbuild: Fix '-S -c' in x86 stack protector scripts
  kbuild: rpm-pkg: ghost modules.weakdep file
  kbuild: rpm-pkg: Fix C locale setup
2024-07-28 14:02:48 -07:00
Linus Torvalds
017fa3e891 minmax: simplify and clarify min_t()/max_t() implementation
This simplifies the min_t() and max_t() macros by no longer making them
work in the context of a C constant expression.

That means that you can no longer use them for static initializers or
for array sizes in type definitions, but there were only a couple of
such uses, and all of them were converted (famous last words) to use
MIN_T/MAX_T instead.

Cc: David Laight <David.Laight@aculab.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-07-28 13:50:01 -07:00
Linus Torvalds
4477b39c32 minmax: add a few more MIN_T/MAX_T users
Commit 3a7e02c040 ("minmax: avoid overly complicated constant
expressions in VM code") added the simpler MIN_T/MAX_T macros in order
to avoid some excessive expansion from the rather complicated regular
min/max macros.

The complexity of those macros stems from two issues:

 (a) trying to use them in situations that require a C constant
     expression (in static initializers and for array sizes)

 (b) the type sanity checking

and MIN_T/MAX_T avoids both of these issues.

Now, in the whole (long) discussion about all this, it was pointed out
that the whole type sanity checking is entirely unnecessary for
min_t/max_t which get a fixed type that the comparison is done in.

But that still leaves min_t/max_t unnecessarily complicated due to
worries about the C constant expression case.

However, it turns out that there really aren't very many cases that use
min_t/max_t for this, and we can just force-convert those.

This does exactly that.

Which in turn will then allow for much simpler implementations of
min_t()/max_t().  All the usual "macros in all upper case will evaluate
the arguments multiple times" rules apply.

We should do all the same things for the regular min/max() vs MIN/MAX()
cases, but that has the added complexity of various drivers defining
their own local versions of MIN/MAX, so that needs another level of
fixes first.

Link: https://lore.kernel.org/all/b47fad1d0cf8449886ad148f8c013dae@AcuMS.aculab.com/
Cc: David Laight <David.Laight@aculab.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-07-28 13:41:14 -07:00
Linus Torvalds
7e2d0ba732 Merge tag 'ubifs-for-linus-6.11-rc1-take2' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs
Pull UBI and UBIFS updates from Richard Weinberger:

 - Many fixes for power-cut issues by Zhihao Cheng

 - Another ubiblock error path fix

 - ubiblock section mismatch fix

 - Misc fixes all over the place

* tag 'ubifs-for-linus-6.11-rc1-take2' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs:
  ubi: Fix ubi_init() ubiblock_exit() section mismatch
  ubifs: add check for crypto_shash_tfm_digest
  ubifs: Fix inconsistent inode size when powercut happens during appendant writing
  ubi: block: fix null-pointer-dereference in ubiblock_create()
  ubifs: fix kernel-doc warnings
  ubifs: correct UBIFS_DFS_DIR_LEN macro definition and improve code clarity
  mtd: ubi: Restore missing cleanup on ubi_init() failure path
  ubifs: dbg_orphan_check: Fix missed key type checking
  ubifs: Fix unattached inode when powercut happens in creating
  ubifs: Fix space leak when powercut happens in linking tmpfile
  ubifs: Move ui->data initialization after initializing security
  ubifs: Fix adding orphan entry twice for the same inode
  ubifs: Remove insert_dead_orphan from replaying orphan process
  Revert "ubifs: ubifs_symlink: Fix memleak of inode->i_link in error path"
  ubifs: Don't add xattr inode into orphan area
  ubifs: Fix unattached xattr inode if powercut happens after deleting
  mtd: ubi: avoid expensive do_div() on 32-bit machines
  mtd: ubi: make ubi_class constant
  ubi: eba: properly rollback inside self_check_eba
2024-07-28 11:51:51 -07:00
Nathan Chancellor
3415b10a03 kbuild: Fix '-S -c' in x86 stack protector scripts
After a recent change in clang to stop consuming all instances of '-S'
and '-c' [1], the stack protector scripts break due to the kernel's use
of -Werror=unused-command-line-argument to catch cases where flags are
not being properly consumed by the compiler driver:

  $ echo | clang -o - -x c - -S -c -Werror=unused-command-line-argument
  clang: error: argument unused during compilation: '-c' [-Werror,-Wunused-command-line-argument]

This results in CONFIG_STACKPROTECTOR getting disabled because
CONFIG_CC_HAS_SANE_STACKPROTECTOR is no longer set.

'-c' and '-S' both instruct the compiler to stop at different stages of
the pipeline ('-S' after compiling, '-c' after assembling), so having
them present together in the same command makes little sense. In this
case, the test wants to stop before assembling because it is looking at
the textual assembly output of the compiler for either '%fs' or '%gs',
so remove '-c' from the list of arguments to resolve the error.

All versions of GCC continue to work after this change, along with
versions of clang that do or do not contain the change mentioned above.

Cc: stable@vger.kernel.org
Fixes: 4f7fd4d7a7 ("[PATCH] Add the -fstack-protector option to the CFLAGS")
Fixes: 60a5317ff0 ("x86: implement x86_32 stack protector")
Link: 6461e53781 [1]
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-07-29 03:47:00 +09:00
Richard Weinberger
92a286e902 ubi: Fix ubi_init() ubiblock_exit() section mismatch
Since ubiblock_exit() is now called from an init function,
the __exit section no longer makes sense.

Cc: Ben Hutchings <bwh@kernel.org>
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202407131403.wZJpd8n2-lkp@intel.com/
Signed-off-by: Richard Weinberger <richard@nod.at>
Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com>
2024-07-28 20:08:25 +02:00
Linus Torvalds
e172f1e906 Merge tag 'v6.11-merge' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux
Pull turbostat updates from Len Brown:

 - Enable turbostat extensions to add both perf and PMT (Intel
   Platform Monitoring Technology) counters via the cmdline

 - Demonstrate PMT access with built-in support for Meteor Lake's
   Die C6 counter

* tag 'v6.11-merge' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
  tools/power turbostat: version 2024.07.26
  tools/power turbostat: Include umask=%x in perf counter's config
  tools/power turbostat: Document PMT in turbostat.8
  tools/power turbostat: Add MTL's PMT DC6 builtin counter
  tools/power turbostat: Add early support for PMT counters
  tools/power turbostat: Add selftests for added perf counters
  tools/power turbostat: Add selftests for SMI, APERF and MPERF counters
  tools/power turbostat: Move verbose counter messages to level 2
  tools/power turbostat: Move debug prints from stdout to stderr
  tools/power turbostat: Fix typo in turbostat.8
  tools/power turbostat: Add perf added counter example to turbostat.8
  tools/power turbostat: Fix formatting in turbostat.8
  tools/power turbostat: Extend --add option with perf counters
  tools/power turbostat: Group SMI counter with APERF and MPERF
  tools/power turbostat: Add ZERO_ARRAY for zero initializing builtin array
  tools/power turbostat: Replace enum rapl_source and cstate_source with counter_source
  tools/power turbostat: Remove anonymous union from rapl_counter_info_t
  tools/power/turbostat: Switch to new Intel CPU model defines
2024-07-28 10:52:15 -07:00
Linus Torvalds
e62f81bbd2 Merge tag 'cxl-for-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl
Pull CXL updates from Dave Jiang:
 "Core:

   - A CXL maturity map has been added to the documentation to detail
     the current state of CXL enabling.

     It provides the status of the current state of various CXL features
     to inform current and future contributors of where things are and
     which areas need contribution.

   - A notifier handler has been added in order for a newly created CXL
     memory region to trigger the abstract distance metrics calculation.

     This should bring parity for CXL memory to the same level vs
     hotplugged DRAM for NUMA abstract distance calculation. The
     abstract distance reflects relative performance used for memory
     tiering handling.

   - An addition for XOR math has been added to address the CXL DPA to
     SPA translation.

     CXL address translation did not support address interleave math
     with XOR prior to this change.

  Fixes:

   - Fix to address race condition in the CXL memory hotplug notifier

   - Add missing MODULE_DESCRIPTION() for CXL modules

   - Fix incorrect vendor debug UUID define

  Misc:

   - A warning has been added to inform users of an unsupported
     configuration when mixing CXL VH and RCH/RCD hierarchies

   - The ENXIO error code has been replaced with EBUSY for inject poison
     limit reached via debugfs and cxl-test support

   - Moving the PCI config read in cxl_dvsec_rr_decode() to avoid
     unnecessary PCI config reads

   - A refactor to a common struct for DRAM and general media CXL
     events"

* tag 'cxl-for-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl:
  cxl/core/pci: Move reading of control register to immediately before usage
  cxl: Remove defunct code calculating host bridge target positions
  cxl/region: Verify target positions using the ordered target list
  cxl: Restore XOR'd position bits during address translation
  cxl/core: Fold cxl_trace_hpa() into cxl_dpa_to_hpa()
  cxl/test: Replace ENXIO with EBUSY for inject poison limit reached
  cxl/memdev: Replace ENXIO with EBUSY for inject poison limit reached
  cxl/acpi: Warn on mixed CXL VH and RCH/RCD Hierarchy
  cxl/core: Fix incorrect vendor debug UUID define
  Documentation: CXL Maturity Map
  cxl/region: Simplify cxl_region_nid()
  cxl/region: Support to calculate memory tier abstract distance
  cxl/region: Fix a race condition in memory hotplug notifier
  cxl: add missing MODULE_DESCRIPTION() macros
  cxl/events: Use a common struct for DRAM and General Media events
2024-07-28 09:33:28 -07:00
Linus Torvalds
7b5d481889 Merge tag 'unicode-next-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/krisman/unicode
Pull unicode update from Gabriel Krisman Bertazi:
 "Two small fixes to silence the compiler and static analyzers tools
  from Ben Dooks and Jeff Johnson"

* tag 'unicode-next-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/krisman/unicode:
  unicode: add MODULE_DESCRIPTION() macros
  unicode: make utf8 test count static
2024-07-28 09:14:11 -07:00
Jose Ignacio Tornos Martinez
d01c14074b kbuild: rpm-pkg: ghost modules.weakdep file
In the same way as for other similar files, mark as ghost the new file
generated by depmod for configured weak dependencies for modules,
modules.weakdep, so that although it is not included in the package,
claim the ownership on it.

Signed-off-by: Jose Ignacio Tornos Martinez <jtornosm@redhat.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-07-28 17:07:03 +09:00
Linus Torvalds
5437f30d34 Merge tag '6.11-rc-smb-client-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6
Pull more smb client updates from Steve French:

 - fix for potential null pointer use in init cifs

 - additional dynamic trace points to improve debugging of some common
   scenarios

 - two SMB1 fixes (one addressing reconnect with POSIX extensions, one a
   mount parsing error)

* tag '6.11-rc-smb-client-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6:
  smb3: add dynamic trace point for session setup key expired failures
  smb3: add four dynamic tracepoints for copy_file_range and reflink
  smb3: add dynamic tracepoint for reflink errors
  cifs: mount with "unix" mount option for SMB1 incorrectly handled
  cifs: fix reconnect with SMB1 UNIX Extensions
  cifs: fix potential null pointer use in destroy_workqueue in init_cifs error path
2024-07-27 20:08:07 -07:00
Linus Torvalds
6342649c33 Merge tag 'block-6.11-20240726' of git://git.kernel.dk/linux
Pull block fixes from Jens Axboe:

 - NVMe pull request via Keith:
     - Fix request without payloads cleanup  (Leon)
     - Use new protection information format (Francis)
     - Improved debug message for lost pci link (Bart)
     - Another apst quirk (Wang)
     - Use appropriate sysfs api for printing chars (Markus)

 - ublk async device deletion fix (Ming)

 - drbd kerneldoc fixups (Simon)

 - Fix deadlock between sd removal and release (Yang)

* tag 'block-6.11-20240726' of git://git.kernel.dk/linux:
  nvme-pci: add missing condition check for existence of mapped data
  ublk: fix UBLK_CMD_DEL_DEV_ASYNC handling
  block: fix deadlock between sd_remove & sd_release
  drbd: Add peer_device to Kernel doc
  nvme-core: choose PIF from QPIF if QPIFS supports and PIF is QTYPE
  nvme-pci: Fix the instructions for disabling power management
  nvme: remove redundant bdev local variable
  nvme-fabrics: Use seq_putc() in __nvmf_concat_opt_tokens()
  nvme/pci: Add APST quirk for Lenovo N60z laptop
2024-07-27 15:28:53 -07:00
Linus Torvalds
8c93074743 Merge tag 'io_uring-6.11-20240726' of git://git.kernel.dk/linux
Pull io_uring fixes from Jens Axboe:

 - Fix a syzbot issue for the msg ring cache added in this release. No
   ill effects from this one, but it did make KMSAN unhappy (me)

 - Sanitize the NAPI timeout handling, by unifying the value handling
   into all ktime_t rather than converting back and forth (Pavel)

 - Fail NAPI registration for IOPOLL rings, it's not supported (Pavel)

 - Fix a theoretical issue with ring polling and cancelations (Pavel)

 - Various little cleanups and fixes (Pavel)

* tag 'io_uring-6.11-20240726' of git://git.kernel.dk/linux:
  io_uring/napi: pass ktime to io_napi_adjust_timeout
  io_uring/napi: use ktime in busy polling
  io_uring/msg_ring: fix uninitialized use of target_req->flags
  io_uring: align iowq and task request error handling
  io_uring: kill REQ_F_CANCEL_SEQ
  io_uring: simplify io_uring_cmd return
  io_uring: fix io_match_task must_hold
  io_uring: don't allow netpolling with SETUP_IOPOLL
  io_uring: tighten task exit cancellations
2024-07-27 15:22:33 -07:00
Linus Torvalds
bc4eee85ca Merge tag 'vfs-6.11-rc1.fixes.3' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs fixes from Christian Brauner:
 "This contains two fixes for this merge window:

  VFS:

   - I noticed that it is possible for a privileged user to mount most
     filesystems with a non-initial user namespace in sb->s_user_ns.

     When fsopen() is called in a non-init namespace the caller's
     namespace is recorded in fs_context->user_ns. If the returned file
     descriptor is then passed to a process privileged in init_user_ns,
     that process can call fsconfig(fd_fs, FSCONFIG_CMD_CREATE*),
     creating a new superblock with sb->s_user_ns set to the namespace
     of the process which called fsopen().

     This is problematic as only filesystems that raise FS_USERNS_MOUNT
     are known to be able to support a non-initial s_user_ns. Others may
     suffer security issues, on-disk corruption or outright crash the
     kernel. Prevent that by restricting such delegation to filesystems
     that allow FS_USERNS_MOUNT.

     Note, that this delegation requires a privileged process to
     actually create the superblock so either the privileged process is
     cooperaing or someone must have tricked a privileged process into
     operating on a fscontext file descriptor whose origin it doesn't
     know (a stupid idea).

     The bug dates back to about 5 years afaict.

  Misc:

   - Fix hostfs parsing when the mount request comes in via the legacy
     mount api.

     In the legacy mount api hostfs allows to specify the host directory
     mount without any key.

     Restore that behavior"

* tag 'vfs-6.11-rc1.fixes.3' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  hostfs: fix the host directory parse when mounting.
  fs: don't allow non-init s_user_ns for filesystems without FS_USERNS_MOUNT
2024-07-27 15:11:59 -07:00
Linus Torvalds
910bfc26d1 Merge tag 'rust-6.11' of https://github.com/Rust-for-Linux/linux
Pull Rust updates from Miguel Ojeda:
 "The highlight is the establishment of a minimum version for the Rust
  toolchain, including 'rustc' (and bundled tools) and 'bindgen'.

  The initial minimum will be the pinned version we currently have, i.e.
  we are just widening the allowed versions. That covers three stable
  Rust releases: 1.78.0, 1.79.0, 1.80.0 (getting released tomorrow),
  plus beta, plus nightly.

  This should already be enough for kernel developers in distributions
  that provide recent Rust compiler versions routinely, such as Arch
  Linux, Debian Unstable (outside the freeze period), Fedora Linux,
  Gentoo Linux (especially the testing channel), Nix (unstable) and
  openSUSE Slowroll and Tumbleweed.

  In addition, the kernel is now being built-tested by Rust's pre-merge
  CI. That is, every change that is attempting to land into the Rust
  compiler is tested against the kernel, and it is merged only if it
  passes. Similarly, the bindgen tool has agreed to build the kernel in
  their CI too.

  Thus, with the pre-merge CI in place, both projects hope to avoid
  unintentional changes to Rust that break the kernel. This means that,
  in general, apart from intentional changes on their side (that we will
  need to workaround conditionally on our side), the upcoming Rust
  compiler versions should generally work.

  In addition, the Rust project has proposed getting the kernel into
  stable Rust (at least solving the main blockers) as one of its three
  flagship goals for 2024H2 [1].

  I would like to thank Niko, Sid, Emilio et al. for their help
  promoting the collaboration between Rust and the kernel.

  Toolchain and infrastructure:

   - Support several Rust toolchain versions.

   - Support several bindgen versions.

   - Remove 'cargo' requirement and simplify 'rusttest', thanks to
     'alloc' having been dropped last cycle.

   - Provide proper error reporting for the 'rust-analyzer' target.

  'kernel' crate:

   - Add 'uaccess' module with a safe userspace pointers abstraction.

   - Add 'page' module with a 'struct page' abstraction.

   - Support more complex generics in workqueue's 'impl_has_work!'
     macro.

  'macros' crate:

   - Add 'firmware' field support to the 'module!' macro.

   - Improve 'module!' macro documentation.

  Documentation:

   - Provide instructions on what packages should be installed to build
     the kernel in some popular Linux distributions.

   - Introduce the new kernel.org LLVM+Rust toolchains.

   - Explain '#[no_std]'.

  And a few other small bits"

Link: https://rust-lang.github.io/rust-project-goals/2024h2/index.html#flagship-goals [1]

* tag 'rust-6.11' of https://github.com/Rust-for-Linux/linux: (26 commits)
  docs: rust: quick-start: add section on Linux distributions
  rust: warn about `bindgen` versions 0.66.0 and 0.66.1
  rust: start supporting several `bindgen` versions
  rust: work around `bindgen` 0.69.0 issue
  rust: avoid assuming a particular `bindgen` build
  rust: start supporting several compiler versions
  rust: simplify Clippy warning flags set
  rust: relax most deny-level lints to warnings
  rust: allow `dead_code` for never constructed bindings
  rust: init: simplify from `map_err` to `inspect_err`
  rust: macros: indent list item in `paste!`'s docs
  rust: add abstraction for `struct page`
  rust: uaccess: add typed accessors for userspace pointers
  uaccess: always export _copy_[from|to]_user with CONFIG_RUST
  rust: uaccess: add userspace pointers
  kbuild: rust-analyzer: improve comment documentation
  kbuild: rust-analyzer: better error handling
  docs: rust: no_std is used
  rust: alloc: add __GFP_HIGHMEM flag
  rust: alloc: fix typo in docs for GFP_NOWAIT
  ...
2024-07-27 13:44:54 -07:00
Linus Torvalds
ff30564411 Merge tag 'apparmor-pr-2024-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor
Pull apparmor updates from John Johansen:
 "Cleanups
   - optimization: try to avoid refing the label in apparmor_file_open
   - remove useless static inline function is_deleted
   - use kvfree_sensitive to free data->data
   - fix typo in kernel doc

  Bug fixes:
   - unpack transition table if dfa is not present
   - test: add MODULE_DESCRIPTION()
   - take nosymfollow flag into account
   - fix possible NULL pointer dereference
   - fix null pointer deref when receiving skb during sock creation"

* tag 'apparmor-pr-2024-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor:
  apparmor: unpack transition table if dfa is not present
  apparmor: try to avoid refing the label in apparmor_file_open
  apparmor: test: add MODULE_DESCRIPTION()
  apparmor: take nosymfollow flag into account
  apparmor: fix possible NULL pointer dereference
  apparmor: fix typo in kernel doc
  apparmor: remove useless static inline function is_deleted
  apparmor: use kvfree_sensitive to free data->data
  apparmor: Fix null pointer deref when receiving skb during sock creation
2024-07-27 13:28:39 -07:00
Linus Torvalds
86b405ad8d Merge tag 'landlock-6.11-rc1-houdini-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux
Pull landlock fix from Mickaël Salaün:
 "Jann Horn reported a sandbox bypass for Landlock. This includes the
  fix and new tests. This should be backported"

* tag 'landlock-6.11-rc1-houdini-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux:
  selftests/landlock: Add cred_transfer test
  landlock: Don't lose track of restrictions on cred_transfer
2024-07-27 13:16:53 -07:00
Linus Torvalds
8e333791d4 Merge tag 'gpio-fixes-for-v6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull gpio fix from Bartosz Golaszewski:

 - don't use sprintf() with non-constant format string

* tag 'gpio-fixes-for-v6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
  gpio: virtuser: avoid non-constant format string
2024-07-27 12:54:06 -07:00
Linus Torvalds
bf80f1391a Merge tag 'devicetree-fixes-for-6.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux
Pull more devicetree updates from Rob Herring:
 "Most of this is a treewide change to of_property_for_each_u32() which
  was small enough to do in one go before rc1 and avoids the need to
  create of_property_for_each_u32_some_new_name().

   - Treewide conversion of of_property_for_each_u32() to drop internal
     arguments making struct property opaque

   - Add binding for Amlogic A4 SoC watchdog

   - Fix constraints for AD7192 'single-channel' property"

* tag 'devicetree-fixes-for-6.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
  dt-bindings: iio: adc: ad7192: Fix 'single-channel' constraints
  of: remove internal arguments from of_property_for_each_u32()
  dt-bindings: watchdog: add support for Amlogic A4 SoCs
2024-07-27 12:46:16 -07:00
Linus Torvalds
b465ed28f7 Merge tag 'iommu-fixes-v6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux
Pull iommu fixes from Will Deacon:
 "We're still resolving a regression with the handling of unexpected
  page faults on SMMUv3, but we're not quite there with a fix yet.

   - Fix NULL dereference when freeing domain in Unisoc SPRD driver

   - Separate assignment statements with semicolons in AMD page-table
     code

   - Fix Tegra erratum workaround when the CPU is using 16KiB pages"

* tag 'iommu-fixes-v6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux:
  iommu: arm-smmu: Fix Tegra workaround for PAGE_SIZE mappings
  iommu/amd: Convert comma to semicolon
  iommu: sprd: Avoid NULL deref in sprd_iommu_hw_en
2024-07-27 12:39:55 -07:00
Linus Torvalds
0421621158 Merge tag 'firewire-fixes-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394
Pull firewire fixes from Takashi Sakamoto:
 "The recent integration of compiler collections introduced the
  technology to check flexible array length at runtime by providing
  proper annotations. In v6.10 kernel, a patch was merged into firewire
  subsystem to utilize it, however the annotation was inadequate.

  There is also the related change for the flexible array in sound
  subsystem, but it causes a regression where the data in the payload of
  isochronous packet is incorrect for some devices. These bugs are now
  fixed"

* tag 'firewire-fixes-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394:
  ALSA: firewire-lib: fix wrong value as length of header for CIP_NO_HEADER case
  Revert "firewire: Annotate struct fw_iso_packet with __counted_by()"
2024-07-27 12:35:12 -07:00
Linus Torvalds
ab11658f26 Merge tag 'spi-fix-v6.11-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
 "The bulk of this is a series of fixes for the microchip-core driver
  mostly originating from one of their customers, I also applied an
  additional patch adding support for controlling the word size which
  came along with it since it's still the merge window and clearly had a
  bunch of fairly thorough testing.

  We also have a fix for the compatible used to bind spidev to the
  BH2228FV"

* tag 'spi-fix-v6.11-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  spi: spidev: add correct compatible for Rohm BH2228FV
  dt-bindings: trivial-devices: fix Rohm BH2228FV compatible string
  spi: microchip-core: add support for word sizes of 1 to 32 bits
  spi: microchip-core: ensure TX and RX FIFOs are empty at start of a transfer
  spi: microchip-core: fix init function not setting the master and motorola modes
  spi: microchip-core: only disable SPI controller when register value change requires it
  spi: microchip-core: defer asserting chip select until just before write to TX FIFO
  spi: microchip-core: fix the issues in the isr
2024-07-27 12:29:10 -07:00
Linus Torvalds
560e805047 Merge tag 'regulator-fix-v6.11-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
Pull regulator fixes from Mark Brown:
 "These two commits clean up the excessively loose dependencies for the
  RZG2L USB VBCTRL regulator driver, ensuring it shouldn't prompt for
  people who can't use it"

* tag 'regulator-fix-v6.11-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
  regulator: Further restrict RZG2L USB VBCTRL regulator dependencies
  regulator: renesas-usb-vbus-regulator: Update the default
2024-07-27 12:27:52 -07:00
Linus Torvalds
8f3f7598cb Merge tag 'regmap-fix-v6.11-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap fix from Mark Brown:
 "Arnd sent a workaround for a false positive warning which was showing
  up with GCC 14.1"

* tag 'regmap-fix-v6.11-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
  regmap: maple: work around gcc-14.1 false-positive warning
2024-07-27 12:26:09 -07:00
Linus Torvalds
de5f4fbe7b Merge tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk fixes from Stephen Boyd:
 "A few clk driver fixes for the merge window to fix the build and boot
  on some SoCs.

   - Initialize struct clk_init_data in the TI da8xx-cfgchip driver so
     that stack contents aren't used for things like clk flags leading
     to unexpected behavior

   - Don't leak stack contents in a debug print in the new Sophgo clk
     driver

   - Disable the new T-Head clk driver on 32-bit targets to fix the
     build due to a division

   - Fix Samsung Exynos4 fin_pll wreckage from the clkdev rework done
     last cycle by using a struct clk_hw directly instead of a struct
     clk consumer"

* tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
  clk: samsung: fix getting Exynos4 fin_pll rate from external clocks
  clk: T-Head: Disable on 32-bit Targets
  clk: sophgo: clk-sg2042-pll: Fix uninitialized variable in debug output
  clk: davinci: da8xx-cfgchip: Initialize clk_init_data before use
2024-07-27 12:07:18 -07:00
Linus Torvalds
c85e1497dd Merge tag 'i3c/for-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux
Pull i3c updates from Alexandre Belloni:
 "This cycle, there are new features for the Designware controller and
  fixes for the other IPs:

   - dw: optional apb clock and power management support, IBI handling
     fixes

   - mipi-i3c-hci: IBI handling fixes

   - svc: a few fixes"

* tag 'i3c/for-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux:
  dt-bindings: i3c: add header for generic I3C flags
  i3c: master: svc: Fix error code in svc_i3c_master_do_daa_locked()
  i3c: master: Enhance i3c_bus_type visibility for device searching & event monitoring
  i3c: dw: Add power management support
  i3c: dw: Add some functions for reusability
  i3c: dw: Save timing registers and other values
  i3c: master: svc: Improve DAA STOP handle code logic
  i3c: dw: Add optional apb clock
  i3c: dw: Use new *_enabled clk API
  dt-bindings: i3c: dw: Add apb clock binding
  i3c: master: svc: Convert comma to semicolon
  i3c: mipi-i3c-hci: Round IBI data chunk size to HW supported value
  i3c: mipi-i3c-hci: Error out instead on BUG_ON() in IBI DMA setup
  i3c: mipi-i3c-hci: Set IBI Status and Data Ring base addresses
  i3c: mipi-i3c-hci: Switch to lower_32_bits()/upper_32_bits() helpers
  i3c: dw: Remove ibi_capable property
  i3c: dw: Fix IBI intr programming
  i3c: dw: Fix clearing queue thld
  i3c: mipi-i3c-hci: Fix number of DAT/DCT entries for HCI versions < 1.1
  i3c: master: svc: resend target address when get NACK
2024-07-27 10:53:06 -07:00
Linus Torvalds
1fcaa5db40 Merge tag 'thermal-6.11-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull thermal control fix from Rafael Wysocki:
 "Prevent the thermal core from flooding the kernel log with useless
  messages if thermal zone temperature can never be determined (or its
  sensor has failed permanently) and make it finally give up and disable
  defective thermal zones (Rafael Wysocki)"

* tag 'thermal-6.11-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  thermal: core: Back off when polling thermal zones on errors
  thermal: trip: Split thermal_zone_device_set_mode()
2024-07-27 10:44:49 -07:00
Linus Torvalds
7b0acd911c Merge tag 'mm-hotfixes-stable-2024-07-26-14-33' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc hotfixes from Andrew Morton:
 "11 hotfixes, 7 of which are cc:stable.  7 are MM, 4 are other"

* tag 'mm-hotfixes-stable-2024-07-26-14-33' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  nilfs2: handle inconsistent state in nilfs_btnode_create_block()
  selftests/mm: skip test for non-LPA2 and non-LVA systems
  mm/page_alloc: fix pcp->count race between drain_pages_zone() vs __rmqueue_pcplist()
  mm: memcg: add cacheline padding after lruvec in mem_cgroup_per_node
  alloc_tag: outline and export free_reserved_page()
  decompress_bunzip2: fix rare decompression failure
  mm/huge_memory: avoid PMD-size page cache if needed
  mm: huge_memory: use !CONFIG_64BIT to relax huge page alignment on 32 bit machines
  mm: fix old/young bit handling in the faulting path
  dt-bindings: arm: update James Clark's email address
  MAINTAINERS: mailmap: update James Clark's email address
2024-07-27 10:26:41 -07:00
Linus Torvalds
5256184b61 Merge tag 'timers-urgent-2024-07-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer migration updates from Thomas Gleixner:
 "Fixes and minor updates for the timer migration code:

   - Stop testing the group->parent pointer as it is not guaranteed to
     be stable over a chain of operations by design.

     This includes a warning which would be nice to have but it produces
     false positives due to the racy nature of the check.

   - Plug a race between CPUs going in and out of idle and a CPU hotplug
     operation. The latter can create and connect a new hierarchy level
     which is missed in the concurrent updates of CPUs which go into
     idle. As a result the events of such a CPU might not be processed
     and timers go stale.

     Cure it by splitting the hotplug operation into a prepare and
     online callback. The prepare callback is guaranteed to run on an
     online and therefore active CPU. This CPU updates the hierarchy and
     being online ensures that there is always at least one migrator
     active which handles the modified hierarchy correctly when going
     idle. The online callback which runs on the incoming CPU then just
     marks the CPU active and brings it into operation.

   - Improve tracing and polish the code further so it is more obvious
     what's going on"

* tag 'timers-urgent-2024-07-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  timers/migration: Fix grammar in comment
  timers/migration: Spare write when nothing changed
  timers/migration: Rename childmask by groupmask to make naming more obvious
  timers/migration: Read childmask and parent pointer in a single place
  timers/migration: Use a single struct for hierarchy walk data
  timers/migration: Improve tracing
  timers/migration: Move hierarchy setup into cpuhotplug prepare callback
  timers/migration: Do not rely always on group->parent
2024-07-27 10:19:55 -07:00