16147 Commits

Author SHA1 Message Date
Miri Korenblit
1f1101c29e wifi: nl80211: add support for NAN stations
There are 2 types of logical links with a NAN peer:
- management (NMI), which is used for Tx/Rx of NAN management frames.
- data (NDI), which is used for Tx/Rx of data frames, or non-NAN
  management frames.

The NMI station has two roles:
- representation of the NAN peer - for example, the peer's schedule
  and the HT, VHT, HE capabilities - belong to the NMI station, and not to
  the NDI ones.
- Tx/Rx of NAN management frames to/from the peer.

The NDI station is used for Tx/Rx data frames of a specific NDP that was
established with the NAN peer.

Note that a peer can choose to reuse its NMI address as the NDI address.
In that case, it is expected that two stations will be added even though
they will have the same address.

- An NDI station can only be added after the corresponding NMI station
  was configured with capabilities.
- All the NDI stations will be removed before the NDI interface is brought
  down.
- All NMI stations will be removed before NAN is stopped.
- Before NMI sta removal, all corresponding NDI stations will be removed

Add support for adding, removing, and changing NMI and NDI stations.

Reviewed-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20260219114327.d280936ee832.I6d859eee759bb5824a9ffd2984410faf879ba00e@changeid
Link: https://patch.msgid.link/20260318123926.206536-6-miriam.rachel.korenblit@intel.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-03-25 20:56:54 +01:00
Miri Korenblit
bd11c96604 wifi: cfg80211: separately store HT, VHT and HE capabilities for NAN
In NAN, unlike in other modes, there is only one set of (HT, VHT, HE)
capabilities that is used for all channels (and bands) used in the NAN
data path.

This set of capabilities will have to be a special one, for example - have
the minimum of (HT-for-5 GHz, HT-for-2.4 GHz), careful handling of the
bits that have a different meaning for each band, etc.

While we could use the exiting sband/iftype capabilities, and require
identical capabilities for all bands (makes no sense since this means
that we will have VHT capabilities in the 2.4 GHz slot),
or require that only one of the sbands will be set,
or have logic to extract the minimum and handle the conflicting bits -
it seems simpler to add a dedicated set of capabilities which is special
for NAN, and is band agnostic, to be populated by the driver.

That way we also let the driver decide how it wants to handle the
conflicting bits.

Add this special set of these capabilities to wiphy:nan_capabilities, to be
populated by the driver.
Send it to user space.

Reviewed-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20260219114327.4b6f3e4a81b4.I45422adc0df3ad4101d857a92e83f0de5cf241e1@changeid
Link: https://patch.msgid.link/20260318123926.206536-5-miriam.rachel.korenblit@intel.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-03-25 20:56:54 +01:00
Miri Korenblit
0e8ec738a7 wifi: cfg80211: add support for NAN data interface
This new interface type represents a NAN data interface (NDI).
It is used for data communication with NAN peers.

Note that the existing NL80211_IFTYPE_NAN interface, which is the NAN
Management Interface (NMI), is used for management communication.

An NDI interface is started when a new NAN data path is about to
be established, and is stopped after the NAN data path is terminated.

- An NDI interface can only be started if the NMI is running, and NAN is
  started.
- Before the NMI is stopped, the NDI interfaces will be stopped.

Add the new interface type, handle add/remove operations for it,
and makes sure of the conditions above.

Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20260219114327.0d681335c2e2.I92973483e927820ae2297853c141842fdb262747@changeid
Link: https://patch.msgid.link/20260318123926.206536-4-miriam.rachel.korenblit@intel.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-03-25 20:56:53 +01:00
Miri Korenblit
6e78b70c9a wifi: cfg80211: Add an API to configure local NAN schedule
Add an nl80211 API to allow user space to configure the local NAN
schedule.
The local schedule consists of a list of channel definitions and a schedule
map, in which each element covers a time slot and indicates on what
channel the device should be in that time slot.

Channels can be added to schedule even without being scheduled, for
reservation purposes.

A schedule can be configured either immedietally or be deferred, in case
there are already connected peers.
When the deferred flag is set, the command is a request from the device
to perform an announced schedule update: send the updated NAN
Availability - as set in this command - to the peers, and do the
actual switch to the new schedule on the right time (i.e. at the end of
the slot after the slot in which the update was sent to the peers).
In addition, a notification will be sent to indicate a deferred update
completion.

Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20260219114327.ecca178a2de0.Ic977ab08b4ed5cf9b849e55d3a59b01ad3fbd08e@changeid
Link: https://patch.msgid.link/20260318123926.206536-2-miriam.rachel.korenblit@intel.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-03-25 20:56:05 +01:00
Max Zhen
1f513a3ec3 accel/amdxdna: Add per-process BO memory usage query support
Add support for querying per-process buffer object (BO) memory
usage through the amdxdna GET_ARRAY UAPI.

Introduce a new query type, DRM_AMDXDNA_BO_USAGE, along with
struct amdxdna_drm_bo_usage to report BO memory usage statistics,
including heap, total, and internal usage.

Track BO memory usage on a per-client basis by maintaining counters
in GEM open/close and heap allocation/free paths. This ensures the
reported statistics reflect the current memory footprint of each
process.

Wire the new query into the GET_ARRAY implementation to expose
the usage information to userspace.

Link: 0546f2aaad
Signed-off-by: Max Zhen <max.zhen@amd.com>
Reviewed-by: Lizhi Hou <lizhi.hou@amd.com>
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Link: https://patch.msgid.link/20260324163159.2425461-1-lizhi.hou@amd.com
2026-03-25 11:41:24 -07:00
Steven Rostedt
dc1d9408c9 Merge commit 'f35dbac6942171dc4ce9398d1d216a59224590a9' into trace/ring-buffer/core
The commit f35dbac694 ("ring-buffer: Fix to update per-subbuf entries of
persistent ring buffer") was a fix and merged upstream. It is needed for
some other work in the ring buffer. The current branch has the remote
buffer code that is shared with the Arm64 subsystem and can't be rebased.

Merge in the upstream commit to allow continuing of the ring buffer work.

Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2026-03-24 22:19:21 -04:00
Thomas Weißschuh
f9909cf0a2 module: Move 'struct module_signature' to UAPI
This structure definition is used outside the kernel proper.
For example in kmod and the kernel build environment.

To allow reuse, move it to a new UAPI header.

While it is not a true UAPI, it is a common practice to have
non-UAPI interface definitions in the kernel's UAPI headers.

Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Reviewed-by: Petr Pavlu <petr.pavlu@suse.com>
Reviewed-by: Nicolas Schier <nsc@kernel.org>
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
2026-03-24 21:42:37 +00:00
Randy Dunlap
fd4cb4511b tracing: trace_mmap.h: fix a kernel-doc warning
Add a description of struct reader to resolve a kernel-doc warning:

Warning: include/uapi/linux/trace_mmap.h:43 struct member 'reader' not described in 'trace_buffer_meta'

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2026-03-24 12:19:57 -04:00
Marcin Slusarz
1dfa2fd98f drm/panthor: extend timestamp query with flags
Flags now control which data user space wants to query,
there is more information sources, and there's ability
to query duration of multiple timestamp reads.

New sources:
- CPU's monotonic,
- CPU's monotonic raw,
- GPU's cycle count

These changes should make the implementation of
VK_KHR_calibrated_timestamps more accurate and much simpler.

Signed-off-by: Marcin Slusarz <marcin.slusarz@arm.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Liviu Dudau <liviu.dudau@arm.com>
Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
Link: https://patch.msgid.link/20260324132557.1707286-1-marcin.slusarz@arm.com
2026-03-24 16:04:40 +00:00
Nicolin Chen
9dcef98dbe iommu/tegra241-cmdqv: Update uAPI to clarify HYP_OWN requirement
>From hardware implementation perspective, a guest tegra241-cmdqv hardware
is different than the host hardware:
 - Host HW is backed by a VINTF (HYP_OWN=1)
 - Guest HW is backed by a VINTF (HYP_OWN=0)

The kernel driver has an implementation requirement of the HYP_OWN bit in
the VM. So, VMM must follow that to allow the same copy of Linux to work.

Add this requirement to the uAPI, which is currently missing.

Fixes: 4dc0d12474 ("iommu/tegra241-cmdqv: Add user-space use support")
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Will Deacon <will@kernel.org>
2026-03-24 14:04:35 +00:00
Emanuele Rocca
701f7f4fba pidfds: add coredump_code field to pidfd_info
The struct pidfd_info currently exposes in a field called coredump_signal the
signal number (si_signo) that triggered the dump (for example, 11 for SIGSEGV).
However, it is also valuable to understand the reason why that signal was sent.
This additional context is provided by the signal code (si_code), such as 2 for
SEGV_ACCERR.

Add a new field to struct pidfd_info called coredump_code with the value of
si_code for the benefit of sysadmins who pipe core dumps to user-space programs
for later analysis. The following snippet illustrates a simplified C program
that consumes coredump_signal and coredump_code, and then logs core dump
signals and codes to a file:

    int pidfd = (int)atoi(argv[1]);

    struct pidfd_info info = {
        .mask = PIDFD_INFO_EXIT | PIDFD_INFO_COREDUMP,
    };

    if (ioctl(pidfd, PIDFD_GET_INFO, &info) == 0)
        if (info.mask & PIDFD_INFO_COREDUMP)
            fprintf(f, "PID=%d, si_signo: %d si_code: %d\n",
                info.pid, info.coredump_signal, info.coredump_code);

Assuming the program is installed under /usr/local/bin/core-logger, core dump
processing can be enabled by setting /proc/sys/kernel/core_pattern to
'|/usr/local/bin/dumpstuff %F'.

systemd-coredump(8) already uses pidfds to process core dumps, and it could be
extended to include the values of coredump_code too.

Signed-off-by: Emanuele Rocca <emanuele.rocca@arm.com>
Link: https://patch.msgid.link/acE52HIFivNZN3nE@NH27D9T0LF
Acked-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-03-23 16:29:15 +01:00
Tejas Upadhyay
4f39a194d4 drm/xe/xe3p_lpg: Restrict UAPI to enable L2 flush optimization
When set, starting xe3p_lpg, the L2 flush optimization
feature will control whether L2 is in Persistent or
Transient mode through monitoring of media activity.

To enable L2 flush optimization include new feature flag
GUC_CTL_ENABLE_L2FLUSH_OPT for Novalake platforms when
media type is detected.

Tighten UAPI validation to restrict userptr, svm and
dmabuf mappings to be either 2WAY or XA+1WAY

V5(Thomas): logic correction
V4(MattA): Modify uapi doc and commit
V3(MattA): check valid op and pat_index value
V2(MattA): validate dma-buf bos and madvise pat-index

Acked-by: José Roberto de Souza <jose.souza@intel.com>
Acked-by: Michal Mrozek <michal.mrozek@intel.com>
Acked-by: Carl Zhang <carl.zhang@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patch.msgid.link/20260305121902.1892593-9-tejas.upadhyay@intel.com
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
2026-03-23 15:24:14 +05:30
Greg Kroah-Hartman
6872c84dc6 Merge 7.0-rc5 into tty-next
We need the tty/serial fixes in here as well.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-03-23 09:59:16 +01:00
Dave Airlie
74919e2797 Merge tag 'drm-misc-next-2026-03-20' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-next
drm-misc-next for v7.1:

UAPI Changes:

math:
- provide __KERNEL_DIV_ROUND_CLOSEST() in UAPI

mode:
- provide DRM_ARGB_GET*() macros for reading color components

Cross-subsystem Changes:

math:
- implement DIV_ROUND_CLOSEST() with __KERNEL_DIV_ROUND_CLOSEST()

Core Changes:

atomic:
- fix handling of colorop state in atomic updates
- provide CRTC background color

ttm:
- improve tests and doumentation

Driver Changes:

amdxdna:
- allow forcing DMA through IOMMU IOVA
- improve debugging

bridge:
- Support Lontium LT8713SX DP MST bridge plus DT bindings

imx:
- support planes behind the primary plane
- fix bus-format selection

ivpu:
- perform engine reset on TDR error

panel:
- novatek-nt36672a: Use mipi_dsi_*_multi() functions
- panel-edp: Support BOE NV153WUM-N42, CMN N153JCA-ELK, CSW MNF307QS3-2

renesas:
- rz-du: clean up

rockchip:
- support CRTC background color

sun4i:
- fix leak in init code
- clean up

tildc
- clean up

v3d:
- improve handling of struct v3d_stats
- improve error handling
- clean up

vkms:
- support CRTC background color

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patch.msgid.link/20260320082604.GA17867@linux.fritz.box
2026-03-23 14:41:56 +10:00
Randy Dunlap
1ccc861dbb um: time-travel: clean up kernel-doc warnings
Repair all kernel-doc warnings in um_timetravel.h:
- add one enum description
- mark "reserve" as private
- use a leading '@' on current_time

Warning: include/uapi/linux/um_timetravel.h:59 Enum value
 'UM_TIMETRAVEL_SHARED_MAX_FDS' not described in enum
 'um_timetravel_shared_mem_fds'
Warning: include/uapi/linux/um_timetravel.h:245 union member 'reserve'
 not described in 'um_timetravel_schedshm_client'
Warning: include/uapi/linux/um_timetravel.h:288 struct member
 'current_time' not described in 'um_timetravel_schedshm'

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Link: https://patch.msgid.link/20260226221112.1042008-1-rdunlap@infradead.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-03-21 10:42:09 +01:00
Max Zhen
d76856beb4 accel/amdxdna: Refactor GEM BO handling and add helper APIs for address retrieval
Refactor amdxdna GEM buffer object (BO) handling to simplify address
management and unify BO type semantics.

Introduce helper APIs to retrieve commonly used BO addresses:
- User virtual address (UVA)
- Kernel virtual address (KVA)
- Device address (IOVA/PA)

These helpers centralize address lookup logic and avoid duplicating
BO-specific handling across submission and execution paths. This also
improves readability and reduces the risk of inconsistent address
handling in future changes.

As part of the refactor:
- Rename SHMEM BO type to SHARE to better reflect its usage.
- Merge CMD BO handling into SHARE, removing special-case logic for
  command buffers.
- Consolidate BO type handling paths to reduce code duplication and
  simplify maintenance.

No functional change is intended. The refactor prepares the driver for
future enhancements by providing a cleaner abstraction for BO address
management.

Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Max Zhen <max.zhen@amd.com>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Link: https://patch.msgid.link/20260320210615.1973016-1-lizhi.hou@amd.com
2026-03-20 22:12:49 -07:00
Jakub Kicinski
edab1ca5ec Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-7.0-rc5).

net/netfilter/nft_set_rbtree.c
  598adea720 ("netfilter: revert nft_set_rbtree: validate open interval overlap")
  3aea466a43 ("netfilter: nft_set_rbtree: don't disable bh when acquiring tree lock")
https://lore.kernel.org/abgaQBpeGstdN4oq@sirena.org.uk

No adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-19 14:16:00 -07:00
Yishai Hadas
d7140b5dde vfio: Define uAPI for re-init initial bytes during the PRE_COPY phase
As currently defined, initial_bytes is monotonically decreasing and
precedes dirty_bytes when reading from the saving file descriptor.
The transition from initial_bytes to dirty_bytes is unidirectional and
irreversible.

The initial_bytes are considered as critical data that is highly
recommended to be transferred to the target as part of PRE_COPY, without
this data, the PRE_COPY phase would be ineffective.

We come to solve the case when a new chunk of critical data is
introduced during the PRE_COPY phase and the driver would like to report
an entirely new value for the initial_bytes.

For that, we extend the VFIO_MIG_GET_PRECOPY_INFO ioctl with an output
flag named VFIO_PRECOPY_INFO_REINIT to allow drivers reporting a new
initial_bytes value during the PRE_COPY phase.

Currently, existing VFIO_MIG_GET_PRECOPY_INFO implementations don't
assign info.flags before copy_to_user(), this effectively echoes
userspace-provided flags back as output, preventing the field from being
used to report new reliable data from the drivers.

Reliable use of the new VFIO_PRECOPY_INFO_REINIT flag requires userspace
to explicitly opt in by enabling the
VFIO_DEVICE_FEATURE_MIG_PRECOPY_INFOv2 device feature.

When the caller opts in, the driver may report an entirely new
value for initial_bytes. It may be larger, it may be smaller, it may
include the previous unread initial_bytes, it may discard the previous
unread initial_bytes, up to the driver logic and state.
The presence of the VFIO_PRECOPY_INFO_REINIT output flag set by the
driver indicates that new initial data is present on the stream.

Once the caller sees this flag, the initial_bytes value should be
re-evaluated relative to the readiness state for transition to
STOP_COPY.

Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
Link: https://lore.kernel.org/r/20260317161753.18964-2-yishaih@nvidia.com
Signed-off-by: Alex Williamson <alex@shazbot.org>
2026-03-19 12:32:08 -06:00
Joseph Salisbury
02256acf1e vfio: uapi: fix comment typo
The file contains a spelling error in a source comment (succes).

Typos in comments reduce readability and make text searches less reliable
for developers and maintainers.

Replace 'succes' with 'success' in the affected comment. This is a
comment-only cleanup and does not change behavior.

Signed-off-by: Joseph Salisbury <joseph.salisbury@oracle.com>
Link: https://lore.kernel.org/r/20260316185617.166414-1-joseph.salisbury@oracle.com
Signed-off-by: Alex Williamson <alex@shazbot.org>
2026-03-19 12:32:07 -06:00
Sascha Bischoff
c547c51ff4 KVM: arm64: gic-v5: Add ARM_VGIC_V5 device to KVM headers
This is the base GICv5 device which is to be used with the
KVM_CREATE_DEVICE ioctl to create a GICv5-based vgic.

Signed-off-by: Sascha Bischoff <sascha.bischoff@arm.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/20260319154937.3619520-9-sascha.bischoff@arm.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
2026-03-19 18:21:27 +00:00
Yang Xiuwei
7da9261bab bsg: add bsg_uring_cmd uapi structure
Add the bsg_uring_cmd structure to the BSG UAPI header to support
io_uring-based SCSI passthrough operations via IORING_OP_URING_CMD.

Signed-off-by: Yang Xiuwei <yangxiuwei@kylinos.cn>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Link: https://patch.msgid.link/20260317072226.2598233-2-yangxiuwei@kylinos.cn
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2026-03-19 11:38:24 -06:00
Paolo Abeni
9ac76f3d0b Merge tag 'wireless-next-2026-03-19' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next
Johannes Berg says:

====================
Aside from various small improvements/cleanups, not much:
 - cfg80211/mac80211: S1G and UHR improvements
 - hwsim: incumbent signal report test support

* tag 'wireless-next-2026-03-19' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (31 commits)
  qtnfmac: use alloc_netdev macro for single queue devices
  wifi: libertas: don't kill URBs in interrupt context
  wifi: libertas: use USB anchors for tracking in-flight URBs
  wifi: nl80211: use int for band coming from netlink
  wifi: rsi_91x_usb: do not pause rfkill polling when stopping mac80211
  wifi: mac80211: fix STA link removal during link removal
  wifi: nl80211: reject S1G/60G with HT chantype
  wifi: ieee80211: fix definition of EHT-MCS 15 in MRU
  wifi: cfg80211: check non-S1G width with S1G chandef
  wifi: cfg80211: restrict cfg80211_chandef_create() to only HT-based bands
  wifi: mac80211: don't use cfg80211_chandef_create() for default chandef
  wifi: mac80211: Remove deleted sta links in ieee80211_ml_reconf_work()
  wifi: b43: use register definitions in nphy_op_software_rfkill
  wifi: cfg80211: split control freq check from chandef check
  wifi: mac80211: always use full chanctx compatible check
  wifi: mac80211: refactor chandef tracing macros
  wifi: mac80211: validate HE 6 GHz operation when EHT is used
  wifi: nl80211: split out UHR operation information
  wifi: mwifiex: drop redundant device reference
  wifi: rt2x00: drop redundant device reference
  ...
====================

Link: https://patch.msgid.link/20260319082439.79875-3-johannes@sipsolutions.net
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-03-19 15:30:20 +01:00
Paolo Abeni
0c45064487 Merge tag 'ovpn-net-next-20260317' of https://github.com/OpenVPN/ovpn-net-next
Antonio Quartulli says:

====================
Included features:
* use bitops.h API when possible
* send netlink notification in case of client float event
* implement support for asymmetric peer IDs
* consolidate memory allocations during crypto operations
* add netlink notification check in selftests
* add FW mark check in selftest

* tag 'ovpn-net-next-20260317' of https://github.com/OpenVPN/ovpn-net-next:
  ovpn: consolidate crypto allocations in one chunk
  selftests: ovpn: add test for the FW mark feature
  selftests: ovpn: check asymmetric peer-id
  ovpn: add support for asymmetric peer IDs
  selftests: ovpn: add notification parsing and matching
  ovpn: notify userspace on client float event
  ovpn: pktid: use bitops.h API
  ovpn: use correct array size to parse nested attributes in ovpn_nl_key_swap_doit
  selftests: ovpn: allow compiling ovpn-cli.c with mbedtls3
====================

Link: https://patch.msgid.link/20260317104023.192548-1-antonio@openvpn.net
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-03-19 12:50:42 +01:00
Thomas Hellström
9e63413827 Merge drm/drm-next into drm-xe-next
Bring in series "drm/{i915,xe}: sort out step enums between the drivers"
that was merged through i915.

Link: https://lore.kernel.org/all/cover.1772635152.git.jani.nikula@intel.com
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2026-03-19 10:01:55 +01:00
Haiyang Zhang
dc3d720e12 net: ethtool: add ethtool COALESCE_RX_CQE_FRAMES/NSECS
Add two parameters for drivers supporting Rx CQE coalescing /
descriptor writeback.

ETHTOOL_A_COALESCE_RX_CQE_FRAMES:
Maximum number of frames that can be coalesced into a CQE or
writeback.

ETHTOOL_A_COALESCE_RX_CQE_NSECS:
Max time in nanoseconds after the first packet arrival in a
coalesced CQE or writeback to be sent.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Link: https://patch.msgid.link/20260317191826.1346111-2-haiyangz@linux.microsoft.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-18 20:01:10 -07:00
Cristian Ciocaltea
4c684596cd drm: Add CRTC background color property
Some display controllers can be hardware programmed to show non-black
colors for pixels that are either not covered by any plane or are
exposed through transparent regions of higher planes.  This feature can
help reduce memory bandwidth usage, e.g. in compositors managing a UI
with a solid background color while using smaller planes to render the
remaining content.

To support this capability, introduce the BACKGROUND_COLOR standard DRM
mode property, which can be attached to a CRTC through the
drm_crtc_attach_background_color_property() helper function.

Additionally, define a 64-bit ARGB format value to be built with the
help of a couple of dedicated DRM_ARGB64_PREP*() helpers.  Individual
color components can be extracted with desired precision using the
corresponding DRM_ARGB64_GET*() macros.

Co-developed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
Tested-by: Diederik de Haas <diederik@cknow-tech.com>
Signed-off-by: Cristian Ciocaltea <cristian.ciocaltea@collabora.com>
Link: https://patch.msgid.link/20260303-rk3588-bgcolor-v8-2-fee377037ad1@collabora.com
Signed-off-by: Daniel Stone <daniels@collabora.com>
2026-03-18 09:59:57 +00:00
Cristian Ciocaltea
de9e2b3d88 uapi: Provide DIV_ROUND_CLOSEST()
Currently DIV_ROUND_CLOSEST() is only available for the kernel via
include/linux/math.h.

Expose it to userland as well by adding __KERNEL_DIV_ROUND_CLOSEST() as
a common definition in uapi.

Additionally, ensure it allows building ISO C applications by switching
from the 'typeof' GNU extension to the ISO-friendly __typeof__.

Reviewed-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
Tested-by: Diederik de Haas <diederik@cknow-tech.com>
Acked-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Cristian Ciocaltea <cristian.ciocaltea@collabora.com>
Link: https://patch.msgid.link/20260303-rk3588-bgcolor-v8-1-fee377037ad1@collabora.com
Signed-off-by: Daniel Stone <daniels@collabora.com>
2026-03-18 09:59:57 +00:00
Shameer Kolothum
a11661a58c iommufd: Report ATS not supported status via IOMMU_GET_HW_INFO
If the IOMMU driver reports that ATS is not supported for a device, set
the IOMMU_HW_CAP_PCI_ATS_NOT_SUPPORTED flag in the returned hardware
capabilities.

This uses a negative flag for UAPI compatibility. Existing userspace
assumes ATS is supported if no flag is present. This also ensures that
new userspace works correctly on both old and new kernels, where a
zero value implies ATS support.

When this flag is set, ATS cannot be used for the device. When it is
clear, ATS may be enabled when an appropriate HWPT is attached.

Reviewed-by: Samiullah Khawaja <skhawaja@google.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2026-03-17 14:05:05 +01:00
Michael Margolin
5aeb6e0399 RDMA/efa: Rename alloc_ucontext comp_mask to supported_caps
Following discussion [1], rename the comp_mask field in
efa_ibv_alloc_ucontext_cmd to supported_caps to reflect its actual
usage as a capabilities handshake mechanism rather than a standard
comp_mask. Rename related constants and align function and macro names.

[1] https://lore.kernel.org/linux-rdma/20260312120858.GH1448102@nvidia.com/

Signed-off-by: Michael Margolin <mrgolin@amazon.com>
Link: https://patch.msgid.link/20260316180846.30273-1-mrgolin@amazon.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2026-03-17 07:04:03 -04:00
Ralf Lici
2e570a5140 ovpn: add support for asymmetric peer IDs
In order to support the multipeer architecture, upon connection setup
each side of a tunnel advertises a unique ID that the other side must
include in packets sent to them. Therefore when transmitting a packet, a
peer inserts the recipient's advertised ID for that specific tunnel into
the peer ID field. When receiving a packet, a peer expects to find its
own unique receive ID for that specific tunnel in the peer ID field.

Add support for the TX peer ID and embed it into transmitting packets.
If no TX peer ID is specified, fallback to using the same peer ID both
for RX and TX in order to be compatible with the non-multipeer compliant
peers.

Cc: horms@kernel.org
Cc: donald.hunter@gmail.com
Signed-off-by: Ralf Lici <ralf@mandelbit.com>
Signed-off-by: Antonio Quartulli <antonio@openvpn.net>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
2026-03-17 11:09:05 +01:00
Ralf Lici
c841b676da ovpn: notify userspace on client float event
Send a netlink notification when a client updates its remote UDP
endpoint. The notification includes the new IP address, port, and scope
ID (for IPv6).

Cc: linux-kselftest@vger.kernel.org
Cc: horms@kernel.org
Cc: shuah@kernel.org
Cc: donald.hunter@gmail.com
Signed-off-by: Ralf Lici <ralf@mandelbit.com>
Signed-off-by: Antonio Quartulli <antonio@openvpn.net>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
2026-03-17 11:08:55 +01:00
Ingo Molnar
786244f703 Merge tag 'v7.0-rc4' into sched/core, to pick up scheduler fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2026-03-17 07:14:42 +01:00
Dave Airlie
3f071d00fc Merge tag 'drm-xe-next-2026-03-12' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-next
UAPI Changes:
- add VM_BIND DECOMPRESS support and on-demand decompression (Nitin)
- Allow per queue programming of COMMON_SLICE_CHICKEN3 bit13 (Lionel)

Cross-subsystem Changes:
- Introduce the DRM RAS infrastructure over generic netlink (Riana, Rodrigo)

Core Changes:
- Two-pass MMU interval notifiers (Thomas)

Driver Changes:
- Merge drm/drm-next into drm-xe-next (Brost)
- Fix overflow in guc_ct_snapshot_capture (Mika, Fixes)
- Extract gt_pta_entry (Gustavo)
- Extra enabling patches for NVL-P (Gustavo)
- Add Wa_14026578760 (Varun)
- Add type-specific GT loop iterator (Roper)
- Refactor xe_migrate_prepare_vm (Raag)
- Don't disable GuCRC in suspend path (Vinay, Fixes)
- Add missing kernel docs in xe_exec_queue.c (Niranjana)
- Change TEST_VRAM to work with 32-bit resource_size_t (Wajdeczko)
- Fix memory leak in xe_vm_madvise_ioctl (Varun, Fixes)
- Skip access counter queue init for unsupported platforms (Himal)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/abLUVfSHu8EHRF9q@lstrano-desk.jf.intel.com
2026-03-16 12:21:08 +10:00
Linus Torvalds
11e8c7e947 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
 "Quite a large pull request, partly due to skipping last week and
  therefore having material from ~all submaintainers in this one. About
  a fourth of it is a new selftest, and a couple more changes are large
  in number of files touched (fixing a -Wflex-array-member-not-at-end
  compiler warning) or lines changed (reformatting of a table in the API
  documentation, thanks rST).

  But who am I kidding---it's a lot of commits and there are a lot of
  bugs being fixed here, some of them on the nastier side like the
  RISC-V ones.

  ARM:

   - Correctly handle deactivation of interrupts that were activated
     from LRs. Since EOIcount only denotes deactivation of interrupts
     that are not present in an LR, start EOIcount deactivation walk
     *after* the last irq that made it into an LR

   - Avoid calling into the stubs to probe for ICH_VTR_EL2.TDS when pKVM
     is already enabled -- not only thhis isn't possible (pKVM will
     reject the call), but it is also useless: this can only happen for
     a CPU that has already booted once, and the capability will not
     change

   - Fix a couple of low-severity bugs in our S2 fault handling path,
     affecting the recently introduced LS64 handling and the even more
     esoteric handling of hwpoison in a nested context

   - Address yet another syzkaller finding in the vgic initialisation,
     where we would end-up destroying an uninitialised vgic with nasty
     consequences

   - Address an annoying case of pKVM failing to boot when some of the
     memblock regions that the host is faulting in are not page-aligned

   - Inject some sanity in the NV stage-2 walker by checking the limits
     against the advertised PA size, and correctly report the resulting
     faults

  PPC:

   - Fix a PPC e500 build error due to a long-standing wart that was
     exposed by the recent conversion to kmalloc_obj(); rip out all the
     ugliness that led to the wart

  RISC-V:

   - Prevent speculative out-of-bounds access using array_index_nospec()
     in APLIC interrupt handling, ONE_REG regiser access, AIA CSR
     access, float register access, and PMU counter access

   - Fix potential use-after-free issues in kvm_riscv_gstage_get_leaf(),
     kvm_riscv_aia_aplic_has_attr(), and kvm_riscv_aia_imsic_has_attr()

   - Fix potential null pointer dereference in
     kvm_riscv_vcpu_aia_rmw_topei()

   - Fix off-by-one array access in SBI PMU

   - Skip THP support check during dirty logging

   - Fix error code returned for Smstateen and Ssaia ONE_REG interface

   - Check host Ssaia extension when creating AIA irqchip

  x86:

   - Fix cases where CPUID mitigation features were incorrectly marked
     as available whenever the kernel used scattered feature words for
     them

   - Validate _all_ GVAs, rather than just the first GVA, when
     processing a range of GVAs for Hyper-V's TLB flush hypercalls

   - Fix a brown paper bug in add_atomic_switch_msr()

   - Use hlist_for_each_entry_srcu() when traversing mask_notifier_list,
     to fix a lockdep warning; KVM doesn't hold RCU, just irq_srcu

   - Ensure AVIC VMCB fields are initialized if the VM has an in-kernel
     local APIC (and AVIC is enabled at the module level)

   - Update CR8 write interception when AVIC is (de)activated, to fix a
     bug where the guest can run in perpetuity with the CR8 intercept
     enabled

   - Add a quirk to skip the consistency check on FREEZE_IN_SMM, i.e. to
     allow L1 hypervisors to set FREEZE_IN_SMM. This reverts (by
     default) an unintentional tightening of userspace ABI in 6.17, and
     provides some amount of backwards compatibility with hypervisors
     who want to freeze PMCs on VM-Entry

   - Validate the VMCS/VMCB on return to a nested guest from SMM,
     because either userspace or the guest could stash invalid values in
     memory and trigger the processor's consistency checks

  Generic:

   - Remove a subtle pseudo-overlay of kvm_stats_desc, which, aside from
     being unnecessary and confusing, triggered compiler warnings due to
     -Wflex-array-member-not-at-end

   - Document that vcpu->mutex is take outside of kvm->slots_lock and
     kvm->slots_arch_lock, which is intentional and desirable despite
     being rather unintuitive

  Selftests:

   - Increase the maximum number of NUMA nodes in the guest_memfd
     selftest to 64 (from 8)"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (43 commits)
  KVM: selftests: Verify SEV+ guests can read and write EFER, CR0, CR4, and CR8
  Documentation: kvm: fix formatting of the quirks table
  KVM: x86: clarify leave_smm() return value
  selftests: kvm: add a test that VMX validates controls on RSM
  selftests: kvm: extract common functionality out of smm_test.c
  KVM: SVM: check validity of VMCB controls when returning from SMM
  KVM: VMX: check validity of VMCS controls when returning from SMM
  KVM: SVM: Set/clear CR8 write interception when AVIC is (de)activated
  KVM: SVM: Initialize AVIC VMCB fields if AVIC is enabled with in-kernel APIC
  KVM: x86: Introduce KVM_X86_QUIRK_VMCS12_ALLOW_FREEZE_IN_SMM
  KVM: x86: Fix SRCU list traversal in kvm_fire_mask_notifiers()
  KVM: VMX: Fix a wrong MSR update in add_atomic_switch_msr()
  KVM: x86: hyper-v: Validate all GVAs during PV TLB flush
  KVM: x86: synthesize CPUID bits only if CPU capability is set
  KVM: PPC: e500: Rip out "struct tlbe_ref"
  KVM: PPC: e500: Fix build error due to using kmalloc_obj() with wrong type
  KVM: selftests: Increase 'maxnode' for guest_memfd tests
  KVM: arm64: pkvm: Don't reprobe for ICH_VTR_EL2.TDS on CPU hotplug
  KVM: arm64: vgic: Pick EOIcount deactivations from AP-list tail
  KVM: arm64: Remove the redundant ISB in __kvm_at_s1e2()
  ...
2026-03-15 12:22:10 -07:00
Jiri Pirko
725d5fdb7b devlink: support index-based lookup via bus_name/dev_name handle
Devlink instances without a backing device use bus_name
"devlink_index" and dev_name set to the decimal index string.
When user space sends this handle, detect the pattern and perform
a direct xarray lookup by index instead of iterating all instances.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Link: https://patch.msgid.link/20260312100407.551173-6-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-14 13:08:48 -07:00
Jiri Pirko
68deca0f0f devlink: expose devlink instance index over netlink
Each devlink instance has an internally assigned index used for xarray
storage. Expose it as a new DEVLINK_ATTR_INDEX uint attribute alongside
the existing bus_name and dev_name handle.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Link: https://patch.msgid.link/20260312100407.551173-2-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-14 13:08:46 -07:00
Tycho Andersen (AMD)
3ac9498813 include/psp-sev.h: fix structure member in comment
The member is 'data', not 'opaque'.

Signed-off-by: Tycho Andersen (AMD) <tycho@kernel.org>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2026-03-14 14:03:19 +09:00
Kuniyuki Iwashima
74f0cca110 udp: Remove UDPLITE_SEND_CSCOV and UDPLITE_RECV_CSCOV.
UDP-Lite supports variable-length checksum and has two socket
options, UDPLITE_SEND_CSCOV and UDPLITE_RECV_CSCOV, to control
the checksum coverage.

Let's remove the support.

setsockopt(UDPLITE_SEND_CSCOV / UDPLITE_RECV_CSCOV) was only
available for UDP-Lite and returned -ENOPROTOOPT for UDP.

Now, the options are handled in ip_setsockopt() and
ipv6_setsockopt(), which still return the same error.

getsockopt(UDPLITE_SEND_CSCOV / UDPLITE_RECV_CSCOV) was available
for UDP and always returned 0, meaning full checksum, but now
-ENOPROTOOPT is returned.

Given that getsockopt() is meaningless for UDP and even the options
are not defined under include/uapi/, this should not be a problem.

  $ man 7 udplite
  ...
  BUGS
       Where glibc support is missing, the following definitions
       are needed:

           #define IPPROTO_UDPLITE     136
           #define UDPLITE_SEND_CSCOV  10
           #define UDPLITE_RECV_CSCOV  11

Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20260311052020.1213705-10-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-13 18:57:45 -07:00
Nitin Gote
341a2c99c8 drm/xe/uapi: Fix kernel-doc for DRM_XE_VM_BIND_FLAG_DECOMPRESS
There is kernel-doc warning for DRM_XE_VM_BIND_FLAG_DECOMPRESS:

  ./include/uapi/drm/xe_drm.h:1060: WARNING: Block quote ends without
  a blank line; unexpected unindent.

Fix the warning by adding the missing '%' prefix to
DRM_XE_VM_BIND_FLAG_DECOMPRESS in the kernel-doc list entry for
struct drm_xe_vm_bind_op.

Fixes: 2270bd7124 ("drm/xe: add VM_BIND DECOMPRESS uapi flag")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202603121515.gEMrFlTL-lkp@intel.com/
Cc: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Nitin Gote <nitin.r.gote@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patch.msgid.link/20260312160244.809849-2-nitin.r.gote@intel.com
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
2026-03-13 11:48:58 +05:30
Johannes Berg
e4b993f2bc wifi: nl80211: split out UHR operation information
The beacon doesn't contain the full UHR operation, a number
of fields (such as NPCA) are only partially there. Add a new
attribute to contain the full information, so it's available
to the driver/mac80211.

Link: https://patch.msgid.link/20260303221710.866bacf82639.Iafdf37fb0f4304bdcdb824977d61e17b38c47685@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-03-13 07:10:36 +01:00
Jakub Kicinski
72374257ed Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-7.0-rc4).

drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
  db25c42c2e ("net/mlx5e: RX, Fix XDP multi-buf frag counting for striding RQ")
  dff1c3164a ("net/mlx5e: SHAMPO, Always calculate page size")
https://lore.kernel.org/aa7ORohmf67EKihj@sirena.org.uk

drivers/net/ethernet/ti/am65-cpsw-nuss.c
  840c9d13cb ("net: ethernet: ti: am65-cpsw-nuss: Fix rx_filter value for PTP support")
  a23c657e33 ("net: ethernet: ti: am65-cpsw: Use also port number to identify timestamps")
https://lore.kernel.org/abK3EkIXuVgMyGI7@sirena.org.uk

No adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-12 12:53:34 -07:00
David Woodhouse
2619da73bb KVM: x86: Use __DECLARE_FLEX_ARRAY() for UAPI structures with VLAs
Commit 94dfc73e7c ("treewide: uapi: Replace zero-length arrays with
flexible-array members") broke the userspace API for C++.

These structures ending in VLAs are typically a *header*, which can be
followed by an arbitrary number of entries. Userspace typically creates
a larger structure with some non-zero number of entries, for example in
QEMU's kvm_arch_get_supported_msr_feature():

    struct {
        struct kvm_msrs info;
        struct kvm_msr_entry entries[1];
    } msr_data = {};

While that works in C, it fails in C++ with an error like:
 flexible array member 'kvm_msrs::entries' not at end of 'struct msr_data'

Fix this by using __DECLARE_FLEX_ARRAY() for the VLA, which uses [0]
for C++ compilation.

Fixes: 94dfc73e7c ("treewide: uapi: Replace zero-length arrays with flexible-array members")
Cc: stable@vger.kernel.org
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Link: https://patch.msgid.link/3abaf6aefd6e5efeff3b860ac38421d9dec908db.camel@infradead.org
[sean: tag for stable@]
Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-03-12 10:56:10 -07:00
Matthew Brost
42d3b66d4c Merge drm/drm-next into drm-xe-next
Backmerging to bring in 7.00-rc3. Important ahead GPU SVM merging THP
support.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
2026-03-12 07:23:23 -07:00
Nicolas Pitre
5cba06c71c vt: add KT_CSI keysym type for modifier-aware CSI sequences
Add a new keysym type KT_CSI that generates CSI tilde sequences with
automatic modifier encoding. The keysym value encodes the CSI parameter
number, producing sequences like ESC [ <value> ~ or ESC [ <value> ; <mod> ~
when Shift, Alt, or Ctrl modifiers are held.

This allows navigation keys (Home, End, Insert, Delete, PgUp, PgDn) and
function keys to generate modifier-aware escape sequences without
consuming string table entries for each modifier combination.

Define key symbols for navigation keys (K_CSI_HOME, K_CSI_END, etc.)
and function keys (K_CSI_F1 through K_CSI_F20) using standard xterm
CSI parameter values.

The modifier encoding follows the xterm convention:
  mod = 1 + (shift ? 1 : 0) + (alt ? 2 : 0) + (ctrl ? 4 : 0)

Allowed CSI parameter values range from 0 to 99.

Note: The Linux console historically uses a non-standard double-bracket
format for F1-F5 (ESC [ [ A through ESC [ [ E) rather than the xterm
tilde format (ESC [ 11 ~ through ESC [ 15 ~). The K_CSI_F1 through
K_CSI_F5 definitions use the xterm format. Converting F1-F5 to KT_CSI
would require updating the "linux" terminfo entry to match. Navigation
keys and F6-F20 already use the tilde format and are fully compatible.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Link: https://patch.msgid.link/20260203045457.1049793-3-nico@fluxnic.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-03-12 15:07:51 +01:00
Christian Brauner
9d4e752a24 namespace: allow creating empty mount namespaces
Add support for creating a mount namespace that contains only a copy of
the root mount from the caller's mount namespace, with none of the
child mounts.  This is useful for containers and sandboxes that want to
start with a minimal mount table and populate it from scratch rather
than inheriting and then tearing down the full mount tree.

Two new flags are introduced:

- CLONE_EMPTY_MNTNS for clone3(), using the 64-bit flag space.

- UNSHARE_EMPTY_MNTNS for unshare(), reusing the
  CLONE_PARENT_SETTID bit which has no meaning for unshare.

Both flags imply CLONE_NEWNS.  For the unshare path,
UNSHARE_EMPTY_MNTNS is converted to CLONE_EMPTY_MNTNS in
unshare_nsproxy_namespaces() before it reaches copy_mnt_ns(), so the
mount namespace code only needs to handle a single flag.

In copy_mnt_ns(), when CLONE_EMPTY_MNTNS is set, clone_mnt() is used
instead of copy_tree() to clone only the root mount.  The caller's root
and working directory are both reset to the root dentry of the new
mount.

The cleanup variables are changed from vfsmount pointers with
__free(mntput) to struct path with __free(path_put) because the empty
mount namespace path needs to release both mount and dentry references
when replacing the caller's root and pwd.  In the normal (non-empty)
path only the mount component is set, and dput(NULL) is a no-op so
path_put remains correct there as well.

Link: https://patch.msgid.link/20260306-work-empty-mntns-consolidated-v1-1-6eb30529bbb0@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-03-12 13:33:55 +01:00
Christian Brauner
5e8969bd19 mount: add FSMOUNT_NAMESPACE
Add FSMOUNT_NAMESPACE flag to fsmount() that creates a new mount
namespace with the newly created filesystem attached to a copy of the
real rootfs. This returns a namespace file descriptor instead of an
O_PATH mount fd, similar to how OPEN_TREE_NAMESPACE works for open_tree().

This allows creating a new filesystem and immediately placing it in a
new mount namespace in a single operation, which is useful for container
runtimes and other namespace-based isolation mechanisms.

The rootfs mount is created before copying the real rootfs for the new
namespace meaning that the mount namespace id for the mount of the root
of the namespace is bigger than the child mounted on top of it. We've
never explicitly given the guarantee for such ordering and I doubt
anyone relies on it. Accepting that lets us avoid copying the mount
again and also avoids having to massage may_copy_tree() to grant an
exception for fsmount->mnt->mnt_ns being NULL.

Link: https://patch.msgid.link/20260122-work-fsmount-namespace-v1-3-5ef0a886e646@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-03-12 13:33:54 +01:00
Nitin Gote
2270bd7124 drm/xe: add VM_BIND DECOMPRESS uapi flag
Add a new VM_BIND flag, DRM_XE_VM_BIND_FLAG_DECOMPRESS, that lets userspace
express intent for the driver to perform on-device in-place decompression
for the GPU mapping created by a MAP bind operation.

This flag is used by subsequent driver changes to trigger scheduling of
GPU work that resolves compressed VRAM pages into an uncompressed PAT
VM mapping.

Behavior and semantics:
- Valid only for DRM_XE_VM_BIND_OP_MAP. IOCTLs using this flag on other ops
  are rejected (-EINVAL).
- The bind's pat_index must select the device "no-compression" PAT entry;
  otherwise the ioctl is rejected (-EINVAL).
- Only meaningful for VRAM-backed BOs on devices that support Flat CCS and
  the required hardware generation (driver will return -EOPNOTSUPP if not).
- On success the driver schedules a migrate/resolve and installs the
  returned dma_fence into the BO's kernel reservation
  (DMA_RESV_USAGE_KERNEL).

Compute PR: https://github.com/intel/compute-runtime/pull/898

v3: Rebase on latest drm-tip and add compute pr info

v2: Add kernel doc (Matt)

Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Mrozek, Michal <michal.mrozek@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Nitin Gote <nitin.r.gote@intel.com>
Acked-by: Michal Mrozek <michal.mrozek@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patch.msgid.link/20260304123758.3050386-6-nitin.r.gote@intel.com
2026-03-12 09:37:38 +00:00
Maxime Ripard
f08ceb71c5 Merge drm/drm-next into drm-misc-next
Biju Das needs a patch for rz-du merged in 7.0-rc3

Signed-off-by: Maxime Ripard <mripard@kernel.org>
2026-03-12 08:25:41 +01:00
Christian Brauner
c8134b5f13 pidfd: add CLONE_PIDFD_AUTOKILL
Add a new clone3() flag CLONE_PIDFD_AUTOKILL that ties a child's
lifetime to the pidfd returned from clone3(). When the last reference to
the struct file created by clone3() is closed the kernel sends SIGKILL
to the child. A pidfd obtained via pidfd_open() for the same process
does not keep the child alive and does not trigger autokill - only the
specific struct file from clone3() has this property.

This is useful for container runtimes, service managers, and sandboxed
subprocess execution - any scenario where the child must die if the
parent crashes or abandons the pidfd.

CLONE_PIDFD_AUTOKILL requires both CLONE_PIDFD (the whole point is tying
lifetime to the pidfd file) and CLONE_AUTOREAP (a killed child with no
one to reap it would become a zombie). CLONE_THREAD is rejected because
autokill targets a process not a thread.

The clone3 pidfd is identified by the PIDFD_AUTOKILL file flag set on
the struct file at clone3() time. The pidfs .release handler checks this
flag and sends SIGKILL via do_send_sig_info(SIGKILL, SEND_SIG_PRIV, ...)
only when it is set. Files from pidfd_open() or open_by_handle_at() are
distinct struct files that do not carry this flag. dup()/fork() share the
same struct file so they extend the child's lifetime until the last
reference drops.

CLONE_PIDFD_AUTOKILL uses a privilege model based on CLONE_NNP: without
CLONE_NNP the child could escalate privileges via setuid/setgid exec
after being spawned, so the caller must have CAP_SYS_ADMIN in its user
namespace. With CLONE_NNP the child can never gain new privileges so
unprivileged usage is allowed. This is a deliberate departure from the
pdeath_signal model which is reset during secureexec and commit_creds()
rendering it useless for container runtimes that need to deprivilege
themselves.

Link: https://patch.msgid.link/20260226-work-pidfs-autoreap-v5-3-d148b984a989@kernel.org
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-03-11 23:15:40 +01:00
Christian Brauner
24baca56fa clone: add CLONE_NNP
Add a new clone3() flag CLONE_NNP that sets no_new_privs on the child
process at clone time. This is analogous to prctl(PR_SET_NO_NEW_PRIVS)
but applied at process creation rather than requiring a separate step
after the child starts running.

CLONE_NNP is rejected with CLONE_THREAD. It's conceptually a lot simpler
if the whole thread-group is forced into NNP and not have single threads
running around with NNP.

Link: https://patch.msgid.link/20260226-work-pidfs-autoreap-v5-2-d148b984a989@kernel.org
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-03-11 23:15:15 +01:00