Merge drm/drm-fixes into drm-misc-fixes

Getting fixes and updates from v7.1-rc1.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
This commit is contained in:
Thomas Zimmermann
2026-04-27 10:26:49 +02:00
12106 changed files with 675737 additions and 350073 deletions

View File

@@ -1,6 +1,6 @@
# SPDX-License-Identifier: GPL-2.0
msrv = "1.78.0"
msrv = "1.85.0"
check-private-items = true
@@ -9,3 +9,13 @@ disallowed-macros = [
# it here, see: https://github.com/rust-lang/rust-clippy/issues/11303.
{ path = "kernel::dbg", reason = "the `dbg!` macro is intended as a debugging tool", allow-invalid = true },
]
[[disallowed-methods]]
path = "core::ffi::CStr::as_ptr"
replacement = "kernel::prelude::CStrExt::as_char_ptr"
reason = "kernel's `char` is always unsigned, use `as_char_ptr` instead"
[[disallowed-methods]]
path = "core::ffi::CStr::from_ptr"
replacement = "kernel::prelude::CStrExt::from_char_ptr"
reason = "kernel's `char` is always unsigned, use `from_char_ptr` instead"

4
.gitignore vendored
View File

@@ -13,6 +13,7 @@
.*
*.a
*.asn1.[ch]
*.bc
*.bin
*.bz2
*.c.[012]*.*
@@ -184,3 +185,6 @@ sphinx_*/
# Rust analyzer configuration
/rust-project.json
# bc language scripts (not LLVM bitcode)
!kernel/time/timeconst.bc

View File

@@ -75,6 +75,9 @@ Andreas Herrmann <aherrman@de.ibm.com>
Andreas Hindborg <a.hindborg@kernel.org> <a.hindborg@samsung.com>
Andrej Shadura <andrew.shadura@collabora.co.uk>
Andrej Shadura <andrew@shadura.me> <andrew@beldisplaytech.com>
Andrew Donnellan <andrew+kernel@donnellan.id.au> <andrew@donnellan.id.au>
Andrew Donnellan <andrew+kernel@donnellan.id.au> <ajd@linux.ibm.com>
Andrew Donnellan <andrew+kernel@donnellan.id.au> <andrew.donnellan@au1.ibm.com>
Andrew Morton <akpm@linux-foundation.org>
Andrew Murray <amurray@thegoodpenguin.co.uk> <amurray@embedded-bits.co.uk>
Andrew Murray <amurray@thegoodpenguin.co.uk> <andrew.murray@arm.com>
@@ -196,6 +199,7 @@ Christophe Leroy <chleroy@kernel.org> <christophe.leroy2@cs-soprasteria.com>
Christophe Ricard <christophe.ricard@gmail.com>
Christopher Obbard <christopher.obbard@linaro.org> <chris.obbard@collabora.com>
Christoph Hellwig <hch@lst.de>
Christoph Manszewski <c.manszewski@gmail.com> <christoph.manszewski@intel.com>
Chuck Lever <chuck.lever@oracle.com> <cel@kernel.org>
Chuck Lever <chuck.lever@oracle.com> <cel@netapp.com>
Chuck Lever <chuck.lever@oracle.com> <cel@citi.umich.edu>
@@ -204,6 +208,7 @@ Colin Ian King <colin.i.king@gmail.com> <colin.king@canonical.com>
Corey Minyard <minyard@acm.org>
Damian Hobson-Garcia <dhobsong@igel.co.jp>
Dan Carpenter <error27@gmail.com> <dan.carpenter@oracle.com>
Dan Williams <djbw@kernel.org> <dan.j.williams@intel.com>
Daniel Borkmann <daniel@iogearbox.net> <danborkmann@googlemail.com>
Daniel Borkmann <daniel@iogearbox.net> <danborkmann@iogearbox.net>
Daniel Borkmann <daniel@iogearbox.net> <daniel.borkmann@tik.ee.ethz.ch>
@@ -305,7 +310,10 @@ Gokul Sriram Palanisamy <quic_gokulsri@quicinc.com> <gokulsri@codeaurora.org>
Govindaraj Saminathan <quic_gsamin@quicinc.com> <gsamin@codeaurora.org>
Guo Ren <guoren@kernel.org> <guoren@linux.alibaba.com>
Guo Ren <guoren@kernel.org> <ren_guo@c-sky.com>
Guru Das Srinagesh <quic_gurus@quicinc.com> <gurus@codeaurora.org>
Guru Das Srinagesh <linux@gurudas.dev>
Guru Das Srinagesh <linux@gurudas.dev> <quic_gurus@quicinc.com>
Guru Das Srinagesh <linux@gurudas.dev> <gurus@codeaurora.org>
Guru Das Srinagesh <linux@gurudas.dev> <gurooodas@gmail.com>
Gustavo Padovan <gustavo@las.ic.unicamp.br>
Gustavo Padovan <padovan@profusion.mobi>
Hamza Mahfooz <hamzamahfooz@linux.microsoft.com> <hamza.mahfooz@amd.com>
@@ -314,6 +322,7 @@ Hans de Goede <hansg@kernel.org> <hdegoede@redhat.com>
Hans Verkuil <hverkuil@kernel.org> <hverkuil@xs4all.nl>
Hans Verkuil <hverkuil@kernel.org> <hverkuil-cisco@xs4all.nl>
Hans Verkuil <hverkuil@kernel.org> <hansverk@cisco.com>
Hans Verkuil <hverkuil@kernel.org> <hans.verkuil@cisco.com>
Hao Ge <hao.ge@linux.dev> <gehao@kylinos.cn>
Harry Yoo <harry.yoo@oracle.com> <42.hyeyoo@gmail.com>
Harry Yoo <harry@kernel.org> <harry.yoo@oracle.com>
@@ -329,6 +338,7 @@ Herbert Xu <herbert@gondor.apana.org.au>
Huacai Chen <chenhuacai@kernel.org> <chenhc@lemote.com>
Huacai Chen <chenhuacai@kernel.org> <chenhuacai@loongson.cn>
Ignat Korchagin <ignat@linux.win> <ignat@cloudflare.com>
Igor Korotin <igor.korotin@linux.dev> <igor.korotin.linux@gmail.com>
Ike Panhc <ikepanhc@gmail.com> <ike.pan@canonical.com>
J. Bruce Fields <bfields@fieldses.org> <bfields@redhat.com>
J. Bruce Fields <bfields@fieldses.org> <bfields@citi.umich.edu>
@@ -419,6 +429,7 @@ John Stultz <johnstul@us.ibm.com>
<jon.toppins+linux@gmail.com> <jtoppins@cumulusnetworks.com>
<jon.toppins+linux@gmail.com> <jtoppins@redhat.com>
Jonas Gorski <jonas.gorski@gmail.com> <jogo@openwrt.org>
Jonathan Cameron <jic23@kernel.org> <jonathan.cameron@huawei.com>
Jordan Crouse <jordan@cosmicpenguin.net> <jcrouse@codeaurora.org>
<josh@joshtriplett.org> <josh@freedesktop.org>
<josh@joshtriplett.org> <josh@kernel.org>
@@ -576,6 +587,7 @@ Michel Lespinasse <michel@lespinasse.org> <walken@google.com>
Michel Lespinasse <michel@lespinasse.org> <walken@zoy.org>
Mickaël Salaün <mic@digikod.net> <mic@linux.microsoft.com>
Miguel Ojeda <ojeda@kernel.org> <miguel.ojeda.sandonis@gmail.com>
Mike Leach <mike.leach@arm.com> <mike.leach@linaro.org>
Mike Rapoport <rppt@kernel.org> <mike@compulab.co.il>
Mike Rapoport <rppt@kernel.org> <mike.rapoport@gmail.com>
Mike Rapoport <rppt@kernel.org> <rppt@linux.ibm.com>
@@ -733,11 +745,13 @@ Sarangdhar Joshi <spjoshi@codeaurora.org>
Saravana Kannan <saravanak@kernel.org> <skannan@codeaurora.org>
Saravana Kannan <saravanak@kernel.org> <saravanak@google.com>
Sascha Hauer <s.hauer@pengutronix.de>
Sasha Finkelstein <k@chaosmail.tech> <fnkl.kernel@gmail.com>
Sahitya Tummala <quic_stummala@quicinc.com> <stummala@codeaurora.org>
Sathishkumar Muruganandam <quic_murugana@quicinc.com> <murugana@codeaurora.org>
Satya Priya <quic_skakitap@quicinc.com> <quic_c_skakit@quicinc.com> <skakit@codeaurora.org>
S.Çağlar Onur <caglar@pardus.org.tr>
Sayali Lokhande <quic_sayalil@quicinc.com> <sayalil@codeaurora.org>
Sean Anderson <sean.anderson@linux.dev> <sean.anderson@seco.com>
Sean Christopherson <seanjc@google.com> <sean.j.christopherson@intel.com>
Sean Nyekjaer <sean@geanix.com> <sean.nyekjaer@prevas.dk>
Sean Tranchetti <quic_stranche@quicinc.com> <stranche@codeaurora.org>

39
CREDITS
View File

@@ -71,11 +71,6 @@ D: dosfs, LILO, some fd features, ATM, various other hacks here and there
S: Buenos Aires
S: Argentina
NTFS FILESYSTEM
N: Anton Altaparmakov
E: anton@tuxera.com
D: NTFS filesystem
N: Tim Alpaerts
E: tim_alpaerts@toyota-motor-europe.com
D: 802.2 class II logical link control layer,
@@ -85,8 +80,8 @@ S: B-2610 Wilrijk-Antwerpen
S: Belgium
N: Anton Altaparmakov
E: aia21@cantab.net
W: http://www-stu.christs.cam.ac.uk/~aia21/
E: anton@tuxera.com
W: http://www.tuxera.com/
D: Author of new NTFS driver, various other kernel hacks.
S: Christ's College
S: Cambridge CB2 3BU
@@ -1456,6 +1451,14 @@ N: Andy Gospodarek
E: andy@greyhouse.net
D: Maintenance and contributions to the network interface bonding driver.
N: Vivek Goyal
E: vgoyal@redhat.com
D: KDUMP, KEXEC, and VIRTIO FILE SYSTEM
N: Alexander Graf
E: graf@amazon.com
D: Kexec Handover (KHO)
N: Wolfgang Grandegger
E: wg@grandegger.com
D: Controller Area Network (device drivers)
@@ -3592,6 +3595,16 @@ E: wsalamon@tislabs.com
E: wsalamon@nai.com
D: portions of the Linux Security Module (LSM) framework and security modules
N: Salil Mehta
E: salil.mehta@opnsrc.net
D: Co-authored Huawei/HiSilicon Kunpeng 920 SoC HNS3 PF and VF 100G
D: Ethernet driver
D: Co-authored Huawei/HiSilicon Kunpeng 916 SoC HNS 10G Ethernet
D: driver enhancements
D: Maintained Huawei/HiSilicon HNS and HNS3 10G/100G Ethernet drivers
D: for Kunpeng 916 family, 920 family of SoCs
S: Cambridge, Cambridgeshire, United Kingdom
N: Robert Sanders
E: gt8134b@prism.gatech.edu
D: Dosemu
@@ -3639,6 +3652,11 @@ S: Dag Hammerskjolds v. 3E
S: S-226 64 LUND
S: Sweden
N: Tilman Schmidt
E: tilman@imap.cc
D: Siemens Gigaset ISDN driver author and maintainer
D: ISDN CAPI subsystem contributions
N: Henning P. Schmiedehausen
E: hps@tanstaafl.de
D: added PCI support to the serial driver
@@ -4560,8 +4578,5 @@ D: MD driver
D: EISA/sysfs subsystem
S: France
# Don't add your name here, unless you really _are_ after Marc
# alphabetically. Leonard used to be very proud of being the
# last entry, and he'll get positively pissed if he can't even
# be second-to-last. (and this file really _is_ supposed to be
# in alphabetic order)
# Don't add your name here unless you really are last alphabetically.
# (This file is supposed to be kept in alphabetical order by last name.)

View File

@@ -783,11 +783,9 @@ namespaces/compatibility-list admin-guide/namespaces/compatibility-list
namespaces/index admin-guide/namespaces/index
namespaces/resource-control admin-guide/namespaces/resource-control
networking/altera_tse networking/device_drivers/ethernet/altera/altera_tse
networking/baycom networking/device_drivers/hamradio/baycom
networking/bpf_flow_dissector bpf/prog_flow_dissector
networking/cxacru networking/device_drivers/atm/cxacru
networking/defza networking/device_drivers/fddi/defza
networking/device_drivers/3com/3c509 networking/device_drivers/ethernet/3com/3c509
networking/device_drivers/3com/vortex networking/device_drivers/ethernet/3com/vortex
networking/device_drivers/amazon/ena networking/device_drivers/ethernet/amazon/ena
networking/device_drivers/aquantia/atlantic networking/device_drivers/ethernet/aquantia/atlantic
@@ -822,7 +820,6 @@ networking/device_drivers/microsoft/netvsc networking/device_drivers/ethernet/mi
networking/device_drivers/netronome/nfp networking/device_drivers/ethernet/netronome/nfp
networking/device_drivers/pensando/ionic networking/device_drivers/ethernet/pensando/ionic
networking/device_drivers/qualcomm/rmnet networking/device_drivers/cellular/qualcomm/rmnet
networking/device_drivers/smsc/smc9 networking/device_drivers/ethernet/smsc/smc9
networking/device_drivers/stmicro/stmmac networking/device_drivers/ethernet/stmicro/stmmac
networking/device_drivers/ti/cpsw networking/device_drivers/ethernet/ti/cpsw
networking/device_drivers/ti/cpsw_switchdev networking/device_drivers/ethernet/ti/cpsw_switchdev
@@ -836,19 +833,16 @@ networking/e100 networking/device_drivers/ethernet/intel/e100
networking/e1000 networking/device_drivers/ethernet/intel/e1000
networking/e1000e networking/device_drivers/ethernet/intel/e1000e
networking/fm10k networking/device_drivers/ethernet/intel/fm10k
networking/fore200e networking/device_drivers/atm/fore200e
networking/hinic networking/device_drivers/ethernet/huawei/hinic
networking/i40e networking/device_drivers/ethernet/intel/i40e
networking/iavf networking/device_drivers/ethernet/intel/iavf
networking/ice networking/device_drivers/ethernet/intel/ice
networking/igb networking/device_drivers/ethernet/intel/igb
networking/igbvf networking/device_drivers/ethernet/intel/igbvf
networking/iphase networking/device_drivers/atm/iphase
networking/ixgbe networking/device_drivers/ethernet/intel/ixgbe
networking/ixgbevf networking/device_drivers/ethernet/intel/ixgbevf
networking/netdev-FAQ process/maintainer-netdev
networking/skfp networking/device_drivers/fddi/skfp
networking/z8530drv networking/device_drivers/hamradio/z8530drv
nfc/index driver-api/nfc/index
nfc/nfc-hci driver-api/nfc/nfc-hci
nfc/nfc-pn544 driver-api/nfc/nfc-pn544

View File

@@ -886,6 +886,21 @@ Description:
zone commands, they will be treated as regular block devices and
zoned will report "none".
What: /sys/block/<disk>/queue/zoned_qd1_writes
Date: January 2026
Contact: Damien Le Moal <dlemoal@kernel.org>
Description:
[RW] zoned_qd1_writes indicates if write operations to a zoned
block device are being handled using a single issuer context (a
kernel thread) operating at a maximum queue depth of 1. This
attribute is visible only for zoned block devices. The default
value for zoned block devices that are not rotational devices
(e.g. ZNS SSDs or zoned UFS devices) is 0. For rotational zoned
block devices (e.g. SMR HDDs) the default value is 1. Since
this default may not be appropriate for some devices, e.g.
remotely connected devices over high latency networks, the user
can disable this feature by setting this attribute to 0.
What: /sys/block/<disk>/hidden
Date: March 2023

View File

@@ -16,8 +16,8 @@ What: /sys/accessibility/speakup/bleeps
KernelVersion: 2.6
Contact: speakup@linux-speakup.org
Description: This controls whether one hears beeps through the PC speaker
when using speakup's review commands.
TODO: what values does it accept?
when using speakup's review commands. Range: 0-3. 0 = off, 1 = beeps
only, 2 = announcements only, 3 = beeps and announcements (default).
What: /sys/accessibility/speakup/bleep_time
KernelVersion: 2.6

View File

@@ -50,6 +50,13 @@ Description: Dump debug registers from the QM.
Available for PF and VF in host. VF in guest currently only
has one debug register.
What: /sys/kernel/debug/hisi_hpre/<bdf>/dev_usage
Date: Mar 2026
Contact: linux-crypto@vger.kernel.org
Description: Query the real-time bandwidth usage of device.
Returns the bandwidth usage of each channel on the device.
The returned number is in percentage.
What: /sys/kernel/debug/hisi_hpre/<bdf>/qm/current_q
Date: Sep 2019
Contact: linux-crypto@vger.kernel.org

View File

@@ -24,6 +24,13 @@ Description: The <bdf> is related the function for PF and VF.
1/1000~1000/1000 of total QoS. The driver reading alg_qos to
get related QoS in the host and VM, Such as "cat alg_qos".
What: /sys/kernel/debug/hisi_sec2/<bdf>/dev_usage
Date: Mar 2026
Contact: linux-crypto@vger.kernel.org
Description: Query the real-time bandwidth usage of device.
Returns the bandwidth usage of each channel on the device.
The returned number is in percentage.
What: /sys/kernel/debug/hisi_sec2/<bdf>/qm/qm_regs
Date: Oct 2019
Contact: linux-crypto@vger.kernel.org

View File

@@ -36,6 +36,13 @@ Description: The <bdf> is related the function for PF and VF.
1/1000~1000/1000 of total QoS. The driver reading alg_qos to
get related QoS in the host and VM, Such as "cat alg_qos".
What: /sys/kernel/debug/hisi_zip/<bdf>/dev_usage
Date: Mar 2026
Contact: linux-crypto@vger.kernel.org
Description: Query the real-time bandwidth usage of device.
Returns the bandwidth usage of each channel on the device.
The returned number is in percentage.
What: /sys/kernel/debug/hisi_zip/<bdf>/qm/regs
Date: Nov 2018
Contact: linux-crypto@vger.kernel.org

View File

@@ -26,6 +26,7 @@ Description:
2 Permit modification of EVM-protected metadata at
runtime. Not supported if HMAC validation and
creation is enabled (deprecated).
3 Require asymmetric signatures to be version 3
31 Disable further runtime modification of EVM policy
=== ==================================================

View File

@@ -53,10 +53,7 @@ Description:
where 'imasig' is the original or the signature
format v2.
where 'modsig' is an appended signature,
where 'sigv3' is the signature format v3. (Currently
limited to fsverity digest based signatures
stored in security.ima xattr. Requires
specifying "digest_type=verity" first.)
where 'sigv3' is the signature format v3.
appraise_flag:= [check_blacklist] (deprecated)
Setting the check_blacklist flag is no longer necessary.
@@ -186,6 +183,11 @@ Description:
appraise func=BPRM_CHECK digest_type=verity \
appraise_type=sigv3
Example of a regular IMA file hash 'appraise' rule requiring
signature version 3 format stored in security.ima xattr.
appraise func=BPRM_CHECK appraise_type=sigv3
All of these policy rules could, for example, be constrained
either based on a filesystem's UUID (fsuuid) or based on LSM
labels.

View File

@@ -278,3 +278,13 @@ Date: Aug 2025
KernelVersion 6.18
Contact: Mao Jinlong <quic_jinlmao@quicinc.com>
Description: (Read) Show hardware context information of device.
What: /sys/bus/coresight/devices/<tpdm-name>/traceid
Date: March 2026
KernelVersion: 7.1
Contact: Jie Gan <jie.gan@oss.qualcomm.com>
Description:
(R) Show the trace ID that will appear in the trace stream
coming from this TPDM. The trace ID is inherited from the
connected TPDA device and is fixed for the lifetime of the
device. Returns -EINVAL if the device has not been enabled yet.

View File

@@ -508,6 +508,19 @@ Description:
(RO) The size of extended linear cache, if there is an extended
linear cache. Otherwise the attribute will not be visible.
What: /sys/bus/cxl/devices/regionZ/locked
Date: Mar, 2026
KernelVersion: v7.1
Contact: linux-cxl@vger.kernel.org
Description:
(RO) The CXL driver has the capability to lock a region based on
a BIOS or platform dependent configuration. Regions created as
locked are never permitted to be destroyed. Resets to participating
decoders will not result in a region destroy and will not free the
decoder resources.
What: /sys/bus/cxl/devices/regionZ/mode
Date: January, 2023
KernelVersion: v6.3

View File

@@ -172,3 +172,23 @@ Description:
the automatic retries. Exist only when I3C constroller supports
this retry on nack feature.
What: /sys/bus/i3c/devices/i3c-<bus-id>/do_daa
KernelVersion: 7.0
Contact: linux-i3c@vger.kernel.org
Description:
Write-only attribute that triggers a Dynamic Address Assignment
(DAA) procedure which discovers new I3C devices on the bus.
Writing a boolean true value (1, y, yes, true, on) to this
attribute causes the master controller to perform DAA, which
includes broadcasting an ENTDAA (Enter Dynamic Address Assignment)
Common Command Code (CCC) on the bus. Writing a false value
returns -EINVAL.
This is useful for discovering I3C devices that were not present
during initial bus initialization and are unable to issue
Hot-Join. Only devices without a currently assigned dynamic address
will respond to the ENTDAA broadcast and be assigned addresses.
Note that this mechanism is distinct from Hot-Join, since this is
controller-initiated discovery, while Hot-Join is device-initiated
method to provoke controller discovery procedure.

View File

@@ -1428,7 +1428,7 @@ KernelVersion: 2.6.35
Contact: linux-iio@vger.kernel.org
Description:
The name of the trigger source being used, as per string given
in /sys/class/iio/triggerY/name.
in /sys/bus/iio/devices/triggerY/name.
What: /sys/bus/iio/devices/iio:deviceX/bufferY/length
KernelVersion: 5.11

View File

@@ -675,7 +675,8 @@ Description:
Valid values:
"Unknown", "SDP", "DCP", "CDP", "ACA", "C", "PD",
"PD_DRP", "PD_PPS", "BrickID"
"PD_DRP", "PD_PPS", "BrickID", "PD_SPR_AVS",
"PD_PPS_SPR_AVS"
**Device Specific Properties**

View File

@@ -0,0 +1,36 @@
What: /sys/class/reboot-mode/<driver>/reboot_modes
Date: March 2026(TBD)
KernelVersion: TBD
Contact: linux-pm@vger.kernel.org
Description:
This interface exposes the reboot-mode arguments
registered with the reboot-mode framework. It is
a read-only interface and provides a space
separated list of reboot-mode arguments supported
on the current platform.
Example:
recovery fastboot bootloader
The exact sysfs path may vary depending on the
name of the driver that registers the arguments.
Example:
/sys/class/reboot-mode/nvmem-reboot-mode/reboot_modes
/sys/class/reboot-mode/syscon-reboot-mode/reboot_modes
/sys/class/reboot-mode/qcom-pon/reboot_modes
The supported arguments can be used by userspace to
invoke device reset using the standard reboot() system
call interface, with the "argument" as string to "*arg"
parameter along with LINUX_REBOOT_CMD_RESTART2.
A driver can expose the supported arguments by
registering them with the reboot-mode framework
using the property names that follow the
mode-<argument> format.
Example:
mode-bootloader, mode-recovery.
This attribute is useful for scripts or initramfs
logic that need to programmatically determine
which reboot-mode arguments are valid before
triggering a reboot.

View File

@@ -327,6 +327,24 @@ Description: Energy performance preference
This file is only present if the cppc-cpufreq driver is in use.
What: /sys/devices/system/cpu/cpuX/cpufreq/perf_limited
Date: February 2026
Contact: linux-pm@vger.kernel.org
Description: Performance Limited
Read to check if platform throttling (thermal/power/current
limits) caused delivered performance to fall below the
requested level. A non-zero value indicates throttling occurred.
Write the bitmask of bits to clear:
- 0x1 = clear bit 0 (desired performance excursion)
- 0x2 = clear bit 1 (minimum performance excursion)
- 0x3 = clear both bits
The platform sets these bits; OSPM can only clear them.
This file is only present if the cppc-cpufreq driver is in use.
What: /sys/devices/system/cpu/cpu*/cache/index3/cache_disable_{0,1}
Date: August 2008

View File

@@ -0,0 +1,724 @@
What: /sys/class/leds/go:rgb:joystick_rings/effect
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This controls the display effect of the RGB interface.
Values are monocolor, breathe, chroma, or rainbow.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/class/leds/go:rgb:joystick_rings/effect_index
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the available options for the effect attribute.
Values are monocolor, breathe, chroma, or rainbow.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/class/leds/go:rgb:joystick_rings/enabled
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This controls enabling or disabling the RGB interface.
Values are true or false.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/class/leds/go:rgb:joystick_rings/enabled_index
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the available options for the enabled attribute.
Values are true or false.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/class/leds/go:rgb:joystick_rings/mode
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This controls the operating mode of the RGB interface.
Values are dynamic or custom. Custom allows setting the RGB effect and color.
Dynamic is a Windows mode for syncing Lenovo RGB interfaces not currently
supported under Linux.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/class/leds/go:rgb:joystick_rings/mode_index
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the available options for the mode attribute.
Values are dynamic or custom.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/class/leds/go:rgb:joystick_rings/profile
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This controls selecting the configured RGB profile.
Values are 1-3.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/class/leds/go:rgb:joystick_rings/profile_range
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the available options for the profile attribute.
Values are 1-3.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/class/leds/go:rgb:joystick_rings/speed
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This controls the change rate for the breathe, chroma, and rainbow effects.
Values are 0-100.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/class/leds/go:rgb:joystick_rings/speed_range
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the available options for the speed attribute.
Values are 0-100.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/firmware_version
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the firmware version of the internal MCU.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/fps_mode_dpi
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the DPI of the right handle when the FPS mode switch is on.
Values are 500, 800, 1200, and 1800.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/fps_mode_dpi_index
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the available options for the fps_mode_dpi attribute.
Values are 500, 800, 1200, and 1800.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/hardware_generation
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the hardware generation of the internal MCU.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/hardware_version
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the hardware version of the internal MCU.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/left_handle/auto_sleep_time
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This controls the sleep timer due to inactivity for the left removable controller.
Values are 0-255.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/left_handle/auto_sleep_time_range
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the available options for the left_handle/auto_sleep_time attribute.
Values are 0-255.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/left_handle/calibrate_gyro
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This initiates or halts calibration of the left removable controller's IMU.
Values are start, stop.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/left_handle/calibrate_gyro_index
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the available options for the left_handle/calibrate_gyro attribute.
Values are start, stop.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/left_handle/calibrate_gyro_status
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the result of the last attempted calibration of the left removable controller's IMU.
Values are unknown, success, failure.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/left_handle/calibrate_joystick
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This initiates or halts calibration of the left removable controller's joystick.
Values are start, stop.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/left_handle/calibrate_joystick_index
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the available options for the left_handle/calibrate_jotstick attribute.
Values are start, stop.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/left_handle/calibrate_joystick_status
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the result of the last attempted calibration of the left removable controller's joystick.
Values are unknown, success, failure.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/left_handle/calibrate_tirgger
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This initiates or halts calibration of the left removable controller's trigger.
Values are start, stop.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/left_handle/calibrate_gyro_trigger
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the available options for the left_handle/calibrate_trigger attribute.
Values are start, stop.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/left_handle/calibrate_trigger_status
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the result of the last attempted calibration of the left removable controller's trigger.
Values are unknown, success, failure.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/left_handle/firmware_version
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the left removable controller's firmware version.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/left_handle/hardware_generation
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the hardware generation of the left removable controller.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/left_handle/hardware_version
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the hardware version of the left removable controller.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/left_handle/imu_bypass_enabled
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This controls enabling or disabling the IMU bypass function of the left removable controller.
Values are true or false.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/left_handle/imu_bypass_enabled_index
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the available options for the left_handle/imu_bypass_enabled attribute.
Values are true or false.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/left_handle/imu_enabled
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This controls enabling or disabling the IMU of the left removable controller.
Values are true or false.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/left_handle/imu_enabled_index
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the available options for the left_handle/imu_enabled attribute.
Values are true or false.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/left_handle/product_version
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the product version of the left removable controller.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/left_handle/protocol_version
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the protocol version of the left removable controller.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/left_handle/reset
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: Resets the left removable controller to factory defaults.
Writing 1 to this path initiates.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/left_handle/rumble_mode
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This controls setting the response behavior for rumble events for the left removable controller.
Values are fps, racing, standarg, spg, rpg.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/left_handle/rumble_mode_index
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the available options for the left_handle/rumble_mode attribute.
Values are fps, racing, standarg, spg, rpg.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/left_handle/rumble_notification
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This controls enabling haptic rumble events for the left removable controller.
Values are true, false.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/left_handle/rumble_notification_index
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the available options for the left_handle/rumble_notification attribute.
Values are true, false.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/mode
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This controls the operating mode of the built-in controller.
Values are xinput or dinput.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/left_handle/mode_index
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the available options for the mode attribute.
Values are xinput or dinput.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/os_mode
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This controls the behavior of built in chord combinations.
Values are windows or linux.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/os_mode_index
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the available options for the os_mode attribute.
Values are windows or linux.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/product_version
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the product version of the internal MCU.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/right_handle/protocol_version
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the protocol version of the internal MCU.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/reset_mcu
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: Resets the internal MCU to factory defaults.
Writing 1 to this path initiates.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/right_handle/auto_sleep_time
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This controls the sleep timer due to inactivity for the right removable controller.
Values are 0-255.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/right_handle/auto_sleep_time_range
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the available options for the right_handle/auto_sleep_time attribute.
Values are 0-255.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/right_handle/calibrate_gyro
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This initiates or halts calibration of the right removable controller's IMU.
Values are start, stop.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/right_handle/calibrate_gyro_index
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the available options for the right_handle/calibrate_gyro attribute.
Values are start, stop.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/right_handle/calibrate_gyro_status
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the result of the last attempted calibration of the right removable controller's IMU.
Values are unknown, success, failure.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/right_handle/calibrate_joystick
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This initiates or halts calibration of the right removable controller's joystick.
Values are start, stop.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/right_handle/calibrate_joystick_index
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the available options for the right_handle/calibrate_jotstick attribute.
Values are start, stop.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/right_handle/calibrate_joystick_status
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the result of the last attempted calibration of the right removable controller's joystick.
Values are unknown, success, failure.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/right_handle/calibrate_tirgger
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This initiates or halts calibration of the right removable controller's trigger.
Values are start, stop.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/right_handle/calibrate_gyro_trigger
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the available options for the right_handle/calibrate_trigger attribute.
Values are start, stop.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/right_handle/calibrate_trigger_status
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the result of the last attempted calibration of the right removable controller's trigger.
Values are unknown, success, failure.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/right_handle/firmware_version
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the right removable controller's firmware version.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/right_handle/hardware_generation
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the hardware generation of the right removable controller.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/right_handle/hardware_version
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the hardware version of the right removable controller.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/right_handle/imu_bypass_enabled
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This controls enabling or disabling the IMU bypass function of the right removable controller.
Values are true or false.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/right_handle/imu_bypass_enabled_index
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the available options for the right_handle/imu_bypass_enabled attribute.
Values are true or false.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/right_handle/imu_enabled
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This controls enabling or disabling the IMU of the right removable controller.
Values are true or false.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/right_handle/imu_enabled_index
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the available options for the right_handle/imu_enabled attribute.
Values are true or false.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/right_handle/product_version
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the product version of the right removable controller.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/right_handle/protocol_version
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the protocol version of the right removable controller.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/right_handle/reset
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: Resets the right removable controller to factory defaults.
Writing 1 to this path initiates.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/right_handle/rumble_mode
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This controls setting the response behavior for rumble events for the right removable controller.
Values are fps, racing, standarg, spg, rpg.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/right_handle/rumble_mode_index
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the available options for the right_handle/rumble_mode attribute.
Values are fps, racing, standarg, spg, rpg.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/right_handle/rumble_notification
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This controls enabling haptic rumble events for the right removable controller.
Values are true, false.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/right_handle/rumble_notification_index
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the available options for the right_handle/rumble_notification attribute.
Values are true, false.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/rumble_intensity
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This controls setting the rumble intensity for both removable controllers.
Values are off, low, medium, high.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/rumble_intensity_index
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the available options for the rumble_intensity attribute.
Values are off, low, medium, high.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/touchpad/enabled
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This controls enabling or disabling the touchpad.
Values are true, false.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/touchpad/enabled_index
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the available options for the touchpad/enabled attribute.
Values are true, false.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/touchpad/vibration_enabled
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This controls enabling haptic rumble events for the touchpad.
Values are true, false.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/touchpad/vibration_enabled_index
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the available options for the touchpad/vibration_enabled attribute.
Values are true, false.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/touchpad/vibration_intensity
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This controls setting the intensity of the touchpad haptics.
Values are off, low, medium, high.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/touchpad/vibration_intensity_index
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the available options for the touchpad/vibration_intensity attribute.
Values are off, low, medium, high.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/tx_dongle/firmware_version
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the firmware version of the internal wireless transmission dongle.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/tx_dongle/hardware_generation
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the hardware generation of the internal wireless transmission dongle.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/tx_dongle/hardware_version
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the hardware version of the internal wireless transmission dongle.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/tx_dongle/product_version
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the product version of the internal wireless transmission dongle.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/tx_dongle/protocol_version
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the protocol version of the internal wireless transmission dongle.
Applies to Lenovo Legion Go and Go 2 line of handheld devices.

View File

@@ -0,0 +1,304 @@
What: /sys/class/leds/go_s:rgb:joystick_rings/effect
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This controls the display effect of the RGB interface.
Values are monocolor, breathe, chroma, or rainbow.
Applies to Lenovo Legion Go S line of handheld devices.
What: /sys/class/leds/go_s:rgb:joystick_rings/effect_index
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the available options for the effect attribute.
Values are monocolor, breathe, chroma, or rainbow.
Applies to Lenovo Legion Go S line of handheld devices.
What: /sys/class/leds/go_s:rgb:joystick_rings/enabled
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This controls enabling or disabling the RGB interface.
Values are true or false.
Applies to Lenovo Legion Go S line of handheld devices.
What: /sys/class/leds/go_s:rgb:joystick_rings/enabled_index
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the available options for the enabled attribute.
Values are true or false.
Applies to Lenovo Legion Go S line of handheld devices.
What: /sys/class/leds/go_s:rgb:joystick_rings/mode
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This controls the operating mode of the RGB interface.
Values are dynamic or custom. Custom allows setting the RGB effect and color.
Dynamic is a Windows mode for syncing Lenovo RGB interfaces not currently
supported under Linux.
Applies to Lenovo Legion Go S line of handheld devices.
What: /sys/class/leds/go_s:rgb:joystick_rings/mode_index
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the available options for the mode attribute.
Values are dynamic or custom.
Applies to Lenovo Legion Go S line of handheld devices.
What: /sys/class/leds/go_s:rgb:joystick_rings/profile
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This controls selecting the configured RGB profile.
Values are 1-3.
Applies to Lenovo Legion Go S line of handheld devices.
What: /sys/class/leds/go_s:rgb:joystick_rings/profile_range
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the available options for the profile attribute.
Values are 1-3.
Applies to Lenovo Legion Go S line of handheld devices.
What: /sys/class/leds/go_s:rgb:joystick_rings/speed
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This controls the change rate for the breathe, chroma, and rainbow effects.
Values are 0-100.
Applies to Lenovo Legion Go S line of handheld devices.
What: /sys/class/leds/go_s:rgb:joystick_rings/speed_range
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the available options for the speed attribute.
Values are 0-100.
Applies to Lenovo Legion Go S line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/gamepad/auto_sleep_time
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This controls the sleep timer due to inactivity for the built-in controller.
Values are 0-255.
Applies to Lenovo Legion Go S line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/gamepad/auto_sleep_time_range
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the available options for the gamepad/auto_sleep_time attribute.
Values are 0-255.
Applies to Lenovo Legion Go S line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/gamepad/dpad_mode
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This controls the operating mode of the built-in controllers D-pad.
Values are 4-way or 8-way.
Applies to Lenovo Legion Go S line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/gamepad/dpad_mode_index
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the available options for the gamepad/dpad_mode attribute.
Values are 4-way or 8-way.
Applies to Lenovo Legion Go S line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/gamepad/mode
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This controls the operating mode of the built-in controller.
Values are xinput or dinput.
Applies to Lenovo Legion Go S line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/gamepad/mode_index
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the available options for the gamepad/mode attribute.
Values are xinput or dinput.
Applies to Lenovo Legion Go S line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/gamepad/poll_rate
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This controls the poll rate in Hz of the built-in controller.
Values are 125, 250, 500, or 1000.
Applies to Lenovo Legion Go S line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/gamepad/poll_rate_index
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the available options for the gamepad/poll_rate attribute.
Values are 125, 250, 500, or 1000.
Applies to Lenovo Legion Go S line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/imu/bypass_enabled
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This controls enabling or disabling the IMU bypass function. When enabled the IMU data is directly reported to the OS through
an HIDRAW interface.
Values are true or false.
Applies to Lenovo Legion Go S line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/imu/bypass_enabled_index
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the available options for the imu/bypass_enabled attribute.
Values are true or false.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/imu/manufacturer
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the manufacturer of the intertial measurment unit.
Values are Bosch or ST.
Applies to Lenovo Legion Go S line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/imu/sensor_enabled
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This controls enabling or disabling the IMU.
Values are true, false, or wake-2s.
Applies to Lenovo Legion Go S line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/imu/sensor_enabled_index
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the available options for the imu/sensor_enabled attribute.
Values are true, false, or wake-2s.
Applies to Lenovo Legion Go S line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/mcu_id
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the MCU Identification Number
Applies to Lenovo Legion Go S line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/mouse/step
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This controls which value is used for the mouse sensitivity.
Values are 1-127.
Applies to Lenovo Legion Go S line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/mouse/step_range
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the available options for the mouse/step attribute.
Values are 1-127.
Applies to Lenovo Legion Go S line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/os_mode
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This controls which value is used for the touchpads operating mode.
Values are windows or linux.
Applies to Lenovo Legion Go S line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/os_mode_index
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the available options for the os_mode attribute.
Values are windows or linux.
Applies to Lenovo Legion Go S line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/touchpad/enabled
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This controls enabling or disabling the built-in touchpad.
Values are true or false.
Applies to Lenovo Legion Go S line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/touchpad/enabled_index
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the available options for the touchpad/enabled attribute.
Values are true or false.
Applies to Lenovo Legion Go S line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/touchpad/linux_mode
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This controls behavior of the touchpad events when os_mode is set to linux.
Values are absolute or relative.
Applies to Lenovo Legion Go S line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/touchpad/linux_mode_index
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the available options for the touchpad/linux_mode attribute.
Values are absolute or relative.
Applies to Lenovo Legion Go S line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/touchpad/windows_mode
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This controls behavior of the touchpad events when os_mode is set to windows.
Values are absolute or relative.
Applies to Lenovo Legion Go S line of handheld devices.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/touchpad/windows_mode_index
Date: April 2026
Contact: linux-input@vger.kernel.org
Description: This displays the available options for the touchpad/windows_mode attribute.
Values are absolute or relative.
Applies to Lenovo Legion Go S line of handheld devices.

View File

@@ -129,6 +129,37 @@ Description:
-EIO if FW refuses to change the provisioning.
What: /sys/bus/pci/drivers/xe/.../sriov_admin/.bulk_profile/vram_quota
What: /sys/bus/pci/drivers/xe/.../sriov_admin/vf<n>/profile/vram_quota
Date: February 2026
KernelVersion: 7.0
Contact: intel-xe@lists.freedesktop.org
Description:
These files allow to perform initial VFs VRAM provisioning prior to VFs
enabling or to change VFs VRAM provisioning once the VFs are enabled.
Any non-zero initial VRAM provisioning will block VFs auto-provisioning.
Without initial VRAM provisioning those files will show result of the
VRAM auto-provisioning performed by the PF once the VFs are enabled.
Once the VFs are disabled, all VRAM provisioning will be released.
These files are visible only on discrete Intel Xe platforms with VRAM
and are writeable only if dynamic VFs VRAM provisioning is supported.
.bulk_profile/vram_quota: (WO) unsigned integer
The amount of the provisioned VRAM in [bytes] for each VF.
Actual quota value might be aligned per HW/FW requirements.
profile/vram_quota: (RW) unsigned integer
The amount of the provisioned VRAM in [bytes] for this VF.
Actual quota value might be aligned per HW/FW requirements.
Default is 0 (unprovisioned).
Writes to these attributes may fail with errors like:
-EINVAL if provided input is malformed or not recognized,
-EPERM if change is not applicable on given HW/FW,
-EIO if FW refuses to change the provisioning.
What: /sys/bus/pci/drivers/xe/.../sriov_admin/vf<n>/stop
Date: October 2025
KernelVersion: 6.19

View File

@@ -0,0 +1,114 @@
What: /sys/bus/pci/devices/<BDF>/qat_svn/
Date: June 2026
KernelVersion: 7.1
Contact: qat-linux@intel.com
Description: Directory containing Security Version Number (SVN) attributes for
the Anti-Rollback (ARB) feature. The ARB feature prevents downloading
older firmware versions to the acceleration device.
What: /sys/bus/pci/devices/<BDF>/qat_svn/enforced_min
Date: June 2026
KernelVersion: 7.1
Contact: qat-linux@intel.com
Description:
(RO) Reports the minimum allowed firmware SVN.
Returns an integer greater than zero. Firmware with SVN lower than
this value is rejected.
A write to qat_svn/commit will update this value. The update is not
persistent across reboot; on reboot, this value is reset from
qat_svn/permanent_min.
Example usage::
# cat /sys/bus/pci/devices/<BDF>/qat_svn/enforced_min
2
This attribute is available only on devices that support
Anti-Rollback.
What: /sys/bus/pci/devices/<BDF>/qat_svn/permanent_min
Date: June 2026
KernelVersion: 7.1
Contact: qat-linux@intel.com
Description:
(RO) Reports the persistent minimum SVN used to initialize
qat_svn/enforced_min on each reboot.
Returns an integer greater than zero. A write to qat_svn/commit
may update this value, depending on platform/BIOS settings.
Example usage::
# cat /sys/bus/pci/devices/<BDF>/qat_svn/permanent_min
3
This attribute is available only on devices that support
Anti-Rollback.
What: /sys/bus/pci/devices/<BDF>/qat_svn/active
Date: June 2026
KernelVersion: 7.1
Contact: qat-linux@intel.com
Description:
(RO) Reports the SVN of the currently active firmware image.
Returns an integer greater than zero.
Example usage::
# cat /sys/bus/pci/devices/<BDF>/qat_svn/active
2
This attribute is available only on devices that support
Anti-Rollback.
What: /sys/bus/pci/devices/<BDF>/qat_svn/commit
Date: June 2026
KernelVersion: 7.1
Contact: qat-linux@intel.com
Description:
(WO) Commits the currently active SVN as the minimum allowed SVN.
Writing 1 sets qat_svn/enforced_min to the value of qat_svn/active,
preventing future firmware loads with lower SVN.
Depending on platform/BIOS settings, a commit may also update
qat_svn/permanent_min.
Note that on reboot, qat_svn/enforced_min reverts to
qat_svn/permanent_min.
It is advisable to use this attribute with caution, only when
it is necessary to set a new minimum SVN for the firmware.
Before committing the SVN update, it is crucial to check the
current values of qat_svn/active, qat_svn/enforced_min and
qat_svn/permanent_min. This verification helps ensure that the
commit operation aligns with the intended outcome.
While writing to the file, any value other than '1' will result
in an error and have no effect.
Example usage::
## Read current values
# cat /sys/bus/pci/devices/<BDF>/qat_svn/enforced_min
2
# cat /sys/bus/pci/devices/<BDF>/qat_svn/permanent_min
2
# cat /sys/bus/pci/devices/<BDF>/qat_svn/active
3
## Commit active SVN
# echo 1 > /sys/bus/pci/devices/<BDF>/qat_svn/commit
## Read updated values
# cat /sys/bus/pci/devices/<BDF>/qat_svn/enforced_min
3
# cat /sys/bus/pci/devices/<BDF>/qat_svn/permanent_min
3
This attribute is available only on devices that support
Anti-Rollback.

View File

@@ -1768,3 +1768,26 @@ Description:
==================== ===========================
The attribute is read only.
What: /sys/bus/platform/drivers/ufshcd/*/dme_qos_notification
What: /sys/bus/platform/devices/*.ufs/dme_qos_notification
Date: March 2026
Contact: Can Guo <can.guo@oss.qualcomm.com>
Description:
This attribute reports and clears pending DME (Device Management
Entity) Quality of Service (QoS) notifications. This attribute
is a bitfield with the following bit assignments:
Bit Description
=== ======================================
0 DME QoS Monitor has been reset by host
1 QoS from TX is detected
2 QoS from RX is detected
3 QoS from PA_INIT is detected
Reading this attribute returns the pending DME QoS notification
bits. Writing '0' to this attribute clears pending DME QoS
notification bits. Writing any non-zero value is invalid and
will be rejected.
The attribute is read/write.

View File

@@ -51,3 +51,30 @@ Description:
Reading this file returns the current status of the breathing animation
functionality.
What: /sys/bus/platform/devices/INOU0000:XX/ctgp_offset
Date: January 2026
KernelVersion: 7.0
Contact: Werner Sembach <wse@tuxedocomputers.com>
Description:
Allows userspace applications to set the configurable TGP offset on top of the base
TGP. Base TGP and max TGP and therefore the max cTGP offset are device specific.
Note that setting the maximum cTGP leaves no window open for Dynamic Boost as
Dynamic Boost also can not go over max TGP. Setting the cTGP to maximum is
effectively disabling Dynamic Boost and telling the device to always prioritize the
GPU over the CPU.
Reading this file returns the current configurable TGP offset.
What: /sys/bus/platform/devices/INOU0000:XX/usb_c_power_priority
Date: February 2026
KernelVersion: 7.1
Contact: Werner Sembach <wse@tuxedocomputers.com>
Description:
Allows userspace applications to choose the USB-C power distribution profile between
one that offers a bigger share of the power to the battery and one that offers more
of it to the CPU. Writing "charging"/"performance" into this file selects the
respective profile.
Reading this file returns the profile names with the currently active one in
brackets.

View File

@@ -41,6 +41,12 @@ Description:
platform runtime firmware S3 resume, just prior to
handoff to the OS waking vector. In nanoseconds.
FBPT: The raw binary contents of the Firmware Basic Boot
Performance Table (FBPT) subtable.
S3PT: The raw binary contents of the S3 Performance Table
(S3PT) subtable.
What: /sys/firmware/acpi/bgrt/
Date: January 2012
Contact: Matthew Garrett <mjg@redhat.com>

View File

@@ -407,6 +407,12 @@ Contact: "Hridya Valsaraju" <hridya@google.com>
Description: Average number of valid blocks.
Available when CONFIG_F2FS_STAT_FS=y.
What: /sys/fs/f2fs/<disk>/defrag_blocks
Date: February 2026
Contact: "Jinbao Liu" <liujinbao1@xiaomi.com>
Description: Number of blocks moved by defragment.
Available when CONFIG_F2FS_STAT_FS=y.
What: /sys/fs/f2fs/<disk>/mounted_time_sec
Date: February 2020
Contact: "Jaegeuk Kim" <jaegeuk@kernel.org>

View File

@@ -316,6 +316,12 @@ Contact: SeongJae Park <sj@kernel.org>
Description: Writing to and reading from this file sets and gets the path
parameter of the goal.
What: /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/schemes/<S>/quotas/goal_tuner
Date: Mar 2026
Contact: SeongJae Park <sj@kernel.org>
Description: Writing to and reading from this file sets and gets the
goal-based effective quota auto-tuning algorithm to use.
What: /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/schemes/<S>/quotas/weights/sz_permil
Date: Mar 2022
Contact: SeongJae Park <sj@kernel.org>

View File

@@ -48,6 +48,15 @@ Contact: Kay Sievers <kay.sievers@vrfy.org>
Description: Show the initialization state(live, coming, going) of
the module.
What: /sys/module/*/import_ns
Date: January 2026
KernelVersion: 7.1
Contact: linux-modules@vger.kernel.org
Description: List of symbol namespaces imported by this module via
MODULE_IMPORT_NS(). Each namespace appears on a separate line.
This file only exists for modules that import at least one
namespace.
What: /sys/module/*/taint
Date: Jan 2012
KernelVersion: 3.3

View File

@@ -0,0 +1,13 @@
What: /sys/devices/virtual/nvme-fabrics/ctl/.../tls_configured_key
Date: November 2025
KernelVersion: 6.19
Contact: Linux NVMe mailing list <linux-nvme@lists.infradead.org>
Description:
The file is avaliable when using a secure concatanation
connection to a NVMe target. Reading the file will return
the serial of the currently negotiated key.
Writing 0 to the file will trigger a PSK reauthentication
(REPLACETLSPSK) with the target. After a reauthentication
the value returned by tls_configured_key will be the new
serial.

View File

@@ -113,8 +113,11 @@ vectors, use the following function::
int pci_irq_vector(struct pci_dev *dev, unsigned int nr);
Any allocated resources should be freed before removing the device using
the following function::
If the driver enables the device using pcim_enable_device(), the driver
shouldn't call pci_free_irq_vectors() because pcim_enable_device()
activates automatic management for IRQ vectors. Otherwise, the driver should
free any allocated IRQ vectors before removing the device using the following
function::
void pci_free_irq_vectors(struct pci_dev *dev);

View File

@@ -79,10 +79,10 @@ To retrieve a Steering Tag for a target memory associated with a specific
CPU, use the following function::
int pcie_tph_get_cpu_st(struct pci_dev *pdev, enum tph_mem_type type,
unsigned int cpu_uid, u16 *tag);
unsigned int cpu, u16 *tag);
The `type` argument is used to specify the memory type, either volatile
or persistent, of the target memory. The `cpu_uid` argument specifies the
or persistent, of the target memory. The `cpu` argument specifies the
CPU where the memory is associated to.
After the ST value is retrieved, the device driver can use the following

View File

@@ -2787,6 +2787,13 @@ which avoids the read-side memory barriers, at least for architectures
that apply noinstr to kernel entry/exit code (or that build with
``CONFIG_TASKS_TRACE_RCU_NO_MB=y``.
Now that the implementation is based on SRCU-fast, a call
to synchronize_rcu_tasks_trace() implies at least one call to
synchronize_rcu(), that is, every Tasks Trace RCU grace period contains
at least one plain vanilla RCU grace period. Should there ever
be a synchronize_rcu_tasks_trace_expedited(), this guarantee would
*not* necessarily apply to this hypothetical API member.
The tasks-trace-RCU API is also reasonably compact,
consisting of rcu_read_lock_trace(), rcu_read_unlock_trace(),
rcu_read_lock_trace_held(), call_rcu_tasks_trace(),

View File

@@ -618,7 +618,7 @@ cache_replacement_policy
One of either lru, fifo or random.
freelist_percent
Size of the freelist as a percentage of nbuckets. Can be written to to
Size of the freelist as a percentage of nbuckets. Can be written to
increase the number of buckets kept on the freelist, which lets you
artificially reduce the size of the cache at runtime. Mostly for testing
purposes (i.e. testing how different size caches affect your hit rate).

View File

@@ -62,7 +62,7 @@ The options available for the add command can be listed by reading the
/dev/zloop-control device::
$ cat /dev/zloop-control
add id=%d,capacity_mb=%u,zone_size_mb=%u,zone_capacity_mb=%u,conv_zones=%u,base_dir=%s,nr_queues=%u,queue_depth=%u,buffered_io
add id=%d,capacity_mb=%u,zone_size_mb=%u,zone_capacity_mb=%u,conv_zones=%u,max_open_zones=%u,base_dir=%s,nr_queues=%u,queue_depth=%u,buffered_io,zone_append=%u,ordered_zone_append,discard_write_cache
remove id=%d
In more details, the options that can be used with the "add" command are as
@@ -80,6 +80,9 @@ zone_capacity_mb Device zone capacity (must always be equal to or lower
conv_zones Total number of conventioanl zones starting from
sector 0
Default: 8
max_open_zones Maximum number of open sequential write required zones
(0 for no limit).
Default: 0
base_dir Path to the base directory where to create the directory
containing the zone files of the device.
Default=/var/local/zloop.
@@ -104,6 +107,11 @@ ordered_zone_append Enable zloop mitigation of zone append reordering.
(extents), as when enabled, this can significantly reduce
the number of data extents needed to for a file data
mapping.
discard_write_cache Discard all data that was not explicitly persisted using a
flush operation when the device is removed by truncating
each zone file to the size recorded during the last flush
operation. This simulates power fail events where
uncommitted data is lost.
=================== =========================================================
3) Deleting a Zoned Device

View File

@@ -462,7 +462,7 @@ know it via /sys/block/zram0/bd_stat's 3rd column.
recompression
-------------
With CONFIG_ZRAM_MULTI_COMP, zram can recompress pages using alternative
With `CONFIG_ZRAM_MULTI_COMP`, zram can recompress pages using alternative
(secondary) compression algorithms. The basic idea is that alternative
compression algorithm can provide better compression ratio at a price of
(potentially) slower compression/decompression speeds. Alternative compression
@@ -471,7 +471,7 @@ that default algorithm failed to compress). Another application is idle pages
recompression - pages that are cold and sit in the memory can be recompressed
using more effective algorithm and, hence, reduce zsmalloc memory usage.
With CONFIG_ZRAM_MULTI_COMP, zram supports up to 4 compression algorithms:
With `CONFIG_ZRAM_MULTI_COMP`, zram supports up to 4 compression algorithms:
one primary and up to 3 secondary ones. Primary zram compressor is explained
in "3) Select compression algorithm", secondary algorithms are configured
using recomp_algorithm device attribute.
@@ -495,56 +495,43 @@ configuration:::
#select deflate recompression algorithm, priority 2
echo "algo=deflate priority=2" > /sys/block/zramX/recomp_algorithm
Another device attribute that CONFIG_ZRAM_MULTI_COMP enables is recompress,
Another device attribute that `CONFIG_ZRAM_MULTI_COMP` enables is `recompress`,
which controls recompression.
Examples:::
#IDLE pages recompression is activated by `idle` mode
echo "type=idle" > /sys/block/zramX/recompress
echo "type=idle priority=1" > /sys/block/zramX/recompress
#HUGE pages recompression is activated by `huge` mode
echo "type=huge" > /sys/block/zram0/recompress
echo "type=huge priority=2" > /sys/block/zram0/recompress
#HUGE_IDLE pages recompression is activated by `huge_idle` mode
echo "type=huge_idle" > /sys/block/zramX/recompress
echo "type=huge_idle priority=1" > /sys/block/zramX/recompress
The number of idle pages can be significant, so user-space can pass a size
threshold (in bytes) to the recompress knob: zram will recompress only pages
of equal or greater size:::
#recompress all pages larger than 3000 bytes
echo "threshold=3000" > /sys/block/zramX/recompress
echo "threshold=3000 priority=1" > /sys/block/zramX/recompress
#recompress idle pages larger than 2000 bytes
echo "type=idle threshold=2000" > /sys/block/zramX/recompress
echo "type=idle threshold=2000 priority=1" > \
/sys/block/zramX/recompress
It is also possible to limit the number of pages zram re-compression will
attempt to recompress:::
echo "type=huge_idle max_pages=42" > /sys/block/zramX/recompress
echo "type=huge_idle priority=1 max_pages=42" > \
/sys/block/zramX/recompress
During re-compression for every page, that matches re-compression criteria,
ZRAM iterates the list of registered alternative compression algorithms in
order of their priorities. ZRAM stops either when re-compression was
successful (re-compressed object is smaller in size than the original one)
and matches re-compression criteria (e.g. size threshold) or when there are
no secondary algorithms left to try. If none of the secondary algorithms can
successfully re-compressed the page such a page is marked as incompressible,
so ZRAM will not attempt to re-compress it in the future.
This re-compression behaviour, when it iterates through the list of
registered compression algorithms, increases our chances of finding the
algorithm that successfully compresses a particular page. Sometimes, however,
it is convenient (and sometimes even necessary) to limit recompression to
only one particular algorithm so that it will not try any other algorithms.
This can be achieved by providing a `algo` or `priority` parameter:::
#use zstd algorithm only (if registered)
echo "type=huge algo=zstd" > /sys/block/zramX/recompress
#use zstd algorithm only (if zstd was registered under priority 1)
echo "type=huge priority=1" > /sys/block/zramX/recompress
It is advised to always specify `priority` parameter. While it is also
possible to specify `algo` parameter, so that `zram` will use algorithm's
name to determine the priority, it is not recommended, since it can lead to
unexpected results when the same algorithm is configured with different
priorities (e.g. different parameters). `priority` is the only way to
guarantee that the expected algorithm will be used.
memory tracking
===============

View File

@@ -1734,6 +1734,11 @@ The following nested keys are defined.
zswpwb
Number of pages written from zswap to swap.
zswap_incomp
Number of incompressible pages currently stored in zswap
without compression. These pages could not be compressed to
a size smaller than PAGE_SIZE, so they are stored as-is.
thp_fault_alloc (npn)
Number of transparent hugepages which were allocated to satisfy
a page fault. This counter is not present when CONFIG_TRANSPARENT_HUGEPAGE

View File

@@ -0,0 +1,357 @@
.. SPDX-License-Identifier: GPL-2.0
=============
CPU Isolation
=============
Introduction
============
"CPU Isolation" means leaving a CPU exclusive to a given workload
without any undesired code interference from the kernel.
Those interferences, commonly pointed out as "noise", can be triggered
by asynchronous events (interrupts, timers, scheduler preemption by
workqueues and kthreads, ...) or synchronous events (syscalls and page
faults).
Such noise usually goes unnoticed. After all, synchronous events are a
component of the requested kernel service. And asynchronous events are
either sufficiently well-distributed by the scheduler when executed
as tasks or reasonably fast when executed as interrupt. The timer
interrupt can even execute 1024 times per seconds without a significant
and measurable impact most of the time.
However some rare and extreme workloads can be quite sensitive to
those kinds of noise. This is the case, for example, with high
bandwidth network processing that can't afford losing a single packet
or very low latency network processing. Typically those use cases
involve DPDK, bypassing the kernel networking stack and performing
direct access to the networking device from userspace.
In order to run a CPU without or with limited kernel noise, the
related housekeeping work needs to be either shut down, migrated or
offloaded.
Housekeeping
============
In the CPU isolation terminology, housekeeping is the work, often
asynchronous, that the kernel needs to process in order to maintain
all its services. It matches the noises and disturbances enumerated
above except when at least one CPU is isolated. Then housekeeping may
make use of further coping mechanisms if CPU-tied work must be
offloaded.
Housekeeping CPUs are the non-isolated CPUs where the kernel noise
is moved away from isolated CPUs.
The isolation can be implemented in several ways depending on the
nature of the noise:
- Unbound work, where "unbound" means not tied to any CPU, can be
simply migrated away from isolated CPUs to housekeeping CPUs.
This is the case of unbound workqueues, kthreads and timers.
- Bound work, where "bound" means tied to a specific CPU, usually
can't be moved away as-is by nature. Either:
- The work must switch to a locked implementation. E.g.:
This is the case of RCU with CONFIG_RCU_NOCB_CPU.
- The related feature must be shut down and considered
incompatible with isolated CPUs. E.g.: Lockup watchdog,
unreliable clocksources, etc...
- An elaborate and heavyweight coping mechanism stands as a
replacement. E.g.: the timer tick is shut down on nohz_full
CPUs but with the constraint of running a single task on
them. A significant cost penalty is added on kernel entry/exit
and a residual 1Hz scheduler tick is offloaded to housekeeping
CPUs.
In any case, housekeeping work has to be handled, which is why there
must be at least one housekeeping CPU in the system, preferably more
if the machine runs a lot of CPUs. For example one per node on NUMA
systems.
Also CPU isolation often means a tradeoff between noise-free isolated
CPUs and added overhead on housekeeping CPUs, sometimes even on
isolated CPUs entering the kernel.
Isolation features
==================
Different levels of isolation can be configured in the kernel, each of
which has its own drawbacks and tradeoffs.
Scheduler domain isolation
--------------------------
This feature isolates a CPU from the scheduler topology. As a result,
the target isn't part of the load balancing. Tasks won't migrate
either from or to it unless affined explicitly.
As a side effect the CPU is also isolated from unbound workqueues and
unbound kthreads.
Requirements
~~~~~~~~~~~~
- CONFIG_CPUSETS=y for the cpusets-based interface
Tradeoffs
~~~~~~~~~
By nature, the system load is overall less distributed since some CPUs
are extracted from the global load balancing.
Interfaces
~~~~~~~~~~
- Documentation/admin-guide/cgroup-v2.rst cpuset isolated partitions are recommended
because they are tunable at runtime.
- The 'isolcpus=' kernel boot parameter with the 'domain' flag is a
less flexible alternative that doesn't allow for runtime
reconfiguration.
IRQs isolation
--------------
Isolate the IRQs whenever possible, so that they don't fire on the
target CPUs.
Interfaces
~~~~~~~~~~
- The file /proc/irq/\*/smp_affinity as explained in detail in
Documentation/core-api/irq/irq-affinity.rst page.
- The "irqaffinity=" kernel boot parameter for a default setting.
- The "managed_irq" flag in the "isolcpus=" kernel boot parameter
tries a best effort affinity override for managed IRQs.
Full Dynticks (aka nohz_full)
-----------------------------
Full dynticks extends the dynticks idle mode, which stops the tick when
the CPU is idle, to CPUs running a single task in userspace. That is,
the timer tick is stopped if the environment allows it.
Global timer callbacks are also isolated from the nohz_full CPUs.
Requirements
~~~~~~~~~~~~
- CONFIG_NO_HZ_FULL=y
Constraints
~~~~~~~~~~~
- The isolated CPUs must run a single task only. Multitask requires
the tick to maintain preemption. This is usually fine since the
workload usually can't stand the latency of random context switches.
- No call to the kernel from isolated CPUs, at the risk of triggering
random noise.
- No use of POSIX CPU timers on isolated CPUs.
- Architecture must have a stable and reliable clocksource (no
unreliable TSC that requires the watchdog).
Tradeoffs
~~~~~~~~~
In terms of cost, this is the most invasive isolation feature. It is
assumed to be used when the workload spends most of its time in
userspace and doesn't rely on the kernel except for preparatory
work because:
- RCU adds more overhead due to the locked, offloaded and threaded
callbacks processing (the same that would be obtained with "rcu_nocbs"
boot parameter).
- Kernel entry/exit through syscalls, exceptions and IRQs are more
costly due to fully ordered RmW operations that maintain userspace
as RCU extended quiescent state. Also the CPU time is accounted on
kernel boundaries instead of periodically from the tick.
- Housekeeping CPUs must run a 1Hz residual remote scheduler tick
on behalf of the isolated CPUs.
Checklist
=========
You have set up each of the above isolation features but you still
observe jitters that trash your workload? Make sure to check a few
elements before proceeding.
Some of these checklist items are similar to those of real-time
workloads:
- Use mlock() to prevent your pages from being swapped away. Page
faults are usually not compatible with jitter sensitive workloads.
- Avoid SMT to prevent your hardware thread from being "preempted"
by another one.
- CPU frequency changes may induce subtle sorts of jitter in a
workload. Cpufreq should be used and tuned with caution.
- Deep C-states may result in latency issues upon wake-up. If this
happens to be a problem, C-states can be limited via kernel boot
parameters such as processor.max_cstate or intel_idle.max_cstate.
More finegrained tunings are described in
Documentation/admin-guide/pm/cpuidle.rst page
- Your system may be subject to firmware-originating interrupts - x86 has
System Management Interrupts (SMIs) for example. Check your system BIOS
to disable such interference, and with some luck your vendor will have
a BIOS tuning guidance for low-latency operations.
Full isolation example
======================
In this example, the system has 8 CPUs and the 8th is to be fully
isolated. Since CPUs start from 0, the 8th CPU is CPU 7.
Kernel parameters
-----------------
Set the following kernel boot parameters to disable SMT and setup tick
and IRQ isolation:
- Full dynticks: nohz_full=7
- IRQs isolation: irqaffinity=0-6
- Managed IRQs isolation: isolcpus=managed_irq,7
- Prevent SMT: nosmt
The full command line is then:
nohz_full=7 irqaffinity=0-6 isolcpus=managed_irq,7 nosmt
CPUSET configuration (cgroup v2)
--------------------------------
Assuming cgroup v2 is mounted to /sys/fs/cgroup, the following script
isolates CPU 7 from scheduler domains.
::
cd /sys/fs/cgroup
# Activate the cpuset subsystem
echo +cpuset > cgroup.subtree_control
# Create partition to be isolated
mkdir test
cd test
echo +cpuset > cgroup.subtree_control
# Isolate CPU 7
echo 7 > cpuset.cpus
echo "isolated" > cpuset.cpus.partition
The userspace workload
----------------------
Fake a pure userspace workload, the program below runs a dummy
userspace loop on the isolated CPU 7.
::
#include <stdio.h>
#include <fcntl.h>
#include <unistd.h>
#include <errno.h>
int main(void)
{
// Move the current task to the isolated cpuset (bind to CPU 7)
int fd = open("/sys/fs/cgroup/test/cgroup.procs", O_WRONLY);
if (fd < 0) {
perror("Can't open cpuset file...\n");
return 0;
}
write(fd, "0\n", 2);
close(fd);
// Run an endless dummy loop until the launcher kills us
while (1)
;
return 0;
}
Build it and save for later step:
::
# gcc user_loop.c -o user_loop
The launcher
------------
The below launcher runs the above program for 10 seconds and traces
the noise resulting from preempting tasks and IRQs.
::
TRACING=/sys/kernel/tracing/
# Make sure tracing is off for now
echo 0 > $TRACING/tracing_on
# Flush previous traces
echo > $TRACING/trace
# Record disturbance from other tasks
echo 1 > $TRACING/events/sched/sched_switch/enable
# Record disturbance from interrupts
echo 1 > $TRACING/events/irq_vectors/enable
# Now we can start tracing
echo 1 > $TRACING/tracing_on
# Run the dummy user_loop for 10 seconds on CPU 7
./user_loop &
USER_LOOP_PID=$!
sleep 10
kill $USER_LOOP_PID
# Disable tracing and save traces from CPU 7 in a file
echo 0 > $TRACING/tracing_on
cat $TRACING/per_cpu/cpu7/trace > trace.7
If no specific problem arose, the output of trace.7 should look like
the following:
::
<idle>-0 [007] d..2. 1980.976624: sched_switch: prev_comm=swapper/7 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=user_loop next_pid=1553 next_prio=120
user_loop-1553 [007] d.h.. 1990.946593: reschedule_entry: vector=253
user_loop-1553 [007] d.h.. 1990.946593: reschedule_exit: vector=253
That is, no specific noise triggered between the first trace and the
second during 10 seconds when user_loop was running.
Debugging
=========
Of course things are never so easy, especially on this matter.
Chances are that actual noise will be observed in the aforementioned
trace.7 file.
The best way to investigate further is to enable finer grained
tracepoints such as those of subsystems producing asynchronous
events: workqueue, timer, irq_vector, etc... It also can be
interesting to enable the tick_stop event to diagnose why the tick is
retained when that happens.
Some tools may also be useful for higher level analysis:
- Documentation/tools/rtla/rtla.rst provides a suite of tools to analyze
latency and noise in the system. For example Documentation/tools/rtla/rtla-osnoise.rst
runs a kernel tracer that analyzes and output a summary of the noises.
- dynticks-testing does something similar to rtla-osnoise but in userspace. It is available
at git://git.kernel.org/pub/scm/linux/kernel/git/frederic/dynticks-testing.git

View File

@@ -102,29 +102,42 @@ ignore_zero_blocks
that are not guaranteed to contain zeroes.
use_fec_from_device <fec_dev>
Use forward error correction (FEC) to recover from corruption if hash
verification fails. Use encoding data from the specified device. This
may be the same device where data and hash blocks reside, in which case
fec_start must be outside data and hash areas.
Use forward error correction (FEC) parity data from the specified device to
try to automatically recover from corruption and I/O errors.
If the encoding data covers additional metadata, it must be accessible
on the hash device after the hash blocks.
If this option is given, then <fec_roots> and <fec_blocks> must also be
given. <hash_block_size> must also be equal to <data_block_size>.
Note: block sizes for data and hash devices must match. Also, if the
verity <dev> is encrypted the <fec_dev> should be too.
<fec_dev> can be the same as <dev>, in which case <fec_start> must be
outside the data area. It can also be the same as <hash_dev>, in which case
<fec_start> must be outside the hash and optional additional metadata areas.
If the data <dev> is encrypted, the <fec_dev> should be too.
For more information, see `Forward error correction`_.
fec_roots <num>
Number of generator roots. This equals to the number of parity bytes in
the encoding data. For example, in RS(M, N) encoding, the number of roots
is M-N.
The number of parity bytes in each 255-byte Reed-Solomon codeword. The
Reed-Solomon code used will be an RS(255, k) code where k = 255 - fec_roots.
The supported values are 2 through 24 inclusive. Higher values provide
stronger error correction. However, the minimum value of 2 already provides
strong error correction due to the use of interleaving, so 2 is the
recommended value for most users. fec_roots=2 corresponds to an
RS(255, 253) code, which has a space overhead of about 0.8%.
fec_blocks <num>
The number of encoding data blocks on the FEC device. The block size for
the FEC device is <data_block_size>.
The total number of <data_block_size> blocks that are error-checked using
FEC. This must be at least the sum of <num_data_blocks> and the number of
blocks needed by the hash tree. It can include additional metadata blocks,
which are assumed to be accessible on <hash_dev> following the hash blocks.
Note that this is *not* the number of parity blocks. The number of parity
blocks is inferred from <fec_blocks>, <fec_roots>, and <data_block_size>.
fec_start <offset>
This is the offset, in <data_block_size> blocks, from the start of the
FEC device to the beginning of the encoding data.
This is the offset, in <data_block_size> blocks, from the start of <fec_dev>
to the beginning of the parity data.
check_at_most_once
Verify data blocks only the first time they are read from the data device,
@@ -180,11 +193,6 @@ per-block basis. This allows for a lightweight hash computation on first read
into the page cache. Block hashes are stored linearly, aligned to the nearest
block size.
If forward error correction (FEC) support is enabled any recovery of
corrupted data will be verified using the cryptographic hash of the
corresponding data. This is why combining error correction with
integrity checking is essential.
Hash Tree
---------
@@ -212,6 +220,80 @@ The tree looks something like:
/ ... \ / . . . \ / \
blk_0 ... blk_127 blk_16256 blk_16383 blk_32640 . . . blk_32767
Forward error correction
------------------------
dm-verity's optional forward error correction (FEC) support adds strong error
correction capabilities to dm-verity. It allows systems that would be rendered
inoperable by errors to continue operating, albeit with reduced performance.
FEC uses Reed-Solomon (RS) codes that are interleaved across the entire
device(s), allowing long bursts of corrupt or unreadable blocks to be recovered.
dm-verity validates any FEC-corrected block against the wanted hash before using
it. Therefore, FEC doesn't affect the security properties of dm-verity.
The integration of FEC with dm-verity provides significant benefits over a
separate error correction layer:
- dm-verity invokes FEC only when a block's hash doesn't match the wanted hash
or the block cannot be read at all. As a result, FEC doesn't add overhead to
the common case where no error occurs.
- dm-verity hashes are also used to identify erasure locations for RS decoding.
This allows correcting twice as many errors.
FEC uses an RS(255, k) code where k = 255 - fec_roots. fec_roots is usually 2.
This means that each k (usually 253) message bytes have fec_roots (usually 2)
bytes of parity data added to get a 255-byte codeword. (Many external sources
call RS codewords "blocks". Since dm-verity already uses the term "block" to
mean something else, we'll use the clearer term "RS codeword".)
FEC checks fec_blocks blocks of message data in total, consisting of:
1. The data blocks from the data device
2. The hash blocks from the hash device
3. Optional additional metadata that follows the hash blocks on the hash device
dm-verity assumes that the FEC parity data was computed as if the following
procedure were followed:
1. Concatenate the message data from the above sources.
2. Zero-pad to the next multiple of k blocks. Let msg be the resulting byte
array, and msglen its length in bytes.
3. For 0 <= i < msglen / k (for each RS codeword):
a. Select msg[i + j * msglen / k] for 0 <= j < k.
Consider these to be the 'k' message bytes of an RS codeword.
b. Compute the corresponding 'fec_roots' parity bytes of the RS codeword,
and concatenate them to the FEC parity data.
Step 3a interleaves the RS codewords across the entire device using an
interleaving degree of data_block_size * ceil(fec_blocks / k). This is the
maximal interleaving, such that the message data consists of a region containing
byte 0 of all the RS codewords, then a region containing byte 1 of all the RS
codewords, and so on up to the region for byte 'k - 1'. Note that the number of
codewords is set to a multiple of data_block_size; thus, the regions are
block-aligned, and there is an implicit zero padding of up to 'k - 1' blocks.
This interleaving allows long bursts of errors to be corrected. It provides
much stronger error correction than storage devices typically provide, while
keeping the space overhead low.
The cost is slow decoding: correcting a single block usually requires reading
254 extra blocks spread evenly across the device(s). However, that is
acceptable because dm-verity uses FEC only when there is actually an error.
The list below contains additional details about the RS codes used by
dm-verity's FEC. Userspace programs that generate the parity data need to use
these parameters for the parity data to match exactly:
- Field used is GF(256)
- Bytes are mapped to/from GF(256) elements in the natural way, where bits 0
through 7 (low-order to high-order) map to the coefficients of x^0 through x^7
- Field generator polynomial is x^8 + x^4 + x^3 + x^2 + 1
- The codes used are systematic, BCH-view codes
- Primitive element alpha is 'x'
- First consecutive root of code generator polynomial is 'x^0'
On-disk format
==============

View File

@@ -94,6 +94,7 @@ likely to be of interest on almost any system.
cgroup-v2
cgroup-v1/index
cpu-isolation
cpu-load
mm/index
module-signing

View File

@@ -141,7 +141,7 @@ nodemask_t
The size of a nodemask_t type. Used to compute the number of online
nodes.
(page, flags|_refcount|mapping|lru|_mapcount|private|compound_order|compound_head)
(page, flags|_refcount|mapping|lru|_mapcount|private|compound_order|compound_info)
----------------------------------------------------------------------------------
User-space tools compute their values based on the offset of these

View File

@@ -6,7 +6,6 @@
APPARMOR AppArmor support is enabled.
ARM ARM architecture is enabled.
ARM64 ARM64 architecture is enabled.
AX25 Appropriate AX.25 support is enabled.
CLK Common clock infrastructure is enabled.
CMA Contiguous Memory Area support is enabled.
DRM Direct Rendering Management support is enabled.
@@ -190,6 +189,14 @@ Kernel parameters
unusable. The "log_buf_len" parameter may be useful
if you need to capture more output.
acpi.poweroff_on_fatal= [ACPI]
{0 | 1}
Causes the system to poweroff when the ACPI bytecode signals
a fatal error. The default value of this setting is 1.
Overriding this value should only be done for diagnosing
ACPI firmware problems, as the system might behave erratically
after having encountered a fatal ACPI error.
acpi_enforce_resources= [ACPI]
{ strict | lax | no }
Check for resource conflicts between native drivers
@@ -493,6 +500,13 @@ Kernel parameters
disable
Disable amd-pstate preferred core.
amd_dynamic_epp=
[X86]
disable
Disable amd-pstate dynamic EPP.
enable
Enable amd-pstate dynamic EPP.
amijoy.map= [HW,JOY] Amiga joystick support
Map of devices attached to JOY0DAT and JOY1DAT
Format: <a>,<b>
@@ -618,23 +632,6 @@ Kernel parameters
1 - Enable the BAU.
unset - Disable the BAU.
baycom_epp= [HW,AX25]
Format: <io>,<mode>
baycom_par= [HW,AX25] BayCom Parallel Port AX.25 Modem
Format: <io>,<mode>
See header of drivers/net/hamradio/baycom_par.c.
baycom_ser_fdx= [HW,AX25]
BayCom Serial Port AX.25 Modem (Full Duplex Mode)
Format: <io>,<irq>,<mode>[,<baud>]
See header of drivers/net/hamradio/baycom_ser_fdx.c.
baycom_ser_hdx= [HW,AX25]
BayCom Serial Port AX.25 Modem (Half Duplex Mode)
Format: <io>,<irq>,<mode>
See header of drivers/net/hamradio/baycom_ser_hdx.c.
bdev_allow_write_mounted=
Format: <bool>
Control the ability to open a mounted block device
@@ -1750,8 +1747,8 @@ Kernel parameters
fred= [X86-64]
Enable/disable Flexible Return and Event Delivery.
Format: { on | off }
on: enable FRED when it's present.
off: disable FRED, the default setting.
on: enable FRED when it's present, the default setting.
off: disable FRED.
ftrace=[tracer]
[FTRACE] will set and start the specified tracer
@@ -2395,23 +2392,6 @@ Kernel parameters
[IMA] Define a custom template format.
Format: { "field1|...|fieldN" }
ima.ahash_minsize= [IMA] Minimum file size for asynchronous hash usage
Format: <min_file_size>
Set the minimal file size for using asynchronous hash.
If left unspecified, ahash usage is disabled.
ahash performance varies for different data sizes on
different crypto accelerators. This option can be used
to achieve the best performance for a particular HW.
ima.ahash_bufsize= [IMA] Asynchronous hash buffer size
Format: <bufsize>
Set hashing buffer size. Default: 4k.
ahash performance varies for different chunk sizes on
different crypto accelerators. This option can be used
to achieve best performance for particular HW.
ima= [IMA] Enable or disable IMA
Format: { "off" | "on" }
Default: "on"
@@ -2615,15 +2595,11 @@ Kernel parameters
Intel machines). This can be used to prevent the usage
of an available hardware IOMMU.
[X86]
pt
[X86]
nopt
[PPC/POWERNV]
nobypass
nobypass [PPC/POWERNV]
Disable IOMMU bypass, using IOMMU for PCI devices.
[X86]
AMD Gart HW IOMMU-specific options:
<size>
@@ -2959,6 +2935,12 @@ Kernel parameters
Format: <bool>
Default: CONFIG_KFENCE_DEFERRABLE
kfence.fault= [MM,KFENCE] Controls the behavior when a KFENCE
error is detected.
report - print the error report and continue (default).
oops - print the error report and oops.
panic - print the error report and panic.
kfence.sample_interval=
[MM,KFENCE] KFENCE's sample interval in milliseconds.
Format: <unsigned integer>
@@ -3247,8 +3229,8 @@ Kernel parameters
for the host. To force nVHE on VHE hardware, add
"arm64_sw.hvhe=0 id_aa64mmfr1.vh=0" to the
command-line.
"nested" is experimental and should be used with
extreme caution.
"nested" and "protected" are experimental and should be
used with extreme caution.
kvm-arm.vgic_v3_group0_trap=
[KVM,ARM,EARLY] Trap guest accesses to GICv3 group-0
@@ -6746,7 +6728,7 @@ Kernel parameters
Default is 'on'.
initramfs_options= [KNL]
Specify mount options for for the initramfs mount.
Specify mount options for the initramfs mount.
rootfstype= [KNL] Set root filesystem type
@@ -7963,12 +7945,7 @@ Kernel parameters
(HPET or PM timer) on systems whose TSC frequency was
obtained from HW or FW using either an MSR or CPUID(0x15).
Warn if the difference is more than 500 ppm.
[x86] watchdog: Use TSC as the watchdog clocksource with
which to check other HW timers (HPET or PM timer), but
only on systems where TSC has been deemed trustworthy.
This will be suppressed by an earlier tsc=nowatchdog and
can be overridden by a later tsc=nowatchdog. A console
message will flag any such suppression or overriding.
[x86] watchdog: Enforce the clocksource watchdog on TSC
tsc_early_khz= [X86,EARLY] Skip early TSC calibration and use the given
value instead. Useful when the early TSC frequency discovery
@@ -8392,7 +8369,9 @@ Kernel parameters
emulate Vsyscalls turn into traps and are emulated
reasonably safely. The vsyscall page is
readable.
readable. This disables the Linear
Address Space Separation (LASS) security
feature and makes the system less secure.
xonly [default] Vsyscalls turn into traps and are
emulated reasonably safely. The vsyscall
@@ -8535,7 +8514,8 @@ Kernel parameters
workqueue.default_affinity_scope=
Select the default affinity scope to use for unbound
workqueues. Can be one of "cpu", "smt", "cache",
"numa" and "system". Default is "cache". For more
"cache_shard", "numa" and "system". Default is
"cache_shard". For more
information, see the Affinity Scopes section in
Documentation/core-api/workqueue.rst.

View File

@@ -1522,6 +1522,27 @@ Currently 2 antenna types are supported as mentioned below:
The property is read-only. If the platform doesn't have support the sysfs
class is not created.
doubletap_enable
----------------
sysfs: doubletap_enable
Controls whether TrackPoint doubletap events are filtered out. Doubletap is a
feature where quickly tapping the TrackPoint twice triggers a special function key event.
The available commands are::
cat /sys/devices/platform/thinkpad_acpi/doubletap_enable
echo 1 | sudo tee /sys/devices/platform/thinkpad_acpi/doubletap_enable
echo 0 | sudo tee /sys/devices/platform/thinkpad_acpi/doubletap_enable
Values:
* 1 - doubletap events are processed (default)
* 0 - doubletap events are filtered out (ignored)
This setting can also be toggled via the Fn+doubletap hotkey.
Auxmac
------

View File

@@ -50,6 +50,10 @@ between 1 and 100 percent are supported.
Additionally the driver signals the presence of battery charging issues through the standard
``health`` power supply sysfs attribute.
It also lets you set whether a USB-C power source should prioritise charging the battery or
delivering immediate power to the cpu. See Documentation/ABI/testing/sysfs-driver-uniwill-laptop for
details.
Lightbar
--------
@@ -58,3 +62,11 @@ LED class device. The default name of this LED class device is ``uniwill:multico
See Documentation/ABI/testing/sysfs-driver-uniwill-laptop for details on how to control the various
animation modes of the lightbar.
Configurable TGP
----------------
The ``uniwill-laptop`` driver allows to set the configurable TGP for devices with NVIDIA GPUs that
allow it.
See Documentation/ABI/testing/sysfs-driver-uniwill-laptop for details.

View File

@@ -16,7 +16,7 @@ details), and a compile option, "BOOTPARAM_SOFTLOCKUP_PANIC", are
provided for this.
A 'hardlockup' is defined as a bug that causes the CPU to loop in
kernel mode for more than 10 seconds (see "Implementation" below for
kernel mode for several seconds (see "Implementation" below for
details), without letting other interrupts have a chance to run.
Similarly to the softlockup case, the current stack trace is displayed
upon detection and the system will stay locked up unless the default
@@ -30,39 +30,135 @@ timeout is set through the confusingly named "kernel.panic" sysctl),
to cause the system to reboot automatically after a specified amount
of time.
Configuration
=============
A kernel knob is provided that allows administrators to configure
this period. The "watchdog_thresh" parameter (default 10 seconds)
controls the threshold. The right value for a particular environment
is a trade-off between fast response to lockups and detection overhead.
Implementation
==============
The soft and hard lockup detectors are built on top of the hrtimer and
perf subsystems, respectively. A direct consequence of this is that,
in principle, they should work in any architecture where these
subsystems are present.
The soft and hard lockup detectors are built around an hrtimer.
In addition, the softlockup detector regularly schedules a job, and
the hard lockup detector might use Perf/NMI events on architectures
that support it.
A periodic hrtimer runs to generate interrupts and kick the watchdog
job. An NMI perf event is generated every "watchdog_thresh"
(compile-time initialized to 10 and configurable through sysctl of the
same name) seconds to check for hardlockups. If any CPU in the system
does not receive any hrtimer interrupt during that time the
'hardlockup detector' (the handler for the NMI perf event) will
generate a kernel warning or call panic, depending on the
configuration.
Frequency and Heartbeats
------------------------
The watchdog job runs in a stop scheduling thread that updates a
timestamp every time it is scheduled. If that timestamp is not updated
for 2*watchdog_thresh seconds (the softlockup threshold) the
The core of the detectors is an hrtimer. It serves multiple purposes:
- schedules watchdog job for the softlockup detector
- bumps the interrupt counter for hardlockup detectors (heartbeat)
- detects softlockups
- detects hardlockups in Buddy mode
The period of this hrtimer is 2*watchdog_thresh/5, which is 4 seconds
by default. The hrtimer has two or three chances to generate an interrupt
(heartbeat) before the hardlockup detector kicks in.
Softlockup Detector
-------------------
The watchdog job is scheduled by the hrtimer and runs in a stop scheduling
thread. It updates a timestamp every time it is scheduled. If that timestamp
is not updated for 2*watchdog_thresh seconds (the softlockup threshold) the
'softlockup detector' (coded inside the hrtimer callback function)
will dump useful debug information to the system log, after which it
will call panic if it was instructed to do so or resume execution of
other kernel code.
The period of the hrtimer is 2*watchdog_thresh/5, which means it has
two or three chances to generate an interrupt before the hardlockup
detector kicks in.
Hardlockup Detector (NMI/Perf)
------------------------------
As explained above, a kernel knob is provided that allows
administrators to configure the period of the hrtimer and the perf
event. The right value for a particular environment is a trade-off
between fast response to lockups and detection overhead.
On architectures that support NMI (Non-Maskable Interrupt) perf events,
a periodic NMI is generated every "watchdog_thresh" seconds.
If any CPU in the system does not receive any hrtimer interrupt
(heartbeat) during the "watchdog_thresh" window, the 'hardlockup
detector' (the handler for the NMI perf event) will generate a kernel
warning or call panic.
**Detection Overhead (NMI):**
The time to detect a lockup can vary depending on when the lockup
occurs relative to the NMI check window. Examples below assume a watchdog_thresh of 10.
* **Best Case:** The lockup occurs just before the first heartbeat is
due. The detector will notice the missing hrtimer interrupt almost
immediately during the next check.
::
Time 100.0: cpu 1 heartbeat
Time 100.1: hardlockup_check, cpu1 stores its state
Time 103.9: Hard Lockup on cpu1
Time 104.0: cpu 1 heartbeat never comes
Time 110.1: hardlockup_check, cpu1 checks the state again, should be the same, declares lockup
Time to detection: ~6 seconds
* **Worst Case:** The lockup occurs shortly after a valid interrupt
(heartbeat) which itself happened just after the NMI check. The next
NMI check sees that the interrupt count has changed (due to that one
heartbeat), assumes the CPU is healthy, and resets the baseline. The
lockup is only detected at the subsequent check.
::
Time 100.0: hardlockup_check, cpu1 stores its state
Time 100.1: cpu 1 heartbeat
Time 100.2: Hard Lockup on cpu1
Time 110.0: hardlockup_check, cpu1 stores its state (misses lockup as state changed)
Time 120.0: hardlockup_check, cpu1 checks the state again, should be the same, declares lockup
Time to detection: ~20 seconds
Hardlockup Detector (Buddy)
---------------------------
On architectures or configurations where NMI perf events are not
available (or disabled), the kernel may use the "buddy" hardlockup
detector. This mechanism requires SMP (Symmetric Multi-Processing).
In this mode, each CPU is assigned a "buddy" CPU to monitor. The
monitoring CPU runs its own hrtimer (the same one used for softlockup
detection) and checks if the buddy CPU's hrtimer interrupt count has
increased.
To ensure timeliness and avoid false positives, the buddy system performs
checks at every hrtimer interval (2*watchdog_thresh/5, which is 4 seconds
by default). It uses a missed-interrupt threshold of 3. If the buddy's
interrupt count has not changed for 3 consecutive checks, it is assumed
that the buddy CPU is hardlocked (interrupts disabled). The monitoring
CPU will then trigger the hardlockup response (warning or panic).
**Detection Overhead (Buddy):**
With a default check interval of 4 seconds (watchdog_thresh = 10):
* **Best case:** Lockup occurs just before a check.
Detected in ~8s (0s till 1st check + 4s till 2nd + 4s till 3rd).
* **Worst case:** Lockup occurs just after a check.
Detected in ~12s (4s till 1st check + 4s till 2nd + 4s till 3rd).
**Limitations of the Buddy Detector:**
1. **All-CPU Lockup:** If all CPUs lock up simultaneously, the buddy
detector cannot detect the condition because the monitoring CPUs
are also frozen.
2. **Stack Traces:** Unlike the NMI detector, the buddy detector
cannot directly interrupt the locked CPU to grab a stack trace.
It relies on architecture-specific mechanisms (like NMI backtrace
support) to try and retrieve the status of the locked CPU. If
such support is missing, the log may only show that a lockup
occurred without providing the locked CPU's stack.
Watchdog Core Exclusion
=======================
By default, the watchdog runs on all online cores. However, on a
kernel configured with NO_HZ_FULL, by default the watchdog runs only

View File

@@ -74,6 +74,7 @@ Common FPDL3/GMSL input parameters
| 0 - OLDI/JEIDA
| 1 - SPWG/VESA (default)
| 2 - ZDML
**link_status** (R):
Video link status. If the link is locked, chips are properly connected and
@@ -240,6 +241,13 @@ Common FPDL3/GMSL output parameters
*Note: This parameter can not be changed while the output v4l2 device is
open.*
**color_mapping** (RW):
Mapping of the outgoing bits in the signal to the colour bits of the pixels.
| 0 - OLDI/JEIDA
| 1 - SPWG/VESA (default)
| 2 - ZDML
**frame_rate** (RW):
Output video signal frame rate limit in frames per second. Due to
the limited output pixel clock steps, the card can not always generate

View File

@@ -1,72 +0,0 @@
.. SPDX-License-Identifier: GPL-2.0
.. include:: <isonum.txt>
================================
Starfive Camera Subsystem driver
================================
Introduction
------------
This file documents the driver for the Starfive Camera Subsystem found on
Starfive JH7110 SoC. The driver is located under drivers/staging/media/starfive/
camss.
The driver implements V4L2, Media controller and v4l2_subdev interfaces. Camera
sensor using V4L2 subdev interface in the kernel is supported.
The driver has been successfully used on the Gstreamer 1.18.5 with v4l2src
plugin.
Starfive Camera Subsystem hardware
----------------------------------
The Starfive Camera Subsystem hardware consists of::
|\ +---------------+ +-----------+
+----------+ | \ | | | |
| | | | | | | |
| MIPI |----->| |----->| ISP |----->| |
| | | | | | | |
+----------+ | | | | | Memory |
|MUX| +---------------+ | Interface |
+----------+ | | | |
| | | |---------------------------->| |
| Parallel |----->| | | |
| | | | | |
+----------+ | / | |
|/ +-----------+
- MIPI: The MIPI interface, receiving data from a MIPI CSI-2 camera sensor.
- Parallel: The parallel interface, receiving data from a parallel sensor.
- ISP: The ISP, processing raw Bayer data from an image sensor and producing
YUV frames.
Topology
--------
The media controller pipeline graph is as follows:
.. _starfive_camss_graph:
.. kernel-figure:: starfive_camss_graph.dot
:alt: starfive_camss_graph.dot
:align: center
The driver has 2 video devices:
- capture_raw: The capture device, capturing image data directly from a sensor.
- capture_yuv: The capture device, capturing YUV frame data processed by the
ISP module
The driver has 3 subdevices:
- stf_isp: is responsible for all the isp operations, outputs YUV frames.
- cdns_csi2rx: a CSI-2 bridge supporting up to 4 CSI lanes in input, and 4
different pixel streams in output.
- imx219: an image sensor, image data is sent through MIPI CSI-2.

View File

@@ -1,12 +0,0 @@
digraph board {
rankdir=TB
n00000001 [label="{{<port0> 0} | stf_isp\n/dev/v4l-subdev0 | {<port1> 1}}", shape=Mrecord, style=filled, fillcolor=green]
n00000001:port1 -> n00000008 [style=dashed]
n00000004 [label="capture_raw\n/dev/video0", shape=box, style=filled, fillcolor=yellow]
n00000008 [label="capture_yuv\n/dev/video1", shape=box, style=filled, fillcolor=yellow]
n0000000e [label="{{<port0> 0} | cdns_csi2rx.19800000.csi-bridge\n | {<port1> 1 | <port2> 2 | <port3> 3 | <port4> 4}}", shape=Mrecord, style=filled, fillcolor=green]
n0000000e:port1 -> n00000001:port0 [style=dashed]
n0000000e:port1 -> n00000004 [style=dashed]
n00000018 [label="{{} | imx219 6-0010\n/dev/v4l-subdev1 | {<port0> 0}}", shape=Mrecord, style=filled, fillcolor=green]
n00000018:port0 -> n0000000e:port0 [style=bold]
}

View File

@@ -33,7 +33,6 @@ Video4Linux (V4L) driver-specific documentation
si470x
si4713
si476x
starfive_camss
vimc
visl
vivid

View File

@@ -79,6 +79,10 @@ of parametrs except ``enabled`` again. Once the re-reading is done, this
parameter is set as ``N``. If invalid parameters are found while the
re-reading, DAMON_LRU_SORT will be disabled.
Once ``Y`` is written to this parameter, the user must not write to any
parameters until reading ``commit_inputs`` again returns ``N``. If users
violate this rule, the kernel may exhibit undefined behavior.
active_mem_bp
-------------
@@ -91,8 +95,8 @@ increases and decreases the effective level of the quota aiming the LRU
Disabled by default.
Auto-tune monitoring intervals
------------------------------
autotune_monitoring_intervals
-----------------------------
If this parameter is set as ``Y``, DAMON_LRU_SORT automatically tunes DAMON's
sampling and aggregation intervals. The auto-tuning aims to capture meaningful
@@ -221,6 +225,10 @@ But, setting this too high could result in increased monitoring overhead.
Please refer to the DAMON documentation (:doc:`usage`) for more detail. 10 by
default.
Note that this must be 3 or higher. Please refer to the :ref:`Monitoring
<damon_design_monitoring>` section of the design document for the rationale
behind this lower bound.
max_nr_regions
--------------
@@ -351,3 +359,8 @@ the LRU-list based page granularity reclamation. ::
# echo 400 > wmarks_mid
# echo 200 > wmarks_low
# echo Y > enabled
Note that this module (damon_lru_sort) cannot run simultaneously with other
DAMON-based special-purpose modules. Refer to :ref:`DAMON design special
purpose modules exclusivity <damon_design_special_purpose_modules_exclusivity>`
for more details.

View File

@@ -71,6 +71,10 @@ of parametrs except ``enabled`` again. Once the re-reading is done, this
parameter is set as ``N``. If invalid parameters are found while the
re-reading, DAMON_RECLAIM will be disabled.
Once ``Y`` is written to this parameter, the user must not write to any
parameters until reading ``commit_inputs`` again returns ``N``. If users
violate this rule, the kernel may exhibit undefined behavior.
min_age
-------
@@ -204,6 +208,10 @@ monitoring. This can be used to set lower-bound of the monitoring quality.
But, setting this too high could result in increased monitoring overhead.
Please refer to the DAMON documentation (:doc:`usage`) for more detail.
Note that this must be 3 or higher. Please refer to the :ref:`Monitoring
<damon_design_monitoring>` section of the design document for the rationale
behind this lower bound.
max_nr_regions
--------------
@@ -318,6 +326,11 @@ granularity reclamation. ::
# echo 200 > wmarks_low
# echo Y > enabled
Note that this module (damon_reclaim) cannot run simultaneously with other
DAMON-based special-purpose modules. Refer to :ref:`DAMON design special
purpose modules exclusivity <damon_design_special_purpose_modules_exclusivity>`
for more details.
.. [1] https://research.google/pubs/pub48551/
.. [2] https://lwn.net/Articles/787611/
.. [3] https://www.kernel.org/doc/html/latest/mm/free_page_reporting.html

View File

@@ -45,6 +45,11 @@ You can enable DAMON_STAT by setting the value of this parameter as ``Y``.
Setting it as ``N`` disables DAMON_STAT. The default value is set by
``CONFIG_DAMON_STAT_ENABLED_DEFAULT`` build config option.
Note that this module (damon_stat) cannot run simultaneously with other
DAMON-based special-purpose modules. Refer to :ref:`DAMON design special
purpose modules exclusivity <damon_design_special_purpose_modules_exclusivity>`
for more details.
.. _damon_stat_aggr_interval_us:
aggr_interval_us

View File

@@ -83,7 +83,7 @@ comma (",").
│ │ │ │ │ │ │ │ sz/min,max
│ │ │ │ │ │ │ │ nr_accesses/min,max
│ │ │ │ │ │ │ │ age/min,max
│ │ │ │ │ │ │ :ref:`quotas <sysfs_quotas>`/ms,bytes,reset_interval_ms,effective_bytes
│ │ │ │ │ │ │ :ref:`quotas <sysfs_quotas>`/ms,bytes,reset_interval_ms,effective_bytes,goal_tuner
│ │ │ │ │ │ │ │ weights/sz_permil,nr_accesses_permil,age_permil
│ │ │ │ │ │ │ │ :ref:`goals <sysfs_schemes_quota_goals>`/nr_goals
│ │ │ │ │ │ │ │ │ 0/target_metric,target_value,current_value,nid,path
@@ -377,9 +377,9 @@ schemes/<N>/quotas/
The directory for the :ref:`quotas <damon_design_damos_quotas>` of the given
DAMON-based operation scheme.
Under ``quotas`` directory, four files (``ms``, ``bytes``,
``reset_interval_ms``, ``effective_bytes``) and two directories (``weights`` and
``goals``) exist.
Under ``quotas`` directory, five files (``ms``, ``bytes``,
``reset_interval_ms``, ``effective_bytes`` and ``goal_tuner``) and two
directories (``weights`` and ``goals``) exist.
You can set the ``time quota`` in milliseconds, ``size quota`` in bytes, and
``reset interval`` in milliseconds by writing the values to the three files,
@@ -390,6 +390,14 @@ apply the action to only up to ``bytes`` bytes of memory regions within the
quota limits unless at least one :ref:`goal <sysfs_schemes_quota_goals>` is
set.
You can set the goal-based effective quota auto-tuning algorithm to use, by
writing the algorithm name to ``goal_tuner`` file. Reading the file returns
the currently selected tuner algorithm. Refer to the design documentation of
:ref:`automatic quota tuning goals <damon_design_damos_quotas_auto_tuning>` for
the background design of the feature and the name of the selectable algorithms.
Refer to :ref:`goals directory <sysfs_schemes_quota_goals>` for the goals
setup.
The time quota is internally transformed to a size quota. Between the
transformed size quota and user-specified size quota, smaller one is applied.
Based on the user-specified :ref:`goal <sysfs_schemes_quota_goals>`, the

View File

@@ -28,20 +28,10 @@ per NUMA node scratch regions on boot.
Perform a KHO kexec
===================
First, before you perform a KHO kexec, you need to move the system into
the :ref:`KHO finalization phase <kho-finalization-phase>` ::
$ echo 1 > /sys/kernel/debug/kho/out/finalize
After this command, the KHO FDT is available in
``/sys/kernel/debug/kho/out/fdt``. Other subsystems may also register
their own preserved sub FDTs under
``/sys/kernel/debug/kho/out/sub_fdts/``.
Next, load the target payload and kexec into it. It is important that you
use the ``-s`` parameter to use the in-kernel kexec file loader, as user
space kexec tooling currently has no support for KHO with the user space
based file loader ::
To perform a KHO kexec, load the target payload and kexec into it. It
is important that you use the ``-s`` parameter to use the in-kernel
kexec file loader, as user space kexec tooling currently has no
support for KHO with the user space based file loader ::
# kexec -l /path/to/bzImage --initrd /path/to/initrd -s
# kexec -e
@@ -52,40 +42,58 @@ For example, if you used ``reserve_mem`` command line parameter to create
an early memory reservation, the new kernel will have that memory at the
same physical address as the old kernel.
Abort a KHO exec
================
Kexec Metadata
==============
You can move the system out of KHO finalization phase again by calling ::
KHO automatically tracks metadata about the kexec chain, passing information
about the previous kernel to the next kernel. This feature helps diagnose
bugs that only reproduce when kexecing from specific kernel versions.
$ echo 0 > /sys/kernel/debug/kho/out/active
On each KHO kexec, the kernel logs the previous kernel's version and the
number of kexec reboots since the last cold boot::
After this command, the KHO FDT is no longer available in
``/sys/kernel/debug/kho/out/fdt``.
[ 0.000000] KHO: exec from: 6.19.0-rc4-next-20260107 (count 1)
The metadata includes:
``previous_release``
The kernel version string (from ``uname -r``) of the kernel that
initiated the kexec.
``kexec_count``
The number of kexec boots since the last cold boot. On cold boot,
this counter starts at 0 and increments with each kexec. This helps
identify issues that only manifest after multiple consecutive kexec
reboots.
Use Cases
---------
This metadata is particularly useful for debugging kexec transition bugs,
where a buggy kernel kexecs into a new kernel and the bug manifests only
in the second kernel. Examples of such bugs include:
- Memory corruption from the previous kernel affecting the new kernel
- Incorrect hardware state left by the previous kernel
- Firmware/ACPI state issues that only appear in kexec scenarios
At scale, correlating crashes to the previous kernel version enables
faster root cause analysis when issues only occur in specific kernel
transition scenarios.
debugfs Interfaces
==================
These debugfs interfaces are available when the kernel is compiled with
``CONFIG_KEXEC_HANDOVER_DEBUGFS`` enabled.
Currently KHO creates the following debugfs interfaces. Notice that these
interfaces may change in the future. They will be moved to sysfs once KHO is
stabilized.
``/sys/kernel/debug/kho/out/finalize``
Kexec HandOver (KHO) allows Linux to transition the state of
compatible drivers into the next kexec'ed kernel. To do so,
device drivers will instruct KHO to preserve memory regions,
which could contain serialized kernel state.
While the state is serialized, they are unable to perform
any modifications to state that was serialized, such as
handed over memory allocations.
When this file contains "1", the system is in the transition
state. When contains "0", it is not. To switch between the
two states, echo the respective number into this file.
``/sys/kernel/debug/kho/out/fdt``
When KHO state tree is finalized, the kernel exposes the
flattened device tree blob that carries its current KHO
state in this file. Kexec user space tooling can use this
The kernel exposes the flattened device tree blob that carries its
current KHO state in this file. Kexec user space tooling can use this
as input file for the KHO payload image.
``/sys/kernel/debug/kho/out/scratch_len``
@@ -100,8 +108,8 @@ stabilized.
it should place its payload images.
``/sys/kernel/debug/kho/out/sub_fdts/``
In the KHO finalization phase, KHO producers register their own
FDT blob under this directory.
KHO producers can register their own FDT or another binary blob under
this directory.
``/sys/kernel/debug/kho/in/fdt``
When the kernel was booted with Kexec HandOver (KHO),
@@ -111,5 +119,5 @@ stabilized.
it finished to interpret their metadata.
``/sys/kernel/debug/kho/in/sub_fdts/``
Similar to ``kho/out/sub_fdts/``, but contains sub FDT blobs
Similar to ``kho/out/sub_fdts/``, but contains sub blobs
of KHO producers passed from the old kernel.

View File

@@ -217,7 +217,7 @@ MPOL_PREFERRED
the MPOL_F_STATIC_NODES or MPOL_F_RELATIVE_NODES flags
described below.
MPOL_INTERLEAVED
MPOL_INTERLEAVE
This mode specifies that page allocations be interleaved, on a
page granularity, across the nodes specified in the policy.
This mode also behaves slightly differently, based on the

View File

@@ -40,3 +40,33 @@ how to translate the device into a serial number from SCSI EVPD 0x80::
echo "fencing client ${CLIENT} serial ${EVPD}" >> /var/log/pnfsd-fence.log
EOF
If the nfsd server needs to fence a non-responding client and the
fencing operation fails, the server logs a warning message in the
system log with the following format:
FENCE failed client[IP_address] clid[#n] device[dev_name]
where:
- IP_address: refers to the IP address of the affected client.
- #n: indicates the unique client identifier.
- dev_name: specifies the name of the block device related
to the fencing attempt.
The server will repeatedly retry the operation indefinitely. During
this time, access to the affected file is restricted for all other
clients. This is to prevent potential data corruption if multiple
clients access the same file simultaneously.
To restore access to the affected file for other clients, the admin
needs to take the following actions:
- shutdown or power off the client being fenced.
- manually expire the client to release all its state on the server::
echo 'expire' > /proc/fs/nfsd/clients/clid/ctl
where:
- clid: is the unique client identifier displayed in the system log.

View File

@@ -22,3 +22,34 @@ option and the underlying SCSI device support persistent reservations.
On the client make sure the kernel has the CONFIG_PNFS_BLOCK option
enabled, and the file system is mounted using the NFSv4.1 protocol
version (mount -o vers=4.1).
If the nfsd server needs to fence a non-responding client and the
fencing operation fails, the server logs a warning message in the
system log with the following format:
FENCE failed client[IP_address] clid[#n] device[dev_name]
where:
- IP_address: refers to the IP address of the affected client.
- #n: indicates the unique client identifier.
- dev_name: specifies the name of the block device related
to the fencing attempt.
The server will repeatedly retry the operation indefinitely. During
this time, access to the affected file is restricted for all other
clients. This is to prevent potential data corruption if multiple
clients access the same file simultaneously.
To restore access to the affected file for other clients, the admin
needs to take the following actions:
- shutdown or power off the client being fenced.
- manually expire the client to release all its state on the server::
echo 'expire' > /proc/fs/nfsd/clients/clid/ctl
where:
- clid: is the unique client identifier displayed in the system log.

View File

@@ -24,7 +24,8 @@ Performance monitor support
thunderx2-pmu
alibaba_pmu
dwc_pcie_pmu
nvidia-pmu
nvidia-tegra241-pmu
nvidia-tegra410-pmu
meson-ddr-pmu
cxl
ampere_cspmu

View File

@@ -1,8 +1,8 @@
=========================================================
NVIDIA Tegra SoC Uncore Performance Monitoring Unit (PMU)
=========================================================
============================================================
NVIDIA Tegra241 SoC Uncore Performance Monitoring Unit (PMU)
============================================================
The NVIDIA Tegra SoC includes various system PMUs to measure key performance
The NVIDIA Tegra241 SoC includes various system PMUs to measure key performance
metrics like memory bandwidth, latency, and utilization:
* Scalable Coherency Fabric (SCF)

View File

@@ -0,0 +1,522 @@
=====================================================================
NVIDIA Tegra410 SoC Uncore Performance Monitoring Unit (PMU)
=====================================================================
The NVIDIA Tegra410 SoC includes various system PMUs to measure key performance
metrics like memory bandwidth, latency, and utilization:
* Unified Coherence Fabric (UCF)
* PCIE
* PCIE-TGT
* CPU Memory (CMEM) Latency
* NVLink-C2C
* NV-CLink
* NV-DLink
PMU Driver
----------
The PMU driver describes the available events and configuration of each PMU in
sysfs. Please see the sections below to get the sysfs path of each PMU. Like
other uncore PMU drivers, the driver provides "cpumask" sysfs attribute to show
the CPU id used to handle the PMU event. There is also "associated_cpus"
sysfs attribute, which contains a list of CPUs associated with the PMU instance.
UCF PMU
-------
The Unified Coherence Fabric (UCF) in the NVIDIA Tegra410 SoC serves as a
distributed cache, last level for CPU Memory and CXL Memory, and cache coherent
interconnect that supports hardware coherence across multiple coherently caching
agents, including:
* CPU clusters
* GPU
* PCIe Ordering Controller Unit (OCU)
* Other IO-coherent requesters
The events and configuration options of this PMU device are described in sysfs,
see /sys/bus/event_source/devices/nvidia_ucf_pmu_<socket-id>.
Some of the events available in this PMU can be used to measure bandwidth and
utilization:
* slc_access_rd: count the number of read requests to SLC.
* slc_access_wr: count the number of write requests to SLC.
* slc_bytes_rd: count the number of bytes transferred by slc_access_rd.
* slc_bytes_wr: count the number of bytes transferred by slc_access_wr.
* mem_access_rd: count the number of read requests to local or remote memory.
* mem_access_wr: count the number of write requests to local or remote memory.
* mem_bytes_rd: count the number of bytes transferred by mem_access_rd.
* mem_bytes_wr: count the number of bytes transferred by mem_access_wr.
* cycles: counts the UCF cycles.
The average bandwidth is calculated as::
AVG_SLC_READ_BANDWIDTH_IN_GBPS = SLC_BYTES_RD / ELAPSED_TIME_IN_NS
AVG_SLC_WRITE_BANDWIDTH_IN_GBPS = SLC_BYTES_WR / ELAPSED_TIME_IN_NS
AVG_MEM_READ_BANDWIDTH_IN_GBPS = MEM_BYTES_RD / ELAPSED_TIME_IN_NS
AVG_MEM_WRITE_BANDWIDTH_IN_GBPS = MEM_BYTES_WR / ELAPSED_TIME_IN_NS
The average request rate is calculated as::
AVG_SLC_READ_REQUEST_RATE = SLC_ACCESS_RD / CYCLES
AVG_SLC_WRITE_REQUEST_RATE = SLC_ACCESS_WR / CYCLES
AVG_MEM_READ_REQUEST_RATE = MEM_ACCESS_RD / CYCLES
AVG_MEM_WRITE_REQUEST_RATE = MEM_ACCESS_WR / CYCLES
More details about what other events are available can be found in Tegra410 SoC
technical reference manual.
The events can be filtered based on source or destination. The source filter
indicates the traffic initiator to the SLC, e.g local CPU, non-CPU device, or
remote socket. The destination filter specifies the destination memory type,
e.g. local system memory (CMEM), local GPU memory (GMEM), or remote memory. The
local/remote classification of the destination filter is based on the home
socket of the address, not where the data actually resides. The available
filters are described in
/sys/bus/event_source/devices/nvidia_ucf_pmu_<socket-id>/format/.
The list of UCF PMU event filters:
* Source filter:
* src_loc_cpu: if set, count events from local CPU
* src_loc_noncpu: if set, count events from local non-CPU device
* src_rem: if set, count events from CPU, GPU, PCIE devices of remote socket
* Destination filter:
* dst_loc_cmem: if set, count events to local system memory (CMEM) address
* dst_loc_gmem: if set, count events to local GPU memory (GMEM) address
* dst_loc_other: if set, count events to local CXL memory address
* dst_rem: if set, count events to CPU, GPU, and CXL memory address of remote socket
If the source is not specified, the PMU will count events from all sources. If
the destination is not specified, the PMU will count events to all destinations.
Example usage:
* Count event id 0x0 in socket 0 from all sources and to all destinations::
perf stat -a -e nvidia_ucf_pmu_0/event=0x0/
* Count event id 0x0 in socket 0 with source filter = local CPU and destination
filter = local system memory (CMEM)::
perf stat -a -e nvidia_ucf_pmu_0/event=0x0,src_loc_cpu=0x1,dst_loc_cmem=0x1/
* Count event id 0x0 in socket 1 with source filter = local non-CPU device and
destination filter = remote memory::
perf stat -a -e nvidia_ucf_pmu_1/event=0x0,src_loc_noncpu=0x1,dst_rem=0x1/
PCIE PMU
--------
This PMU is located in the SOC fabric connecting the PCIE root complex (RC) and
the memory subsystem. It monitors all read/write traffic from the root port(s)
or a particular BDF in a PCIE RC to local or remote memory. There is one PMU per
PCIE RC in the SoC. Each RC can have up to 16 lanes that can be bifurcated into
up to 8 root ports. The traffic from each root port can be filtered using RP or
BDF filter. For example, specifying "src_rp_mask=0xFF" means the PMU counter will
capture traffic from all RPs. Please see below for more details.
The events and configuration options of this PMU device are described in sysfs,
see /sys/bus/event_source/devices/nvidia_pcie_pmu_<socket-id>_rc_<pcie-rc-id>.
The events in this PMU can be used to measure bandwidth, utilization, and
latency:
* rd_req: count the number of read requests by PCIE device.
* wr_req: count the number of write requests by PCIE device.
* rd_bytes: count the number of bytes transferred by rd_req.
* wr_bytes: count the number of bytes transferred by wr_req.
* rd_cum_outs: count outstanding rd_req each cycle.
* cycles: count the clock cycles of SOC fabric connected to the PCIE interface.
The average bandwidth is calculated as::
AVG_RD_BANDWIDTH_IN_GBPS = RD_BYTES / ELAPSED_TIME_IN_NS
AVG_WR_BANDWIDTH_IN_GBPS = WR_BYTES / ELAPSED_TIME_IN_NS
The average request rate is calculated as::
AVG_RD_REQUEST_RATE = RD_REQ / CYCLES
AVG_WR_REQUEST_RATE = WR_REQ / CYCLES
The average latency is calculated as::
FREQ_IN_GHZ = CYCLES / ELAPSED_TIME_IN_NS
AVG_LATENCY_IN_CYCLES = RD_CUM_OUTS / RD_REQ
AVERAGE_LATENCY_IN_NS = AVG_LATENCY_IN_CYCLES / FREQ_IN_GHZ
The PMU events can be filtered based on the traffic source and destination.
The source filter indicates the PCIE devices that will be monitored. The
destination filter specifies the destination memory type, e.g. local system
memory (CMEM), local GPU memory (GMEM), or remote memory. The local/remote
classification of the destination filter is based on the home socket of the
address, not where the data actually resides. These filters can be found in
/sys/bus/event_source/devices/nvidia_pcie_pmu_<socket-id>_rc_<pcie-rc-id>/format/.
The list of event filters:
* Source filter:
* src_rp_mask: bitmask of root ports that will be monitored. Each bit in this
bitmask represents the RP index in the RC. If the bit is set, all devices under
the associated RP will be monitored. E.g "src_rp_mask=0xF" will monitor
devices in root port 0 to 3.
* src_bdf: the BDF that will be monitored. This is a 16-bit value that
follows formula: (bus << 8) + (device << 3) + (function). For example, the
value of BDF 27:01.1 is 0x2781.
* src_bdf_en: enable the BDF filter. If this is set, the BDF filter value in
"src_bdf" is used to filter the traffic.
Note that Root-Port and BDF filters are mutually exclusive and the PMU in
each RC can only have one BDF filter for the whole counters. If BDF filter
is enabled, the BDF filter value will be applied to all events.
* Destination filter:
* dst_loc_cmem: if set, count events to local system memory (CMEM) address
* dst_loc_gmem: if set, count events to local GPU memory (GMEM) address
* dst_loc_pcie_p2p: if set, count events to local PCIE peer address
* dst_loc_pcie_cxl: if set, count events to local CXL memory address
* dst_rem: if set, count events to remote memory address
If the source filter is not specified, the PMU will count events from all root
ports. If the destination filter is not specified, the PMU will count events
to all destinations.
Example usage:
* Count event id 0x0 from root port 0 of PCIE RC-0 on socket 0 targeting all
destinations::
perf stat -a -e nvidia_pcie_pmu_0_rc_0/event=0x0,src_rp_mask=0x1/
* Count event id 0x1 from root port 0 and 1 of PCIE RC-1 on socket 0 and
targeting just local CMEM of socket 0::
perf stat -a -e nvidia_pcie_pmu_0_rc_1/event=0x1,src_rp_mask=0x3,dst_loc_cmem=0x1/
* Count event id 0x2 from root port 0 of PCIE RC-2 on socket 1 targeting all
destinations::
perf stat -a -e nvidia_pcie_pmu_1_rc_2/event=0x2,src_rp_mask=0x1/
* Count event id 0x3 from root port 0 and 1 of PCIE RC-3 on socket 1 and
targeting just local CMEM of socket 1::
perf stat -a -e nvidia_pcie_pmu_1_rc_3/event=0x3,src_rp_mask=0x3,dst_loc_cmem=0x1/
* Count event id 0x4 from BDF 01:01.0 of PCIE RC-4 on socket 0 targeting all
destinations::
perf stat -a -e nvidia_pcie_pmu_0_rc_4/event=0x4,src_bdf=0x0180,src_bdf_en=0x1/
.. _NVIDIA_T410_PCIE_PMU_RC_Mapping_Section:
Mapping the RC# to lspci segment number
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Mapping the RC# to lspci segment number can be non-trivial; hence a new NVIDIA
Designated Vendor Specific Capability (DVSEC) register is added into the PCIE config space
for each RP. This DVSEC has vendor id "10de" and DVSEC id of "0x4". The DVSEC register
contains the following information to map PCIE devices under the RP back to its RC# :
- Bus# (byte 0xc) : bus number as reported by the lspci output
- Segment# (byte 0xd) : segment number as reported by the lspci output
- RP# (byte 0xe) : port number as reported by LnkCap attribute from lspci for a device with Root Port capability
- RC# (byte 0xf): root complex number associated with the RP
- Socket# (byte 0x10): socket number associated with the RP
Example script for mapping lspci BDF to RC# and socket#::
#!/bin/bash
while read bdf rest; do
dvsec4_reg=$(lspci -vv -s $bdf | awk '
/Designated Vendor-Specific: Vendor=10de ID=0004/ {
match($0, /\[([0-9a-fA-F]+)/, arr);
print "0x" arr[1];
exit
}
')
if [ -n "$dvsec4_reg" ]; then
bus=$(setpci -s $bdf $(printf '0x%x' $((${dvsec4_reg} + 0xc))).b)
segment=$(setpci -s $bdf $(printf '0x%x' $((${dvsec4_reg} + 0xd))).b)
rp=$(setpci -s $bdf $(printf '0x%x' $((${dvsec4_reg} + 0xe))).b)
rc=$(setpci -s $bdf $(printf '0x%x' $((${dvsec4_reg} + 0xf))).b)
socket=$(setpci -s $bdf $(printf '0x%x' $((${dvsec4_reg} + 0x10))).b)
echo "$bdf: Bus=$bus, Segment=$segment, RP=$rp, RC=$rc, Socket=$socket"
fi
done < <(lspci -d 10de:)
Example output::
0001:00:00.0: Bus=00, Segment=01, RP=00, RC=00, Socket=00
0002:80:00.0: Bus=80, Segment=02, RP=01, RC=01, Socket=00
0002:a0:00.0: Bus=a0, Segment=02, RP=02, RC=01, Socket=00
0002:c0:00.0: Bus=c0, Segment=02, RP=03, RC=01, Socket=00
0002:e0:00.0: Bus=e0, Segment=02, RP=04, RC=01, Socket=00
0003:00:00.0: Bus=00, Segment=03, RP=00, RC=02, Socket=00
0004:00:00.0: Bus=00, Segment=04, RP=00, RC=03, Socket=00
0005:00:00.0: Bus=00, Segment=05, RP=00, RC=04, Socket=00
0005:40:00.0: Bus=40, Segment=05, RP=01, RC=04, Socket=00
0005:c0:00.0: Bus=c0, Segment=05, RP=02, RC=04, Socket=00
0006:00:00.0: Bus=00, Segment=06, RP=00, RC=05, Socket=00
0009:00:00.0: Bus=00, Segment=09, RP=00, RC=00, Socket=01
000a:80:00.0: Bus=80, Segment=0a, RP=01, RC=01, Socket=01
000a:a0:00.0: Bus=a0, Segment=0a, RP=02, RC=01, Socket=01
000a:e0:00.0: Bus=e0, Segment=0a, RP=03, RC=01, Socket=01
000b:00:00.0: Bus=00, Segment=0b, RP=00, RC=02, Socket=01
000c:00:00.0: Bus=00, Segment=0c, RP=00, RC=03, Socket=01
000d:00:00.0: Bus=00, Segment=0d, RP=00, RC=04, Socket=01
000d:40:00.0: Bus=40, Segment=0d, RP=01, RC=04, Socket=01
000d:c0:00.0: Bus=c0, Segment=0d, RP=02, RC=04, Socket=01
000e:00:00.0: Bus=00, Segment=0e, RP=00, RC=05, Socket=01
PCIE-TGT PMU
------------
This PMU is located in the SOC fabric connecting the PCIE root complex (RC) and
the memory subsystem. It monitors traffic targeting PCIE BAR and CXL HDM ranges.
There is one PCIE-TGT PMU per PCIE RC in the SoC. Each RC in Tegra410 SoC can
have up to 16 lanes that can be bifurcated into up to 8 root ports (RP). The PMU
provides RP filter to count PCIE BAR traffic to each RP and address filter to
count access to PCIE BAR or CXL HDM ranges. The details of the filters are
described in the following sections.
Mapping the RC# to lspci segment number is similar to the PCIE PMU. Please see
:ref:`NVIDIA_T410_PCIE_PMU_RC_Mapping_Section` for more info.
The events and configuration options of this PMU device are available in sysfs,
see /sys/bus/event_source/devices/nvidia_pcie_tgt_pmu_<socket-id>_rc_<pcie-rc-id>.
The events in this PMU can be used to measure bandwidth and utilization:
* rd_req: count the number of read requests to PCIE.
* wr_req: count the number of write requests to PCIE.
* rd_bytes: count the number of bytes transferred by rd_req.
* wr_bytes: count the number of bytes transferred by wr_req.
* cycles: count the clock cycles of SOC fabric connected to the PCIE interface.
The average bandwidth is calculated as::
AVG_RD_BANDWIDTH_IN_GBPS = RD_BYTES / ELAPSED_TIME_IN_NS
AVG_WR_BANDWIDTH_IN_GBPS = WR_BYTES / ELAPSED_TIME_IN_NS
The average request rate is calculated as::
AVG_RD_REQUEST_RATE = RD_REQ / CYCLES
AVG_WR_REQUEST_RATE = WR_REQ / CYCLES
The PMU events can be filtered based on the destination root port or target
address range. Filtering based on RP is only available for PCIE BAR traffic.
Address filter works for both PCIE BAR and CXL HDM ranges. These filters can be
found in sysfs, see
/sys/bus/event_source/devices/nvidia_pcie_tgt_pmu_<socket-id>_rc_<pcie-rc-id>/format/.
Destination filter settings:
* dst_rp_mask: bitmask to select the root port(s) to monitor. E.g. "dst_rp_mask=0xFF"
corresponds to all root ports (from 0 to 7) in the PCIE RC. Note that this filter is
only available for PCIE BAR traffic.
* dst_addr_base: BAR or CXL HDM filter base address.
* dst_addr_mask: BAR or CXL HDM filter address mask.
* dst_addr_en: enable BAR or CXL HDM address range filter. If this is set, the
address range specified by "dst_addr_base" and "dst_addr_mask" will be used to filter
the PCIE BAR and CXL HDM traffic address. The PMU uses the following comparison
to determine if the traffic destination address falls within the filter range::
(txn's addr & dst_addr_mask) == (dst_addr_base & dst_addr_mask)
If the comparison succeeds, then the event will be counted.
If the destination filter is not specified, the RP filter will be configured by default
to count PCIE BAR traffic to all root ports.
Example usage:
* Count event id 0x0 to root port 0 and 1 of PCIE RC-0 on socket 0::
perf stat -a -e nvidia_pcie_tgt_pmu_0_rc_0/event=0x0,dst_rp_mask=0x3/
* Count event id 0x1 for accesses to PCIE BAR or CXL HDM address range
0x10000 to 0x100FF on socket 0's PCIE RC-1::
perf stat -a -e nvidia_pcie_tgt_pmu_0_rc_1/event=0x1,dst_addr_base=0x10000,dst_addr_mask=0xFFF00,dst_addr_en=0x1/
CPU Memory (CMEM) Latency PMU
-----------------------------
This PMU monitors latency events of memory read requests from the edge of the
Unified Coherence Fabric (UCF) to local CPU DRAM:
* RD_REQ counters: count read requests (32B per request).
* RD_CUM_OUTS counters: accumulated outstanding request counter, which track
how many cycles the read requests are in flight.
* CYCLES counter: counts the number of elapsed cycles.
The average latency is calculated as::
FREQ_IN_GHZ = CYCLES / ELAPSED_TIME_IN_NS
AVG_LATENCY_IN_CYCLES = RD_CUM_OUTS / RD_REQ
AVERAGE_LATENCY_IN_NS = AVG_LATENCY_IN_CYCLES / FREQ_IN_GHZ
The events and configuration options of this PMU device are described in sysfs,
see /sys/bus/event_source/devices/nvidia_cmem_latency_pmu_<socket-id>.
Example usage::
perf stat -a -e '{nvidia_cmem_latency_pmu_0/rd_req/,nvidia_cmem_latency_pmu_0/rd_cum_outs/,nvidia_cmem_latency_pmu_0/cycles/}'
NVLink-C2C PMU
--------------
This PMU monitors latency events of memory read/write requests that pass through
the NVIDIA Chip-to-Chip (C2C) interface. Bandwidth events are not available
in this PMU, unlike the C2C PMU in Grace (Tegra241 SoC).
The events and configuration options of this PMU device are available in sysfs,
see /sys/bus/event_source/devices/nvidia_nvlink_c2c_pmu_<socket-id>.
The list of events:
* IN_RD_CUM_OUTS: accumulated outstanding request (in cycles) of incoming read requests.
* IN_RD_REQ: the number of incoming read requests.
* IN_WR_CUM_OUTS: accumulated outstanding request (in cycles) of incoming write requests.
* IN_WR_REQ: the number of incoming write requests.
* OUT_RD_CUM_OUTS: accumulated outstanding request (in cycles) of outgoing read requests.
* OUT_RD_REQ: the number of outgoing read requests.
* OUT_WR_CUM_OUTS: accumulated outstanding request (in cycles) of outgoing write requests.
* OUT_WR_REQ: the number of outgoing write requests.
* CYCLES: NVLink-C2C interface cycle counts.
The incoming events count the reads/writes from remote device to the SoC.
The outgoing events count the reads/writes from the SoC to remote device.
The sysfs /sys/bus/event_source/devices/nvidia_nvlink_c2c_pmu_<socket-id>/peer
contains the information about the connected device.
When the C2C interface is connected to GPU(s), the user can use the
"gpu_mask" parameter to filter traffic to/from specific GPU(s). Each bit represents the GPU
index, e.g. "gpu_mask=0x1" corresponds to GPU 0 and "gpu_mask=0x3" is for GPU 0 and 1.
The PMU will monitor all GPUs by default if not specified.
When connected to another SoC, only the read events are available.
The events can be used to calculate the average latency of the read/write requests::
C2C_FREQ_IN_GHZ = CYCLES / ELAPSED_TIME_IN_NS
IN_RD_AVG_LATENCY_IN_CYCLES = IN_RD_CUM_OUTS / IN_RD_REQ
IN_RD_AVG_LATENCY_IN_NS = IN_RD_AVG_LATENCY_IN_CYCLES / C2C_FREQ_IN_GHZ
IN_WR_AVG_LATENCY_IN_CYCLES = IN_WR_CUM_OUTS / IN_WR_REQ
IN_WR_AVG_LATENCY_IN_NS = IN_WR_AVG_LATENCY_IN_CYCLES / C2C_FREQ_IN_GHZ
OUT_RD_AVG_LATENCY_IN_CYCLES = OUT_RD_CUM_OUTS / OUT_RD_REQ
OUT_RD_AVG_LATENCY_IN_NS = OUT_RD_AVG_LATENCY_IN_CYCLES / C2C_FREQ_IN_GHZ
OUT_WR_AVG_LATENCY_IN_CYCLES = OUT_WR_CUM_OUTS / OUT_WR_REQ
OUT_WR_AVG_LATENCY_IN_NS = OUT_WR_AVG_LATENCY_IN_CYCLES / C2C_FREQ_IN_GHZ
Example usage:
* Count incoming traffic from all GPUs connected via NVLink-C2C::
perf stat -a -e nvidia_nvlink_c2c_pmu_0/in_rd_req/
* Count incoming traffic from GPU 0 connected via NVLink-C2C::
perf stat -a -e nvidia_nvlink_c2c_pmu_0/in_rd_cum_outs,gpu_mask=0x1/
* Count incoming traffic from GPU 1 connected via NVLink-C2C::
perf stat -a -e nvidia_nvlink_c2c_pmu_0/in_rd_cum_outs,gpu_mask=0x2/
* Count outgoing traffic to all GPUs connected via NVLink-C2C::
perf stat -a -e nvidia_nvlink_c2c_pmu_0/out_rd_req/
* Count outgoing traffic to GPU 0 connected via NVLink-C2C::
perf stat -a -e nvidia_nvlink_c2c_pmu_0/out_rd_cum_outs,gpu_mask=0x1/
* Count outgoing traffic to GPU 1 connected via NVLink-C2C::
perf stat -a -e nvidia_nvlink_c2c_pmu_0/out_rd_cum_outs,gpu_mask=0x2/
NV-CLink PMU
------------
This PMU monitors latency events of memory read requests that pass through
the NV-CLINK interface. Bandwidth events are not available in this PMU.
In Tegra410 SoC, the NV-CLink interface is used to connect to another Tegra410
SoC and this PMU only counts read traffic.
The events and configuration options of this PMU device are available in sysfs,
see /sys/bus/event_source/devices/nvidia_nvclink_pmu_<socket-id>.
The list of events:
* IN_RD_CUM_OUTS: accumulated outstanding request (in cycles) of incoming read requests.
* IN_RD_REQ: the number of incoming read requests.
* OUT_RD_CUM_OUTS: accumulated outstanding request (in cycles) of outgoing read requests.
* OUT_RD_REQ: the number of outgoing read requests.
* CYCLES: NV-CLINK interface cycle counts.
The incoming events count the reads from remote device to the SoC.
The outgoing events count the reads from the SoC to remote device.
The events can be used to calculate the average latency of the read requests::
CLINK_FREQ_IN_GHZ = CYCLES / ELAPSED_TIME_IN_NS
IN_RD_AVG_LATENCY_IN_CYCLES = IN_RD_CUM_OUTS / IN_RD_REQ
IN_RD_AVG_LATENCY_IN_NS = IN_RD_AVG_LATENCY_IN_CYCLES / CLINK_FREQ_IN_GHZ
OUT_RD_AVG_LATENCY_IN_CYCLES = OUT_RD_CUM_OUTS / OUT_RD_REQ
OUT_RD_AVG_LATENCY_IN_NS = OUT_RD_AVG_LATENCY_IN_CYCLES / CLINK_FREQ_IN_GHZ
Example usage:
* Count incoming read traffic from remote SoC connected via NV-CLINK::
perf stat -a -e nvidia_nvclink_pmu_0/in_rd_req/
* Count outgoing read traffic to remote SoC connected via NV-CLINK::
perf stat -a -e nvidia_nvclink_pmu_0/out_rd_req/
NV-DLink PMU
------------
This PMU monitors latency events of memory read requests that pass through
the NV-DLINK interface. Bandwidth events are not available in this PMU.
In Tegra410 SoC, this PMU only counts CXL memory read traffic.
The events and configuration options of this PMU device are available in sysfs,
see /sys/bus/event_source/devices/nvidia_nvdlink_pmu_<socket-id>.
The list of events:
* IN_RD_CUM_OUTS: accumulated outstanding read requests (in cycles) to CXL memory.
* IN_RD_REQ: the number of read requests to CXL memory.
* CYCLES: NV-DLINK interface cycle counts.
The events can be used to calculate the average latency of the read requests::
DLINK_FREQ_IN_GHZ = CYCLES / ELAPSED_TIME_IN_NS
IN_RD_AVG_LATENCY_IN_CYCLES = IN_RD_CUM_OUTS / IN_RD_REQ
IN_RD_AVG_LATENCY_IN_NS = IN_RD_AVG_LATENCY_IN_CYCLES / DLINK_FREQ_IN_GHZ
Example usage:
* Count read events to CXL memory::
perf stat -a -e '{nvidia_nvdlink_pmu_0/in_rd_req/,nvidia_nvdlink_pmu_0/in_rd_cum_outs/}'

View File

@@ -239,8 +239,12 @@ control its functionality at the system level. They are located in the
root@hr-test1:/home/ray# ls /sys/devices/system/cpu/cpufreq/policy0/*amd*
/sys/devices/system/cpu/cpufreq/policy0/amd_pstate_highest_perf
/sys/devices/system/cpu/cpufreq/policy0/amd_pstate_hw_prefcore
/sys/devices/system/cpu/cpufreq/policy0/amd_pstate_lowest_nonlinear_freq
/sys/devices/system/cpu/cpufreq/policy0/amd_pstate_max_freq
/sys/devices/system/cpu/cpufreq/policy0/amd_pstate_floor_freq
/sys/devices/system/cpu/cpufreq/policy0/amd_pstate_floor_count
/sys/devices/system/cpu/cpufreq/policy0/amd_pstate_prefcore_ranking
``amd_pstate_highest_perf / amd_pstate_max_freq``
@@ -264,14 +268,46 @@ This attribute is read-only.
``amd_pstate_hw_prefcore``
Whether the platform supports the preferred core feature and it has been
enabled. This attribute is read-only.
Whether the platform supports the preferred core feature and it has
been enabled. This attribute is read-only. This file is only visible
on platforms which support the preferred core feature.
``amd_pstate_prefcore_ranking``
The performance ranking of the core. This number doesn't have any unit, but
larger numbers are preferred at the time of reading. This can change at
runtime based on platform conditions. This attribute is read-only.
runtime based on platform conditions. This attribute is read-only. This file
is only visible on platforms which support the preferred core feature.
``amd_pstate_floor_freq``
The floor frequency associated with each CPU. Userspace can write any
value between ``cpuinfo_min_freq`` and ``scaling_max_freq`` into this
file. When the system is under power or thermal constraints, the
platform firmware will attempt to throttle the CPU frequency to the
value specified in ``amd_pstate_floor_freq`` before throttling it
further. This allows userspace to specify different floor frequencies
to different CPUs. For optimal results, threads of the same core
should have the same floor frequency value. This file is only visible
on platforms that support the CPPC Performance Priority feature.
``amd_pstate_floor_count``
The number of distinct Floor Performance levels supported by the
platform. For example, if this value is 2, then the number of unique
values obtained from the command ``cat
/sys/devices/system/cpu/cpufreq/policy*/amd_pstate_floor_freq |
sort -n | uniq`` should be at most this number for the behavior
described in ``amd_pstate_floor_freq`` to take effect. A zero value
implies that the platform supports unlimited floor performance levels.
This file is only visible on platforms that support the CPPC
Performance Priority feature.
**Note**: When ``amd_pstate_floor_count`` is non-zero, the frequency to
which the CPU is throttled under power or thermal constraints is
undefined when the number of unique values of ``amd_pstate_floor_freq``
across all CPUs in the system exceeds ``amd_pstate_floor_count``.
``energy_performance_available_preferences``
@@ -280,16 +316,22 @@ A list of all the supported EPP preferences that could be used for
These profiles represent different hints that are provided
to the low-level firmware about the user's desired energy vs efficiency
tradeoff. ``default`` represents the epp value is set by platform
firmware. This attribute is read-only.
firmware. ``custom`` designates that integer values 0-255 may be written
as well. This attribute is read-only.
``energy_performance_preference``
The current energy performance preference can be read from this attribute.
and user can change current preference according to energy or performance needs
Please get all support profiles list from
``energy_performance_available_preferences`` attribute, all the profiles are
integer values defined between 0 to 255 when EPP feature is enabled by platform
firmware, if EPP feature is disabled, driver will ignore the written value
Coarse named profiles are available in the attribute
``energy_performance_available_preferences``.
Users can also write individual integer values between 0 to 255.
When dynamic EPP is enabled, writes to energy_performance_preference are blocked
even when EPP feature is enabled by platform firmware. Lower epp values shift the bias
towards improved performance while a higher epp value shifts the bias towards
power-savings. The exact impact can change from one platform to the other.
If a valid integer was last written, then a number will be returned on future reads.
If a valid string was last written then a string will be returned on future reads.
This attribute is read-write.
``boost``
@@ -311,6 +353,24 @@ boost or `1` to enable it, for the respective CPU using the sysfs path
Other performance and frequency values can be read back from
``/sys/devices/system/cpu/cpuX/acpi_cppc/``, see :ref:`cppc_sysfs`.
Dynamic energy performance profile
==================================
The amd-pstate driver supports dynamically selecting the energy performance
profile based on whether the machine is running on AC or DC power.
Whether this behavior is enabled by default depends on the kernel
config option `CONFIG_X86_AMD_PSTATE_DYNAMIC_EPP`. This behavior can also be overridden
at runtime by the sysfs file ``/sys/devices/system/cpu/cpufreq/policyX/dynamic_epp``.
When set to enabled, the driver will select a different energy performance
profile when the machine is running on battery or AC power. The driver will
also register with the platform profile handler to receive notifications of
user desired power state and react to those.
When set to disabled, the driver will not change the energy performance profile
based on the power source and will not react to user desired power state.
Attempting to manually write to the ``energy_performance_preference`` sysfs
file will fail when ``dynamic_epp`` is enabled.
``amd-pstate`` vs ``acpi-cpufreq``
======================================
@@ -422,6 +482,13 @@ For systems that support ``amd-pstate`` preferred core, the core rankings will
always be advertised by the platform. But OS can choose to ignore that via the
kernel parameter ``amd_prefcore=disable``.
``amd_dynamic_epp``
When AMD pstate is in auto mode, dynamic EPP will control whether the kernel
autonomously changes the EPP mode. The default is configured by
``CONFIG_X86_AMD_PSTATE_DYNAMIC_EPP`` but can be explicitly enabled with
``amd_dynamic_epp=enable`` or disabled with ``amd_dynamic_epp=disable``.
User Space Interface in ``sysfs`` - General
===========================================
@@ -790,13 +857,13 @@ Reference
===========
.. [1] AMD64 Architecture Programmer's Manual Volume 2: System Programming,
https://www.amd.com/system/files/TechDocs/24593.pdf
https://docs.amd.com/v/u/en-US/24593_3.44_APM_Vol2
.. [2] Advanced Configuration and Power Interface Specification,
https://uefi.org/sites/default/files/resources/ACPI_Spec_6_4_Jan22.pdf
.. [3] Processor Programming Reference (PPR) for AMD Family 19h Model 51h, Revision A1 Processors
https://www.amd.com/system/files/TechDocs/56569-A1-PUB.zip
https://docs.amd.com/v/u/en-US/56569-A1-PUB_3.03
.. [4] Linux Kernel Selftests,
https://www.kernel.org/doc/html/latest/dev-tools/kselftest.html

View File

@@ -287,7 +287,7 @@ level.
Check presence of other Intel(R) SST features
---------------------------------------------
Each of the performance profiles also specifies weather there is support of
Each of the performance profiles also specifies whether there is support of
other two Intel(R) SST features (Intel(R) Speed Select Technology - Base Frequency
(Intel(R) SST-BF) and Intel(R) Speed Select Technology - Turbo Frequency (Intel
SST-TF)).

View File

@@ -349,12 +349,14 @@ again.
.. _submit_improvements_qbtl:
Did you run into trouble following any of the above steps that is not cleared up
by the reference section below? Or do you have ideas how to improve the text?
Then please take a moment of your time and let the maintainer of this document
know by email (Thorsten Leemhuis <linux@leemhuis.info>), ideally while CCing the
Linux docs mailing list (linux-doc@vger.kernel.org). Such feedback is vital to
improve this document further, which is in everybody's interest, as it will
Did you run into trouble following the step-by-step guide not cleared up by the
reference section below? Did you spot errors? Or do you have ideas on how to
improve the guide?
If any of that applies, please let the developers know by sending a short note
or a patch to Thorsten Leemhuis <linux@leemhuis.info> while ideally CCing the
public Linux docs mailing list <linux-doc@vger.kernel.org>. Such feedback is
vital to improve this text further, which is in everybody's interest, as it will
enable more people to master the task described here.
Reference section for the step-by-step guide

View File

@@ -48,6 +48,16 @@ Once the report is out, answer any questions that come up and help where you
can. That includes keeping the ball rolling by occasionally retesting with newer
releases and sending a status update afterwards.
..
Note: If you see this note, you are reading the text's source file. You
might want to switch to a rendered version: It makes it a lot easier to
read and navigate this document -- especially when you want to look something
up in the reference section, then jump back to where you left off.
..
Find the latest rendered version of this text here:
https://docs.kernel.org/admin-guide/reporting-issues.html
Step-by-step guide how to report issues to the kernel maintainers
=================================================================
@@ -231,45 +241,54 @@ kernels regularly rebased on those. If that is the case, follow these steps:
The reference section below explains each of these steps in more detail.
Conclusion of the step-by-step guide
------------------------------------
Did you run into trouble following the step-by-step guide not cleared up by the
reference section below? Did you spot errors? Or do you have ideas on how to
improve the guide?
If any of that applies, please let the developers know by sending a short note
or a patch to Thorsten Leemhuis <linux@leemhuis.info> while ideally CCing the
public Linux docs mailing list <linux-doc@vger.kernel.org>. Such feedback is
vital to improve this text further, which is in everybody's interest, as it will
enable more people to master the task described here.
Reference section: Reporting issues to the kernel maintainers
=============================================================
The detailed guides above outline all the major steps in brief fashion, which
should be enough for most people. But sometimes there are situations where even
experienced users might wonder how to actually do one of those steps. That's
what this section is for, as it will provide a lot more details on each of the
above steps. Consider this as reference documentation: it's possible to read it
from top to bottom. But it's mainly meant to skim over and a place to look up
details how to actually perform those steps.
The step-by-step guide above outlines all the major steps in brief fashion,
which usually covers everything required. But even experienced users will
sometimes wonder how to actually realize some of those steps or why they are
needed; there are also corner cases the guide ignores for readability. That is
what the entries in this reference section are for, which provide additional
information for each of the steps in the guide.
A few words of general advice before digging into the details:
A few words of general advice:
* The Linux kernel developers are well aware this process is complicated and
demands more than other FLOSS projects. We'd love to make it simpler. But
that would require work in various places as well as some infrastructure,
which would need constant maintenance; nobody has stepped up to do that
work, so that's just how things are for now.
* The Linux developers are well aware that reporting bugs to them is more
complicated and demanding than in other FLOSS projects. Some of it is because
the kernel is different, among others due to its mail-driven development
process and because it consists mostly of drivers. Some of it is because
improving things would require work in several technical areas and people
triaging bugs and nobody has stepped up to do or fund that work.
* A warranty or support contract with some vendor doesn't entitle you to
request fixes from developers in the upstream Linux kernel community: such
contracts are completely outside the scope of the Linux kernel, its
development community, and this document. That's why you can't demand
anything such a contract guarantees in this context, not even if the
developer handling the issue works for the vendor in question. If you want
to claim your rights, use the vendor's support channel instead. When doing
so, you might want to mention you'd like to see the issue fixed in the
upstream Linux kernel; motivate them by saying it's the only way to ensure
the fix in the end will get incorporated in all Linux distributions.
* A warranty or support contract with some vendor doesn't entitle you to
request fixes from the upstream Linux developers: Such contracts are
completely outside the scope of the upstream Linux kernel, its development
community, and this document -- even if those handling the issue work for the
vendor who issued the contract. If you want to claim your rights, use the
vendor's support channel.
* If you never reported an issue to a FLOSS project before you should consider
reading `How to Report Bugs Effectively
<https://www.chiark.greenend.org.uk/~sgtatham/bugs.html>`_, `How To Ask
Questions The Smart Way
<http://www.catb.org/esr/faqs/smart-questions.html>`_, and `How to ask good
questions <https://jvns.ca/blog/good-questions/>`_.
* If you never reported an issue to a FLOSS project before, consider skimming
guides like `How to ask good questions
<https://jvns.ca/blog/good-questions/>`_, `How To Ask Questions The Smart Way
<http://www.catb.org/esr/faqs/smart-questions.html>`_, and `How to Report
Bugs Effectively <https://www.chiark.greenend.org.uk/~sgtatham/bugs.html>`_,.
With that off the table, find below the details on how to properly report
issues to the Linux kernel developers.
With that off the table, find below details for the steps from the detailed
guide on reporting issues to the Linux kernel developers.
Make sure you're using the upstream Linux kernel
@@ -1674,72 +1693,59 @@ for the subsystem where the issue seems to have its roots; CC the mailing list
for the subsystem as well as the stable mailing list (stable@vger.kernel.org).
Why some issues won't get any reaction or remain unfixed after being reported
=============================================================================
Appendix: Why it is somewhat hard to report kernel bugs
=======================================================
When reporting a problem to the Linux developers, be aware only 'issues of high
priority' (regressions, security issues, severe problems) are definitely going
to get resolved. The maintainers or if all else fails Linus Torvalds himself
will make sure of that. They and the other kernel developers will fix a lot of
other issues as well. But be aware that sometimes they can't or won't help; and
sometimes there isn't even anyone to send a report to.
The Linux kernel developers are well aware that reporting bugs to them is harder
than in other Free/Libre Open Source Projects. Many reasons for that lie in the
nature of kernels, Linux' development model, and how the world uses the kernel:
This is best explained with kernel developers that contribute to the Linux
kernel in their spare time. Quite a few of the drivers in the kernel were
written by such programmers, often because they simply wanted to make their
hardware usable on their favorite operating system.
* *Most kernels of Linux distributions are totally unsuitable for reporting bugs
upstream.* The reference section above already explained this in detail:
outdated codebases as well as modifications and add-ons lead to kernel bugs
that were fixed upstream a long time ago or never happened there in the first
place. Developers of other Open Source software face these problems as well,
but the situation is a lot worse when it comes to the kernel, as the changes
and their impact are much more severe -- which is why many kernel developers
expect reports with kernels built from fresh and nearly unmodified sources.
These programmers most of the time will happily fix problems other people
report. But nobody can force them to do, as they are contributing voluntarily.
* *Bugs often only occur in a special environment.* That is because Linux is
mostly drivers and can be used in a multitude of ways. Developers often do not
have a matching setup at hand -- and therefore frequently must rely on bug
reporters for isolating a problems's cause and testing proposed fixes.
Then there are situations where such developers really want to fix an issue,
but can't: sometimes they lack hardware programming documentation to do so.
This often happens when the publicly available docs are superficial or the
driver was written with the help of reverse engineering.
* *The kernel has hundreds of maintainers, but all-rounders are very rare.* That
again is and effect caused by the multitude of features and drivers, due to
which many kernel developers know little about lower or higher layers related
to their code and even less about other areas.
Sooner or later spare time developers will also stop caring for the driver.
Maybe their test hardware broke, got replaced by something more fancy, or is so
old that it's something you don't find much outside of computer museums
anymore. Sometimes developer stops caring for their code and Linux at all, as
something different in their life became way more important. In some cases
nobody is willing to take over the job as maintainer and nobody can be forced
to, as contributing to the Linux kernel is done on a voluntary basis. Abandoned
drivers nevertheless remain in the kernel: they are still useful for people and
removing would be a regression.
* *It is hard finding where to report issues to, among others, due to the lack
of a central bug tracker.* This is something even some kernel developers
dislike, but that's the situation everyone has to deal with currently.
The situation is not that different with developers that are paid for their
work on the Linux kernel. Those contribute most changes these days. But their
employers sooner or later also stop caring for their code or make its
programmer focus on other things. Hardware vendors for example earn their money
mainly by selling new hardware; quite a few of them hence are not investing
much time and energy in maintaining a Linux kernel driver for something they
stopped selling years ago. Enterprise Linux distributors often care for a
longer time period, but in new versions often leave support for old and rare
hardware aside to limit the scope. Often spare time contributors take over once
a company orphans some code, but as mentioned above: sooner or later they will
leave the code behind, too.
* *Stable and longterm kernels are primarily maintained by a dedicated 'stable
team', which only handles regressions introduced within stable and longterm
series.* When someone reports a bug, say, using Linux 6.1.2, the team will,
therefore, always ask if mainline is affected: if the bug already happened
in 6.1 or occurs with latest mainline (say, 6.2-rc3), they in everybody's
interest shove it to the regular developers, as those know the code best.
Priorities are another reason why some issues are not fixed, as maintainers
quite often are forced to set those, as time to work on Linux is limited.
That's true for spare time or the time employers grant their developers to
spend on maintenance work on the upstream kernel. Sometimes maintainers also
get overwhelmed with reports, even if a driver is working nearly perfectly. To
not get completely stuck, the programmer thus might have no other choice than
to prioritize issue reports and reject some of them.
* *Linux developers are free to focus on latest mainline.* Some, thus, react
coldly to reports about bugs in, say, Linux 6.0 when 6.1 is already out;
even the latter might not be enough once 6.2-rc1 is out. Some will also not
be very welcoming to reports with 6.1.5 or 6.1.6, as the problem might be a
series-specific regression the stable team (see above) caused and must fix.
But don't worry too much about all of this, a lot of drivers have active
maintainers who are quite interested in fixing as many issues as possible.
Closing words
=============
Compared with other Free/Libre & Open Source Software it's hard to report
issues to the Linux kernel developers: the length and complexity of this
document and the implications between the lines illustrate that. But that's how
it is for now. The main author of this text hopes documenting the state of the
art will lay some groundwork to improve the situation over time.
* *Sometimes there is nobody to help.* Sometimes this is due to the lack of
hardware documentation -- for example, when a driver was built using reverse
engineering or was taken over by spare-time developers when the hardware
manufacturer left it behind. Other times there is nobody to even report bugs
to: when maintainers move on without a replacement, their code often remains
in the kernel as long as it's useful.
Some of these aspects could be improved to facilitate bug reporting -- many
Linux kernel developers are well aware of this and would be glad if a few
individuals or an entity would make this their mission.
..
end-of-content

View File

@@ -0,0 +1,47 @@
=================
/proc/sys/crypto/
=================
These files show up in ``/proc/sys/crypto/``, depending on the
kernel configuration:
.. contents:: :local:
fips_enabled
============
Read-only flag that indicates whether FIPS mode is enabled.
- ``0``: FIPS mode is disabled (default).
- ``1``: FIPS mode is enabled.
This value is set at boot time via the ``fips=1`` kernel command line
parameter. When enabled, the cryptographic API will restrict the use
of certain algorithms and perform self-tests to ensure compliance with
FIPS (Federal Information Processing Standards) requirements, such as
FIPS 140-2 and the newer FIPS 140-3, depending on the kernel
configuration and the module in use.
fips_name
=========
Read-only file that contains the name of the FIPS module currently in use.
The value is typically configured via the ``CONFIG_CRYPTO_FIPS_NAME``
kernel configuration option.
fips_version
============
Read-only file that contains the version string of the FIPS module.
If ``CONFIG_CRYPTO_FIPS_CUSTOM_VERSION`` is set, it uses the value from
``CONFIG_CRYPTO_FIPS_VERSION``. Otherwise, it defaults to the kernel
release version (``UTS_RELEASE``).
Copyright (c) 2026, Shubham Chakraborty <chakrabortyshubham66@gmail.com>
For general info and legal blurb, please look in
Documentation/admin-guide/sysctl/index.rst.
.. See scripts/check-sysctl-docs to keep this up to date:
.. scripts/check-sysctl-docs -vtable="crypto" \
.. $(git grep -l register_sysctl_)

View File

@@ -0,0 +1,52 @@
================
/proc/sys/debug/
================
These files show up in ``/proc/sys/debug/``, depending on the
kernel configuration:
.. contents:: :local:
exception-trace
===============
This flag controls whether the kernel prints information about unhandled
signals (like segmentation faults) to the kernel log (``dmesg``).
- ``0``: Unhandled signals are not traced.
- ``1``: Information about unhandled signals is printed.
The default value is ``1`` on most architectures (like x86, MIPS, RISC-V),
but it is ``0`` on **arm64**.
The actual information printed and the context provided varies
significantly depending on the CPU architecture. For example:
- On **x86**, it typically prints the instruction pointer (IP), error
code, and address that caused a page fault.
- On **PowerPC**, it may print the next instruction pointer (NIP),
link register (LR), and other relevant registers.
When enabled, this feature is often rate-limited to prevent the kernel
log from being flooded during a crash loop.
kprobes-optimization
====================
This flag enables or disables the optimization of Kprobes on certain
architectures (like x86).
- ``0``: Kprobes optimization is turned off.
- ``1``: Kprobes optimization is turned on (default).
For more details on Kprobes and its optimization, please refer to
Documentation/trace/kprobes.rst.
Copyright (c) 2026, Shubham Chakraborty <chakrabortyshubham66@gmail.com>
For general info and legal blurb, please look in
Documentation/admin-guide/sysctl/index.rst.
.. See scripts/check-sysctl-docs to keep this up to date:
.. scripts/check-sysctl-docs -vtable="debug" \
.. $(git grep -l register_sysctl_)

View File

@@ -67,8 +67,8 @@ This documentation is about:
=============== ===============================================================
abi/ execution domains & personalities
<$ARCH> tuning controls for various CPU architecture (e.g. csky, s390)
crypto/ <undocumented>
debug/ <undocumented>
crypto/ cryptographic subsystem
debug/ debugging features
dev/ device specific information (e.g. dev/cdrom/info)
fs/ specific filesystems
filehandle, inode, dentry and quota tuning
@@ -84,7 +84,7 @@ sunrpc/ SUN Remote Procedure Call (NFS)
user/ Per user namespace limits
vm/ memory management tuning
buffer and cache management
xen/ <undocumented>
xen/ Xen hypervisor controls
=============== ===============================================================
These are the subdirs I have on my system or have been discovered by
@@ -96,9 +96,12 @@ it :-)
:maxdepth: 1
abi
crypto
debug
fs
kernel
net
sunrpc
user
vm
xen

View File

@@ -418,7 +418,8 @@ hung_task_detect_count
======================
Indicates the total number of tasks that have been detected as hung since
the system boot.
the system boot or since the counter was reset. The counter is zeroed when
a value of 0 is written.
This file shows up if ``CONFIG_DETECT_HUNG_TASK`` is enabled.

View File

@@ -602,3 +602,31 @@ it does not modify the current namespace or any existing children.
A namespace with ``ns_mode`` set to ``local`` cannot change
``child_ns_mode`` to ``global`` (returns ``-EPERM``).
g2h_fallback
------------
Controls whether connections to CIDs not owned by the host-to-guest (H2G)
transport automatically fall back to the guest-to-host (G2H) transport.
When enabled, if a connect targets a CID that the H2G transport (e.g.
vhost-vsock) does not serve, or if no H2G transport is loaded at all, the
connection is routed via the G2H transport (e.g. virtio-vsock) instead. This
allows a host running both nested VMs (via vhost-vsock) and sibling VMs
reachable through the hypervisor (e.g. Nitro Enclaves) to address both using
a single CID space, without requiring applications to set
``VMADDR_FLAG_TO_HOST``.
When the fallback is taken, ``VMADDR_FLAG_TO_HOST`` is automatically set on
the remote address so that userspace can determine the path via
``getpeername()``.
Note: With this sysctl enabled, user space that attempts to talk to a guest
CID which is not implemented by the H2G transport will create host vsock
traffic. Environments that rely on H2G-only isolation should set it to 0.
Values:
- 0 - Connections to CIDs <= 2 or with VMADDR_FLAG_TO_HOST use G2H;
all others use H2G (or fail with ENODEV if H2G is not loaded).
- 1 - Connections to CIDs not owned by H2G fall back to G2H. (default)

View File

@@ -0,0 +1,31 @@
===============
/proc/sys/xen/
===============
Copyright (c) 2026, Shubham Chakraborty <chakrabortyshubham66@gmail.com>
For general info and legal blurb, please look in
Documentation/admin-guide/sysctl/index.rst.
------------------------------------------------------------------------------
These files show up in ``/proc/sys/xen/``, depending on the
kernel configuration:
.. contents:: :local:
balloon/hotplug_unpopulated
===========================
This flag controls whether unpopulated memory ranges are automatically
hotplugged as system RAM.
- ``0``: Unpopulated ranges are not hotplugged (default).
- ``1``: Unpopulated ranges are automatically hotplugged.
When enabled, the Xen balloon driver will add memory regions that are
marked as unpopulated in the Xen memory map to the system as usable RAM.
This allows for dynamic memory expansion in Xen guest domains.
This option is only available when the kernel is built with
``CONFIG_XEN_BALLOON_MEMORY_HOTPLUG`` enabled.

View File

@@ -74,7 +74,7 @@ a particular type of taint. It's best to leave that to the aforementioned
script, but if you need something quick you can use this shell command to check
which bits are set::
$ for i in $(seq 18); do echo $(($i-1)) $(($(cat /proc/sys/kernel/tainted)>>($i-1)&1));done
$ for i in $(seq 20); do echo $(($i-1)) $(($(cat /proc/sys/kernel/tainted)>>($i-1)&1));done
Table for decoding tainted state
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

View File

@@ -1062,16 +1062,15 @@ Conclusion
You have reached the end of the step-by-step guide.
Did you run into trouble following any of the above steps not cleared up by the
reference section below? Did you spot errors? Or do you have ideas how to
Did you run into trouble following the step-by-step guide not cleared up by the
reference section below? Did you spot errors? Or do you have ideas on how to
improve the guide?
If any of that applies, please take a moment and let the maintainer of this
document know by email (Thorsten Leemhuis <linux@leemhuis.info>), ideally while
CCing the Linux docs mailing list (linux-doc@vger.kernel.org). Such feedback is
vital to improve this text further, which is in everybody's interest, as it
will enable more people to master the task described here -- and hopefully also
improve similar guides inspired by this one.
If any of that applies, please let the developers know by sending a short note
or a patch to Thorsten Leemhuis <linux@leemhuis.info> while ideally CCing the
public Linux docs mailing list <linux-doc@vger.kernel.org>. Such feedback is
vital to improve this text further, which is in everybody's interest, as it will
enable more people to master the task described here.
Reference section for the step-by-step guide

View File

@@ -550,6 +550,10 @@ For zoned file systems, the following attributes are exposed in:
is limited by the capabilities of the backing zoned device, file system
size and the max_open_zones mount option.
nr_open_zones (Min: 0 Default: Varies Max: UINTMAX)
This read-only attribute exposes the current number of open zones
used by the file system.
zonegc_low_space (Min: 0 Default: 0 Max: 100)
Define a percentage for how much of the unused space that GC should keep
available for writing. A high value will reclaim more of the space

View File

@@ -23,6 +23,7 @@ ARM64 Architecture
memory
memory-tagging-extension
mops
mpam
perf
pointer-authentication
ptdump

View File

@@ -0,0 +1,72 @@
.. SPDX-License-Identifier: GPL-2.0
====
MPAM
====
What is MPAM
============
MPAM (Memory Partitioning and Monitoring) is a feature in the CPUs and memory
system components such as the caches or memory controllers that allow memory
traffic to be labelled, partitioned and monitored.
Traffic is labelled by the CPU, based on the control or monitor group the
current task is assigned to using resctrl. Partitioning policy can be set
using the schemata file in resctrl, and monitor values read via resctrl.
See Documentation/filesystems/resctrl.rst for more details.
This allows tasks that share memory system resources, such as caches, to be
isolated from each other according to the partitioning policy (so called noisy
neighbours).
Supported Platforms
===================
Use of this feature requires CPU support, support in the memory system
components, and a description from firmware of where the MPAM device controls
are in the MMIO address space. (e.g. the 'MPAM' ACPI table).
The MMIO device that provides MPAM controls/monitors for a memory system
component is called a memory system component. (MSC).
Because the user interface to MPAM is via resctrl, only MPAM features that are
compatible with resctrl can be exposed to user-space.
MSC are considered as a group based on the topology. MSC that correspond with
the L3 cache are considered together, it is not possible to mix MSC between L2
and L3 to 'cover' a resctrl schema.
The supported features are:
* Cache portion bitmap controls (CPOR) on the L2 or L3 caches. To expose
CPOR at L2 or L3, every CPU must have a corresponding CPU cache at this
level that also supports the feature. Mismatched big/little platforms are
not supported as resctrl's controls would then also depend on task
placement.
* Memory bandwidth maximum controls (MBW_MAX) on or after the L3 cache.
resctrl uses the L3 cache-id to identify where the memory bandwidth
control is applied. For this reason the platform must have an L3 cache
with cache-id's supplied by firmware. (It doesn't need to support MPAM.)
To be exported as the 'MB' schema, the topology of the group of MSC chosen
must match the topology of the L3 cache so that the cache-id's can be
repainted. For example: Platforms with Memory bandwidth maximum controls
on CPU-less NUMA nodes cannot expose the 'MB' schema to resctrl as these
nodes do not have a corresponding L3 cache. If the memory bandwidth
control is on the memory rather than the L3 then there must be a single
global L3 as otherwise it is unknown which L3 the traffic came from. There
must be no caches between the L3 and the memory so that the two ends of
the path have equivalent traffic.
When the MPAM driver finds multiple groups of MSC it can use for the 'MB'
schema, it prefers the group closest to the L3 cache.
* Cache Storage Usage (CSU) counters can expose the 'llc_occupancy' provided
there is at least one CSU monitor on each MSC that makes up the L3 group.
Exposing CSU counters from other caches or devices is not supported.
Reporting Bugs
==============
If you are not seeing the counters or controls you expect please share the
debug messages produced when enabling dynamic debug and booting with:
dyndbg="file mpam_resctrl.c +pl"

View File

@@ -202,18 +202,29 @@ stable kernels.
+----------------+-----------------+-----------------+-----------------------------+
| ARM | Neoverse-V3AE | #3312417 | ARM64_ERRATUM_3194386 |
+----------------+-----------------+-----------------+-----------------------------+
| ARM | C1-Pro | #4193714 | ARM64_ERRATUM_4193714 |
+----------------+-----------------+-----------------+-----------------------------+
| ARM | MMU-500 | #841119,826419 | ARM_SMMU_MMU_500_CPRE_ERRATA|
| | | #562869,1047329 | |
+----------------+-----------------+-----------------+-----------------------------+
| ARM | MMU-600 | #1076982,1209401| N/A |
+----------------+-----------------+-----------------+-----------------------------+
| ARM | MMU-700 | #2268618,2812531| N/A |
| ARM | MMU-700 | #2133013, | N/A |
| | | #2268618, | |
| | | #2812531, | |
| | | #3777127 | |
+----------------+-----------------+-----------------+-----------------------------+
| ARM | MMU L1 | #3878312 | N/A |
+----------------+-----------------+-----------------+-----------------------------+
| ARM | MMU S3 | #3995052 | N/A |
+----------------+-----------------+-----------------+-----------------------------+
| ARM | GIC-700 | #2941627 | ARM64_ERRATUM_2941627 |
+----------------+-----------------+-----------------+-----------------------------+
| ARM | SI L1 | #4311569 | ARM64_ERRATUM_4311569 |
+----------------+-----------------+-----------------+-----------------------------+
| ARM | CMN-650 | #3642720 | N/A |
+----------------+-----------------+-----------------+-----------------------------+
+----------------+-----------------+-----------------+-----------------------------+
| Broadcom | Brahma-B53 | N/A | ARM64_ERRATUM_845719 |
+----------------+-----------------+-----------------+-----------------------------+
| Broadcom | Brahma-B53 | N/A | ARM64_ERRATUM_843419 |
@@ -247,6 +258,12 @@ stable kernels.
+----------------+-----------------+-----------------+-----------------------------+
| NVIDIA | T241 GICv3/4.x | T241-FABRIC-4 | N/A |
+----------------+-----------------+-----------------+-----------------------------+
| NVIDIA | T241 MPAM | T241-MPAM-1 | N/A |
+----------------+-----------------+-----------------+-----------------------------+
| NVIDIA | T241 MPAM | T241-MPAM-4 | N/A |
+----------------+-----------------+-----------------+-----------------------------+
| NVIDIA | T241 MPAM | T241-MPAM-6 | N/A |
+----------------+-----------------+-----------------+-----------------------------+
+----------------+-----------------+-----------------+-----------------------------+
| Freescale/NXP | LS2080A/LS1043A | A-008585 | FSL_ERRATUM_A008585 |
+----------------+-----------------+-----------------+-----------------------------+

View File

@@ -6,6 +6,7 @@ S/390 PCI
Authors:
- Pierre Morel
- Niklas Schnelle
Copyright, IBM Corp. 2020
@@ -27,14 +28,16 @@ Command line parameters
debugfs entries
---------------
The S/390 debug feature (s390dbf) generates views to hold various debug results in sysfs directories of the form:
The S/390 debug feature (s390dbf) generates views to hold various debug results
in sysfs directories of the form:
* /sys/kernel/debug/s390dbf/pci_*/
For example:
- /sys/kernel/debug/s390dbf/pci_msg/sprintf
Holds messages from the processing of PCI events, like machine check handling
holds messages from the processing of PCI events, like machine check handling
and setting of global functionality, like UID checking.
Change the level of logging to be more or less verbose by piping
@@ -47,87 +50,141 @@ Sysfs entries
Entries specific to zPCI functions and entries that hold zPCI information.
* /sys/bus/pci/slots/XXXXXXXX
* /sys/bus/pci/slots/XXXXXXXX:
The slot entries are set up using the function identifier (FID) of the
PCI function. The format depicted as XXXXXXXX above is 8 hexadecimal digits
with 0 padding and lower case hexadecimal digits.
The slot entries are set up using the function identifier (FID) of the PCI
function as slot name. The format depicted as XXXXXXXX above is 8 hexadecimal
digits with 0 padding and lower case hexadecimal digits.
- /sys/bus/pci/slots/XXXXXXXX/power
In addition to using the FID as the name of the slot, the slot directory
also contains the following s390-specific slot attributes.
- uid:
The User-defined identifier (UID) of the function which may be configured
by this slot. See also the corresponding attribute of the device.
A physical function that currently supports a virtual function cannot be
powered off until all virtual functions are removed with:
echo 0 > /sys/bus/pci/devices/XXXX:XX:XX.X/sriov_numvf
echo 0 > /sys/bus/pci/devices/DDDD:BB:dd.f/sriov_numvf
* /sys/bus/pci/devices/XXXX:XX:XX.X/
* /sys/bus/pci/devices/DDDD:BB:dd.f/:
- function_id
A zPCI function identifier that uniquely identifies the function in the Z server.
- function_id:
The zPCI function identifier (FID) is a 32-bit hexadecimal value that
uniquely identifies the PCI function. Unless the hypervisor provides
a virtual FID e.g. on KVM this identifier is unique across the machine even
between different partitions.
- function_handle
Low-level identifier used for a configured PCI function.
It might be useful for debugging.
- function_handle:
This 32-bit hexadecimal value is a low-level identifier used for a PCI
function. Note that the function handle may be changed and become invalid
on PCI events and when enabling/disabling the PCI function.
- pchid
Model-dependent location of the I/O adapter.
- pchid:
This 16-bit hexadecimal value encodes a model-dependent location for
the PCI function.
- pfgid
PCI function group ID, functions that share identical functionality
- pfgid:
PCI function group ID; functions that share identical functionality
use a common identifier.
A PCI group defines interrupts, IOMMU, IOTLB, and DMA specifics.
- vfn
- vfn:
The virtual function number, from 1 to N for virtual functions,
0 for physical functions.
- pft
The PCI function type
- pft:
The PCI function type is an s390-specific type attribute. It indicates
a more general, usage oriented, type than PCI Specification
class/vendor/device identifiers. That is PCI functions with the same pft
value may be backed by different hardware implementations. At the same time
apart from unclassified functions (pft is 0x00) the same pft value
generally implies a similar usage model. At the same time the same
PCI hardware device may appear with different pft values when in a
different usage model. For example NETD and NETH VFs may be implemented
by the same PCI hardware device but in NETD the parent Physical Function
is user managed while with NETH it is platform managed.
- port
The port corresponds to the physical port the function is attached to.
It also gives an indication of the physical function a virtual function
is attached to.
Currently the following PFT values are defined:
- uid
The user identifier (UID) may be defined as part of the machine
configuration or the z/VM or KVM guest configuration. If the accompanying
uid_is_unique attribute is 1 the platform guarantees that the UID is unique
within that instance and no devices with the same UID can be attached
during the lifetime of the system.
- 0x00 (UNC): Unclassified
- 0x02 (ROCE): RoCE Express
- 0x05 (ISM): Internal Shared Memory
- 0x0a (ROC2): RoCE Express 2
- 0x0b (NVMe): NVMe
- 0x0c (NETH): Network Express hybrid
- 0x0d (CNW): Cloud Network Adapter
- 0x0f (NETD): Network Express direct
- uid_is_unique
Indicates whether the user identifier (UID) is guaranteed to be and remain
unique within this Linux instance.
- port:
The port is a decimal value corresponding to the physical port the function
is attached to. Virtual Functions (VFs) share the port with their parent
Physical Function (PF). A value of 0 indicates that the port attribute is
not applicable for that PCI function type.
- pfip/segmentX
- uid:
The user-defined identifier (UID) for a PCI function is a 32-bit
hexadecimal value. It is defined on a per instance basis as part of the
partition, KVM guest, or z/VM guest configuration. If UID Checking is
enabled the platform ensures that the UID is unique within that instance
and no two PCI functions with the same UID will be visible to the instance.
Independent of this guarantee and unlike the function ID (FID) the UID may
be the same in different partitions within the same machine. This allows to
create PCI configurations in multiple partitions to be identical in the
UID-namespace.
- uid_is_unique:
A 0 or 1 flag indicating whether the user-defined identifier (UID) is
guaranteed to be and remain unique within this Linux instance. This
platform feature is called UID Checking.
- pfip/segmentX:
The segments determine the isolation of a function.
They correspond to the physical path to the function.
The more the segments are different, the more the functions are isolated.
- fidparm:
Contains an 8-bit-per-PCI function parameter field in hexadecimal provided
by the platform. The meaning of this field is PCI function type specific.
For NETH VFs a value of 0x01 indicates that the function supports
promiscuous mode.
* /sys/firmware/clp/uid_checking:
In addition to the per-device uid_is_unique attribute this presents a
global indication of whether UID Checking is enabled. This allows users
to check for UID Checking even when no PCI functions are configured.
Enumeration and hotplug
=======================
The PCI address consists of four parts: domain, bus, device and function,
and is of this form: DDDD:BB:dd.f
and is of this form: DDDD:BB:dd.f.
* When not using multi-functions (norid is set, or the firmware does not
support multi-functions):
* For a PCI function for which the platform does not expose the RID, the
pci=norid kernel parameter is used, or a so-called isolated Virtual Function
which does have RID information but is used without its parent Physical
Function being part of the same PCI configuration:
- There is only one function per domain.
- The domain is set from the zPCI function's UID as defined during the
LPAR creation.
- The domain is set from the zPCI function's UID if UID Checking is on;
otherwise the domain ID is generated dynamically and is not stable
across reboots or hot plug.
* When using multi-functions (norid parameter is not set),
zPCI functions are addressed differently:
* For a PCI function for which the platform exposes the RID and which
is not an Isolated Virtual Function:
- There is still only one bus per domain.
- There can be up to 256 functions per bus.
- There can be up to 256 PCI functions per bus.
- The domain part of the address of all functions for
a multi-Function device is set from the zPCI function's UID as defined
in the LPAR creation for the function zero.
- The domain part of the address of all functions within the same topology is
that of the configured PCI function with the lowest devfn within that
topology.
- New functions will only be ready for use after the function zero
(the function with devfn 0) has been enumerated.
- Virtual Functions generated by an SR-IOV capable Physical Function only
become visible once SR-IOV is enabled.

View File

@@ -431,17 +431,14 @@ matrix device.
* callback interfaces
open_device:
The vfio_ap driver uses this callback to register a
VFIO_GROUP_NOTIFY_SET_KVM notifier callback function for the matrix mdev
devices. The open_device callback is invoked by userspace to connect the
VFIO iommu group for the matrix mdev device to the MDEV bus. Access to the
KVM structure used to configure the KVM guest is provided via this callback.
The KVM structure, is used to configure the guest's access to the AP matrix
defined via the vfio_ap mediated device's sysfs attribute files.
the open_device callback is invoked by userspace to connect the
VFIO iommu group for the matrix mdev device to the MDEV bus. The
callback retrieves the KVM structure used to configure the KVM guest
and configures the guest's access to the AP matrix defined via the
vfio_ap mediated device's sysfs attribute files.
close_device:
unregisters the VFIO_GROUP_NOTIFY_SET_KVM notifier callback function for the
matrix mdev device and deconfigures the guest's AP matrix.
this callback deconfigures the guest's AP matrix.
ioctl:
this callback handles the VFIO_DEVICE_GET_INFO and VFIO_DEVICE_RESET ioctls
@@ -449,9 +446,8 @@ matrix device.
Configure the guest's AP resources
----------------------------------
Configuring the AP resources for a KVM guest will be performed when the
VFIO_GROUP_NOTIFY_SET_KVM notifier callback is invoked. The notifier
function is called when userspace connects to KVM. The guest's AP resources are
Configuring the AP resources for a KVM guest will be performed at the
time of ``open_device`` and ``close_device``. The guest's AP resources are
configured via its APCB by:
* Setting the bits in the APM corresponding to the APIDs assigned to the

View File

@@ -60,44 +60,18 @@ Besides initializing the TDX module, a per-cpu initialization SEAMCALL
must be done on one cpu before any other SEAMCALLs can be made on that
cpu.
The kernel provides two functions, tdx_enable() and tdx_cpu_enable() to
allow the user of TDX to enable the TDX module and enable TDX on local
cpu respectively.
Making SEAMCALL requires VMXON has been done on that CPU. Currently only
KVM implements VMXON. For now both tdx_enable() and tdx_cpu_enable()
don't do VMXON internally (not trivial), but depends on the caller to
guarantee that.
To enable TDX, the caller of TDX should: 1) temporarily disable CPU
hotplug; 2) do VMXON and tdx_enable_cpu() on all online cpus; 3) call
tdx_enable(). For example::
cpus_read_lock();
on_each_cpu(vmxon_and_tdx_cpu_enable());
ret = tdx_enable();
cpus_read_unlock();
if (ret)
goto no_tdx;
// TDX is ready to use
And the caller of TDX must guarantee the tdx_cpu_enable() has been
successfully done on any cpu before it wants to run any other SEAMCALL.
A typical usage is do both VMXON and tdx_cpu_enable() in CPU hotplug
online callback, and refuse to online if tdx_cpu_enable() fails.
User can consult dmesg to see whether the TDX module has been initialized.
If the TDX module is initialized successfully, dmesg shows something
like below::
[..] virt/tdx: 262668 KBs allocated for PAMT
[..] virt/tdx: module initialized
[..] virt/tdx: TDX-Module initialized
If the TDX module failed to initialize, dmesg also shows it failed to
initialize::
[..] virt/tdx: module initialization failed ...
[..] virt/tdx: TDX-Module initialization failed ...
TDX Interaction to Other Kernel Components
------------------------------------------
@@ -129,9 +103,9 @@ CPU Hotplug
~~~~~~~~~~~
TDX module requires the per-cpu initialization SEAMCALL must be done on
one cpu before any other SEAMCALLs can be made on that cpu. The kernel
provides tdx_cpu_enable() to let the user of TDX to do it when the user
wants to use a new cpu for TDX task.
one cpu before any other SEAMCALLs can be made on that cpu. The kernel,
via the CPU hotplug framework, performs the necessary initialization when
a CPU is first brought online.
TDX doesn't support physical (ACPI) CPU hotplug. During machine boot,
TDX verifies all boot-time present logical CPUs are TDX compatible before

View File

@@ -153,7 +153,7 @@ blk-crypto-fallback completes the original bio. If the original bio is too
large, multiple bounce bios may be required; see the code for details.
For decryption, blk-crypto-fallback "wraps" the bio's completion callback
(``bi_complete``) and private data (``bi_private``) with its own, unsets the
(``bi_end_io``) and private data (``bi_private``) with its own, unsets the
bio's encryption context, then submits the bio. If the read completes
successfully, blk-crypto-fallback restores the bio's original completion
callback and private data, then decrypts the bio's data in-place using the

View File

@@ -485,6 +485,125 @@ Limitations
in case that too many ublk devices are handled by this single io_ring_ctx
and each one has very large queue depth
Shared Memory Zero Copy (UBLK_F_SHMEM_ZC)
------------------------------------------
The ``UBLK_F_SHMEM_ZC`` feature provides an alternative zero-copy path
that works by sharing physical memory pages between the client application
and the ublk server. Unlike the io_uring fixed buffer approach above,
shared memory zero copy does not require io_uring buffer registration
per I/O — instead, it relies on the kernel matching physical pages
at I/O time. This allows the ublk server to access the shared
buffer directly, which is unlikely for the io_uring fixed buffer
approach.
Motivation
~~~~~~~~~~
Shared memory zero copy takes a different approach: if the client
application and the ublk server both map the same physical memory, there is
nothing to copy. The kernel detects the shared pages automatically and
tells the server where the data already lives.
``UBLK_F_SHMEM_ZC`` can be thought of as a supplement for optimized client
applications — when the client is willing to allocate I/O buffers from
shared memory, the entire data path becomes zero-copy.
Use Cases
~~~~~~~~~
This feature is useful when the client application can be configured to
use a specific shared memory region for its I/O buffers:
- **Custom storage clients** that allocate I/O buffers from shared memory
(memfd, hugetlbfs) and issue direct I/O to the ublk device
- **Database engines** that use pre-allocated buffer pools with O_DIRECT
How It Works
~~~~~~~~~~~~
1. The ublk server and client both ``mmap()`` the same file (memfd or
hugetlbfs) with ``MAP_SHARED``. This gives both processes access to the
same physical pages.
2. The ublk server registers its mapping with the kernel::
struct ublk_shmem_buf_reg buf = { .addr = mmap_va, .len = size };
ublk_ctrl_cmd(UBLK_U_CMD_REG_BUF, .addr = &buf);
The kernel pins the pages and builds a PFN lookup tree.
3. When the client issues direct I/O (``O_DIRECT``) to ``/dev/ublkb*``,
the kernel checks whether the I/O buffer pages match any registered
pages by comparing PFNs.
4. On a match, the kernel sets ``UBLK_IO_F_SHMEM_ZC`` in the I/O
descriptor and encodes the buffer index and offset in ``addr``::
if (iod->op_flags & UBLK_IO_F_SHMEM_ZC) {
/* Data is already in our shared mapping — zero copy */
index = ublk_shmem_zc_index(iod->addr);
offset = ublk_shmem_zc_offset(iod->addr);
buf = shmem_table[index].mmap_base + offset;
}
5. If pages do not match (e.g., the client used a non-shared buffer),
the I/O falls back to the normal copy path silently.
The shared memory can be set up via two methods:
- **Socket-based**: the client sends a memfd to the ublk server via
``SCM_RIGHTS`` on a unix socket. The server mmaps and registers it.
- **Hugetlbfs-based**: both processes ``mmap(MAP_SHARED)`` the same
hugetlbfs file. No IPC needed — same file gives same physical pages.
Advantages
~~~~~~~~~~
- **Simple**: no per-I/O buffer registration or unregistration commands.
Once the shared buffer is registered, all matching I/O is zero-copy
automatically.
- **Direct buffer access**: the ublk server can read and write the shared
buffer directly via its own mmap, without going through io_uring fixed
buffer operations. This is more friendly for server implementations.
- **Fast**: PFN matching is a single maple tree lookup per bvec. No
io_uring command round-trips for buffer management.
- **Compatible**: non-matching I/O silently falls back to the copy path.
The device works normally for any client, with zero-copy as an
optimization when shared memory is available.
Limitations
~~~~~~~~~~~
- **Requires client cooperation**: the client must allocate its I/O
buffers from the shared memory region. This requires a custom or
configured client — standard applications using their own buffers
will not benefit.
- **Direct I/O only**: buffered I/O (without ``O_DIRECT``) goes through
the page cache, which allocates its own pages. These kernel-allocated
pages will never match the registered shared buffer. Only ``O_DIRECT``
puts the client's buffer pages directly into the block I/O.
- **Contiguous data only**: each I/O request's data must be contiguous
within a single registered buffer. Scatter/gather I/O that spans
multiple non-adjacent registered buffers cannot use the zero-copy path.
Control Commands
~~~~~~~~~~~~~~~~
- ``UBLK_U_CMD_REG_BUF``
Register a shared memory buffer. ``ctrl_cmd.addr`` points to a
``struct ublk_shmem_buf_reg`` containing the buffer virtual address and size.
Returns the assigned buffer index (>= 0) on success. The kernel pins
pages and builds the PFN lookup tree. Queue freeze is handled
internally.
- ``UBLK_U_CMD_UNREG_BUF``
Unregister a previously registered buffer. ``ctrl_cmd.data[0]`` is the
buffer index. Unpins pages and removes PFN entries from the lookup
tree.
References
==========

View File

@@ -26,8 +26,8 @@ about these objects, including id, type and name.
The main use-case `bpf_inspect.py`_ covers is to show BPF programs of types
``BPF_PROG_TYPE_EXT`` and ``BPF_PROG_TYPE_TRACING`` attached to other BPF
programs via ``freplace``/``fentry``/``fexit`` mechanisms, since there is no
user-space API to get this information.
programs via ``freplace``/``fentry``/``fexit``/``fsession`` mechanisms, since
there is no user-space API to get this information.
Getting started
===============

View File

@@ -207,6 +207,10 @@ described in more detail in the footnotes.
+ + +----------------------------------+-----------+
| | | ``fexit.s+`` [#fentry]_ | Yes |
+ +----------------------------------------+----------------------------------+-----------+
| | ``BPF_TRACE_FSESSION`` | ``fsession+`` [#fentry]_ | |
+ + +----------------------------------+-----------+
| | | ``fsession.s+`` [#fentry]_ | Yes |
+ +----------------------------------------+----------------------------------+-----------+
| | ``BPF_TRACE_ITER`` | ``iter+`` [#iter]_ | |
+ + +----------------------------------+-----------+
| | | ``iter.s+`` [#iter]_ | Yes |

View File

@@ -455,6 +455,7 @@ if html_theme == "alabaster":
# The name of an image file (relative to this directory) to place at the top
# of the sidebar.
html_logo = "images/logo.svg"
html_favicon = "images/logo.svg"
# Output file base name for HTML help builder.
htmlhelp_basename = "TheLinuxKerneldoc"

View File

@@ -15,7 +15,7 @@ various deferrals etc...
Sometimes housekeeping is just some unbound work (unbound workqueues,
unbound timers, ...) that gets easily assigned to non-isolated CPUs.
But sometimes housekeeping is tied to a specific CPU and requires
elaborated tricks to be offloaded to non-isolated CPUs (RCU_NOCB, remote
elaborate tricks to be offloaded to non-isolated CPUs (RCU_NOCB, remote
scheduler tick, etc...).
Thus, a housekeeping CPU can be considered as the reverse of an isolated

View File

@@ -9,3 +9,4 @@ IRQs
irq-affinity
irq-domain
irqflags-tracing
managed_irq

View File

@@ -0,0 +1,116 @@
.. SPDX-License-Identifier: GPL-2.0
===========================
Affinity managed interrupts
===========================
The IRQ core provides support for managing interrupts according to a specified
CPU affinity. Under normal operation, an interrupt is associated with a
particular CPU. If that CPU is taken offline, the interrupt is migrated to
another online CPU.
Devices with large numbers of interrupt vectors can stress the available vector
space. For example, an NVMe device with 128 I/O queues typically requests one
interrupt per queue on systems with at least 128 CPUs. Two such devices
therefore request 256 interrupts. On x86, the interrupt vector space is
notoriously low, providing only 256 vectors per CPU, and the kernel reserves a
subset of these, further reducing the number available for device interrupts.
In practice this is not an issue because the interrupts are distributed across
many CPUs, so each CPU only receives a small number of vectors.
During system suspend, however, all secondary CPUs are taken offline and all
interrupts are migrated to the single CPU that remains online. This can exhaust
the available interrupt vectors on that CPU and cause the suspend operation to
fail.
Affinitymanaged interrupts address this limitation. Each interrupt is assigned
a CPU affinity mask that specifies the set of CPUs on which the interrupt may
be targeted. When a CPU in the mask goes offline, the interrupt is moved to the
next CPU in the mask. If the last CPU in the mask goes offline, the interrupt
is shut down. Drivers using affinitymanaged interrupts must ensure that the
associated queue is quiesced before the interrupt is disabled so that no
further interrupts are generated. When a CPU in the affinity mask comes back
online, the interrupt is reenabled.
Implementation
--------------
Devices must provide perinstance interrupts, such as perI/Oqueue interrupts
for storage devices like NVMe. The driver allocates interrupt vectors with the
required affinity settings using struct irq_affinity. For MSIX devices, this
is done via pci_alloc_irq_vectors_affinity() with the PCI_IRQ_AFFINITY flag
set.
Based on the provided affinity information, the IRQ core attempts to spread the
interrupts evenly across the system. The affinity masks are computed during
this allocation step, but the final IRQ assignment is performed when
request_irq() is invoked.
Isolated CPUs
-------------
The affinity of managed interrupts is handled entirely in the kernel and cannot
be modified from user space through the /proc interfaces. The managed_irq
subparameter of the isolcpus boot option specifies a CPU mask that managed
interrupts should attempt to avoid. This isolation is besteffort and only
applies if the automatically assigned interrupt mask also contains online CPUs
outside the avoided mask. If the requested mask contains only isolated CPUs,
the setting has no effect.
CPUs listed in the avoided mask remain part of the interrupts affinity mask.
This means that if all nonisolated CPUs go offline while isolated CPUs remain
online, the interrupt will be assigned to one of the isolated CPUs.
The following examples assume a system with 8 CPUs.
- A QEMU instance is booted with "-device virtio-scsi-pci".
The MSIX device exposes 11 interrupts: 3 "management" interrupts and 8
"queue" interrupts. The driver requests the 8 queue interrupts, each of which
is affine to exactly one CPU. If that CPU goes offline, the interrupt is shut
down.
Assuming interrupt 48 is one of the queue interrupts, the following appears::
/proc/irq/48/effective_affinity_list:7
/proc/irq/48/smp_affinity_list:7
This indicates that the interrupt is served only by CPU7. Shutting down CPU7
does not migrate the interrupt to another CPU::
/proc/irq/48/effective_affinity_list:0
/proc/irq/48/smp_affinity_list:7
This can be verified via the debugfs interface
(/sys/kernel/debug/irq/irqs/48). The dstate field will include
IRQD_IRQ_DISABLED, IRQD_IRQ_MASKED and IRQD_MANAGED_SHUTDOWN.
- A QEMU instance is booted with "-device virtio-scsi-pci,num_queues=2"
and the kernel command line includes:
"irqaffinity=0,1 isolcpus=domain,2-7 isolcpus=managed_irq,1-3,5-7".
The MSIX device exposes 5 interrupts: 3 management interrupts and 2 queue
interrupts. The management interrupts follow the irqaffinity= setting. The
queue interrupts are spread across available CPUs::
/proc/irq/47/effective_affinity_list:0
/proc/irq/47/smp_affinity_list:0-3
/proc/irq/48/effective_affinity_list:4
/proc/irq/48/smp_affinity_list:4-7
The two queue interrupts are evenly distributed. Interrupt 48 is placed on CPU4
because the managed_irq mask avoids CPUs 57 when possible.
Replacing the managed_irq argument with "isolcpus=managed_irq,1-3,4-5,7"
results in::
/proc/irq/48/effective_affinity_list:6
/proc/irq/48/smp_affinity_list:4-7
Interrupt 48 is now served on CPU6 because the system avoids CPUs 4, 5 and
7. If CPU6 is taken offline, the interrupt migrates to one of the "isolated"
CPUs::
/proc/irq/48/effective_affinity_list:7
/proc/irq/48/smp_affinity_list:4-7
The interrupt is shut down once all CPUs listed in its smp_affinity mask are
offline.

View File

@@ -22,6 +22,12 @@ memblock preservation ABI
.. kernel-doc:: include/linux/kho/abi/memblock.h
:doc: memblock kexec handover ABI
KHO persistent memory tracker ABI
=================================
.. kernel-doc:: include/linux/kho/abi/kexec_handover.h
:doc: KHO persistent memory tracker
See Also
========

View File

@@ -71,17 +71,17 @@ for boot memory allocations and as target memory for kexec blobs, some parts
of that memory region may be reserved. These reservations are irrelevant for
the next KHO, because kexec can overwrite even the original kernel.
.. _kho-finalization-phase:
Kexec Handover Radix Tree
=========================
KHO finalization phase
======================
.. kernel-doc:: include/linux/kho_radix_tree.h
:doc: Kexec Handover Radix Tree
To enable user space based kexec file loader, the kernel needs to be able to
provide the FDT that describes the current kernel's state before
performing the actual kexec. The process of generating that FDT is
called serialization. When the FDT is generated, some properties
of the system may become immutable because they are already written down
in the FDT. That state is called the KHO finalization phase.
Public API
==========
.. kernel-doc:: kernel/liveupdate/kexec_handover.c
:export:
See Also
========

View File

@@ -96,7 +96,7 @@ NODE_CANCEL_ADDING_FIRST_MEMORY
Generated if NODE_ADDING_FIRST_MEMORY fails.
NODE_ADDED_FIRST_MEMORY
Generated when memory has become available fo this node for the first time.
Generated when memory has become available for this node for the first time.
NODE_REMOVING_LAST_MEMORY
Generated when the last memory available to this node is about to be offlined.

View File

@@ -103,6 +103,42 @@ For debugging purposes there are also two conditionally-compiled macros:
pr_debug() and pr_devel(), which are compiled-out unless ``DEBUG`` (or
also ``CONFIG_DYNAMIC_DEBUG`` in the case of pr_debug()) is defined.
Avoiding lockups from excessive printk() use
============================================
.. note::
This section is relevant only for legacy console drivers (those not
using the nbcon API) and !PREEMPT_RT kernels. Once all console drivers
are updated to nbcon, this documentation can be removed.
Using ``printk()`` in hot paths (such as interrupt handlers, timer
callbacks, or high-frequency network receive routines) with legacy
consoles (e.g., ``console=ttyS0``) may cause lockups. Legacy consoles
synchronously acquire ``console_sem`` and block while flushing messages,
potentially disabling interrupts long enough to trigger hard or soft
lockup detectors.
To avoid this:
- Use rate-limited variants (e.g., ``pr_*_ratelimited()``) or one-time
macros (e.g., ``pr_*_once()``) to reduce message frequency.
- Assign lower log levels (e.g., ``KERN_DEBUG``) to non-essential messages
and filter console output via ``console_loglevel``.
- Use ``printk_deferred()`` to log messages immediately to the ringbuffer
and defer console printing. This is a workaround for legacy consoles.
- Port legacy console drivers to the non-blocking ``nbcon`` API (indicated
by ``CON_NBCON``). This is the preferred solution, as nbcon consoles
offload message printing to a dedicated kernel thread.
For temporary debugging, ``trace_printk()`` can be used, but it must not
appear in mainline code. See ``Documentation/trace/debugging.rst`` for
more information.
If more permanent output is needed in a hot path, trace events can be used.
See ``Documentation/trace/events.rst`` and
``samples/trace_events/trace-events-sample.[ch]``.
Function reference
==================

View File

@@ -74,7 +74,7 @@ Exception handlers
Enabling interrupts is especially important on PREEMPT_RT, where certain
locks, such as spinlock_t, become sleepable. For example, handling an
invalid opcode may result in sending a SIGILL signal to the user task. A
debug excpetion will send a SIGTRAP signal.
debug exception will send a SIGTRAP signal.
In both cases, if the exception occurred in user space, it is safe to enable
interrupts early. Sending a signal requires both interrupts and kernel
preemption to be enabled.

View File

@@ -213,7 +213,7 @@ to suspend until the callback completes, ensuring forward progress without
risking livelock.
In order to solve the problem at the API level, the sequence locks were extended
to allow a proper handover between the the spinning reader and the maybe
to allow a proper handover between the spinning reader and the maybe
blocked writer.
Sequence locks

View File

@@ -114,6 +114,11 @@ inspected with modinfo::
import_ns: USB_STORAGE
[...]
For modules that are currently loaded, imported namespaces are also available
via sysfs::
$ cat /sys/module/ums_karma/import_ns
USB_STORAGE
It is advisable to add the MODULE_IMPORT_NS() statement close to other module
metadata definitions like MODULE_AUTHOR() or MODULE_LICENSE().

View File

@@ -378,9 +378,9 @@ Affinity Scopes
An unbound workqueue groups CPUs according to its affinity scope to improve
cache locality. For example, if a workqueue is using the default affinity
scope of "cache", it will group CPUs according to last level cache
boundaries. A work item queued on the workqueue will be assigned to a worker
on one of the CPUs which share the last level cache with the issuing CPU.
scope of "cache_shard", it will group CPUs into sub-LLC shards. A work item
queued on the workqueue will be assigned to a worker on one of the CPUs
within the same shard as the issuing CPU.
Once started, the worker may or may not be allowed to move outside the scope
depending on the ``affinity_strict`` setting of the scope.
@@ -402,7 +402,13 @@ Workqueue currently supports the following affinity scopes.
``cache``
CPUs are grouped according to cache boundaries. Which specific cache
boundary is used is determined by the arch code. L3 is used in a lot of
cases. This is the default affinity scope.
cases.
``cache_shard``
CPUs are grouped into sub-LLC shards of at most ``wq_cache_shard_size``
cores (default 8, tunable via the ``workqueue.cache_shard_size`` boot
parameter). Shards are always split on core (SMT group) boundaries.
This is the default affinity scope.
``numa``
CPUs are grouped according to NUMA boundaries.

View File

@@ -13,6 +13,7 @@ for cryptographic use cases, as well as programming examples.
:caption: Table of contents
:maxdepth: 2
libcrypto
intro
api-intro
architecture
@@ -27,4 +28,3 @@ for cryptographic use cases, as well as programming examples.
descore-readme
device_drivers/index
krb5
sha3

View File

@@ -0,0 +1,19 @@
.. SPDX-License-Identifier: GPL-2.0-or-later
Block ciphers
=============
AES
---
Support for the AES block cipher.
.. kernel-doc:: include/crypto/aes.h
DES
---
Support for the DES block cipher. This algorithm is obsolete and is supported
only for backwards compatibility.
.. kernel-doc:: include/crypto/des.h

View File

@@ -0,0 +1,86 @@
.. SPDX-License-Identifier: GPL-2.0-or-later
Hash functions, MACs, and XOFs
==============================
AES-CMAC and AES-XCBC-MAC
-------------------------
Support for the AES-CMAC and AES-XCBC-MAC message authentication codes.
.. kernel-doc:: include/crypto/aes-cbc-macs.h
BLAKE2b
-------
Support for the BLAKE2b cryptographic hash function.
.. kernel-doc:: include/crypto/blake2b.h
BLAKE2s
-------
Support for the BLAKE2s cryptographic hash function.
.. kernel-doc:: include/crypto/blake2s.h
GHASH and POLYVAL
-----------------
Support for the GHASH and POLYVAL universal hash functions. These algorithms
are used only as internal components of other algorithms.
.. kernel-doc:: include/crypto/gf128hash.h
MD5
---
Support for the MD5 cryptographic hash function and HMAC-MD5. This algorithm is
obsolete and is supported only for backwards compatibility.
.. kernel-doc:: include/crypto/md5.h
NH
--
Support for the NH universal hash function. This algorithm is used only as an
internal component of other algorithms.
.. kernel-doc:: include/crypto/nh.h
Poly1305
--------
Support for the Poly1305 universal hash function. This algorithm is used only
as an internal component of other algorithms.
.. kernel-doc:: include/crypto/poly1305.h
SHA-1
-----
Support for the SHA-1 cryptographic hash function and HMAC-SHA1. This algorithm
is obsolete and is supported only for backwards compatibility.
.. kernel-doc:: include/crypto/sha1.h
SHA-2
-----
Support for the SHA-2 family of cryptographic hash functions, including SHA-224,
SHA-256, SHA-384, and SHA-512. This also includes their corresponding HMACs:
HMAC-SHA224, HMAC-SHA256, HMAC-SHA384, and HMAC-SHA512.
.. kernel-doc:: include/crypto/sha2.h
SHA-3
-----
The SHA-3 functions are documented in :ref:`sha3`.
SM3
---
Support for the SM3 cryptographic hash function.
.. kernel-doc:: include/crypto/sm3.h

View File

@@ -0,0 +1,11 @@
.. SPDX-License-Identifier: GPL-2.0-or-later
Digital signature algorithms
============================
ML-DSA
------
Support for the ML-DSA digital signature algorithm.
.. kernel-doc:: include/crypto/mldsa.h

View File

@@ -0,0 +1,6 @@
.. SPDX-License-Identifier: GPL-2.0-or-later
Utility functions
=================
.. kernel-doc:: include/crypto/utils.h

Some files were not shown because too many files have changed in this diff Show More