Commit Graph

1367303 Commits

Author SHA1 Message Date
Bjorn Helgaas
fc9a7d38d5 Merge branch 'pci/enumeration'
- Allow 'isolated PCI functions' (multi-function devices without a function
  0) for LoongArch, similar to s390 and jailhouse (Huacai Chen)

- Mask out unrelated bits in PCIE_LNKCAP_SLS2SPEED() and
  PCIE_LNKCTL2_TLS2SPEED(), which makes them more robust and fixes a
  WARN_ON_ONCE() in pcie_set_target_speed() (Jiwei Sun)

- Read Link Control 2 again when retraining a link after a training failure
  so we try to increase the link speed (Jiwei Sun)

- Allow built-in drivers, not just modular drivers, to use async initial
  probing (Lukas Wunner)

- Support Immediate Readiness even on devices with no PM Capability (Sean
  Christopherson)

* pci/enumeration:
  PCI: Support Immediate Readiness on devices without PM capabilities
  PCI: Allow built-in drivers to use async initial probing
  PCI: Adjust the position of reading the Link Control 2 register
  PCI: Fix link speed calculation on retrain failure
  PCI: Extend isolated function probing to LoongArch
2025-07-31 16:11:41 -05:00
Bjorn Helgaas
e6035a0809 Merge branch 'pci/boot-display'
- Add pci_is_display() to check for "Display" base class and use it in
  ALSA hda, vfio, vga_switcheroo, vt-d (Mario Limonciello)

* pci/boot-display:
  ALSA: hda: Use pci_is_display()
  iommu/vt-d: Use pci_is_display()
  vga_switcheroo: Use pci_is_display()
  vfio/pci: Use pci_is_display()
  PCI: Add pci_is_display() to check if device is a display controller
2025-07-31 16:11:40 -05:00
Bjorn Helgaas
010c310577 Merge branch 'pci/aspm'
- Change aspm_disabled and aspm_force from int to bool (Hans Zhang)

- Initialize val at declaration (Hans Zhang)

* pci/aspm:
  PCI/ASPM: Consolidate variable declaration and initialization
  PCI/ASPM: Use boolean type for aspm_disabled and aspm_force
2025-07-31 16:11:40 -05:00
Bjorn Helgaas
2de2f9274f Merge branch 'pci/aer'
- Change pcie_aer_disable from int to bool (Hans Zhang)

- Add message if AER interrupt occurs and we find more downstream devices
  with AER errors logged than we can process (Akshay Jindal)

* pci/aer:
  PCI/AER: Add message when AER_MAX_MULTI_ERR_DEVICES limit is hit
  PCI/AER: Use bool for AER disable state tracking
2025-07-31 16:11:39 -05:00
Sean Christopherson
5c0d0ee36f PCI: Support Immediate Readiness on devices without PM capabilities
Query support for Immediate Readiness irrespective of whether or not the
device supports PM capabilities, as nothing in the PCIe spec suggests that
Immediate Readiness is in any way dependent on PM functionality.

Fixes: d6112f8def ("PCI: Add support for Immediate Readiness")
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: David Matlack <dmatlack@google.com>
Cc: Vipin Sharma <vipinsh@google.com>
Cc: Aaron Lewis <aaronlewis@google.com>
Link: https://patch.msgid.link/20250722155926.352248-1-seanjc@google.com
2025-07-22 18:02:44 -05:00
Mario Limonciello
6642adf0c1 ALSA: hda: Use pci_is_display()
The inline pci_is_display() helper does the same thing.  Use it.

Suggested-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Takashi Iwai <tiwai@suse.de>
Reviewed-by: Daniel Dadap <ddadap@nvidia.com>
Reviewed-by: Simona Vetter <simona.vetter@ffwll.ch>
Link: https://patch.msgid.link/20250717173812.3633478-6-superm1@kernel.org
2025-07-17 15:30:13 -05:00
Mario Limonciello
75952c4975 iommu/vt-d: Use pci_is_display()
The inline pci_is_display() helper does the same thing.  Use it.

Suggested-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Daniel Dadap <ddadap@nvidia.com>
Reviewed-by: Simona Vetter <simona.vetter@ffwll.ch>
Link: https://patch.msgid.link/20250717173812.3633478-5-superm1@kernel.org
2025-07-17 15:30:13 -05:00
Mario Limonciello
b1060ea44a vga_switcheroo: Use pci_is_display()
The inline pci_is_display() helper does the same thing.  Use it.

Suggested-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Daniel Dadap <ddadap@nvidia.com>
Reviewed-by: Simona Vetter <simona.vetter@ffwll.ch>
Link: https://patch.msgid.link/20250717173812.3633478-4-superm1@kernel.org
2025-07-17 15:30:13 -05:00
Mario Limonciello
a7feca7c88 vfio/pci: Use pci_is_display()
The inline pci_is_display() helper does the same thing.  Use it.

Suggested-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Daniel Dadap <ddadap@nvidia.com>
Reviewed-by: Simona Vetter <simona.vetter@ffwll.ch>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Link: https://patch.msgid.link/20250717173812.3633478-3-superm1@kernel.org
2025-07-17 15:30:13 -05:00
Mario Limonciello
76720eed7d PCI: Add pci_is_display() to check if device is a display controller
Several places in the kernel do class shifting to match whether a PCI
device is display class.  Add pci_is_display() for those places to use.

Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Daniel Dadap <ddadap@nvidia.com>
Reviewed-by: Simona Vetter <simona.vetter@ffwll.ch>
Link: https://patch.msgid.link/20250717173812.3633478-2-superm1@kernel.org
2025-07-17 15:30:05 -05:00
Lukas Wunner
9170304169 PCI: Allow built-in drivers to use async initial probing
The PCI core has historically not allowed built-in drivers to opt in to
async initial probing:  Drivers may set "PROBE_PREFER_ASYNCHRONOUS", but
initial probing always happens synchronously.  That's because the PCI core
uses device_attach() instead of device_initial_probe().

Should a driver return -EPROBE_DEFER on initial probe, reprobing later on
does honor the PROBE_PREFER_ASYNCHRONOUS setting.  Modular drivers are
also allowed to probe asynchronously, which is inconsistent.

The choice of device_attach() is likely not deliberate:  It was introduced
in 2013 with commit 58d9a38f6f ("PCI: Skip attaching driver in
device_add()"), but asynchronous probing was added two years later with
commit 765230b5f0 ("driver-core: add asynchronous probing support for
drivers").

According to the kernel-doc of "enum probe_type", "the end goal is to
switch the kernel to use asynchronous probing by default".  To this end,
use device_initial_probe() to allow asynchronous initial probing.  The
function returns void, making the return value check unnecessary.

Initial PCI probing often takes on the order of seconds even on laptops,
so this may speed up booting significantly.

A small number of PCI drivers already opt in to asynchronous probing.
Their maintainers (who are all cc'ed) should watch out for issues, now
that asynchronous probing is not just allowed for deferred and modular
probing, but also initial probing:

  hl_pci_driver        drivers/accel/habanalabs/common/habanalabs_drv.c
  cxl_pci_driver       drivers/cxl/pci.c
  quicki2c_driver      drivers/hid/intel-thc-hid/intel-quicki2c/pci-quicki2c.c
  quickspi_driver      drivers/hid/intel-thc-hid/intel-quickspi/pci-quickspi.c
  i801_driver          drivers/i2c/busses/i2c-i801.c
  mei_me_driver        drivers/misc/mei/pci-me.c
  mei_vsc_drv          drivers/misc/mei/platform-vsc.c
  sdhci_driver         drivers/mmc/host/sdhci-pci-core.c
  nvme_driver          drivers/nvme/host/pci.c
  ehci_pci_driver      drivers/usb/host/ehci-pci.c
  hvfb_pci_stub_driver drivers/video/fbdev/hyperv_fb.c

All other driver maintainers may test asynchronous probing by specifying
the command line parameter "driver_async_probe=drv_name1,drv_name2,...",
and on success setting "probe_type = PROBE_PREFER_ASYNCHRONOUS" in the
pci_driver struct.

Signed-off-by: Lukas Wunner <lukas@wunner.de>
[bhelgaas: updated commit log per https://lore.kernel.org/r/aHYUh7WoDlhHckxd@wunner.de]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/53abe6f5ac7c631f95f5d061aa748b192eda0379.1751614426.git.lukas@wunner.de
2025-07-15 11:23:33 -05:00
Akshay Jindal
a6f494becf PCI/AER: Add message when AER_MAX_MULTI_ERR_DEVICES limit is hit
When a PCIe device detects an error, it logs the error locally and issues
an error Message routed to the Root Complex (PCIe r6.0, sec 6.2.5). If the
Root Port or RCEC supports AER and Linux has enabled the AER interrupt,
aer_isr() traverses the relevant devices and adds those with AER errors
logged to the aer_err_info.dev[] array for error logging and recovery.

If aer_isr() finds more than AER_MAX_MULTI_ERR_DEVICES devices with AER
errors logged, it silently ignores them, and those extra devices are not
included in the recovery flow.

Emit an error message if we find more than AER_MAX_MULTI_ERR_DEVICES
devices with AER errors logged.

Testing details at link below.

Signed-off-by: Akshay Jindal <akshayaj.lkd@gmail.com>
[bhelgaas: commit log, join error message]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20250619185041.73240-1-akshayaj.lkd@gmail.com
2025-06-26 15:34:22 -05:00
Jiwei Sun
b85af48de3 PCI: Adjust the position of reading the Link Control 2 register
In a89c82249c ("PCI: Work around PCIe link training failures"), if the
speed limit is set to 2.5 GT/s and the retraining is successful, an attempt
will be made to lift the speed limit. One condition for lifting the speed
limit is to check whether the link speed field of the Link Control 2
register is PCI_EXP_LNKCTL2_TLS_2_5GT.

However, since de9a6c8d5d ("PCI/bwctrl: Add pcie_set_target_speed() to
set PCIe Link Speed"), the `lnkctl2` local variable does not undergo any
changes during the speed limit setting and retraining process. As a result,
the code intended to lift the speed limit is not executed.

To address this issue, adjust the position of the Link Control 2 register
read operation in the code and place it before its use.

Fixes: de9a6c8d5d ("PCI/bwctrl: Add pcie_set_target_speed() to set PCIe Link Speed")
Suggested-by: Maciej W. Rozycki <macro@orcam.me.uk>
Suggested-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Jiwei Sun <sunjw10@lenovo.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20250123055155.22648-3-sjiwei@163.com
2025-06-24 15:54:26 -05:00
Jiwei Sun
9989e0ca74 PCI: Fix link speed calculation on retrain failure
When pcie_failed_link_retrain() fails to retrain, it tries to revert to the
previous link speed.  However it calculates that speed from the Link
Control 2 register without masking out non-speed bits first.

PCIE_LNKCTL2_TLS2SPEED() converts such incorrect values to
PCI_SPEED_UNKNOWN (0xff), which in turn causes a WARN splat in
pcie_set_target_speed():

  pci 0000:00:01.1: [1022:14ed] type 01 class 0x060400 PCIe Root Port
  pci 0000:00:01.1: broken device, retraining non-functional downstream link at 2.5GT/s
  pci 0000:00:01.1: retraining failed
  WARNING: CPU: 1 PID: 1 at drivers/pci/pcie/bwctrl.c:168 pcie_set_target_speed
  RDX: 0000000000000001 RSI: 00000000000000ff RDI: ffff9acd82efa000
  pcie_failed_link_retrain
  pci_device_add
  pci_scan_single_device

Mask out the non-speed bits in PCIE_LNKCTL2_TLS2SPEED() and
PCIE_LNKCAP_SLS2SPEED() so they don't incorrectly return PCI_SPEED_UNKNOWN.

Fixes: de9a6c8d5d ("PCI/bwctrl: Add pcie_set_target_speed() to set PCIe Link Speed")
Reported-by: Andrew <andreasx0@protonmail.com>
Closes: https://lore.kernel.org/r/7iNzXbCGpf8yUMJZBQjLdbjPcXrEJqBxy5-bHfppz0ek-h4_-G93b1KUrm106r2VNF2FV_sSq0nENv4RsRIUGnlYZMlQr2ZD2NyB5sdj5aU=@protonmail.com/
Suggested-by: Maciej W. Rozycki <macro@orcam.me.uk>
Suggested-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Jiwei Sun <sunjw10@lenovo.com>
[bhelgaas: commit log, add details from https://lore.kernel.org/r/1c92ef6bcb314ee6977839b46b393282e4f52e74.1750684771.git.lukas@wunner.de]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Cc: stable@vger.kernel.org	# v6.13+
Link: https://patch.msgid.link/20250123055155.22648-2-sjiwei@163.com
2025-06-24 15:48:35 -05:00
Huacai Chen
a02fd05661 PCI: Extend isolated function probing to LoongArch
Like s390 and the jailhouse hypervisor, LoongArch's PCI architecture allows
passing isolated PCI functions to a guest OS instance. So it is possible
that there is a multi-function device without function 0 for the host or
guest.

Allow probing such functions by adding a IS_ENABLED(CONFIG_LOONGARCH) case
in the hypervisor_isolated_pci_functions() helper.

This is similar to commit 189c6c33ff ("PCI: Extend isolated function
probing to s390").

Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20250624062927.4037734-1-chenhuacai@loongson.cn
2025-06-24 10:45:32 -05:00
Hans Zhang
9890dd3fb7 PCI/AER: Use bool for AER disable state tracking
Change pcie_aer_disable variable to bool and update pci_no_aer()
to set it to true. Improves code readability and aligns with modern
kernel practices.

Signed-off-by: Hans Zhang <hans.zhang@cixtech.com>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20250516165223.125083-3-18255117159@163.com
2025-06-17 17:32:27 -05:00
Hans Zhang
64fd90ef25 PCI/ASPM: Consolidate variable declaration and initialization
Merge the declaration and initialization of 'val' into a single statement
for clarity. This eliminates a redundant assignment operation and improves
code readability while maintaining the same functionality.

Signed-off-by: Hans Zhang <18255117159@163.com>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20250522161533.394689-1-18255117159@163.com
2025-06-17 17:31:54 -05:00
Hans Zhang
c1842b98c9 PCI/ASPM: Use boolean type for aspm_disabled and aspm_force
The aspm_disabled and aspm_force variables are used as boolean flags.
Change their type from int to bool and update assignments to use
true/false instead of 1/0. This improves code clarity.

Signed-off-by: Hans Zhang <18255117159@163.com>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://patch.msgid.link/20250517154939.139237-1-18255117159@163.com
2025-06-17 17:31:43 -05:00
Linus Torvalds
19272b37aa Linux 6.16-rc1 v6.16-rc1 2025-06-08 13:44:43 -07:00
Linus Torvalds
939f15e640 Merge tag 'turbostat-2025.06.08' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux
Pull turbostat updates from Len Brown:

 - Add initial DMR support, which required smarter RAPL probe

 - Fix AMD MSR RAPL energy reporting

 - Add RAPL power limit configuration output

 - Minor fixes

* tag 'turbostat-2025.06.08' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
  tools/power turbostat: version 2025.06.08
  tools/power turbostat: Add initial support for BartlettLake
  tools/power turbostat: Add initial support for DMR
  tools/power turbostat: Dump RAPL sysfs info
  tools/power turbostat: Avoid probing the same perf counters
  tools/power turbostat: Allow probing RAPL with platform_features->rapl_msrs cleared
  tools/power turbostat: Clean up add perf/msr counter logic
  tools/power turbostat: Introduce add_msr_counter()
  tools/power turbostat: Remove add_msr_perf_counter_()
  tools/power turbostat: Remove add_cstate_perf_counter_()
  tools/power turbostat: Remove add_rapl_perf_counter_()
  tools/power turbostat: Quit early for unsupported RAPL counters
  tools/power turbostat: Always check rapl_joules flag
  tools/power turbostat: Fix AMD package-energy reporting
  tools/power turbostat: Fix RAPL_GFX_ALL typo
  tools/power turbostat: Add Android support for MSR device handling
  tools/power turbostat.8: pm_domain wording fix
  tools/power turbostat.8: fix typo: idle_pct should be pct_idle
2025-06-08 11:44:41 -07:00
Linus Torvalds
be54f8c558 Merge tag 'timers-cleanups-2025-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer cleanup from Thomas Gleixner:
 "The delayed from_timer() API cleanup:

  The renaming to the timer_*() namespace was delayed due massive
  conflicts against Linux-next. Now that everything is upstream finish
  the conversion"

* tag 'timers-cleanups-2025-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  treewide, timers: Rename from_timer() to timer_container_of()
2025-06-08 11:33:00 -07:00
Linus Torvalds
0529ef8c36 Merge tag 'x86-urgent-2025-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
 "A small set of x86 fixes:

   - Cure IO bitmap inconsistencies

     A failed fork cleans up all resources of the newly created thread
     via exit_thread(). exit_thread() invokes io_bitmap_exit() which
     does the IO bitmap cleanups, which unfortunately assume that the
     cleanup is related to the current task, which is obviously bogus.

     Make it work correctly

   - A lockdep fix in the resctrl code removed the clearing of the
     command buffer in two places, which keeps stale error messages
     around. Bring them back.

   - Remove unused trace events"

* tag 'x86-urgent-2025-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  fs/resctrl: Restore the rdt_last_cmd_clear() calls after acquiring rdtgroup_mutex
  x86/iopl: Cure TIF_IO_BITMAP inconsistencies
  x86/fpu: Remove unused trace events
2025-06-08 11:27:20 -07:00
Linus Torvalds
4710eacf8d Merge tag 'timers-urgent-2025-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fix from Thomas Gleixner:
 "Add the missing seq_file forward declaration in the timer namespace
  header"

* tag 'timers-urgent-2025-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  timens: Add struct seq_file forward declaration
2025-06-08 11:25:13 -07:00
Len Brown
42fd37dcc4 tools/power turbostat: version 2025.06.08
Add initial DMR support, which required smarter RAPL probe
Fix AMD MSR RAPL energy reporting
Add RAPL power limit configuration output
Minor fixes

Signed-off-by: Len Brown <len.brown@intel.com>
2025-06-08 14:10:17 -04:00
Zhang Rui
d8c0f5d973 tools/power turbostat: Add initial support for BartlettLake
Add initial support for BartlettLake.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2025-06-08 14:10:17 -04:00
Zhang Rui
83075bd59d tools/power turbostat: Add initial support for DMR
Add initial support for DMR.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2025-06-08 14:10:16 -04:00
Zhang Rui
2a535d6cc3 tools/power turbostat: Dump RAPL sysfs info
for example:

intel-rapl:1: psys 28.0s:100W 976.0us:100W
intel-rapl:0: package-0 28.0s:57W,max:15W 2.4ms:57W
intel-rapl:0/intel-rapl:0:0: core disabled
intel-rapl:0/intel-rapl:0:1: uncore disabled
intel-rapl-mmio:0: package-0 28.0s:28W,max:15W 2.4ms:57W

[lenb: simplified format]

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>

squish me

Signed-off-by: Len Brown <len.brown@intel.com>
2025-06-08 14:10:16 -04:00
Zhang Rui
69078520fd tools/power turbostat: Avoid probing the same perf counters
For the RAPL package energy status counter, Intel and AMD share the same
perf_subsys and perf_name, but with different MSR addresses.

Both rapl_counter_arch_infos[0] and rapl_counter_arch_infos[1] are
introduced to describe this counter for different Vendors.

As a result, the perf counter is probed twice, and causes a failure in
in get_rapl_counters() because expected_read_size and actual_read_size
don't match.

Fix the problem by skipping the already probed counter.

Note, this is not a perfect fix. For example, if different
vendors/platforms use the same MSR value for different purpose, the code
can be fooled when it probes a rapl_counter_arch_infos[] entry that does
not belong to the running Vendor/Platform.

In a long run, better to put rapl_counter_arch_infos[] into the
platform_features so that this becomes Vendor/Platform specific.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2025-06-08 14:10:16 -04:00
Zhang Rui
ff3d019e98 tools/power turbostat: Allow probing RAPL with platform_features->rapl_msrs cleared
platform_features->rapl_msrs describes the RAPL MSRs supported. While
RAPL Perf counters can be exposed from different kernel backend drivers,
e.g. RAPL MSR I/F driver, or RAPL TPMI I/F driver.

Thus, turbostat should first blindly probe all the available RAPL Perf
counters, and falls back to the RAPL MSR counters if they are listed in
platform_features->rapl_msrs.

With this, platforms that don't have RAPL MSRs can clear the
platform_features->rapl_msrs bits and use RAPL Perf counters only.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2025-06-08 14:10:16 -04:00
Zhang Rui
0362337968 tools/power turbostat: Clean up add perf/msr counter logic
Increase the code readability by moving the no_perf/no_msr flag and the
cai->perf_name/cai->msr sanity checks into the counter probe functions.

No functional change.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2025-06-08 14:10:16 -04:00
Zhang Rui
1ab2e19b4c tools/power turbostat: Introduce add_msr_counter()
probe_rapl_msr() is reused for probing RAPL MSR counters, cstate MSR
counters and MPERF/APERF/SMI MSR counters, thus its name is misleading.

Similar to add_perf_counter(), introduce add_msr_counter() to probe a
counter via MSR. Introduce wrapper function add_rapl_msr_counter() at
the same time to add extra check for Zero return value for specified
RAPL counters.

No functional change intended.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2025-06-08 14:10:16 -04:00
Zhang Rui
3403e89f97 tools/power turbostat: Remove add_msr_perf_counter_()
As the only caller of add_msr_perf_counter_(), add_msr_perf_counter()
just gives extra debug output on top. There is no need to keep both
functions.

Remove add_msr_perf_counter_() and move all the logic to
add_msr_perf_counter().

No functional change.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2025-06-08 14:10:16 -04:00
Zhang Rui
4d6ced7bef tools/power turbostat: Remove add_cstate_perf_counter_()
As the only caller of add_cstate_perf_counter_(),
add_cstate_perf_counter() just gives extra debug output on top. There is
no need to keep both functions.

Remove add_cstate_perf_counter_() and move all the logic to
add_cstate_perf_counter().

No functional change.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2025-06-08 14:10:16 -04:00
Zhang Rui
c8bca955da tools/power turbostat: Remove add_rapl_perf_counter_()
As the only caller of add_rapl_perf_counter_(), add_rapl_perf_counter()
just gives extra debug output on top. There is no need to keep both
functions.

Remove add_rapl_perf_counter_() and move all the logic to
add_rapl_perf_counter().

No functional change.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2025-06-08 14:10:16 -04:00
Zhang Rui
57b53787f0 tools/power turbostat: Quit early for unsupported RAPL counters
Quit early for unsupported RAPL counters.

No functional change.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2025-06-08 14:10:16 -04:00
Zhang Rui
fdea6b883b tools/power turbostat: Always check rapl_joules flag
rapl_joules bit should always be checked even if
platform_features->rapl_msrs is not set or no_msr flag is used.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2025-06-08 14:10:16 -04:00
Gautham R. Shenoy
adb49732c8 tools/power turbostat: Fix AMD package-energy reporting
commit 05a2f07db8 ("tools/power turbostat: read RAPL counters via
perf") that adds support to read RAPL counters via perf defines the
notion of a RAPL domain_id which is set to physical_core_id on
platforms which support per_core_rapl counters (Eg: AMD processors
Family 17h onwards) and is set to the physical_package_id on all the
other platforms.

However, the physical_core_id is only unique within a package and on
platforms with multiple packages more than one core can have the same
physical_core_id and thus the same domain_id. (For eg, the first cores
of each package have the physical_core_id = 0). This results in all
these cores with the same physical_core_id using the same entry in the
rapl_counter_info_perdomain[]. Since rapl_perf_init() skips the
perf-initialization for cores whose domain_ids have already been
visited, cores that have the same physical_core_id always read the
perf file corresponding to the physical_core_id of the first package
and thus the package-energy is incorrectly reported to be the same
value for different packages.

Note: This issue only arises when RAPL counters are read via perf and
not when they are read via MSRs since in the latter case the MSRs are
read separately on each core.

Fix this issue by associating each CPU with rapl_core_id which is
unique across all the packages in the system.

Fixes: 05a2f07db8 ("tools/power turbostat: read RAPL counters via perf")
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2025-06-08 14:10:16 -04:00
Kaushlendra Kumar
b4a734d383 tools/power turbostat: Fix RAPL_GFX_ALL typo
Fix typo in the currently unused RAPL_GFX_ALL macro definition.

Signed-off-by: Kaushlendra Kumar <kaushlendra.kumar@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2025-06-08 14:10:16 -04:00
Kaushlendra Kumar
5663785ae0 tools/power turbostat: Add Android support for MSR device handling
It uses /dev/msrN device paths on Android instead of /dev/cpu/N/msr,
updates error messages and permission checks to reflect the Android
device path, and wraps platform-specific code with #if defined(ANDROID)
to ensure correct behavior on both Android and non-Android systems.
These changes improve compatibility and usability of turbostat on
Android devices.

Signed-off-by: Kaushlendra Kumar <kaushlendra.kumar@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2025-06-08 14:10:16 -04:00
Len Brown
c967900fcb tools/power turbostat.8: pm_domain wording fix
turbostat.8: clarify that uncore "domains" are Power Management domains,
aka pm_domains.

Signed-off-by: Len Brown <len.brown@intel.com>
2025-06-08 14:10:16 -04:00
Len Brown
394c1127ab tools/power turbostat.8: fix typo: idle_pct should be pct_idle
idle_pct should be pct_idle

Signed-off-by: Len Brown <len.brown@intel.com>
2025-06-08 14:10:01 -04:00
Linus Torvalds
d9864e7d15 Merge tag 'perf-urgent-2025-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 perf fix from Thomas Gleixner:
 "A single fix for the x86 performance counters on Intel CPUs:

  The MSR offset calculations for fixed performance counters are stored
  at the wrong index in the configuration array causing the general
  purpose counter MSR offset to be overwritten, so both the general
  purpose and the fixed counters offsets are incorrect.

  Correct the array index calculation to fix that"

* tag 'perf-urgent-2025-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/intel: Fix incorrect MSR index calculations in intel_pmu_config_acr()
2025-06-08 11:07:33 -07:00
Linus Torvalds
70b7d651ca Merge tag 'irq-urgent-2025-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fix from Thomas Gleixner:
 "A single fix for the PCI/MSI code:

  The conversion to per device MSI domains created a MSI domain with
  size 1 instead of sizing it to the maximum possible number of MSI
  interrupts for the device. This "worked" as the subsequent allocations
  resized the domain, but the recent change to move the prepare() call
  into the domain creation path broke this works by chance mechanism.

  Size the domain properly at creation time"

* tag 'irq-urgent-2025-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  PCI/MSI: Size device MSI domain with the maximum number of vectors
2025-06-08 11:02:53 -07:00
Linus Torvalds
35b574a6c2 Merge tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull mount fixes from Al Viro:
 "Various mount-related bugfixes:

   - split the do_move_mount() checks in subtree-of-our-ns and
     entire-anon cases and adapt detached mount propagation selftest for
     mount_setattr

   - allow clone_private_mount() for a path on real rootfs

   - fix a race in call of has_locked_children()

   - fix move_mount propagation graph breakage by MOVE_MOUNT_SET_GROUP

   - make sure clone_private_mnt() caller has CAP_SYS_ADMIN in the right
     userns

   - avoid false negatives in path_overmount()

   - don't leak MNT_LOCKED from parent to child in finish_automount()

   - do_change_type(): refuse to operate on unmounted/not ours mounts"

* tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  do_change_type(): refuse to operate on unmounted/not ours mounts
  clone_private_mnt(): make sure that caller has CAP_SYS_ADMIN in the right userns
  selftests/mount_setattr: adapt detached mount propagation test
  do_move_mount(): split the checks in subtree-of-our-ns and entire-anon cases
  fs: allow clone_private_mount() for a path on real rootfs
  fix propagation graph breakage by MOVE_MOUNT_SET_GROUP move_mount(2)
  finish_automount(): don't leak MNT_LOCKED from parent to child
  path_overmount(): avoid false negatives
  fs/fhandle.c: fix a race in call of has_locked_children()
2025-06-08 10:35:12 -07:00
Linus Torvalds
522cd6acd2 Merge tag '6.16-rc-part2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull more smb client updates from Steve French:

 - multichannel/reconnect fixes

 - move smbdirect (smb over RDMA) defines to fs/smb/common so they will
   be able to be used in the future more broadly, and a documentation
   update explaining setting up smbdirect mounts

 - update email address for Paulo

* tag '6.16-rc-part2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: update internal version number
  MAINTAINERS, mailmap: Update Paulo Alcantara's email address
  cifs: add documentation for smbdirect setup
  cifs: do not disable interface polling on failure
  cifs: serialize other channels when query server interfaces is pending
  cifs: deal with the channel loading lag while picking channels
  smb: client: make use of common smbdirect_socket_parameters
  smb: smbdirect: introduce smbdirect_socket_parameters
  smb: client: make use of common smbdirect_socket
  smb: smbdirect: add smbdirect_socket.h
  smb: client: make use of common smbdirect.h
  smb: smbdirect: add smbdirect.h with public structures
  smb: client: make use of common smbdirect_pdu.h
  smb: smbdirect: add smbdirect_pdu.h with protocol definitions
2025-06-08 10:20:21 -07:00
Linus Torvalds
538c429a4b Merge tag 'trace-v6.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull more tracing fixes from Steven Rostedt:

 - Fix regression of waiting a long time on updating trace event filters

   When the faultable trace points were added, it needed task trace RCU
   synchronization.

   This was added to the tracepoint_synchronize_unregister() function.
   The filter logic always called this function whenever it updated the
   trace event filters before freeing the old filters. This increased
   the time of "trace-cmd record" from taking 13 seconds to running over
   2 minutes to complete.

   Move the freeing of the filters to call_rcu*() logic, which brings
   the time back down to 13 seconds.

 - Fix ring_buffer_subbuf_order_set() error path lock protection

   The error path of the ring_buffer_subbuf_order_set() released the
   mutex too early and allowed subsequent accesses to setting the
   subbuffer size to corrupt the data and cause a bug.

   By moving the mutex locking to the end of the error path, it prevents
   the reentrant access to the critical data and also allows the
   function to convert the taking of the mutex over to the guard()
   logic.

 - Remove unused power management clock events

   The clock events were added in 2010 for power management. In 2011 arm
   used them. In 2013 the code they were used in was removed. These
   events have been wasting memory since then.

 - Fix sparse warnings

   There was a few places that sparse warned about trace_events_filter.c
   where file->filter was referenced directly, but it is annotated with
   an __rcu tag. Use the helper functions and fix them up to use
   rcu_dereference() properly.

* tag 'trace-v6.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  tracing: Add rcu annotation around file->filter accesses
  tracing: PM: Remove unused clock events
  ring-buffer: Fix buffer locking in ring_buffer_subbuf_order_set()
  tracing: Fix regression of filter waiting a long time on RCU synchronization
2025-06-08 08:19:01 -07:00
Ingo Molnar
41cb08555c treewide, timers: Rename from_timer() to timer_container_of()
Move this API to the canonical timer_*() namespace.

[ tglx: Redone against pre rc1 ]

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/aB2X0jCKQO56WdMt@gmail.com
2025-06-08 09:07:37 +02:00
Linus Torvalds
8630c59e99 Merge tag 'kbuild-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild updates from Masahiro Yamada:

 - Add support for the EXPORT_SYMBOL_GPL_FOR_MODULES() macro, which
   exports a symbol only to specified modules

 - Improve ABI handling in gendwarfksyms

 - Forcibly link lib-y objects to vmlinux even if CONFIG_MODULES=n

 - Add checkers for redundant or missing <linux/export.h> inclusion

 - Deprecate the extra-y syntax

 - Fix a genksyms bug when including enum constants from *.symref files

* tag 'kbuild-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (28 commits)
  genksyms: Fix enum consts from a reference affecting new values
  arch: use always-$(KBUILD_BUILTIN) for vmlinux.lds
  kbuild: set y instead of 1 to KBUILD_{BUILTIN,MODULES}
  efi/libstub: use 'targets' instead of extra-y in Makefile
  module: make __mod_device_table__* symbols static
  scripts/misc-check: check unnecessary #include <linux/export.h> when W=1
  scripts/misc-check: check missing #include <linux/export.h> when W=1
  scripts/misc-check: add double-quotes to satisfy shellcheck
  kbuild: move W=1 check for scripts/misc-check to top-level Makefile
  scripts/tags.sh: allow to use alternative ctags implementation
  kconfig: introduce menu type enum
  docs: symbol-namespaces: fix reST warning with literal block
  kbuild: link lib-y objects to vmlinux forcibly even when CONFIG_MODULES=n
  tinyconfig: enable CONFIG_LD_DEAD_CODE_DATA_ELIMINATION
  docs/core-api/symbol-namespaces: drop table of contents and section numbering
  modpost: check forbidden MODULE_IMPORT_NS("module:") at compile time
  kbuild: move kbuild syntax processing to scripts/Makefile.build
  Makefile: remove dependency on archscripts for header installation
  Documentation/kbuild: Add new gendwarfksyms kABI rules
  Documentation/kbuild: Drop section numbers
  ...
2025-06-07 10:05:35 -07:00
Linus Torvalds
b3154a6ff1 Merge tag 'sh-for-v6.16-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/glaubitz/sh-linux
Pull sh updates from John Paul Adrian Glaubitz:

 - replace the __ASSEMBLY__ with __ASSEMBLER__ macro in all headers
   since the latter is now defined automatically by both GCC and Clang
   when compiling assembly code (Thomas Huth)

 - set the default SPI mode for the ecovec24 board which became
   necessary after a new mode member as added to the sh_msiof_spi_info
   struct in cf9e4784f3 ("spi: sh-msiof: Add slave mode support")
   (Geert Uytterhoeven)

 - remove unused variables in the kprobes code in
   kprobe_exceptions_notify() (Mike Rapoport)

* tag 'sh-for-v6.16-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/glaubitz/sh-linux:
  sh: kprobes: Remove unused variables in kprobe_exceptions_notify()
  sh: ecovec24: Make SPI mode explicit
  sh: Replace __ASSEMBLY__ with __ASSEMBLER__ in all headers
2025-06-07 10:00:03 -07:00
Linus Torvalds
b7191581a9 Merge tag 'loongarch-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson
Pull LoongArch updates from Huacai Chen:

 - Adjust the 'make install' operation

 - Support SCHED_MC (Multi-core scheduler)

 - Enable ARCH_SUPPORTS_MSEAL_SYSTEM_MAPPINGS

 - Enable HAVE_ARCH_STACKLEAK

 - Increase max supported CPUs up to 2048

 - Introduce the numa_memblks conversion

 - Add PWM controller nodes in dts

 - Some bug fixes and other small changes

* tag 'loongarch-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
  platform/loongarch: laptop: Unregister generic_sub_drivers on exit
  platform/loongarch: laptop: Add backlight power control support
  platform/loongarch: laptop: Get brightness setting from EC on probe
  LoongArch: dts: Add PWM support to Loongson-2K2000
  LoongArch: dts: Add PWM support to Loongson-2K1000
  LoongArch: dts: Add PWM support to Loongson-2K0500
  LoongArch: vDSO: Correctly use asm parameters in syscall wrappers
  LoongArch: Fix panic caused by NULL-PMD in huge_pte_offset()
  LoongArch: Preserve firmware configuration when desired
  LoongArch: Avoid using $r0/$r1 as "mask" for csrxchg
  LoongArch: Introduce the numa_memblks conversion
  LoongArch: Increase max supported CPUs up to 2048
  LoongArch: Enable HAVE_ARCH_STACKLEAK
  LoongArch: Enable ARCH_SUPPORTS_MSEAL_SYSTEM_MAPPINGS
  LoongArch: Add SCHED_MC (Multi-core scheduler) support
  LoongArch: Add some annotations in archhelp
  LoongArch: Using generic scripts/install.sh in `make install`
  LoongArch: Add a default install.sh
2025-06-07 09:56:18 -07:00