Fixes a logic issue in mlxreg_lc_completion_notify() where the
intention was to check if MLXREG_LC_POWERED flag is not set before
powering on the device.
The original code used "state & ~MLXREG_LC_POWERED" to check for the
absence of the POWERED bit. However this condition evaluates to true
even when other bits are set, leading to potentially incorrect
behavior.
Corrected the logic to explicitly check for the absence of
MLXREG_LC_POWERED using !(state & MLXREG_LC_POWERED).
Fixes: 62f9529b8d ("platform/mellanox: mlxreg-lc: Add initial support for Nvidia line card devices")
Suggested-by: Vadim Pasternak <vadimp@nvidia.com>
Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com>
Link: https://lore.kernel.org/r/20250630105812.601014-1-alok.a.tiwari@oracle.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Clang warns (or errors with CONFIG_WERROR=y):
drivers/platform/mellanox/nvsw-sn2201.c:531:32: error: variable 'nvsw_sn2201_busbar_items' is not needed and will not be emitted [-Werror,-Wunneeded-internal-declaration]
531 | static struct mlxreg_core_item nvsw_sn2201_busbar_items[] = {
| ^~~~~~~~~~~~~~~~~~~~~~~~
nvsw_sn2201_busbar_items is only used in ARRAY_SIZE(), which uses
sizeof(), so this variable is only used at compile time. It appears that
this may be a copy and paste issue, so use nvsw_sn2201_busbar_items as
the .items member in nvsw_sn2201_busbar_hotplug, clearing up the
warning.
Fixes: 56b0bb7f90 ("platform: mellanox: nvsw-sn2200: Add support for new system flavour")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/20250509-nvsw-sn2200-fix-items-busbar-hotplug-v1-1-8844fff38dc8@kernel.org
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Add list of events and counters from the following blocks: APT (ARM Processor
Tile), GGA (Global Generic Accelerator), MSN (Memory Stasher and Navigator),
EMI (External Memory Interface) and PRNF (PCIe Request Node).
If any of the fields populated from the ACPI table (like apt_num) cannot be
read, assign the corresponding block count to be 0 instead of failing probe
to maintain compatibility with older firmware.
Signed-off-by: Shravan Kumar Ramani <shravankr@nvidia.com>
Reviewed-by: David Thompson <davthompson@nvidia.com>
Link: https://lore.kernel.org/r/20250423083103.5240-1-shravankr@nvidia.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Add support for SN5640 and SN5610 Nvidia switches:
- SN5610 is a 51.2Tbps switch based on Nvidia SPC-4 ASIC equipped with 64
OSFP ports supporting 2.5Gbps - 400Gbps speeds.
- SN5640 is a 51.2Tbps switch based on Nvidia SPC-5 ASIC equipped with 64
OSFP ports supporting 10Gbps - 800Gbps speeds.
Both equipped with:
- Air-cooled with 4 + 1 redundant fan units.
- 2 + 2 redundant 2000W PSUs.
- System management board based on AMD CPU with secure-boot support.
Reviewed-by: Oleksandr Shamray <oleksandrs@nvidia.com>
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Link: https://lore.kernel.org/r/20250421092051.7687-5-vadimp@nvidia.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Provide platform support for Nvidia Smart Switch SN4280.
The Smart Switch equipped with:
- Nvidia COME module based on AMD EPYC™ Embedded 3451 CPU.
- Nvidia Spectrum-3 ASIC.
- Four DPUs, each equipped with Nvidia BF3 ARM based processor and
with Lattice LFD2NX-40 FPGA device.
- 28xQSFP-DD external ports.
- Two power supplies.
- Four cooling drawers.
Reviewed-by: Ciju Rajan K <crajank@nvidia.com>
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://lore.kernel.org/r/20250421092051.7687-3-vadimp@nvidia.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Provide platform support for Nvidia (DPU) Data Processor Unit for the
Smart Switch SN4280.
The Smart Switch equipped with:
- Nvidia COME module based on AMD EPYC™ Embedded 3451 CPU.
- Nvidia Spectrum-3 ASIC.
- Four DPUs, each equipped with Nvidia BF3 ARM based processor and
with Lattice LFD2NX-40 FPGA device.
- 28xQSFP-DD external ports.
- Two power supplies.
- Four cooling drawers.
Driver provides support for the platform management and monitoring
of DPU components. It includes support for: health events, resets and
boot progress indications logic, implemented by FPGA device.
Reviewed-by: Ciju Rajan K <crajank@nvidia.com>
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
[ij: added depends on I2C]
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://lore.kernel.org/r/20250421092051.7687-2-vadimp@nvidia.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
A warning is seen when running the latest kernel on a BlueField SOC:
[251.512704] ------------[ cut here ]------------
[251.512711] invalid sysfs_emit: buf:0000000003aa32ae
[251.512720] WARNING: CPU: 1 PID: 705264 at fs/sysfs/file.c:767 sysfs_emit+0xac/0xc8
The warning is triggered because the mlxbf-bootctl driver invokes
"sysfs_emit()" with a buffer pointer that is not aligned to the
start of the page. The driver should instead use "sysfs_emit_at()"
to support non-zero offsets into the destination buffer.
Fixes: 9886f575de ("platform/mellanox: mlxbf-bootctl: use sysfs_emit() instead of sprintf()")
Signed-off-by: David Thompson <davthompson@nvidia.com>
Link: https://lore.kernel.org/r/20250407132558.2418719-1-davthompson@nvidia.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
timer_delete[_sync]() replaces del_timer[_sync](). Convert the whole tree
over and remove the historical wrapper inlines.
Conversion was done with coccinelle plus manual fixups where necessary.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The continual trickle of small conversion patches is grating on me, and
is really not helping. Just get rid of the 'remove_new' member
function, which is just an alias for the plain 'remove', and had a
comment to that effect:
/*
* .remove_new() is a relic from a prototype conversion of .remove().
* New drivers are supposed to implement .remove(). Once all drivers are
* converted to not use .remove_new any more, it will be dropped.
*/
This was just a tree-wide 'sed' script that replaced '.remove_new' with
'.remove', with some care taken to turn a subsequent tab into two tabs
to make things line up.
I did do some minimal manual whitespace adjustment for places that used
spaces to line things up.
Then I just removed the old (sic) .remove_new member function, and this
is the end result. No more unnecessary conversion noise.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pointer item is checked fo NULL at mlxreg_hotplug_work_helper() and then
it is dereferenced to produce dev_err().
This pointer is also dereferenced before calling this function and should
never be NULL except some piece of hardware is broken as it is said in
the comment before the check. So, this check can be safely removed.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: c6acad68eb ("platform/mellanox: mlxreg-hotplug: Modify to use a regmap interface")
Signed-off-by: Daniil Dulov <d.dulov@aladdin.ru>
Reviewed-by: Vadim Pasternak <vadimp@nvidia.com>
Link: https://lore.kernel.org/r/20240306153804.6509-1-d.dulov@aladdin.ru
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Currently, the driver has two behaviors to deal with new & unsupported
performance blocks reported by the firmware:
1. For register and unknown block types, the driver will fail to load
with the following error message:
[ 4510.956369] mlxbf-pmc: probe of MLNXBFD2:00 failed with error -22
2. For counter and crspace blocks, the driver will load and sysfs files
will be created but getting the contents of event_list or trying to
setup the counter will fail
Instead, let's ignore and log unsupported blocks. This means the driver
will always load and unsupported blocks will never show up in sysfs.
Signed-off-by: Luiz Capitulino <luizcap@redhat.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/f8e2e6210b43e825b69824b420c801cd513d401d.1708635408.git.luizcap@redhat.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Merge tag 'platform-drivers-x86-v6.8-2' fixes into pdf86/for-next
because of WMI fixes. The WMI changes done in for-next already created
a minor conflict with the fixes and WMI is actively being improved
currently so besides resolving the current conflict, this is also to
avoid further conflicts.
Per filesystems/sysfs.rst, show() should only use sysfs_emit()
or sysfs_emit_at() when formatting the value to be returned to user space.
coccinelle complains that there are still a couple of functions that use
snprintf(). Convert them to sysfs_emit().
> ./drivers/platform/mellanox/mlxbf-bootctl.c:466:8-16: WARNING: please use sysfs_emit
> ./drivers/platform/mellanox/mlxbf-bootctl.c:584:8-16: WARNING: please use sysfs_emit
> ./drivers/platform/mellanox/mlxbf-bootctl.c:635:8-16: WARNING: please use sysfs_emit
> ./drivers/platform/mellanox/mlxbf-bootctl.c:686:8-16: WARNING: please use sysfs_emit
> ./drivers/platform/mellanox/mlxbf-bootctl.c:737:8-16: WARNING: please use sysfs_emit
> ./drivers/platform/mellanox/mlxbf-bootctl.c:788:8-16: WARNING: please use sysfs_emit
> ./drivers/platform/mellanox/mlxbf-bootctl.c:839:8-16: WARNING: please use sysfs_emit
No functional change intended
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
Link: https://lore.kernel.org/r/20240116045151.3940401-12-lizhijian@fujitsu.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Starting from Linux 5.16 kernel, Tx timeout mechanism was added
in the virtio_net driver which prints the "Tx timeout" warning
message when a packet stays in Tx queue for too long. Below is an
example of the reported message:
"[494105.316739] virtio_net virtio1 tmfifo_net0: TX timeout on
queue: 0, sq: output.0, vq: 0×1, name: output.0, usecs since
last trans: 3079892256".
This issue could happen when external host driver which drains the
FIFO is restared, stopped or upgraded. To avoid such confusing
"Tx timeout" messages, this commit adds logic to drop the outstanding
Tx packet if it's not able to transmit in two seconds due to Tx FIFO
full, which can be considered as congestion or out-of-resource drop.
This commit also handles the special case that the packet is half-
transmitted into the Tx FIFO. In such case, the packet is discarded
with remaining length stored in vring->rem_padding. So paddings with
zeros can be sent out when Tx space is available to maintain the
integrity of the packet format. The padded packet will be dropped on
the receiving side.
Signed-off-by: Liming Sun <limings@nvidia.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://lore.kernel.org/r/20240111173106.96958-1-limings@nvidia.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Back merge pdx86 fixes into pdx86/for-next for further WMI work
depending on some of the fixes.
platform-drivers-x86 for v6.7-3
Highlights:
- asus-wmi: Solve i8042 filter resource handling, input, and
suspend issues
- wmi: Skip zero instance WMI blocks to avoid issues with
some laptops
- mlxbf-bootctl: Differentiate dev/production keys
- platform/surface: Correct serdev related return value to avoid
leaking errno into userspace
- Error checking fixes
The following is an automated shortlog grouped by driver:
asus-wmi:
- Change q500a_i8042_filter() into a generic i8042-filter
- disable USB0 hub on ROG Ally before suspend
- Filter Volume key presses if also reported via atkbd
- Move i8042 filter install to shared asus-wmi code
mellanox:
- Add null pointer checks for devm_kasprintf()
- Check devm_hwmon_device_register_with_groups() return value
mlxbf-bootctl:
- correctly identify secure boot with development keys
surface: aggregator:
- fix recv_buf() return value
wmi:
- Skip blocks with zero instances
The secure boot state of the BlueField SoC is represented by two bits:
0 = production state
1 = secure boot enabled
2 = non-secure (secure boot disabled)
3 = RMA state
There is also a single bit to indicate whether production keys or
development keys are being used when secure boot is enabled.
This single bit (specified by MLXBF_BOOTCTL_SB_DEV_MASK) only has
meaning if secure boot state equals 1 (secure boot enabled).
The secure boot states are as follows:
- “GA secured” is when secure boot is enabled with official production keys.
- “Secured (development)” is when secure boot is enabled with development keys.
Without this fix “GA Secured” is displayed on development cards which is
misleading. This patch updates the logic in "lifecycle_state_show()" to
handle the case where the SoC is configured for secure boot and is using
development keys.
Fixes: 79e29cb8fb ("platform/mellanox: Add bootctl driver for Mellanox BlueField Soc")
Reviewed-by: Khalil Blaiech <kblaiech@nvidia.com>
Signed-off-by: David Thompson <davthompson@nvidia.com>
Link: https://lore.kernel.org/r/20231130183515.17214-1-davthompson@nvidia.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Pull x86 platform driver updates from Ilpo Järvinen:
- asus-wmi: Support for screenpad and solve brightness key press
duplication
- int3472: Eliminate the last use of deprecated GPIO functions
- mlxbf-pmc: New HW support
- msi-ec: Support new EC configurations
- thinkpad_acpi: Support reading aux MAC address during passthrough
- wmi: Fixes & improvements
- x86-android-tablets: Detection fix and avoid use of GPIO private APIs
- Debug & metrics interface improvements
- Miscellaneous cleanups / fixes / improvements
* tag 'platform-drivers-x86-v6.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: (80 commits)
platform/x86: inspur-platform-profile: Add platform profile support
platform/x86: thinkpad_acpi: Add battery quirk for Thinkpad X120e
platform/x86: wmi: Decouple WMI device removal from wmi_block_list
platform/x86: wmi: Fix opening of char device
platform/x86: wmi: Fix probe failure when failing to register WMI devices
platform/x86: wmi: Fix refcounting of WMI devices in legacy functions
platform/x86: wmi: Decouple probe deferring from wmi_block_list
platform/x86/amd/hsmp: Fix iomem handling
platform/x86: asus-wmi: Do not report brightness up/down keys when also reported by acpi_video
platform/x86: thinkpad_acpi: replace deprecated strncpy with memcpy
tools/power/x86/intel-speed-select: v1.18 release
tools/power/x86/intel-speed-select: Use cgroup isolate for CPU 0
tools/power/x86/intel-speed-select: Increase max CPUs in one request
tools/power/x86/intel-speed-select: Display error for core-power support
tools/power/x86/intel-speed-select: No TRL for non compute domains
tools/power/x86/intel-speed-select: turbo-mode enable disable swapped
tools/power/x86/intel-speed-select: Update help for TRL
tools/power/x86/intel-speed-select: Sanitize integer arguments
platform/x86: acer-wmi: Remove void function return
platform/x86/amd/pmc: Add dump_custom_stb module parameter
...