There have been instances when the default size (1M) of the STB is not
sufficient to get the complete traces of the failure. In such scenarios
we can use a module_param to enable full trace that shall contain more
debugging data. This is not a regular case and hence not enabling this
capability by default.
With this change, there will be two cases on how the driver fetches the
stb data:
1) A special case (proposed now) - which is required only for certain
platforms. Here, a new module param will be supplied to the driver that
will have a special PMFW supporting enhanced dram sizes for getting
the stb data. Without the special PMFW support, just setting the module
param will not help to get the enhanced stb data.
To adapt to this change, we will have a new amd_pmc_stb_handle_efr() to
handle enhanced firmware reporting mechanism. Note that, since num_samples
based r/w pointer offset calculation is not required for enhanced firmware
reporting we will have this mailbox command sent only in case of regular
STB cases.
2) Current code branch which fetches the stb data based on the parameters
like the num_samples, fsize and the r/w pointer.
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Co-developed-by: Harsh Jain <Harsh.Jain@amd.com>
Signed-off-by: Harsh Jain <Harsh.Jain@amd.com>
Signed-off-by: Sanket Goswami <Sanket.Goswami@amd.com>
Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
Link: https://lore.kernel.org/r/20231010145003.139932-3-Shyam-sundar.S-k@amd.com
[ij: Renamed flex_arr -> stb_data_arr]
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
In amd_pmc_stb_debugfs_open_v2(), the stb buffer is created based on the
num_samples and the read/write pointer offset. This holds good when the
num_samples reported by PMFW is less than S2D_TELEMETRY_BYTES_MAX; where
the stb buffer gets filled from 0th position until
S2D_TELEMETRY_BYTES_MAX - 1 based on the read/write pointer offset.
But when the num_samples exceeds the S2D_TELEMETRY_BYTES_MAX, the current
code does not handle it well as it does not account for the cases where
the stb buffer has to filled up as a circular buffer.
Handle this scenario into two cases, where first memcpy will have the
samples from location:
(num_samples % S2D_TELEMETRY_BYTES_MAX) - (S2D_TELEMETRY_BYTES_MAX - 1)
and next memcpy will have the newest ones i.e.
0 - (num_samples % S2D_TELEMETRY_BYTES_MAX - 1)
Suggested-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Sanket Goswami <Sanket.Goswami@amd.com>
Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
Link: https://lore.kernel.org/r/20231010145003.139932-2-Shyam-sundar.S-k@amd.com
[ij: renamed flex_arr -> stb_data_arr]
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
AMD MI300 MCM provides GET_METRICS_TABLE message to retrieve
all the system management information from SMU.
The metrics table is made available as hexadecimal sysfs binary file
under per socket sysfs directory created at
/sys/devices/platform/amd_hsmp/socket%d/metrics_bin
Metrics table definitions will be documented as part of Public PPR.
The same is defined in the amd_hsmp.h header.
Signed-off-by: Suma Hegde <suma.hegde@amd.com>
Reviewed-by: Naveen Krishna Chatradhi <nchatrad@amd.com>
Link: https://lore.kernel.org/r/20231010120310.3464066-2-suma.hegde@amd.com
[ij: lseek -> lseek(), dram -> DRAM in dev_err()]
[ij: added period to terminate a documentation sentence]
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Fixes in the mellanox init branch due for v6.6.
platform-drivers-x86-mellanox-init-v6.6: v6.6-rc1 + fixes in
the platform-drivers-x86-mellanox-init branch to avoid a feature
conflict during the v6.7 merge window.
Immutable branch between pdx86 int3472 branch and GPIO due for the v6.7 merge window.
platform-drivers-x86-ib-int3472-v6.7: v6.6-rc1 + platform-drivers-x86-int3472
for merging into the GPIO subsystem for v6.7.
As described in the added code comment, a reference to .exit.text is ok
for drivers registered via module_platform_driver_probe(). Make this
explicit to prevent a section mismatch warning:
WARNING: modpost: drivers/platform/x86/hp/hp-wmi: section mismatch in reference: hp_wmi_driver+0x8 (section: .data) -> hp_wmi_bios_remove (section: .exit.text)
Fixes: c165b80cfe ("hp-wmi: fix handling of platform device")
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Link: https://lore.kernel.org/r/20231004111624.2667753-1-u.kleine-koenig@pengutronix.de
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Scan image loading flow for newer IFS generations are slightly different
from that of current generation. In newer schemes, loading need not be
done once for each socket as was done in gen0.
Also the width of NUM_CHUNKS bitfield in SCAN_HASHES_STATUS MSR has
increased from 8 -> 16 bits. Similarly there are width differences for
CHUNK_AUTHENTICATION_STATUS too.
Further the parameter to AUTHENTICATE_AND_COPY_CHUNK is passed
differently in newer generations.
Signed-off-by: Jithu Joseph <jithu.joseph@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Tested-by: Pengfei Xu <pengfei.xu@intel.com>
Link: https://lore.kernel.org/r/20231005195137.3117166-4-jithu.joseph@intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Fix kernel-doc notation for structs and struct members to prevent
these warnings:
mlxbf-tmfifo.c:73: warning: cannot understand function prototype: 'struct mlxbf_tmfifo_vring '
mlxbf-tmfifo.c:128: warning: cannot understand function prototype: 'struct mlxbf_tmfifo_vdev '
mlxbf-tmfifo.c:146: warning: cannot understand function prototype: 'struct mlxbf_tmfifo_irq_info '
mlxbf-tmfifo.c:158: warning: cannot understand function prototype: 'struct mlxbf_tmfifo_io '
mlxbf-tmfifo.c:182: warning: cannot understand function prototype: 'struct mlxbf_tmfifo '
mlxbf-tmfifo.c:208: warning: cannot understand function prototype: 'struct mlxbf_tmfifo_msg_hdr '
mlxbf-tmfifo.c:138: warning: Function parameter or member 'config' not described in 'mlxbf_tmfifo_vdev'
mlxbf-tmfifo.c:212: warning: Function parameter or member 'unused' not described in 'mlxbf_tmfifo_msg_hdr'
Fixes: 1357dfd726 ("platform/mellanox: Add TmFifo driver for Mellanox BlueField Soc")
Fixes: bc05ea63b3 ("platform/mellanox: Add BlueField-3 support in the tmfifo driver")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: kernel test robot <lkp@intel.com>
Closes: lore.kernel.org/r/202309252330.saRU491h-lkp@intel.com
Cc: Liming Sun <lsun@mellanox.com>
Cc: Hans de Goede <hdegoede@redhat.com>
Cc: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Cc: Mark Gross <markgross@kernel.org>
Cc: Vadim Pasternak <vadimp@nvidia.com>
Cc: platform-driver-x86@vger.kernel.org
Link: https://lore.kernel.org/r/20230926054013.11450-1-rdunlap@infradead.org
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Couple of error paths in do_core_test() was returning directly without
doing a necessary cpus_read_unlock().
Following lockdep warning was observed when exercising these scenarios
with PROVE_RAW_LOCK_NESTING enabled:
[ 139.304775] ================================================
[ 139.311185] WARNING: lock held when returning to user space!
[ 139.317593] 6.6.0-rc2ifs01+ #11 Tainted: G S W I
[ 139.324499] ------------------------------------------------
[ 139.330908] bash/11476 is leaving the kernel with locks still held!
[ 139.338000] 1 lock held by bash/11476:
[ 139.342262] #0: ffffffffaa26c930 (cpu_hotplug_lock){++++}-{0:0}, at:
do_core_test+0x35/0x1c0 [intel_ifs]
Fix the flow so that all scenarios release the lock prior to returning
from the function.
Fixes: 5210fb4e18 ("platform/x86/intel/ifs: Sysfs interface for Array BIST")
Cc: stable@vger.kernel.org
Signed-off-by: Jithu Joseph <jithu.joseph@intel.com>
Link: https://lore.kernel.org/r/20230927184824.2566086-1-jithu.joseph@intel.com
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
If a duplicate attribute is found using kset_find_obj(), a reference
to that attribute is returned which needs to be disposed accordingly
using kobject_put(). Use kobject_put() to dispose the duplicate
attribute in such a case.
As a side note, a very similar bug was fixed in
commit 7295a996fd ("platform/x86: dell-sysman: Fix reference leak"),
so it seems that the bug was copied from that driver.
Compile-tested only.
Fixes: a34fc329b1 ("platform/x86: hp-bioscfg: bioscfg")
Suggested-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Armin Wolf <W_Armin@gmx.de>
Reviewed-by: Jorge Lopez <jorge.lopez2@hp.com>
Link: https://lore.kernel.org/r/20230925142819.74525-3-W_Armin@gmx.de
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
If a duplicate attribute is found using kset_find_obj(), a reference
to that attribute is returned which needs to be disposed accordingly
using kobject_put(). Move the setting name validation into a separate
function to allow for this change without having to duplicate the
cleanup code for this setting.
As a side note, a very similar bug was fixed in
commit 7295a996fd ("platform/x86: dell-sysman: Fix reference leak"),
so it seems that the bug was copied from that driver.
Compile-tested only.
Fixes: 1bcad8e510 ("platform/x86: think-lmi: Fix issues with duplicate attributes")
Reviewed-by: Mark Pearson <mpearson-lenovo@squebb.ca>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Armin Wolf <W_Armin@gmx.de>
Link: https://lore.kernel.org/r/20230925142819.74525-2-W_Armin@gmx.de
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
The hardware definition of every TPMI feature contains a major and minor
version. When there is a change in the MMIO offset or change in the
definition of a field, hardware will change major version. For addition
of new fields without modifying existing MMIO offsets or fields, only the
minor version is changed.
Driver is developed to support uncore frequency control (UFS) for a major
and minor version. If the hardware changes major version, since offsets
and definitions are changed, driver cannot continue to provide UFS
interface to users. Driver can still function with minor version change
as it will just miss the new functionality added by the hardware.
The current implementation logs information message and skips adding
uncore sysfs entry for a resource for any version mismatch. Check major
and minor version mismatch for every valid resource and fail on any major
version mismatch by logging an error message. A valid resource has a
version which is not 0xFF.
If there is mismatch with the minor version, continue with a log message.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://lore.kernel.org/r/20231003184916.1860084-4-srinivas.pandruvada@linux.intel.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
The hardware definition of every TPMI feature contains a major and minor
version. When there is a change in the MMIO offset or change in the
definition of a field, hardware will change major version. For addition
of new fields without modifying existing MMIO offsets or fields, only the
minor version is changed.
Driver is developed to support SST functionality for a major and minor
version. If the hardware changes major version, since offsets and
definitions are changed, driver cannot continue to provide SST interface
to users. Driver can still function with a minor version change as it will
just miss the new functionality added by the hardware. The current
implementation doesn't ignore any version change.
If there is mismatch with the minor version, continue with an information
log message. If there is mismatch with the major version, log error and
exit.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://lore.kernel.org/r/20231003184916.1860084-3-srinivas.pandruvada@linux.intel.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>