Commit Graph

229909 Commits

Author SHA1 Message Date
Roman Kisel
f285d99574 hyperv: Do not overlap the hvcall IO areas in hv_vtl_apicid_to_vp_id()
The Top-Level Functional Specification for Hyper-V, Section 3.6 [1, 2],
disallows overlapping of the input and output hypercall areas, and
hv_vtl_apicid_to_vp_id() overlaps them.

Use the output hypercall page of the current vCPU for the hypercall.

[1] https://learn.microsoft.com/en-us/virtualization/hyper-v-on-windows/tlfs/hypercall-interface
[2] https://github.com/MicrosoftDocs/Virtualization-Documentation/tree/main/tlfs

Reported-by: Michael Kelley <mhklinux@outlook.com>
Closes: https://lore.kernel.org/lkml/SN6PR02MB4157B98CD34781CC87A9D921D40D2@SN6PR02MB4157.namprd02.prod.outlook.com/
Signed-off-by: Roman Kisel <romank@linux.microsoft.com>
Reviewed-by: Easwar Hariharan <eahariha@linux.microsoft.com>
Reviewed-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
Link: https://lore.kernel.org/r/20250108222138.1623703-6-romank@linux.microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Message-ID: <20250108222138.1623703-6-romank@linux.microsoft.com>
2025-01-10 00:54:21 +00:00
Roman Kisel
07412e1f16 hyperv: Do not overlap the hvcall IO areas in get_vtl()
The Top-Level Functional Specification for Hyper-V, Section 3.6 [1, 2],
disallows overlapping of the input and output hypercall areas, and
get_vtl(void) does overlap them.

Use the output hypercall page of the current vCPU for the hypercall.

[1] https://learn.microsoft.com/en-us/virtualization/hyper-v-on-windows/tlfs/hypercall-interface
[2] https://github.com/MicrosoftDocs/Virtualization-Documentation/tree/main/tlfs

Fixes: 8387ce06d7 ("x86/hyperv: Set Virtual Trust Level in VMBus init message")
Signed-off-by: Roman Kisel <romank@linux.microsoft.com>
Reviewed-by: Tianyu Lan <tiala@microsoft.com>
Reviewed-by: Easwar Hariharan <eahariha@linux.microsoft.com>
Link: https://lore.kernel.org/r/20250108222138.1623703-5-romank@linux.microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Message-ID: <20250108222138.1623703-5-romank@linux.microsoft.com>
2025-01-10 00:54:21 +00:00
Michael Kelley
a7ae41cd80 x86/hyperv: Don't assume cpu_possible_mask is dense
Current code allocates the hv_vp_assist_page array with size
num_possible_cpus(). This code assumes cpu_possible_mask is dense,
which is not true in the general case per [1]. If cpu_possible_mask
is sparse, the array might be indexed by a value beyond the size of
the array.

However, the configurations that Hyper-V provides to guest VMs on x86
hardware, in combination with how x86 code assigns Linux CPU numbers,
*does* always produce a dense cpu_possible_mask. So the dense assumption
is not currently causing failures. But for robustness against future
changes in how cpu_possible_mask is populated, update the code to no
longer assume dense.

The correct approach is to allocate the array with size "nr_cpu_ids".
While this leaves unused array entries corresponding to holes in
cpu_possible_mask, the holes are assumed to be minimal and hence the
amount of memory wasted by unused entries is minimal.

[1] https://lore.kernel.org/lkml/SN6PR02MB4157210CC36B2593F8572E5ED4692@SN6PR02MB4157.namprd02.prod.outlook.com/

Signed-off-by: Michael Kelley <mhklinux@outlook.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20241003035333.49261-2-mhklinux@outlook.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Message-ID: <20241003035333.49261-2-mhklinux@outlook.com>
2025-01-10 00:54:21 +00:00
Nuno Das Neves
962a4c7ea8 hyperv: Remove the now unused hyperv-tlfs.h files
Remove all hyperv-tlfs.h files. These are no longer included
anywhere. hyperv/hvhdk.h serves the same role, but with an easier
path for adding new definitions.

Remove the relevant lines in MAINTAINERS.

Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
Reviewed-by: Michael Kelley <mhklinux@outlook.com>
Reviewed-by: Easwar Hariharan <eahariha@linux.microsoft.com>
Link: https://lore.kernel.org/r/1732577084-2122-6-git-send-email-nunodasneves@linux.microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Message-ID: <1732577084-2122-6-git-send-email-nunodasneves@linux.microsoft.com>
2025-01-10 00:54:21 +00:00
Nuno Das Neves
ef5a3c92a8 hyperv: Switch from hyperv-tlfs.h to hyperv/hvhdk.h
Switch to using hvhdk.h everywhere in the kernel. This header
includes all the new Hyper-V headers in include/hyperv, which form a
superset of the definitions found in hyperv-tlfs.h.

This makes it easier to add new Hyper-V interfaces without being
restricted to those in the TLFS doc (reflected in hyperv-tlfs.h).

To be more consistent with the original Hyper-V code, the names of
some definitions are changed slightly. Update those where needed.

Update comments in mshyperv.h files to point to include/hyperv for
adding new definitions.

Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
Reviewed-by: Michael Kelley <mhklinux@outlook.com>
Reviewed-by: Easwar Hariharan <eahariha@linux.microsoft.com>
Signed-off-by: Roman Kisel <romank@linux.microsoft.com>
Reviewed-by: Easwar Hariharan <eahariha@linux.microsoft.com>
Link: https://lore.kernel.org/r/1732577084-2122-5-git-send-email-nunodasneves@linux.microsoft.com
Link: https://lore.kernel.org/r/20250108222138.1623703-3-romank@linux.microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
2025-01-10 00:54:21 +00:00
Cristian Ciocaltea
06835ccec2 arm64: defconfig: Enable Rockchip extensions for Synopsys DW HDMI QP
Enable Rockchip specific extensions for Synopsys DesignWare HDMI
Quad-Pixel (QP) driver.  This is needed to provide HDMI output support
for the boards based on RK3588 SoC.

Signed-off-by: Cristian Ciocaltea <cristian.ciocaltea@collabora.com>
Link: https://lore.kernel.org/r/20241202-dw-hdmi-qp-rk-defconfig-v1-1-38757fc053d0@collabora.com
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
2025-01-10 00:21:25 +01:00
Nicolas Dufresne
893b4ea693 arm64: defconfig: Enable RFKILL GPIO
Enable as a module the RFKILL GPIO driver. This is needed to power WiFi
and Bluetooth radio on various RK3588 board. Without this module, rfkill
will report the switch has hard blocked, which prevents using the WiFi
or Bluetooth feature.

Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
Link: https://lore.kernel.org/r/20241218-rk3588-rfkill-gpio-config-v1-1-dfdf464f97d4@collabora.com
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
2025-01-10 00:20:14 +01:00
Joel Stanley
983833061d arm64: dts: qcom: x1e80100-romulus: Update firmware nodes
Other x1e machines use _dtbs.elf for these firmwares, which matches the
filenames shipped by Windows.

Fixes: 09d77be560 ("arm64: dts: qcom: Add support for X1-based Surface Laptop 7 devices")
Signed-off-by: Joel Stanley <joel@jms.id.au>
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20250108124500.44011-1-joel@jms.id.au
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2025-01-09 17:13:58 -06:00
Shimrra Shai
ebe82df46f arm64: dts: rockchip: add DTs for Firefly ITX-3588J and its Core-3588J SoM
Add the device tree and Makefile update.

Signed-off-by: Shimrra Shai <shimrrashai@gmail.com>
Link: https://lore.kernel.org/r/20241216214152.58387-3-shimrrashai@gmail.com
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
2025-01-10 00:13:16 +01:00
Jimmy Hon
c600d252dc arm64: dts: rockchip: Add Orange Pi 5 Max board
The RK3588 Single Board Computer includes
- eMMC
- microSD
- UART
- 2 PWM LEDs
- RTC
- RTL8125 network controller on PCIe 2.0x1.
- M.2 M-key connector routed to PCIe 3.0x4
- PWM controlled heat sink fan.
- 2 USB2 ports
- lower USB3 port
- upper USB3 port with OTG capability
- Mali GPU
- SPI NOR flash
- Mask Rom button
- Analog audio using es8388 codec via the headset jack and onboard mic
- HDMI0
- HDMI1

the vcc5v0_usb30 regulator shares the same enable gpio pin as the
vcc5v0_usb20 regulator.

The Orange Pi 5 Max and Orange Pi 5 Ultra are both credit-card sized
boards with similar layout, so these boards will share a common dtsi.
The 5 Max has an extra HDMI0 while the 5 Ultra has a HDMI IN instead.

Signed-off-by: Jimmy Hon <honyuenkwun@gmail.com>
Link: https://lore.kernel.org/r/20250109051619.1825-4-honyuenkwun@gmail.com
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
2025-01-09 23:42:19 +01:00
Jimmy Hon
ea63f4666e arm64: dts: rockchip: refactor common rk3588-orangepi-5.dtsi
Orange Pi now has multiple SBCs using the RK3588.

Refactor the common parts of the Orange Pi 5 Plus DTS so it can be
shared with the 5 Max and the 5 Ultra.

Signed-off-by: Jimmy Hon <honyuenkwun@gmail.com>
Link: https://lore.kernel.org/r/20250109051619.1825-2-honyuenkwun@gmail.com
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
2025-01-09 23:42:19 +01:00
Arnd Bergmann
00f63a0f5f Merge tag 'imx-fixes-6.13' of https://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into arm/fixes
i.MX fixes for 6.13:

- Add fallback for i.MX8QM ESAI compatible to fix a dt-schema warning
  caused by bindings update
- Fix uSDHC1 clock for i.MX RT1050
- Enable SND_SOC_SPDIF in imx_v6_v7_defconfig to fix a regression caused
  by an i.MX6 SPDIF sound card change in DT
- Fix address length of i.MX95 netcmix_blk_ctrl

* tag 'imx-fixes-6.13' of https://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
  ARM: dts: imxrt1050: Fix clocks for mmc
  ARM: imx_v6_v7_defconfig: enable SND_SOC_SPDIF
  arm64: dts: imx95: correct the address length of netcmix_blk_ctrl
  arm64: dts: imx8-ss-audio: add fallback compatible string fsl,imx6ull-esai for esai

Link: https://lore.kernel.org/r/Z3Jf9zbv/xH3YzuB@dragon
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2025-01-09 22:56:15 +01:00
Arnd Bergmann
627522c8bf Merge tag 'qcom-arm64-fixes-for-6.13' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux into arm/fixes
Qualcomm Arm64 DeviceTree fixes for v6.13

Revert the enablement of OTG support on primary and secondary USB Type-C
controllers of X1 Elite, for now, as this broke support for USB hotplug.

Disable the TPDM DCC device on SA8775P, as this is inaccessible per
current firmware configuration. Also correct the PCIe "addr_space"
region to enable larger BAR sizes.

Also fix the address space of PCIe6a found in X1 Elite.

* tag 'qcom-arm64-fixes-for-6.13' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux:
  arm64: dts: qcom: sa8775p: fix the secure device bootup issue
  Revert "arm64: dts: qcom: x1e80100: enable OTG on USB-C controllers"
  Revert "arm64: dts: qcom: x1e80100-crd: enable otg on usb ports"
  arm64: dts: qcom: x1e80100: Fix up BAR space size for PCIe6a
  Revert "arm64: dts: qcom: x1e78100-t14s: enable otg on usb-c ports"
  arm64: dts: qcom: sa8775p: Fix the size of 'addr_space' regions

Link: https://lore.kernel.org/r/20250103024945.4649-1-andersson@kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2025-01-09 22:54:53 +01:00
Palmer Dabbelt
6f6ecce59d Merge patch series "SBI PMU event related fixes"
Atish Patra <atishp@rivosinc.com> says:

Here are two minor improvement/fixes in the PMU event path. The first patch
was part of the series[1]. The 2nd patch was suggested during the series
review.

While the series can only be merged once SBI v3.0 is frozen, these two
patches can be independent of SBI v3.0 and can be merged sooner. Hence, these
two patches are sent as a separate series.

* b4-shazam-merge:
  drivers/perf: riscv: Do not allow invalid raw event config
  drivers/perf: riscv: Return error for default case
  drivers/perf: riscv: Fix Platform firmware event data

Link: https://lore.kernel.org/r/20241212-pmu_event_fixes_v2-v2-0-813e8a4f5962@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-09 09:37:12 -08:00
Atish Patra
fc58db9aeb drivers/perf: riscv: Fix Platform firmware event data
Platform firmware event data field is allowed to be 62 bits for
Linux as uppper most two bits are reserved to indicate SBI fw or
platform specific firmware events.
However, the event data field is masked as per the hardware raw
event mask which is not correct.

Fix the platform firmware event data field with proper mask.

Fixes: f0c9363db2 ("perf/riscv-sbi: Add platform specific firmware event handling")

Signed-off-by: Atish Patra <atishp@rivosinc.com>
Link: https://lore.kernel.org/r/20241212-pmu_event_fixes_v2-v2-1-813e8a4f5962@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-09 09:37:08 -08:00
Sebastian Reichel
3948b4a9bb arm64: dts: rockchip: add WLAN to rk3588-evb1 controller
The RK3588 EVB1 has an onboard AP6275P WLAN/BT module. This adds
support for the WLAN side, which is connected to the second
PCIe bus. The Bluetooth side is connected to UART and handled
separately.

Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Link: https://lore.kernel.org/r/20241210162452.116767-1-sebastian.reichel@collabora.com
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
2025-01-09 16:28:52 +01:00
Peter Geis
3699f2c43e arm64: dts: rockchip: add hevc power domain clock to rk3328
There is a race condition at startup between disabling power domains not
used and disabling clocks not used on the rk3328. When the clocks are
disabled first, the hevc power domain fails to shut off leading to a
splat of failures. Add the hevc core clock to the rk3328 power domain
node to prevent this condition.

rcu: INFO: rcu_sched detected expedited stalls on CPUs/tasks: { 3-.... }
1087 jiffies s: 89 root: 0x8/.
rcu: blocking rcu_node structures (internal RCU debug):
Sending NMI from CPU 0 to CPUs 3:
NMI backtrace for cpu 3
CPU: 3 UID: 0 PID: 86 Comm: kworker/3:3 Not tainted 6.12.0-rc5+ #53
Hardware name: Firefly ROC-RK3328-CC (DT)
Workqueue: pm genpd_power_off_work_fn
pstate: 20400005 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : regmap_unlock_spinlock+0x18/0x30
lr : regmap_read+0x60/0x88
sp : ffff800081123c00
x29: ffff800081123c00 x28: ffff2fa4c62cad80 x27: 0000000000000000
x26: ffffd74e6e660eb8 x25: ffff2fa4c62cae00 x24: 0000000000000040
x23: ffffd74e6d2f3ab8 x22: 0000000000000001 x21: ffff800081123c74
x20: 0000000000000000 x19: ffff2fa4c0412000 x18: 0000000000000000
x17: 77202c31203d2065 x16: 6c6469203a72656c x15: 6c6f72746e6f632d
x14: 7265776f703a6e6f x13: 2063766568206e69 x12: 616d6f64202c3431
x11: 347830206f742030 x10: 3430303034783020 x9 : ffffd74e6c7369e0
x8 : 3030316666206e69 x7 : 205d383738353733 x6 : 332e31202020205b
x5 : ffffd74e6c73fc88 x4 : ffffd74e6c73fcd4 x3 : ffffd74e6c740b40
x2 : ffff800080015484 x1 : 0000000000000000 x0 : ffff2fa4c0412000
Call trace:
regmap_unlock_spinlock+0x18/0x30
rockchip_pmu_set_idle_request+0xac/0x2c0
rockchip_pd_power+0x144/0x5f8
rockchip_pd_power_off+0x1c/0x30
_genpd_power_off+0x9c/0x180
genpd_power_off.part.0.isra.0+0x130/0x2a8
genpd_power_off_work_fn+0x6c/0x98
process_one_work+0x170/0x3f0
worker_thread+0x290/0x4a8
kthread+0xec/0xf8
ret_from_fork+0x10/0x20
rockchip-pm-domain ff100000.syscon:power-controller: failed to get ack on domain 'hevc', val=0x88220

Fixes: 52e02d377a ("arm64: dts: rockchip: add core dtsi file for RK3328 SoCs")
Signed-off-by: Peter Geis <pgwipeout@gmail.com>
Reviewed-by: Dragan Simic <dsimic@manjaro.org>
Link: https://lore.kernel.org/r/20241214224339.24674-1-pgwipeout@gmail.com
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
2025-01-09 16:22:49 +01:00
Jakob Unterwurzacher
9d241b0680 arm64: dts: rockchip: increase gmac rx_delay on rk3399-puma
During mass manufacturing, we noticed the mmc_rx_crc_error counter,
as reported by "ethtool -S eth0 | grep mmc_rx_crc_error", to increase
above zero during nuttcp speedtests. Most of the time, this did not
affect the achieved speed, but it prompted this investigation.

Cycling through the rx_delay range on six boards (see table below) of
various ages shows that there is a large good region from 0x12 to 0x35
where we see zero crc errors on all tested boards.

The old rx_delay value (0x10) seems to have always been on the edge for
the KSZ9031RNX that is usually placed on Puma.

Choose "rx_delay = 0x23" to put us smack in the middle of the good
region. This works fine as well with the KSZ9131RNX PHY that was used
for a small number of boards during the COVID chip shortages.

	Board S/N        PHY        rx_delay good region
	---------        ---        --------------------
	Puma TT0069903   KSZ9031RNX 0x11 0x35
	Puma TT0157733   KSZ9031RNX 0x11 0x35
	Puma TT0681551   KSZ9031RNX 0x12 0x37
	Puma TT0681156   KSZ9031RNX 0x10 0x38
	Puma 17496030079 KSZ9031RNX 0x10 0x37 (Puma v1.2 from 2017)
	Puma TT0681720   KSZ9131RNX 0x02 0x39 (alternative PHY used in very few boards)

	Intersection of good regions = 0x12 0x35
	Middle of good region = 0x23

Fixes: 2c66fc34e9 ("arm64: dts: rockchip: add RK3399-Q7 (Puma) SoM")
Cc: stable@vger.kernel.org
Reviewed-by: Quentin Schulz <quentin.schulz@cherry.de>
Tested-by: Quentin Schulz <quentin.schulz@cherry.de> # Puma v2.1 and v2.3 with KSZ9031
Signed-off-by: Jakob Unterwurzacher <jakob.unterwurzacher@cherry.de>
Link: https://lore.kernel.org/r/20241213-puma_rx_delay-v4-1-8e8e11cc6ed7@cherry.de
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
2025-01-09 16:21:46 +01:00
Dragan Simic
3ca11da6e8 arm64: dts: rockchip: Delete redundant RK3328 GMAC stability fixes
Since the commit 8a469ee356 ("arm64: dts: rockchip: Add txpbl node for
RK3399/RK3328"), having "snps,txpbl" properties defined as Ethernet stability
fixes in RK3328-based board dts(i) files is redundant, because that commit
added the required fix to the RK3328 SoC dtsi, so let's delete them.

It has been determined that the Ethernet stability fixes no longer require
"snps,rxpbl", "snps,aal" and "snps,force_thresh_dma_mode" properties, [1][2]
out of which the last two also induce performance penalties, so let's delete
these properties from the relevant RK3328-based board dts(i) files.

This commit completes the removal of these redundant "snps,*" DT properties
that was started by a patch from Peter Geis. [3]

[1] https://lore.kernel.org/linux-rockchip/CAMdYzYpj3d7Rq0O0QjV4r6HEf_e07R0QAhPT2NheZdQV3TnQ6g@mail.gmail.com/
[2] https://lore.kernel.org/linux-rockchip/CAMdYzYpnx=pHJ+oqshgfZFp=Mfqp3TcMmEForqJ+s9KuhkgnqA@mail.gmail.com/
[3] https://lore.kernel.org/linux-rockchip/20241210013010.81257-7-pgwipeout@gmail.com/

Cc: Peter Geis <pgwipeout@gmail.com>
Acked-by: Peter Geis <pgwipeout@gmail.com>
Signed-off-by: Dragan Simic <dsimic@manjaro.org>
Link: https://lore.kernel.org/r/fe05ecccc9fe27a678ad3e700ea022429f659724.1733943615.git.dsimic@manjaro.org
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
2025-01-09 16:21:05 +01:00
Sumit Gupta
a5e6fc0a10 arm64: tegra: Disable Tegra234 sce-fabric node
Access to safety cluster engine (SCE) fabric registers was blocked
by firewall after the introduction of Functional Safety Island in
Tegra234. After that, any access by software to SCE registers is
correctly resulting in the internal bus error. However, when CPUs
try accessing the SCE-fabric registers to print error info,
another firewall error occurs as the fabric registers are also
firewall protected. This results in a second error to be printed.
Disable the SCE fabric node to avoid printing the misleading error.
The first error info will be printed by the interrupt from the
fabric causing the actual access.

Cc: stable@vger.kernel.org
Fixes: 302e154000 ("arm64: tegra: Add node for CBB 2.0 on Tegra234")
Signed-off-by: Sumit Gupta <sumitg@nvidia.com>
Signed-off-by: Ivy Huang <yijuh@nvidia.com>
Reviewed-by: Brad Griffis <bgriffis@nvidia.com>
Reviewed-by: Jon Hunter <jonathanh@nvidia.com>
Link: https://lore.kernel.org/r/20241218000737.1789569-3-yijuh@nvidia.com
Signed-off-by: Thierry Reding <treding@nvidia.com>
2025-01-09 14:18:00 +01:00
Sumit Gupta
604120fd9e arm64: tegra: Fix typo in Tegra234 dce-fabric compatible
The compatible string for the Tegra DCE fabric is currently defined as
'nvidia,tegra234-sce-fabric' but this is incorrect because this is the
compatible string for SCE fabric. Update the compatible for the DCE
fabric to correct the compatible string.

This compatible needs to be correct in order for the interconnect
to catch things such as improper data accesses.

Cc: stable@vger.kernel.org
Fixes: 302e154000 ("arm64: tegra: Add node for CBB 2.0 on Tegra234")
Signed-off-by: Sumit Gupta <sumitg@nvidia.com>
Signed-off-by: Ivy Huang <yijuh@nvidia.com>
Reviewed-by: Brad Griffis <bgriffis@nvidia.com>
Reviewed-by: Jon Hunter <jonathanh@nvidia.com>
Link: https://lore.kernel.org/r/20241218000737.1789569-2-yijuh@nvidia.com
Signed-off-by: Thierry Reding <treding@nvidia.com>
2025-01-09 14:17:41 +01:00
Akhil R
346bf459db arm64: tegra: Fix DMA ID for SPI2
DMA ID for SPI2 is '16'. Update the incorrect value in the devicetree.

Fixes: bb9667d818 ("arm64: tegra: Add SPI device tree nodes for Tegra234")
Signed-off-by: Akhil R <akhilrajeev@nvidia.com>
Link: https://lore.kernel.org/r/20241206105201.53596-1-akhilrajeev@nvidia.com
Signed-off-by: Thierry Reding <treding@nvidia.com>
2025-01-09 14:17:06 +01:00
Alexander Gordeev
03b3e82a78 s390/tlb: Add missing TLB range adjustment
While converting to generic mmu_gather with commit 9de7d833e3
("s390/tlb: Convert to generic mmu_gather") __tlb_adjust_range()
is called from pte|pmd|p4d_free_tlb(), but not for pud_free_tlb().

__tlb_adjust_range() adjusts the span of TLB range to be flushed,
but s390 does not make use of it. Thus, this change is only for
consistency.

Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2025-01-09 13:44:48 +01:00
Thomas Weißschuh
3675a926fe sysfs: constify bin_attribute argument of sysfs_bin_attr_simple_read()
Most users use this function through the BIN_ATTR_SIMPLE* macros,
they can handle the switch transparently.
Also adapt the two non-macro users in the same change.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Acked-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Reviewed-by: Mahesh Salgaonkar <mahesh@linux.ibm.com>
Tested-by: Aditya Gupta <adityag@linux.ibm.com>
Link: https://lore.kernel.org/r/20241228-sysfs-const-bin_attr-simple-v2-1-7c6f3f1767a3@weissschuh.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-09 10:43:58 +01:00
Nikunj A Dadhania
0563ee35ae x86/sev: Add the Secure TSC feature for SNP guests
Now that all the required plumbing is done for enabling Secure TSC, add it to
the SNP features present list.

Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Tested-by: Peter Gonda <pgonda@google.com>
Link: https://lore.kernel.org/r/20250106124633.1418972-14-nikunj@amd.com
2025-01-09 10:21:56 +01:00
Stephan Gerhold
46316370e9 arm64: dts: qcom: msm8916-samsung-serranove: Add display panel
Add the Samsung S6E88A0-AMS427AP24 panel to the device tree for the
Samsung Galaxy S4 Mini Value Edition. By default the panel displays
everything horizontally flipped, so add "flip-horizontal" to the panel
node to correct that.

Signed-off-by: Stephan Gerhold <stephan@gerhold.net>
Co-developed-by: Jakob Hauser <jahau@rocketmail.com>
Signed-off-by: Jakob Hauser <jahau@rocketmail.com>
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20241114220718.12248-1-jahau@rocketmail.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2025-01-08 17:04:29 -06:00
Neil Armstrong
9eb81b31ab arm64: dts: qcom: sm8650: Add 'global' interrupt to the PCIe RC nodes
Qcom PCIe RC controllers are capable of generating 'global' SPI interrupt
to the host CPUs. This interrupt can be used by the device driver to
identify events such as PCIe link specific events, safety events, etc...

Hence, add it to the PCIe RC node along with the existing MSI interrupts.

Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20241126-topic-sm8x50-pcie-global-irq-v1-3-4049cfccd073@linaro.org
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2025-01-08 16:44:05 -06:00
Neil Armstrong
3e14b14ec8 arm64: dts: qcom: sm8550: Add 'global' interrupt to the PCIe RC nodes
Qcom PCIe RC controllers are capable of generating 'global' SPI interrupt
to the host CPUs. This interrupt can be used by the device driver to
identify events such as PCIe link specific events, safety events, etc...

Hence, add it to the PCIe RC node along with the existing MSI interrupts.

Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20241126-topic-sm8x50-pcie-global-irq-v1-2-4049cfccd073@linaro.org
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2025-01-08 16:44:05 -06:00
Rob Herring (Arm)
6888a95590 arm64: dts: qcom: Remove unused and undocumented properties
Remove properties which are both unused in the kernel and undocumented.
Most likely they are leftovers from downstream.

Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20241115193435.3618831-1-robh@kernel.org
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2025-01-08 16:41:55 -06:00
Neil Armstrong
cddaf23136 arm64: dts: qcom: sdm450-lenovo-tbx605f: add DSI panel nodes
Add the necessary nodes to enable the DSI panel on the
Lenovo Smart Tab M10 tablet.

Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Link: https://lore.kernel.org/r/20241115-topic-sdm450-upstream-lab-ibb-v1-2-8a8e74befbfe@linaro.org
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2025-01-08 16:41:34 -06:00
Neil Armstrong
f8ed8fd084 arm64: dts: qcom: pmi8950: add LAB-IBB nodes
Add the PMI8950 LAB-IBB regulator nodes, with the
PMI8998 compatible as fallback.

The LAB-IBB regulators are used as panels supplies
on existing phones or tablets.

Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Link: https://lore.kernel.org/r/20241115-topic-sdm450-upstream-lab-ibb-v1-1-8a8e74befbfe@linaro.org
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2025-01-08 16:41:34 -06:00
Manikanta Mylavarapu
b6f4f8c769 arm64: dts: qcom: ipq5424: enable the download mode support
Enable support for download mode to collect RAM dumps in case
of system crash, facilitating post mortem analysis.

Signed-off-by: Manikanta Mylavarapu <quic_mmanikan@quicinc.com>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Link: https://lore.kernel.org/r/20241204141416.1352545-3-quic_mmanikan@quicinc.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2025-01-08 16:37:55 -06:00
Manikanta Mylavarapu
2561c1377d arm64: dts: qcom: ipq5424: add scm node
Add an scm node to interact with the secure world.

Signed-off-by: Manikanta Mylavarapu <quic_mmanikan@quicinc.com>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Link: https://lore.kernel.org/r/20241204133627.1341760-3-quic_mmanikan@quicinc.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2025-01-08 16:35:48 -06:00
Maxim Levitsky
37c3ddfe52 KVM: VMX: read the PML log in the same order as it was written
Intel's PRM specifies that the CPU writes to the PML log 'backwards'
or in other words, it first writes entry 511, then entry 510 and so on.

I also confirmed on the bare metal that the CPU indeed writes the entries
in this order.

KVM on the other hand, reads the entries in the opposite order, from the
last written entry and towards entry 511 and dumps them in this order to
the dirty ring.

Usually this doesn't matter, except for one complex nesting case:

KVM reties the instructions that cause MMU faults.
This might cause an emulated PML log entry to be visible to L1's hypervisor
before the actual memory write was committed.

This happens when the L0 MMU fault is followed directly by the VM exit to
L1, for example due to a pending L1 interrupt or due to the L1's
'PML log full' event.

This problem doesn't have a noticeable real-world impact because this
write retry is not much different from the guest writing to the same page
multiple times, which is also not reflected in the dirty log. The users of
the dirty logging only rely on correct reporting of the clean pages, or
in other words they assume that if a page is clean, then no writes were
committed to it since the moment it was marked clean.

However KVM has a kvm_dirty_log_test selftest, a test that tests both
the clean and the dirty pages vs the memory contents, and can fail if it
detects a dirty page which has an old value at the offset 0 which the test
writes.

To avoid failure, the test has a workaround for this specific problem:

The test skips checking memory that belongs to the last dirty ring entry,
which it has seen, relying on the fact that as long as memory writes are
committed in-order, only the last entry can belong to a not yet committed
memory write.

However, since L1's KVM is reading the PML log in the opposite direction
that L0 wrote it, the last dirty ring entry often will be not the last
entry written by the L0.

To fix this, switch the order in which KVM reads the PML log.

Note that this issue is not present on the bare metal, because on the
bare metal, an update of the A/D bits of a present entry, PML logging and
the actual memory write are all done by the CPU without any hypervisor
intervention and pending interrupt evaluation, thus once a PML log and/or
vCPU kick happens, all memory writes that are in the PML log are
committed to memory.

The only exception to this rule is when the guest hits a not present EPT
entry, in which case KVM first reads (backward) the PML log, dumps it to
the dirty ring, and *then* sets up a SPTE entry with A/D bits set, and logs
this to the dirty ring, thus making the entry be the last one in the
dirty ring.

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Link: https://lore.kernel.org/r/20241219221034.903927-3-mlevitsk@redhat.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-01-08 14:31:25 -08:00
Maxim Levitsky
ae81ce936f KVM: VMX: refactor PML terminology
Rename PML_ENTITY_NUM to PML_LOG_NR_ENTRIES
Add PML_HEAD_INDEX to specify the first entry that CPU writes.

No functional change intended.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Link: https://lore.kernel.org/r/20241219221034.903927-2-mlevitsk@redhat.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-01-08 14:19:25 -08:00
Gao Shiyuan
4d141e444e KVM: VMX: Fix comment of handle_vmx_instruction()
Fix a goof in handle_vmx_instruction()'s comment where it references the
non-existent nested_vmx_setup(); the function that overwrites the exit
handlers is nested_vmx_hardware_setup().

Note, this isn't a case of a stale comment, e.g. due to the function being
renamed.  The comment has always been wrong.

Fixes: e4027cfafd ("KVM: nVMX: Set callbacks for nested functions during hardware setup")
Signed-off-by: Gao Shiyuan <gaoshiyuan@baidu.com>
Link: https://lore.kernel.org/r/20250103153814.73903-1-gaoshiyuan@baidu.com
[sean: massage changelog]
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-01-08 14:12:31 -08:00
Costas Argyris
b5fd068473 KVM: VMX: Reinstate __exit attribute for vmx_exit()
Tag vmx_exit() with __exit now that it's no longer used by vmx_init().

Commit a7b9020b06 ("x86/l1tf: Handle EPT disabled state proper") dropped
the "__exit" attribute from vmx_exit() because vmx_init() was changed to
call vmx_exit().

However, commit e32b120071 ("KVM: VMX: Do _all_ initialization before
exposing /dev/kvm to userspace") changed vmx_init() to call __vmx_exit()
instead of the "full" vmx_exit().  This made it possible to mark vmx_exit()
as "__exit" again, as it originally was, and enjoy the benefits that it
provides (the function can be discarded from memory in situations where it
cannot be called, like the module being built-in or module unloading being
disabled in the kernel).

Signed-off-by: Costas Argyris <costas.argyris@amd.com>
Link: https://lore.kernel.org/r/20250102154050.2403-1-costas.argyris@amd.com
[sean: massage changelog]
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-01-08 14:10:05 -08:00
Thorsten Blum
800173cf75 KVM: SVM: Use str_enabled_disabled() helper in sev_hardware_setup()
Remove hard-coded strings by using the str_enabled_disabled() helper
function.

Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Reviewed-by: Pavan Kumar Paluri <papaluri@amd.com>
Reviewed-by: Nikunj A Dadhania <nikunj@amd.com>
Link: https://lore.kernel.org/r/20241227094450.674104-2-thorsten.blum@linux.dev
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-01-08 14:08:41 -08:00
Sean Christopherson
4c20cd4cee KVM: x86: Avoid double RDPKRU when loading host/guest PKRU
Use the raw wrpkru() helper when loading the guest/host's PKRU on switch
to/from guest context, as the write_pkru() wrapper incurs an unnecessary
rdpkru().  In both paths, KVM is guaranteed to have performed RDPKRU since
the last possible write, i.e. KVM has a fresh cache of the current value
in hardware.

This effectively restores KVM's behavior to that of KVM prior to commit
c806e88734 ("x86/pkeys: Provide *pkru() helpers"), which renamed the raw
helper from __write_pkru() => wrpkru(), and turned __write_pkru() into a
wrapper.  Commit 577ff465f5 ("x86/fpu: Only write PKRU if it is different
from current") then added the extra RDPKRU to avoid an unnecessary WRPKRU,
but completely missed that KVM already optimized away pointless writes.

Reported-by: Adrian Hunter <adrian.hunter@intel.com>
Fixes: 577ff465f5 ("x86/fpu: Only write PKRU if it is different from current")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Link: https://lore.kernel.org/r/20241221011647.3747448-1-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-01-08 14:08:25 -08:00
Liam Ni
d6470627f5 KVM: x86: Use LVT_TIMER instead of an open coded literal
Use LVT_TIMER instead of the literal '0' to clean up the apic_lvt_mask
lookup when emulating handling writes to APIC_LVTT.

No functional change intended.

Signed-off-by: Liam Ni <zhiguangni01@gmail.com>
[sean: manually regenerate patch (whitespace damaged), massage changelog]
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-01-08 14:05:19 -08:00
Nikunj A Dadhania
73bbf3b0fb x86/tsc: Init the TSC for Secure TSC guests
Use the GUEST_TSC_FREQ MSR to discover the TSC frequency instead of
relying on kvm-clock based frequency calibration.  Override both CPU and
TSC frequency calibration callbacks with securetsc_get_tsc_khz(). Since
the difference between CPU base and TSC frequency does not apply in this
case, the same callback is being used.

  [ bp: Carve out from
    https://lore.kernel.org/r/20250106124633.1418972-11-nikunj@amd.com ]

Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20250106124633.1418972-11-nikunj@amd.com
2025-01-08 21:26:19 +01:00
Clément Léger
5cd900b8b7 riscv: use local label names instead of global ones in assembly
Local labels should be prefix by '.L' or they'll be exported in the
symbol table. Additionally, this messes up the backtrace by displaying
an incorrect symbol:

  ...
  [   12.751810] [<ffffffff80441628>] _copy_from_user+0x28/0xc2
  [   12.752035] [<ffffffff800152ca>] handle_misaligned_load+0x1ca/0x2fc
  [   12.752310] [<ffffffff80a033e8>] do_trap_load_misaligned+0x24/0xee
  [   12.752596] [<ffffffff80a0dcae>] _new_vmalloc_restore_context_a0+0xc2/0xce

After:
  ...
  [   10.243916] [<ffffffff804415e4>] _copy_from_user+0x28/0xc2
  [   10.244026] [<ffffffff800152ca>] handle_misaligned_load+0x1ca/0x2fc
  [   10.244150] [<ffffffff80a033a0>] do_trap_load_misaligned+0x24/0xee
  [   10.244268] [<ffffffff80a0dc66>] handle_exception+0x146/0x152

Signed-off-by: Clément Léger <cleger@rivosinc.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Fixes: 503638e0ba ("riscv: Stop emitting preventive sfence.vma for new vmalloc mappings")
Link: https://lore.kernel.org/r/20250103141814.508865-1-cleger@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-08 10:46:14 -08:00
Guo Ren
40e6073e76 riscv: qspinlock: Fixup _Q_PENDING_LOOPS definition
When CONFIG_RISCV_QUEUED_SPINLOCKS=y, the _Q_PENDING_LOOPS
definition is missing. Add the _Q_PENDING_LOOPS definition for
pure qspinlock usage.

Fixes: ab83647fad ("riscv: Add qspinlock support")
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20241215135252.201983-1-guoren@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-08 10:46:01 -08:00
Clément Léger
51356ce60e riscv: stacktrace: fix backtracing through exceptions
Prior to commit 5d5fc33ce5 ("riscv: Improve exception and system call
latency"), backtrace through exception worked since ra was filled with
ret_from_exception symbol address and the stacktrace code checked 'pc' to
be equal to that symbol. Now that handle_exception uses regular 'call'
instructions, this isn't working anymore and backtrace stops at
handle_exception(). Since there are multiple call site to C code in the
exception handling path, rather than checking multiple potential return
addresses, add a new symbol at the end of exception handling and check pc
to be in that range.

Fixes: 5d5fc33ce5 ("riscv: Improve exception and system call latency")
Signed-off-by: Clément Léger <cleger@rivosinc.com>
Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20241209155714.1239665-1-cleger@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-08 10:45:49 -08:00
Xu Lu
f754f27e98 riscv: mm: Fix the out of bound issue of vmemmap address
In sparse vmemmap model, the virtual address of vmemmap is calculated as:
((struct page *)VMEMMAP_START - (phys_ram_base >> PAGE_SHIFT)).
And the struct page's va can be calculated with an offset:
(vmemmap + (pfn)).

However, when initializing struct pages, kernel actually starts from the
first page from the same section that phys_ram_base belongs to. If the
first page's physical address is not (phys_ram_base >> PAGE_SHIFT), then
we get an va below VMEMMAP_START when calculating va for it's struct page.

For example, if phys_ram_base starts from 0x82000000 with pfn 0x82000, the
first page in the same section is actually pfn 0x80000. During
init_unavailable_range(), we will initialize struct page for pfn 0x80000
with virtual address ((struct page *)VMEMMAP_START - 0x2000), which is
below VMEMMAP_START as well as PCI_IO_END.

This commit fixes this bug by introducing a new variable
'vmemmap_start_pfn' which is aligned with memory section size and using
it to calculate vmemmap address instead of phys_ram_base.

Fixes: a11dd49dcb ("riscv: Sparse-Memory/vmemmap out-of-bounds fix")
Signed-off-by: Xu Lu <luxu.kernel@bytedance.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Tested-by: Björn Töpel <bjorn@rivosinc.com>
Reviewed-by: Björn Töpel <bjorn@rivosinc.com>
Link: https://lore.kernel.org/r/20241209122617.53341-1-luxu.kernel@bytedance.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-08 10:45:34 -08:00
Nam Cao
13134cc949 riscv: kprobes: Fix incorrect address calculation
p->ainsn.api.insn is a pointer to u32, therefore arithmetic operations are
multiplied by four. This is clearly undesirable for this case.

Cast it to (void *) first before any calculation.

Below is a sample before/after. The dumped memory is two kprobe slots, the
first slot has

  - c.addiw a0, 0x1c (0x7125)
  - ebreak           (0x00100073)

and the second slot has:

  - c.addiw a0, -4   (0x7135)
  - ebreak           (0x00100073)

Before this patch:

(gdb) x/16xh 0xff20000000135000
0xff20000000135000:	0x7125	0x0000	0x0000	0x0000	0x7135	0x0010	0x0000	0x0000
0xff20000000135010:	0x0073	0x0010	0x0000	0x0000	0x0000	0x0000	0x0000	0x0000

After this patch:

(gdb) x/16xh 0xff20000000125000
0xff20000000125000:	0x7125	0x0073	0x0010	0x0000	0x7135	0x0073	0x0010	0x0000
0xff20000000125010:	0x0000	0x0000	0x0000	0x0000	0x0000	0x0000	0x0000	0x0000

Fixes: b1756750a3 ("riscv: kprobes: Use patch_text_nosync() for insn slots")
Signed-off-by: Nam Cao <namcao@linutronix.de>
Cc: stable@vger.kernel.org
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20241119111056.2554419-1-namcao@linutronix.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-08 10:39:39 -08:00
Nam Cao
6a97f4118a riscv: Fix sleeping in invalid context in die()
die() can be called in exception handler, and therefore cannot sleep.
However, die() takes spinlock_t which can sleep with PREEMPT_RT enabled.
That causes the following warning:

BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48
in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 285, name: mutex
preempt_count: 110001, expected: 0
RCU nest depth: 0, expected: 0
CPU: 0 UID: 0 PID: 285 Comm: mutex Not tainted 6.12.0-rc7-00022-ge19049cf7d56-dirty #234
Hardware name: riscv-virtio,qemu (DT)
Call Trace:
    dump_backtrace+0x1c/0x24
    show_stack+0x2c/0x38
    dump_stack_lvl+0x5a/0x72
    dump_stack+0x14/0x1c
    __might_resched+0x130/0x13a
    rt_spin_lock+0x2a/0x5c
    die+0x24/0x112
    do_trap_insn_illegal+0xa0/0xea
    _new_vmalloc_restore_context_a0+0xcc/0xd8
Oops - illegal instruction [#1]

Switch to use raw_spinlock_t, which does not sleep even with PREEMPT_RT
enabled.

Fixes: 76d2a0493a ("RISC-V: Init and Halt Code")
Signed-off-by: Nam Cao <namcao@linutronix.de>
Cc: stable@vger.kernel.org
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://lore.kernel.org/r/20241118091333.1185288-1-namcao@linutronix.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-08 10:23:17 -08:00
Clément Léger
03f0b54853 riscv: module: remove relocation_head rel_entry member allocation
relocation_head's list_head member, rel_entry, doesn't need to be
allocated, its storage can just be part of the allocated relocation_head.
Remove the pointer which allows to get rid of the allocation as well as
an existing memory leak found by Kai Zhang using kmemleak.

Fixes: 8fd6c51423 ("riscv: Add remaining module relocations")
Reported-by: Kai Zhang <zhangkai@iscas.ac.cn>
Signed-off-by: Clément Léger <cleger@rivosinc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Tested-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20241128081636.3620468-1-cleger@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-08 10:22:52 -08:00
Frederic Weisbecker
b04e317b52 treewide: Introduce kthread_run_worker[_on_cpu]()
kthread_create() creates a kthread without running it yet. kthread_run()
creates a kthread and runs it.

On the other hand, kthread_create_worker() creates a kthread worker and
runs it.

This difference in behaviours is confusing. Also there is no way to
create a kthread worker and affine it using kthread_bind_mask() or
kthread_affine_preferred() before starting it.

Consolidate the behaviours and introduce kthread_run_worker[_on_cpu]()
that behaves just like kthread_run(). kthread_create_worker[_on_cpu]()
will now only create a kthread worker without starting it.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
2025-01-08 18:15:03 +01:00
Frederic Weisbecker
3a5446612a sched,arm64: Handle CPU isolation on last resort fallback rq selection
When a kthread or any other task has an affinity mask that is fully
offline or unallowed, the scheduler reaffines the task to all possible
CPUs as a last resort.

This default decision doesn't mix up very well with nohz_full CPUs that
are part of the possible cpumask but don't want to be disturbed by
unbound kthreads or even detached pinned user tasks.

Make the fallback affinity setting aware of nohz_full.

Suggested-by: Michal Hocko <mhocko@suse.com>
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
2025-01-08 18:14:23 +01:00