A recent DRM series purporting to simplify support for "transparent
bridges" and handling of probe deferrals ironically exposed a
use-after-free issue on pmic_glink_altmode probe deferral.
This has manifested itself as the display subsystem occasionally failing
to initialise and NULL-pointer dereferences during boot of machines like
the Lenovo ThinkPad X13s.
Specifically, the dp-hpd bridge is currently registered before all
resources have been acquired which means that it can also be
deregistered on probe deferrals.
In the meantime there is a race window where the new aux bridge driver
(or PHY driver previously) may have looked up the dp-hpd bridge and
stored a (non-reference-counted) pointer to the bridge which is about to
be deallocated.
When the display controller is later initialised, this triggers a
use-after-free when attaching the bridges:
dp -> aux -> dp-hpd (freed)
which may, for example, result in the freed bridge failing to attach:
[drm:drm_bridge_attach [drm]] *ERROR* failed to attach bridge /soc@0/phy@88eb000 to encoder TMDS-31: -16
or a NULL-pointer dereference:
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
...
Call trace:
drm_bridge_attach+0x70/0x1a8 [drm]
drm_aux_bridge_attach+0x24/0x38 [aux_bridge]
drm_bridge_attach+0x80/0x1a8 [drm]
dp_bridge_init+0xa8/0x15c [msm]
msm_dp_modeset_init+0x28/0xc4 [msm]
The DRM bridge implementation is clearly fragile and implicitly built on
the assumption that bridges may never go away. In this case, the fix is
to move the bridge registration in the pmic_glink_altmode driver to
after all resources have been looked up.
Incidentally, with the new dp-hpd bridge implementation, which registers
child devices, this is also a requirement due to a long-standing issue
in driver core that can otherwise lead to a probe deferral loop (see
commit fbc35b45f9 ("Add documentation on meaning of -EPROBE_DEFER")).
[DB: slightly fixed commit message by adding the word 'commit']
Fixes: 080b4e2485 ("soc: qcom: pmic_glink: Introduce altmode support")
Fixes: 2bcca96abf ("soc: qcom: pmic-glink: switch to DRM_AUX_HPD_BRIDGE")
Cc: <stable@vger.kernel.org> # 6.3
Cc: Bjorn Andersson <andersson@kernel.org>
Cc: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Reviewed-by: Bjorn Andersson <andersson@kernel.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20240217150228.5788-4-johan+linaro@kernel.org
Combining allocation and registration is an anti-pattern that should be
avoided. Add two new functions for allocating and registering an dp-hpd
bridge with a proper 'devm' prefix so that it is clear that these are
device managed interfaces.
devm_drm_dp_hpd_bridge_alloc()
devm_drm_dp_hpd_bridge_add()
The new interface will be used to fix a use-after-free bug in the
Qualcomm PMIC GLINK driver and may prevent similar issues from being
introduced elsewhere.
The existing drm_dp_hpd_bridge_register() is reimplemented using the
above and left in place for now.
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Reviewed-by: Bjorn Andersson <andersson@kernel.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20240217150228.5788-3-johan+linaro@kernel.org
snd_soc_card_get_kcontrol() must be holding a read lock on
card->controls_rwsem while walking the controls list.
Compare with snd_ctl_find_numid().
The existing function is renamed snd_soc_card_get_kcontrol_locked()
so that it can be called from contexts that are already holding
card->controls_rwsem (for example, control get/put functions).
There are few direct or indirect callers of
snd_soc_card_get_kcontrol(), and most are safe. Three require
changes, which have been included in this patch:
codecs/cs35l45.c:
cs35l45_activate_ctl() is called from a control put() function so
is changed to call snd_soc_card_get_kcontrol_locked().
codecs/cs35l56.c:
cs35l56_sync_asp1_mixer_widgets_with_firmware() is called from
control get()/put() functions so is changed to call
snd_soc_card_get_kcontrol_locked().
fsl/fsl_xcvr.c:
fsl_xcvr_activate_ctl() is called from three places, one of which
already holds card->controls_rwsem:
1. fsl_xcvr_mode_put(), a control put function, which will
already be holding card->controls_rwsem.
2. fsl_xcvr_startup(), a DAI startup function.
3. fsl_xcvr_shutdown(), a DAI shutdown function.
To fix this, fsl_xcvr_activate_ctl() has been changed to call
snd_soc_card_get_kcontrol_locked() so that it is safe to call
directly from fsl_xcvr_mode_put().
The fsl_xcvr_startup() and fsl_xcvr_shutdown() functions have been
changed to take a read lock on card->controls_rsem() around calls
to fsl_xcvr_activate_ctl(). While this is not very elegant, it
keeps the change small, to avoid this patch creating a large
collateral churn in fsl/fsl_xcvr.c.
Analysis of other callers of snd_soc_card_get_kcontrol() is that
they do not need any changes, they are not holding card->controls_rwsem
when they call snd_soc_card_get_kcontrol().
Direct callers of snd_soc_card_get_kcontrol():
fsl/fsl_spdif.c: fsl_spdif_dai_probe() - DAI probe function
fsl/fsl_micfil.c: voice_detected_fn() - IRQ handler
Indirect callers via soc_component_notify_control():
codecs/cs42l43: cs42l43_mic_shutter() - IRQ handler
codecs/cs42l43: cs42l43_spk_shutter() - IRQ handler
codecs/ak4118.c: ak4118_irq_handler() - IRQ handler
codecs/wm_adsp.c: wm_adsp_write_ctl() - not currently used
Indirect callers via snd_soc_limit_volume():
qcom/sc8280xp.c: sc8280xp_snd_init() - DAIlink init function
ti/rx51.c: rx51_aic34_init() - DAI init function
I don't have hardware to test the fsl/*, qcom/sc828xp.c, ti/rx51.c
and ak4118.c changes.
Backport note:
The fsl/, qcom/, cs35l45, cs35l56 and cs42l43 callers were added
since the Fixes commit so won't all be present on older kernels.
Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Fixes: 209c6cdfd2 ("ASoC: soc-card: move snd_soc_card_get_kcontrol() to soc-card")
Link: https://lore.kernel.org/r/20240221123710.690224-1-rf@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>
RISC-V Devicetree fixes for v6.8-rc6
Two fixes for W=2 issues in devicetrees, which should constitute fixes
for all reasonable-to-fix W=2 problems on RISC-V. The others are caused
by standard USB and MMC property names containing underscores that are
not likely to ever change.
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
* tag 'riscv-dt-fixes-for-v6.8-rc6' of https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux:
riscv: dts: sifive: add missing #interrupt-cells to pmic
riscv: dts: starfive: replace underscores in node names
Link: https://lore.kernel.org/r/20240221-foil-glade-09dbf1aa3fe2@spud
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Commit 3ce4f9c3fb ("net/ps3_gelic_net: Add gelic_descr structures") of
6.8-rc1 had a copy-and-paste error where the pointer that holds the
allocated SKB (struct gelic_descr.skb) was set to NULL after the SKB was
allocated. This resulted in a kernel panic when the SKB pointer was
accessed.
This fix moves the initialization of the gelic_descr to before the SKB
is allocated.
Reported-by: sambat goson <sombat3960@gmail.com>
Fixes: 3ce4f9c3fb ("net/ps3_gelic_net: Add gelic_descr structures")
Signed-off-by: Geoff Levand <geoff@infradead.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since commit 5d93cfcf73 ("net: dpaa: Convert to phylink"), we support
the "10gbase-r" phy-mode through a driver-based conversion of "xgmii",
but we still don't actually support it when the device tree specifies
"10gbase-r" proper.
This is because boards such as LS1046A-RDB do not define pcs-handle-names
(for whatever reason) in the ethernet@f0000 device tree node, and the
code enters through this code path:
err = of_property_match_string(mac_node, "pcs-handle-names", "xfi");
// code takes neither branch and falls through
if (err >= 0) {
(...)
} else if (err != -EINVAL && err != -ENODATA) {
goto _return_fm_mac_free;
}
(...)
/* For compatibility, if pcs-handle-names is missing, we assume this
* phy is the first one in pcsphy-handle
*/
err = of_property_match_string(mac_node, "pcs-handle-names", "sgmii");
if (err == -EINVAL || err == -ENODATA)
pcs = memac_pcs_create(mac_node, 0); // code takes this branch
else if (err < 0)
goto _return_fm_mac_free;
else
pcs = memac_pcs_create(mac_node, err);
// A default PCS is created and saved in "pcs"
// This determination fails and mistakenly saves the default PCS
// memac->sgmii_pcs instead of memac->xfi_pcs, because at this
// stage, mac_dev->phy_if == PHY_INTERFACE_MODE_10GBASER.
if (err && mac_dev->phy_if == PHY_INTERFACE_MODE_XGMII)
memac->xfi_pcs = pcs;
else
memac->sgmii_pcs = pcs;
In other words, in the absence of pcs-handle-names, the default
xfi_pcs assignment logic only works when in the device tree we have
PHY_INTERFACE_MODE_XGMII.
By reversing the order between the fallback xfi_pcs assignment and the
"xgmii" overwrite with "10gbase-r", we are able to support both values
in the device tree, with identical behavior.
Currently, it is impossible to make the s/xgmii/10gbase-r/ device tree
conversion, because it would break forward compatibility (new device
tree with old kernel). The only way to modify existing device trees to
phy-interface-mode = "10gbase-r" is to fix stable kernels to accept this
value and handle it properly.
One reason why the conversion is desirable is because with pre-phylink
kernels, the Aquantia PHY driver used to warn about the improper use
of PHY_INTERFACE_MODE_XGMII [1]. It is best to have a single (latest)
device tree that works with all supported stable kernel versions.
Note that the blamed commit does not constitute a regression per se.
Older stable kernels like 6.1 still do not work with "10gbase-r", but
for a different reason. That is a battle for another time.
[1] https://lore.kernel.org/netdev/20240214-ls1046-dts-use-10gbase-r-v1-1-8c2d68547393@concurrent-rt.com/
Fixes: 5d93cfcf73 ("net: dpaa: Convert to phylink")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Sean Anderson <sean.anderson@seco.com>
Acked-by: Madalin Bucur <madalin.bucur@oss.nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Return used most significant bits from sample bit-width rather than the whole
physical sample word size. The starting bit offset is defined in the format
itself.
The behaviour is not changed for 32-bit formats like S32_LE. But with this
change - msbits value 24 instead 32 is returned for 24-bit formats like S24_LE
etc.
Also, commit 2112aa0349 ("ALSA: pcm: Introduce MSBITS subformat interface")
compares sample bit-width not physical sample bit-width to reset MSBITS_MAX bit
from the subformat bitmask.
Probably no applications are using msbits value for other than S32_LE/U32_LE
formats, because no drivers are reducing msbits value for other formats (with
the msb offset) at the moment.
For sanity, increase PCM protocol version, letting the user space to detect
the changed behaviour.
Signed-off-by: Jaroslav Kysela <perex@perex.cz>
Link: https://lore.kernel.org/r/20240222173649.1447549-1-perex@perex.cz
Signed-off-by: Takashi Iwai <tiwai@suse.de>
When a station has not been uploaded yet, receiving SMPS or channel width
notification action frames can lead to rate_control_rate_update calling
drv_sta_rc_update with uninitialized driver private data.
Fix this by adding a missing check for sta->uploaded.
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Link: https://msgid.link/20240221140535.16102-1-nbd@nbd.name
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
The PTDMA driver sets DMA masks in two different places for the same
device inconsistently. First call is in pt_pci_probe(), where it uses
48bit mask. The second call is in pt_dmaengine_register(), where it
uses a 64bit mask. Using 64bit dma mask causes IO_PAGE_FAULT errors
on DMA transfers between main memory and other devices.
Without the extra call it works fine. Additionally the second call
doesn't check the return value so it can silently fail.
Remove the superfluous dma_set_mask() call and only use 48bit mask.
Cc: stable@vger.kernel.org
Fixes: b0b4a6b105 ("dmaengine: ptdma: register PTDMA controller as a DMA resource")
Reviewed-by: Basavaraj Natikar <Basavaraj.Natikar@amd.com>
Signed-off-by: Tadeusz Struk <tstruk@gigaio.com>
Link: https://lore.kernel.org/r/20240222163053.13842-1-tstruk@gigaio.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
All the checks currently done in kvm_check_cpucfg can be realized with
early returns, so just do that to avoid extra cognitive burden related
to the return value handling.
While at it, clean up comments of _kvm_get_cpucfg_mask() and
kvm_check_cpucfg(), by removing comments that are merely restatement of
the code nearby, and paraphrasing the rest so they read more natural for
English speakers (that likely are not familiar with the actual Chinese-
influenced grammar).
No functional changes intended.
Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
The function is not actually a getter of guest CPUCFG, but rather
validation of the input CPUCFG ID plus information about the supported
bit flags of that CPUCFG leaf. So rename it to avoid confusion.
Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
The range check for the CPUCFG ID is wrong (should have been a ||
instead of &&) and useless in effect, so fix the obvious mistake.
Furthermore, the juggling of the temp return value is unnecessary,
because it is semantically equivalent and more readable to just
return at every switch case's end. This is done too to avoid potential
bugs in the future related to the unwanted complexity.
Also, the return value of _kvm_get_cpucfg is meant to be checked, but
this was not done, so bad CPUCFG IDs wrongly fall back to the default
case and 0 is incorrectly returned; check the return value to fix the
UAPI behavior.
While at it, also remove the redundant range check in kvm_check_cpucfg,
because out-of-range CPUCFG IDs are already rejected by the -EINVAL
as returned by _kvm_get_cpucfg().
Fixes: db1ecca22e ("LoongArch: KVM: Add LSX (128bit SIMD) support")
Fixes: 118e10cd89 ("LoongArch: KVM: Add LASX (256bit SIMD) support")
Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
The unflatten_and_copy_device_tree() function contains a call to
memblock_alloc(). This means that memblock is allocating memory before
any of the reserved memory regions are set aside in the arch_mem_init()
function which calls early_init_fdt_scan_reserved_mem(). Therefore,
there is a possibility for memblock to allocate from any of the
reserved memory regions.
Hence, move the call to early_init_fdt_scan_reserved_mem() to be earlier
in the init sequence, so that the reserved memory regions are set aside
before any allocations are done using memblock.
Cc: stable@vger.kernel.org
Fixes: 88d4d957ed ("LoongArch: Add FDT booting support from efi system table")
Signed-off-by: Oreoluwa Babatunde <quic_obabatun@quicinc.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
The PAPR spec spells the function name as
"ibm,reset-pe-dma-windows"
but in practice firmware uses the singular form:
"ibm,reset-pe-dma-window"
in the device tree. Since we have the wrong spelling in the RTAS
function table, reverse lookups (token -> name) fail and warn:
unexpected failed lookup for token 86
WARNING: CPU: 1 PID: 545 at arch/powerpc/kernel/rtas.c:659 __do_enter_rtas_trace+0x2a4/0x2b4
CPU: 1 PID: 545 Comm: systemd-udevd Not tainted 6.8.0-rc4 #30
Hardware name: IBM,9105-22A POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1060.00 (NL1060_028) hv:phyp pSeries
NIP [c0000000000417f0] __do_enter_rtas_trace+0x2a4/0x2b4
LR [c0000000000417ec] __do_enter_rtas_trace+0x2a0/0x2b4
Call Trace:
__do_enter_rtas_trace+0x2a0/0x2b4 (unreliable)
rtas_call+0x1f8/0x3e0
enable_ddw.constprop.0+0x4d0/0xc84
dma_iommu_dma_supported+0xe8/0x24c
dma_set_mask+0x5c/0xd8
mlx5_pci_init.constprop.0+0xf0/0x46c [mlx5_core]
probe_one+0xfc/0x32c [mlx5_core]
local_pci_probe+0x68/0x12c
pci_call_probe+0x68/0x1ec
pci_device_probe+0xbc/0x1a8
really_probe+0x104/0x570
__driver_probe_device+0xb8/0x224
driver_probe_device+0x54/0x130
__driver_attach+0x158/0x2b0
bus_for_each_dev+0xa8/0x120
driver_attach+0x34/0x48
bus_add_driver+0x174/0x304
driver_register+0x8c/0x1c4
__pci_register_driver+0x68/0x7c
mlx5_init+0xb8/0x118 [mlx5_core]
do_one_initcall+0x60/0x388
do_init_module+0x7c/0x2a4
init_module_from_file+0xb4/0x108
idempotent_init_module+0x184/0x34c
sys_finit_module+0x90/0x114
And oopses are possible when lockdep is enabled or the RTAS
tracepoints are active, since those paths dereference the result of
the lookup.
Use the correct spelling to match firmware's behavior, adjusting the
related constants to match.
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Fixes: 8252b88294 ("powerpc/rtas: improve function information lookups")
Reported-by: Gaurav Batra <gbatra@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20240222-rtas-fix-ibm-reset-pe-dma-window-v1-1-7aaf235ac63c@linux.ibm.com
When kdump kernel tries to copy dump data over SR-IOV, LPAR panics due
to NULL pointer exception:
Kernel attempted to read user page (0) - exploit attempt? (uid: 0)
BUG: Kernel NULL pointer dereference on read at 0x00000000
Faulting instruction address: 0xc000000020847ad4
Oops: Kernel access of bad area, sig: 11 [#1]
LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
Modules linked in: mlx5_core(+) vmx_crypto pseries_wdt papr_scm libnvdimm mlxfw tls psample sunrpc fuse overlay squashfs loop
CPU: 12 PID: 315 Comm: systemd-udevd Not tainted 6.4.0-Test102+ #12
Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1060.00 (NH1060_008) hv:phyp pSeries
NIP: c000000020847ad4 LR: c00000002083b2dc CTR: 00000000006cd18c
REGS: c000000029162ca0 TRAP: 0300 Not tainted (6.4.0-Test102+)
MSR: 800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 48288244 XER: 00000008
CFAR: c00000002083b2d8 DAR: 0000000000000000 DSISR: 40000000 IRQMASK: 1
...
NIP _find_next_zero_bit+0x24/0x110
LR bitmap_find_next_zero_area_off+0x5c/0xe0
Call Trace:
dev_printk_emit+0x38/0x48 (unreliable)
iommu_area_alloc+0xc4/0x180
iommu_range_alloc+0x1e8/0x580
iommu_alloc+0x60/0x130
iommu_alloc_coherent+0x158/0x2b0
dma_iommu_alloc_coherent+0x3c/0x50
dma_alloc_attrs+0x170/0x1f0
mlx5_cmd_init+0xc0/0x760 [mlx5_core]
mlx5_function_setup+0xf0/0x510 [mlx5_core]
mlx5_init_one+0x84/0x210 [mlx5_core]
probe_one+0x118/0x2c0 [mlx5_core]
local_pci_probe+0x68/0x110
pci_call_probe+0x68/0x200
pci_device_probe+0xbc/0x1a0
really_probe+0x104/0x540
__driver_probe_device+0xb4/0x230
driver_probe_device+0x54/0x130
__driver_attach+0x158/0x2b0
bus_for_each_dev+0xa8/0x130
driver_attach+0x34/0x50
bus_add_driver+0x16c/0x300
driver_register+0xa4/0x1b0
__pci_register_driver+0x68/0x80
mlx5_init+0xb8/0x100 [mlx5_core]
do_one_initcall+0x60/0x300
do_init_module+0x7c/0x2b0
At the time of LPAR dump, before kexec hands over control to kdump
kernel, DDWs (Dynamic DMA Windows) are scanned and added to the FDT.
For the SR-IOV case, default DMA window "ibm,dma-window" is removed from
the FDT and DDW added, for the device.
Now, kexec hands over control to the kdump kernel.
When the kdump kernel initializes, PCI busses are scanned and IOMMU
group/tables created, in pci_dma_bus_setup_pSeriesLP(). For the SR-IOV
case, there is no "ibm,dma-window". The original commit: b1fc44eaa9,
fixes the path where memory is pre-mapped (direct mapped) to the DDW.
When TCEs are direct mapped, there is no need to initialize IOMMU
tables.
iommu_table_setparms_lpar() only considers "ibm,dma-window" property
when initiallizing IOMMU table. In the scenario where TCEs are
dynamically allocated for SR-IOV, newly created IOMMU table is not
initialized. Later, when the device driver tries to enter TCEs for the
SR-IOV device, NULL pointer execption is thrown from iommu_area_alloc().
The fix is to initialize the IOMMU table with DDW property stored in the
FDT. There are 2 points to remember:
1. For the dedicated adapter, kdump kernel would encounter both
default and DDW in FDT. In this case, DDW property is used to
initialize the IOMMU table.
2. A DDW could be direct or dynamic mapped. kdump kernel would
initialize IOMMU table and mark the existing DDW as
"dynamic". This works fine since, at the time of table
initialization, iommu_table_clear() makes some space in the
DDW, for some predefined number of TCEs which are needed for
kdump to succeed.
Fixes: b1fc44eaa9 ("pseries/iommu/ddw: Fix kdump to work in absence of ibm,dma-window")
Signed-off-by: Gaurav Batra <gbatra@linux.vnet.ibm.com>
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20240125203017.61014-1-gbatra@linux.ibm.com
Currently, mctp_local_output only takes ownership of skb on success, and
we may leak an skb if mctp_local_output fails in specific states; the
skb ownership isn't transferred until the actual output routing occurs.
Instead, make mctp_local_output free the skb on all error paths up to
the route action, so it always consumes the passed skb.
Fixes: 833ef3b91d ("mctp: Populate socket implementation")
Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240220081053.1439104-1-jk@codeconstruct.com.au
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2024-02-20 (ice)
This series contains updates to ice driver only.
Yochai sets parent device to properly reflect connection state between
source DPLL and output pin.
Arkadiusz fixes additional issues related to DPLL; proper reporting of
phase_adjust value and preventing use/access of data while resetting.
Amritha resolves ASSERT_RTNL() being triggered on certain reset/rebuild
flows.
* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
ice: Fix ASSERT_RTNL() warning during certain scenarios
ice: fix pin phase adjust updates on PF reset
ice: fix dpll periodic work data updates on PF reset
ice: fix dpll and dpll_pin data access on PF reset
ice: fix dpll input pin phase_adjust value updates
ice: fix connection state of DPLL and out pin
====================
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20240220214444.1039759-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
syzkaller triggered following kasan splat:
BUG: KASAN: use-after-free in __skb_flow_dissect+0x19d1/0x7a50 net/core/flow_dissector.c:1170
Read of size 1 at addr ffff88812fb4000e by task syz-executor183/5191
[..]
kasan_report+0xda/0x110 mm/kasan/report.c:588
__skb_flow_dissect+0x19d1/0x7a50 net/core/flow_dissector.c:1170
skb_flow_dissect_flow_keys include/linux/skbuff.h:1514 [inline]
___skb_get_hash net/core/flow_dissector.c:1791 [inline]
__skb_get_hash+0xc7/0x540 net/core/flow_dissector.c:1856
skb_get_hash include/linux/skbuff.h:1556 [inline]
ip_tunnel_xmit+0x1855/0x33c0 net/ipv4/ip_tunnel.c:748
ipip_tunnel_xmit+0x3cc/0x4e0 net/ipv4/ipip.c:308
__netdev_start_xmit include/linux/netdevice.h:4940 [inline]
netdev_start_xmit include/linux/netdevice.h:4954 [inline]
xmit_one net/core/dev.c:3548 [inline]
dev_hard_start_xmit+0x13d/0x6d0 net/core/dev.c:3564
__dev_queue_xmit+0x7c1/0x3d60 net/core/dev.c:4349
dev_queue_xmit include/linux/netdevice.h:3134 [inline]
neigh_connected_output+0x42c/0x5d0 net/core/neighbour.c:1592
...
ip_finish_output2+0x833/0x2550 net/ipv4/ip_output.c:235
ip_finish_output+0x31/0x310 net/ipv4/ip_output.c:323
..
iptunnel_xmit+0x5b4/0x9b0 net/ipv4/ip_tunnel_core.c:82
ip_tunnel_xmit+0x1dbc/0x33c0 net/ipv4/ip_tunnel.c:831
ipgre_xmit+0x4a1/0x980 net/ipv4/ip_gre.c:665
__netdev_start_xmit include/linux/netdevice.h:4940 [inline]
netdev_start_xmit include/linux/netdevice.h:4954 [inline]
xmit_one net/core/dev.c:3548 [inline]
dev_hard_start_xmit+0x13d/0x6d0 net/core/dev.c:3564
...
The splat occurs because skb->data points past skb->head allocated area.
This is because neigh layer does:
__skb_pull(skb, skb_network_offset(skb));
... but skb_network_offset() returns a negative offset and __skb_pull()
arg is unsigned. IOW, we skb->data gets "adjusted" by a huge value.
The negative value is returned because skb->head and skb->data distance is
more than 64k and skb->network_header (u16) has wrapped around.
The bug is in the ip_tunnel infrastructure, which can cause
dev->needed_headroom to increment ad infinitum.
The syzkaller reproducer consists of packets getting routed via a gre
tunnel, and route of gre encapsulated packets pointing at another (ipip)
tunnel. The ipip encapsulation finds gre0 as next output device.
This results in the following pattern:
1). First packet is to be sent out via gre0.
Route lookup found an output device, ipip0.
2).
ip_tunnel_xmit for gre0 bumps gre0->needed_headroom based on the future
output device, rt.dev->needed_headroom (ipip0).
3).
ip output / start_xmit moves skb on to ipip0. which runs the same
code path again (xmit recursion).
4).
Routing step for the post-gre0-encap packet finds gre0 as output device
to use for ipip0 encapsulated packet.
tunl0->needed_headroom is then incremented based on the (already bumped)
gre0 device headroom.
This repeats for every future packet:
gre0->needed_headroom gets inflated because previous packets' ipip0 step
incremented rt->dev (gre0) headroom, and ipip0 incremented because gre0
needed_headroom was increased.
For each subsequent packet, gre/ipip0->needed_headroom grows until
post-expand-head reallocations result in a skb->head/data distance of
more than 64k.
Once that happens, skb->network_header (u16) wraps around when
pskb_expand_head tries to make sure that skb_network_offset() is unchanged
after the headroom expansion/reallocation.
After this skb_network_offset(skb) returns a different (and negative)
result post headroom expansion.
The next trip to neigh layer (or anything else that would __skb_pull the
network header) makes skb->data point to a memory location outside
skb->head area.
v2: Cap the needed_headroom update to an arbitarily chosen upperlimit to
prevent perpetual increase instead of dropping the headroom increment
completely.
Reported-and-tested-by: syzbot+bfde3bef047a81b8fde6@syzkaller.appspotmail.com
Closes: https://groups.google.com/g/syzkaller-bugs/c/fL9G6GtWskY/m/VKk_PR5FBAAJ
Fixes: 243aad830e ("ip_gre: include route header_len in max_headroom calculation")
Signed-off-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240220135606.4939-1-fw@strlen.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
BUG: KMSAN: uninit-value in nla_validate_range_unsigned lib/nlattr.c:222 [inline]
BUG: KMSAN: uninit-value in nla_validate_int_range lib/nlattr.c:336 [inline]
BUG: KMSAN: uninit-value in validate_nla lib/nlattr.c:575 [inline]
BUG: KMSAN: uninit-value in __nla_validate_parse+0x2e20/0x45c0 lib/nlattr.c:631
nla_validate_range_unsigned lib/nlattr.c:222 [inline]
nla_validate_int_range lib/nlattr.c:336 [inline]
validate_nla lib/nlattr.c:575 [inline]
...
The message in question matches this policy:
[NFTA_TARGET_REV] = NLA_POLICY_MAX(NLA_BE32, 255),
but because NLA_BE32 size in minlen array is 0, the validation
code will read past the malformed (too small) attribute.
Note: Other attributes, e.g. BITFIELD32, SINT, UINT.. are also missing:
those likely should be added too.
Reported-by: syzbot+3f497b07aa3baf2fb4d0@syzkaller.appspotmail.com
Reported-by: xingwei lee <xrivendell7@gmail.com>
Closes: https://lore.kernel.org/all/CABOYnLzFYHSnvTyS6zGa-udNX55+izqkOt2sB9WDqUcEGW6n8w@mail.gmail.com/raw
Fixes: ecaf75ffd5 ("netlink: introduce bigendian integer types")
Signed-off-by: Florian Westphal <fw@strlen.de>
Link: https://lore.kernel.org/r/20240221172740.5092-1-fw@strlen.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This reports the currently used vram allocations.
userspace using this has been proposed for nvk, but
it's a rather trivial uapi addition.
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This returns the BAR resources size so userspace can make
decisions based on rebar support.
userspace using this has been proposed for nvk, but
it's a rather trivial uapi addition.
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
The new riscv specific arch_hugetlb_migration_supported() must be
guarded with a #ifdef CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION to avoid
the following build error:
In file included from include/linux/hugetlb.h:851,
from kernel/fork.c:52:
>> arch/riscv/include/asm/hugetlb.h:15:42: error: static declaration of 'arch_hugetlb_migration_supported' follows non-static declaration
15 | #define arch_hugetlb_migration_supported arch_hugetlb_migration_supported
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
include/linux/hugetlb.h:916:20: note: in expansion of macro 'arch_hugetlb_migration_supported'
916 | static inline bool arch_hugetlb_migration_supported(struct hstate *h)
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
arch/riscv/include/asm/hugetlb.h:14:6: note: previous declaration of 'arch_hugetlb_migration_supported' with type 'bool(struct hstate *)' {aka '_Bool(struct hstate *)'}
14 | bool arch_hugetlb_migration_supported(struct hstate *h);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202402110258.CV51JlEI-lkp@intel.com/
Fixes: ce68c03545 ("riscv: Fix arch_hugetlb_migration_supported() for NAPOT")
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240211083640.756583-1-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
CALLER_ADDRx returns caller's address at specified level, they are used
for several tracers. These macros eventually use
__builtin_return_address(n) to get the caller's address if arch doesn't
define their own implementation.
In RISC-V, __builtin_return_address(n) only works when n == 0, we need
to walk the stack frame to get the caller's address at specified level.
data.level started from 'level + 3' due to the call flow of getting
caller's address in RISC-V implementation. If we don't have additional
three iteration, the level is corresponding to follows:
callsite -> return_address -> arch_stack_walk -> walk_stackframe
| | | |
level 3 level 2 level 1 level 0
Fixes: 10626c32e3 ("riscv/ftrace: Add basic support")
Cc: stable@vger.kernel.org
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Zong Li <zong.li@sifive.com>
Link: https://lore.kernel.org/r/20240202015102.26251-1-zong.li@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
This single fix is also part of a larger cleanup, so I'm merging it
into my fixes branch so it can be shared with for-next.
* commit '8246601a7d391ce8207408149d65732f28af81a1':
riscv: tlb: fix __p*d_free_tlb()
Pull block fixes from Jens Axboe:
"Mostly just fixlets for md, but also a sed-opal parsing fix"
* tag 'block-6.8-2024-02-22' of git://git.kernel.dk/linux:
block: sed-opal: handle empty atoms when parsing response
md: Don't suspend the array for interrupted reshape
md: Don't register sync_thread for reshape directly
md: Make sure md_do_sync() will set MD_RECOVERY_DONE
md: Don't ignore read-only array in md_check_recovery()
md: Don't ignore suspended array in md_check_recovery()
md: Fix missing release of 'active_io' for flush
Pull iommufd fixes from Jason Gunthorpe:
- Fix dirty tracking bitmap collection when using reporting bitmaps
that are not neatly aligned to u64's or match the IO page table radix
tree layout.
- Add self tests to cover the cases that were found to be broken.
- Add missing enforcement of invalidation type in the uapi.
- Fix selftest config generation
* tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd:
selftests/iommu: fix the config fragment
iommufd: Reject non-zero data_type if no data_len is provided
iommufd/iova_bitmap: Consider page offset for the pages to be pinned
iommufd/selftest: Add mock IO hugepages tests
iommufd/selftest: Hugepage mock domain support
iommufd/selftest: Refactor mock_domain_read_and_clear_dirty()
iommufd/selftest: Refactor dirty bitmap tests
iommufd/iova_bitmap: Handle recording beyond the mapped pages
iommufd/selftest: Test u64 unaligned bitmaps
iommufd/iova_bitmap: Switch iova_bitmap::bitmap to an u8 array
iommufd/iova_bitmap: Bounds check mapped::pages access
Pull x86 platform driver fixes from Hans de Goede:
"Regression fixes:
- Fix INT0002 vGPIO events no longer working after 6.8 ACPI SCI
changes
- AMD-PMF: Fix laptops (e.g. Framework 13 AMD) hanging on suspend
- x86-android-tablets: Fix touchscreen no longer working on Lenovo
Yogabook
- x86-android-tablets: Fix serdev instantiation regression
- intel-vbtn: Fix ThinkPad X1 Tablet Gen2 no longer suspending
Bug fixes:
- think-lmi: Fix changing BIOS settings on Lenovo workstations
- touchscreen_dmi: Fix Hi8 Air touchscreen data sometimes missing
- AMD-PMF: Fix Smart PC support not working after suspend/resume
Other misc small fixes"
* tag 'platform-drivers-x86-v6.8-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
platform/x86: thinkpad_acpi: Only update profile if successfully converted
platform/x86: intel-vbtn: Stop calling "VBDL" from notify_handler
platform/x86: x86-android-tablets: Fix acer_b1_750_goodix_gpios name
platform/x86: x86-android-tablets: Fix serdev instantiation no longer working
platform/x86: Add new get_serdev_controller() helper
platform/x86: x86-android-tablets: Fix keyboard touchscreen on Lenovo Yogabook1 X90
platform/x86/amd/pmf: Fix a potential race with policy binary sideload
platform/x86/amd/pmf: Fixup error handling for amd_pmf_init_smart_pc()
platform/x86/amd/pmf: Add debugging message for missing policy data
platform/x86/amd/pmf: Fix a suspend hang on Framework 13
platform/x86/amd/pmf: Fix TEE enact command failure after suspend and resume
platform/x86/amd/pmf: Remove smart_pc_status enum
platform/x86: touchscreen_dmi: Consolidate Goodix upside-down touchscreen data
platform/x86: touchscreen_dmi: Allow partial (prefix) matches for ACPI names
platform/x86: intel: int0002_vgpio: Pass IRQF_ONESHOT to request_irq()
platform/x86: think-lmi: Fix password opcode ordering for workstations
Pull clk fixes from Stephen Boyd:
"Here are some Samsung clk driver fixes I've been sitting on for far
too long.
They fix the bindings and clk driver for the Google GS101 SoC"
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
clk: samsung: clk-gs101: comply with the new dt cmu_misc clock names
dt-bindings: clock: gs101: rename cmu_misc clock-names
Pull vfs fixes from Christian Brauner:
- Fix a memory leak in cachefiles
- Restrict aio cancellations to I/O submitted through the aio
interfaces as this is otherwise causing issues for I/O submitted
via io_uring
- Increase buffer for afs volume status to avoid overflow
- Fix a missing zero-length check in unbuffered writes in the
netfs library. If generic_write_checks() returns zero make
netfs_unbuffered_write_iter() return right away
- Prevent a leak in i_dio_count caused by netfs_begin_read() operating
past i_size. It will return early and leave i_dio_count incremented
- Account for ipv4 addresses as well as ipv6 addresses when processing
incoming callbacks in afs
* tag 'vfs-6.8-rc6.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
fs/aio: Restrict kiocb_set_cancel_fn() to I/O submitted via libaio
afs: Increase buffer size in afs_update_volume_status()
afs: Fix ignored callbacks over ipv4
cachefiles: fix memory leak in cachefiles_add_cache()
netfs: Fix missing zero-length check in unbuffered write
netfs: Fix i_dio_count leak on DIO read past i_size
Pull networking fixes from Paolo Abeni:
"Including fixes from bpf and netfilter.
Current release - regressions:
- af_unix: fix another unix GC hangup
Previous releases - regressions:
- core: fix a possible AF_UNIX deadlock
- bpf: fix NULL pointer dereference in sk_psock_verdict_data_ready()
- netfilter: nft_flow_offload: release dst in case direct xmit path
is used
- bridge: switchdev: ensure MDB events are delivered exactly once
- l2tp: pass correct message length to ip6_append_data
- dccp/tcp: unhash sk from ehash for tb2 alloc failure after
check_estalblished()
- tls: fixes for record type handling with PEEK
- devlink: fix possible use-after-free and memory leaks in
devlink_init()
Previous releases - always broken:
- bpf: fix an oops when attempting to read the vsyscall page through
bpf_probe_read_kernel
- sched: act_mirred: use the backlog for mirred ingress
- netfilter: nft_flow_offload: fix dst refcount underflow
- ipv6: sr: fix possible use-after-free and null-ptr-deref
- mptcp: fix several data races
- phonet: take correct lock to peek at the RX queue
Misc:
- handful of fixes and reliability improvements for selftests"
* tag 'net-6.8.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (72 commits)
l2tp: pass correct message length to ip6_append_data
net: phy: realtek: Fix rtl8211f_config_init() for RTL8211F(D)(I)-VD-CG PHY
selftests: ioam: refactoring to align with the fix
Fix write to cloned skb in ipv6_hop_ioam()
phonet/pep: fix racy skb_queue_empty() use
phonet: take correct lock to peek at the RX queue
net: sparx5: Add spinlock for frame transmission from CPU
net/sched: flower: Add lock protection when remove filter handle
devlink: fix port dump cmd type
net: stmmac: Fix EST offset for dwmac 5.10
tools: ynl: don't leak mcast_groups on init error
tools: ynl: make sure we always pass yarg to mnl_cb_run
net: mctp: put sock on tag allocation failure
netfilter: nf_tables: use kzalloc for hook allocation
netfilter: nf_tables: register hooks last when adding new chain/flowtable
netfilter: nft_flow_offload: release dst in case direct xmit path is used
netfilter: nft_flow_offload: reset dst in route object after setting up flow
netfilter: nf_tables: set dormant flag on hook register failure
selftests: tls: add test for peeking past a record of a different type
selftests: tls: add test for merging of same-type control messages
...
On Tegra186, secure world applications may need to access host1x
during suspend/resume, and rely on the kernel to keep Host1x out
of reset during the suspend cycle. As such, as a quirk,
skip asserting Host1x's reset on Tegra186.
We don't need to keep the clocks enabled, as BPMP ensures the clock
stays on while Host1x is being used. On newer SoC's, the reset line
is inaccessible, so there is no need for the quirk.
Fixes: b7c00cdf6d ("gpu: host1x: Enable system suspend callbacks")
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Reviewed-by: Jon Hunter <jonathanh@nvidia.com>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240222010517.1573931-1-cyndis@kapsi.fi