linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-04-03 04:05:33 -04:00

Author	SHA1	Message	Date
Miaoqian Lin	bf2b09fedc	fsl/fman: Fix missing put_device() call in fman_port_probe The reference taken by 'of_find_device_by_node()' must be released when not needed anymore. Add the corresponding 'put_device()' in the and error handling paths. Fixes: `18a6c85fcc` ("fsl/fman: Add FMan Port Support") Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-12-30 13:34:06 +00:00
Gal Pressman	992d8a4e38	net/mlx5e: Fix wrong features assignment in case of error In case of an error in mlx5e_set_features(), 'netdev->features' must be updated with the correct state of the device to indicate which features were updated successfully. To do that we maintain a copy of 'netdev->features' and update it after successful feature changes, so we can assign it to back to 'netdev->features' if needed. However, since not all netdev features are handled by the driver (e.g. GRO/TSO/etc), some features may not be updated correctly in case of an error updating another feature. For example, while requesting to disable TSO (feature which is not handled by the driver) and enable HW-GRO, if an error occurs during HW-GRO enable, 'oper_features' will be assigned with 'netdev->features' and HW-GRO turned off. TSO will remain enabled in such case, which is a bug. To solve that, instead of using 'netdev->features' as the baseline of 'oper_features' and changing it on set feature success, use 'features' instead and update it in case of errors. Fixes: `75b81ce719` ("net/mlx5e: Don't override netdev features field unless in error flow") Signed-off-by: Gal Pressman <gal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-12-28 22:42:50 -08:00
Roi Dayan	077cdda764	net/mlx5e: TC, Fix memory leak with rules with internal port Fix a memory leak with decap rule with internal port as destination device. The driver allocates a modify hdr action but doesn't set the flow attr modify hdr action which results in skipping releasing the modify hdr action when releasing the flow. backtrace: [<000000005f8c651c>] krealloc+0x83/0xd0 [<000000009f59b143>] alloc_mod_hdr_actions+0x156/0x310 [mlx5_core] [<000000002257f342>] mlx5e_tc_match_to_reg_set_and_get_id+0x12a/0x360 [mlx5_core] [<00000000b44ea75a>] mlx5e_tc_add_fdb_flow+0x962/0x1470 [mlx5_core] [<0000000003e384a0>] __mlx5e_add_fdb_flow+0x54c/0xb90 [mlx5_core] [<00000000ed8b22b6>] mlx5e_configure_flower+0xe45/0x4af0 [mlx5_core] [<00000000024f4ab5>] mlx5e_rep_indr_offload.isra.0+0xfe/0x1b0 [mlx5_core] [<000000006c3bb494>] mlx5e_rep_indr_setup_tc_cb+0x90/0x130 [mlx5_core] [<00000000d3dac2ea>] tc_setup_cb_add+0x1d2/0x420 Fixes: `b16eb3c81f` ("net/mlx5: Support internal port as decap route device") Signed-off-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-12-28 22:42:50 -08:00
Jakub Kicinski	9665e03a8d	Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2021-12-28 This series contains updates to igc driver only. Vinicius disables support for crosstimestamp on i225-V as lockups are being observed. James McLaughlin fixes Tx timestamping support on non-MSI-X platforms. * '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: igc: Fix TX timestamp support for non-MSI-X platforms igc: Do not enable crosstimestamping for i225-V models ==================== Link: https://lore.kernel.org/r/20211228182421.340354-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-12-28 16:19:10 -08:00
Christophe JAILLET	140c7bc7d1	ionic: Initialize the 'lif->dbid_inuse' bitmap When allocated, this bitmap is not initialized. Only the first bit is set a few lines below. Use bitmap_zalloc() to make sure that it is cleared before being used. Fixes: `6461b446f2` ("ionic: Add interrupts and doorbells") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Shannon Nelson <snelson@pensando.io> Link: https://lore.kernel.org/r/6a478eae0b5e6c63774e1f0ddb1a3f8c38fa8ade.1640527506.git.christophe.jaillet@wanadoo.fr Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-12-28 16:07:35 -08:00
James McLaughlin	f85846bbf4	igc: Fix TX timestamp support for non-MSI-X platforms Time synchronization was not properly enabled on non-MSI-X platforms. Fixes: `2c344ae245` ("igc: Add support for TX timestamping") Signed-off-by: James McLaughlin <james.mclaughlin@qsc.com> Reviewed-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Tested-by: Nechama Kraus <nechamax.kraus@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-12-28 09:54:11 -08:00
Vinicius Costa Gomes	1e81dcc1ab	igc: Do not enable crosstimestamping for i225-V models It was reported that when PCIe PTM is enabled, some lockups could be observed with some integrated i225-V models. While the issue is investigated, we can disable crosstimestamp for those models and see no loss of functionality, because those models don't have any support for time synchronization. Fixes: `a90ec84837` ("igc: Add support for PTP getcrosststamp()") Link: https://lore.kernel.org/all/924175a188159f4e03bd69908a91e606b574139b.camel@gmx.de/ Reported-by: Stefan Dietrich <roots@gmx.de> Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Tested-by: Nechama Kraus <nechamax.kraus@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-12-28 09:54:10 -08:00
Aleksander Jan Bajkowski	5be60a9453	net: lantiq_xrx200: fix statistics of received bytes Received frames have FCS truncated. There is no need to subtract FCS length from the statistics. Fixes: `fe1a56420c` ("net: lantiq: Add Lantiq / Intel VRX200 Ethernet driver") Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-12-28 12:18:18 +00:00
Christophe JAILLET	1cd5384c88	net: ag71xx: Fix a potential double free in error handling paths 'ndev' is a managed resource allocated with devm_alloc_etherdev(), so there is no need to call free_netdev() explicitly or there will be a double free(). Simplify all error handling paths accordingly. Fixes: `d51b6ce441` ("net: ethernet: add ag71xx driver") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-12-28 12:16:34 +00:00
Zekun Shen	5f50153288	atlantic: Fix buff_ring OOB in aq_ring_rx_clean The function obtain the next buffer without boundary check. We should return with I/O error code. The bug is found by fuzzing and the crash report is attached. It is an OOB bug although reported as use-after-free. [ 4.804724] BUG: KASAN: use-after-free in aq_ring_rx_clean+0x1e88/0x2730 [atlantic] [ 4.805661] Read of size 4 at addr ffff888034fe93a8 by task ksoftirqd/0/9 [ 4.806505] [ 4.806703] CPU: 0 PID: 9 Comm: ksoftirqd/0 Tainted: G W 5.6.0 #34 [ 4.809030] Call Trace: [ 4.809343] dump_stack+0x76/0xa0 [ 4.809755] print_address_description.constprop.0+0x16/0x200 [ 4.810455] ? aq_ring_rx_clean+0x1e88/0x2730 [atlantic] [ 4.811234] ? aq_ring_rx_clean+0x1e88/0x2730 [atlantic] [ 4.813183] __kasan_report.cold+0x37/0x7c [ 4.813715] ? aq_ring_rx_clean+0x1e88/0x2730 [atlantic] [ 4.814393] kasan_report+0xe/0x20 [ 4.814837] aq_ring_rx_clean+0x1e88/0x2730 [atlantic] [ 4.815499] ? hw_atl_b0_hw_ring_rx_receive+0x9a5/0xb90 [atlantic] [ 4.816290] aq_vec_poll+0x179/0x5d0 [atlantic] [ 4.816870] ? _GLOBAL__sub_I_65535_1_aq_pci_func_init+0x20/0x20 [atlantic] [ 4.817746] ? __next_timer_interrupt+0xba/0xf0 [ 4.818322] net_rx_action+0x363/0xbd0 [ 4.818803] ? call_timer_fn+0x240/0x240 [ 4.819302] ? __switch_to_asm+0x40/0x70 [ 4.819809] ? napi_busy_loop+0x520/0x520 [ 4.820324] __do_softirq+0x18c/0x634 [ 4.820797] ? takeover_tasklets+0x5f0/0x5f0 [ 4.821343] run_ksoftirqd+0x15/0x20 [ 4.821804] smpboot_thread_fn+0x2f1/0x6b0 [ 4.822331] ? smpboot_unregister_percpu_thread+0x160/0x160 [ 4.823041] ? __kthread_parkme+0x80/0x100 [ 4.823571] ? smpboot_unregister_percpu_thread+0x160/0x160 [ 4.824301] kthread+0x2b5/0x3b0 [ 4.824723] ? kthread_create_on_node+0xd0/0xd0 [ 4.825304] ret_from_fork+0x35/0x40 Signed-off-by: Zekun Shen <bruceshenzk@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-12-27 14:49:53 +00:00
Jakub Kicinski	6f6f0ac664	Merge tag 'mlx5-fixes-2021-12-22' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5 fixes 2021-12-22 This series provides bug fixes to mlx5 driver. * tag 'mlx5-fixes-2021-12-22' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux: net/mlx5: Fix some error handling paths in 'mlx5e_tc_add_fdb_flow()' net/mlx5e: Delete forward rule for ct or sample action net/mlx5e: Fix ICOSQ recovery flow for XSK net/mlx5e: Fix interoperability between XSK and ICOSQ recovery flow net/mlx5e: Fix skb memory leak when TC classifier action offloads are disabled net/mlx5e: Wrap the tx reporter dump callback to extract the sq net/mlx5: Fix tc max supported prio for nic mode net/mlx5: Fix SF health recovery flow net/mlx5: Fix error print in case of IRQ request failed net/mlx5: Use first online CPU instead of hard coded CPU net/mlx5: DR, Fix querying eswitch manager vport for ECPF net/mlx5: DR, Fix NULL vs IS_ERR checking in dr_domain_init_resources ==================== Link: https://lore.kernel.org/r/20211223190441.153012-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-12-23 19:04:33 -08:00
Nobuhiro Iwamatsu	391e5975c0	net: stmmac: dwmac-visconti: Fix value of ETHER_CLK_SEL_FREQ_SEL_2P5M ETHER_CLK_SEL_FREQ_SEL_2P5M is not 0 bit of the register. This is a value, which is 0. Fix from BIT(0) to 0. Reported-by: Yuji Ishikawa <yuji2.ishikawa@toshiba.co.jp> Fixes: `b38dd98ff8` ("net: stmmac: Add Toshiba Visconti SoCs glue driver") Signed-off-by: Nobuhiro Iwamatsu <nobuhiro1.iwamatsu@toshiba.co.jp> Link: https://lore.kernel.org/r/20211223073633.101306-1-nobuhiro1.iwamatsu@toshiba.co.jp Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-12-23 09:58:13 -08:00
Xiaoliang Yang	eccffcf465	net: stmmac: ptp: fix potentially overflowing expression Convert the u32 variable to type u64 in a context where expression of type u64 is required to avoid potential overflow. Fixes: `e9e3720002` ("net: stmmac: ptp: update tas basetime after ptp adjust") Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com> Link: https://lore.kernel.org/r/20211223073928.37371-1-xiaoliang.yang_1@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-12-23 09:45:55 -08:00
Christophe JAILLET	4390c6edc0	net/mlx5: Fix some error handling paths in 'mlx5e_tc_add_fdb_flow()' All the error handling paths of 'mlx5e_tc_add_fdb_flow()' end to 'err_out' where 'flow_flag_set(flow, FAILED);' is called. All but the new error handling paths added by the commits given in the Fixes tag below. Fix these error handling paths and branch to 'err_out'. Fixes: `166f431ec6` ("net/mlx5e: Add indirect tc offload of ovs internal port") Fixes: `b16eb3c81f` ("net/mlx5: Support internal port as decap route device") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> (cherry picked from commit `31108d142f`)	2021-12-22 20:38:49 -08:00
Chris Mi	2820110d94	net/mlx5e: Delete forward rule for ct or sample action When there is ct or sample action, the ct or sample rule will be deleted and return. But if there is an extra mirror action, the forward rule can't be deleted because of the return. Fix it by removing the return. Fixes: `69e2916ebc` ("net/mlx5: CT: Add support for mirroring") Fixes: `f94d6389f6` ("net/mlx5e: TC, Add support to offload sample action") Signed-off-by: Chris Mi <cmi@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-12-22 20:38:49 -08:00
Maxim Mikityanskiy	19c4aba2d4	net/mlx5e: Fix ICOSQ recovery flow for XSK There are two ICOSQs per channel: one is needed for RX, and the other for async operations (XSK TX, kTLS offload). Currently, the recovery flow for both is the same, and async ICOSQ is mistakenly treated like the regular ICOSQ. This patch prevents running the regular ICOSQ recovery on async ICOSQ. The purpose of async ICOSQ is to handle XSK wakeup requests and post kTLS offload RX parameters, it has nothing to do with RQ and XSKRQ UMRs, so the regular recovery sequence is not applicable here. Fixes: `be5323c837` ("net/mlx5e: Report and recover from CQE error on ICOSQ") Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Reviewed-by: Aya Levin <ayal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-12-22 20:38:48 -08:00
Maxim Mikityanskiy	17958d7cd7	net/mlx5e: Fix interoperability between XSK and ICOSQ recovery flow Both regular RQ and XSKRQ use the same ICOSQ for UMRs. When doing recovery for the ICOSQ, don't forget to deactivate XSKRQ. XSK can be opened and closed while channels are active, so a new mutex prevents the ICOSQ recovery from running at the same time. The ICOSQ recovery deactivates and reactivates XSKRQ, so any parallel change in XSK state would break consistency. As the regular RQ is running, it's not enough to just flush the recovery work, because it can be rescheduled. Fixes: `be5323c837` ("net/mlx5e: Report and recover from CQE error on ICOSQ") Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-12-22 20:38:48 -08:00
Gal Pressman	a0cb909644	net/mlx5e: Fix skb memory leak when TC classifier action offloads are disabled When TC classifier action offloads are disabled (CONFIG_MLX5_CLS_ACT in Kconfig), the mlx5e_rep_tc_receive() function which is responsible for passing the skb to the stack (or freeing it) is defined as a nop, and results in leaking the skb memory. Replace the nop with a call to napi_gro_receive() to resolve the leak. Fixes: `28e7606fa8` ("net/mlx5e: Refactor rx handler of represetor device") Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Ariel Levkovich <lariel@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-12-22 20:38:48 -08:00
Amir Tzin	918fc3855a	net/mlx5e: Wrap the tx reporter dump callback to extract the sq Function mlx5e_tx_reporter_dump_sq() casts its void * argument to struct mlx5e_txqsq , but in TX-timeout-recovery flow the argument is actually of type struct mlx5e_tx_timeout_ctx . mlx5_core 0000:08:00.1 enp8s0f1: TX timeout detected mlx5_core 0000:08:00.1 enp8s0f1: TX timeout on queue: 1, SQ: 0x11ec, CQ: 0x146d, SQ Cons: 0x0 SQ Prod: 0x1, usecs since last trans: 21565000 BUG: stack guard page was hit at 0000000093f1a2de (stack is 00000000b66ea0dc..000000004d932dae) kernel stack overflow (page fault): 0000 [#1] SMP NOPTI CPU: 5 PID: 95 Comm: kworker/u20:1 Tainted: G W OE 5.13.0_mlnx #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 Workqueue: mlx5e mlx5e_tx_timeout_work [mlx5_core] RIP: 0010:mlx5e_tx_reporter_dump_sq+0xd3/0x180 [mlx5_core] Call Trace: mlx5e_tx_reporter_dump+0x43/0x1c0 [mlx5_core] devlink_health_do_dump.part.91+0x71/0xd0 devlink_health_report+0x157/0x1b0 mlx5e_reporter_tx_timeout+0xb9/0xf0 [mlx5_core] ? mlx5e_tx_reporter_err_cqe_recover+0x1d0/0x1d0 [mlx5_core] ? mlx5e_health_queue_dump+0xd0/0xd0 [mlx5_core] ? update_load_avg+0x19b/0x550 ? set_next_entity+0x72/0x80 ? pick_next_task_fair+0x227/0x340 ? finish_task_switch+0xa2/0x280 mlx5e_tx_timeout_work+0x83/0xb0 [mlx5_core] process_one_work+0x1de/0x3a0 worker_thread+0x2d/0x3c0 ? process_one_work+0x3a0/0x3a0 kthread+0x115/0x130 ? kthread_park+0x90/0x90 ret_from_fork+0x1f/0x30 --[ end trace 51ccabea504edaff ]--- RIP: 0010:mlx5e_tx_reporter_dump_sq+0xd3/0x180 PKRU: 55555554 Kernel panic - not syncing: Fatal exception Kernel Offset: disabled end Kernel panic - not syncing: Fatal exception To fix this bug add a wrapper for mlx5e_tx_reporter_dump_sq() which extracts the sq from struct mlx5e_tx_timeout_ctx and set it as the TX-timeout-recovery flow dump callback. Fixes: `5f29458b77` ("net/mlx5e: Support dump callback in TX reporter") Signed-off-by: Aya Levin <ayal@nvidia.com> Signed-off-by: Amir Tzin <amirtz@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-12-22 20:38:48 -08:00
Chris Mi	d671e109bd	net/mlx5: Fix tc max supported prio for nic mode Only prio 1 is supported if firmware doesn't support ignore flow level for nic mode. The offending commit removed the check wrongly. Add it back. Fixes: `9a99c8f125` ("net/mlx5e: E-Switch, Offload all chain 0 priorities when modify header and forward action is not supported") Signed-off-by: Chris Mi <cmi@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-12-22 20:38:47 -08:00
Moshe Shemesh	33de865f7b	net/mlx5: Fix SF health recovery flow SF do not directly control the PCI device. During recovery flow SF should not be allowed to do pci disable or pci reset, its PF will do it. It fixes the following kernel trace: mlx5_core.sf mlx5_core.sf.25: mlx5_health_try_recover:387:(pid 40948): starting health recovery flow mlx5_core 0000:03:00.0: mlx5_pci_slot_reset was called mlx5_core 0000:03:00.0: wait vital counter value 0xab175 after 1 iterations mlx5_core.sf mlx5_core.sf.25: firmware version: 24.32.532 mlx5_core.sf mlx5_core.sf.23: mlx5_health_try_recover:387:(pid 40946): starting health recovery flow mlx5_core 0000:03:00.0: mlx5_pci_slot_reset was called mlx5_core 0000:03:00.0: wait vital counter value 0xab193 after 1 iterations mlx5_core.sf mlx5_core.sf.23: firmware version: 24.32.532 mlx5_core.sf mlx5_core.sf.25: mlx5_cmd_check:813:(pid 40948): ENABLE_HCA(0x104) op_mod(0x0) failed, status bad resource state(0x9), syndrome (0x658908) mlx5_core.sf mlx5_core.sf.25: mlx5_function_setup:1292:(pid 40948): enable hca failed mlx5_core.sf mlx5_core.sf.25: mlx5_health_try_recover:389:(pid 40948): health recovery failed Fixes: `1958fc2f07` ("net/mlx5: SF, Add auxiliary device driver") Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-12-22 20:38:47 -08:00
Shay Drory	aa968f9220	net/mlx5: Fix error print in case of IRQ request failed In case IRQ layer failed to find or to request irq, the driver is printing the first cpu of the provided affinity as part of the error print. Empty affinity is a valid input for the IRQ layer, and it is an error to call cpumask_first() on empty affinity. Remove the first cpu print from the error message. Fixes: `c36326d38d` ("net/mlx5: Round-Robin EQs over IRQs") Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-12-22 20:38:47 -08:00
Shay Drory	26a7993c93	net/mlx5: Use first online CPU instead of hard coded CPU Hard coded CPU (0 in our case) might be offline. Hence, use the first online CPU instead. Fixes: `f891b7cdbd` ("net/mlx5: Enable single IRQ for PCI Function") Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-12-22 20:38:46 -08:00
Yevgeny Kliteynik	624bf42c2e	net/mlx5: DR, Fix querying eswitch manager vport for ECPF On BlueField the E-Switch manager is the ECPF (vport 0xFFFE), but when querying capabilities of ECPF eswitch manager, need to query vport 0 with other_vport = 0. Fixes: `9091b821aa` ("net/mlx5: DR, Handle eswitch manager and uplink vports separately") Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Alex Vesker <valex@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-12-22 20:38:46 -08:00
Miaoqian Lin	6b8b425858	net/mlx5: DR, Fix NULL vs IS_ERR checking in dr_domain_init_resources The mlx5_get_uars_page() function returns error pointers. Using IS_ERR() to check the return value to fix this. Fixes: `4ec9e7b026` ("net/mlx5: DR, Expose steering domain functionality") Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-12-22 20:38:46 -08:00
Jiasheng Jiang	9b8bdd1eb5	sfc: falcon: Check null pointer of rx_queue->page_ring Because of the possible failure of the kcalloc, it should be better to set rx_queue->page_ptr_mask to 0 when it happens in order to maintain the consistency. Fixes: `5a6681e22c` ("sfc: separate out SFC4000 ("Falcon") support into new sfc-falcon driver") Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn> Acked-by: Martin Habets <habetsm.xilinx@gmail.com> Link: https://lore.kernel.org/r/20211220140344.978408-1-jiasheng@iscas.ac.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-12-22 12:25:18 -08:00
Jiasheng Jiang	bdf1b5c388	sfc: Check null pointer of rx_queue->page_ring Because of the possible failure of the kcalloc, it should be better to set rx_queue->page_ptr_mask to 0 when it happens in order to maintain the consistency. Fixes: `5a6681e22c` ("sfc: separate out SFC4000 ("Falcon") support into new sfc-falcon driver") Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn> Acked-by: Martin Habets <habetsm.xilinx@gmail.com> Link: https://lore.kernel.org/r/20211220135603.954944-1-jiasheng@iscas.ac.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-12-22 12:23:18 -08:00
Jiasheng Jiang	99d7fbb5ce	net: ks8851: Check for error irq Because platform_get_irq() could fail and return error irq. Therefore, it might be better to check it if order to avoid the use of error irq. Fixes: `797047f875` ("net: ks8851: Implement Parallel bus operations") Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-12-22 10:23:50 +00:00
Jiasheng Jiang	cb93b3e11d	drivers: net: smc911x: Check for error irq Because platform_get_irq() could fail and return error irq. Therefore, it might be better to check it if order to avoid the use of error irq. Fixes: `ae150435b5` ("smsc: Move the SMC (SMSC) drivers") Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-12-22 10:23:03 +00:00
Heiner Kallweit	ac8c58f5b5	igb: fix deadlock caused by taking RTNL in RPM resume path Recent net core changes caused an issue with few Intel drivers (reportedly igb), where taking RTNL in RPM resume path results in a deadlock. See [0] for a bug report. I don't think the core changes are wrong, but taking RTNL in RPM resume path isn't needed. The Intel drivers are the only ones doing this. See [1] for a discussion on the issue. Following patch changes the RPM resume path to not take RTNL. [0] https://bugzilla.kernel.org/show_bug.cgi?id=215129 [1] https://lore.kernel.org/netdev/20211125074949.5f897431@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com/t/ Fixes: `bd869245a3` ("net: core: try to runtime-resume detached device in __dev_open") Fixes: `f32a213765` ("ethtool: runtime-resume netdev parent before ethtool ioctl ops") Tested-by: Martin Stolpe <martin.stolpe@gmail.com> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20211220201844.2714498-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-12-20 18:48:09 -08:00
Jeroen de Borst	1f06f7d97f	gve: Correct order of processing device options The legacy raw addressing device option was processed before the new RDA queue format option. This caused the supported features mask, which is provided only on the RDA queue format option, not to be set. This disabled jumbo-frame support when using raw adressing. Fixes: `255489f5b3` ("gve: Add a jumbo-frame device option") Signed-off-by: Jeroen de Borst <jeroendb@google.com> Link: https://lore.kernel.org/r/20211220192746.2900594-1-jeroendb@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-12-20 18:47:56 -08:00
Jiasheng Jiang	60ec7fcfe7	qlcnic: potential dereference null pointer of rx_queue->page_ring The return value of kcalloc() needs to be checked. To avoid dereference of null pointer in case of the failure of alloc. Therefore, it might be better to change the return type of qlcnic_sriov_alloc_vlans() and return -ENOMEM when alloc fails and return 0 the others. Also, qlcnic_sriov_set_guest_vlan_mode() and __qlcnic_pci_sriov_enable() should deal with the return value of qlcnic_sriov_alloc_vlans(). Fixes: `154d0c810c` ("qlcnic: VLAN enhancement for 84XX adapters") Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-12-18 12:37:12 +00:00
David S. Miller	aa3cc8a9e4	Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2021-12-17 Maciej Fijalkowski says: It seems that previous [0] Rx fix was not enough and there are still issues with AF_XDP Rx ZC support in ice driver. Elza reported that for multiple XSK sockets configured on a single netdev, some of them were becoming dead after a while. We have spotted more things that needed to be addressed this time. More of information can be found in particular commit messages. It also carries Alexandr's patch that was sent previously which was overlapping with this set. [0]: https://lore.kernel.org/bpf/20211129231746.2767739-1-anthony.l.nguyen@intel.com/ ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-12-18 12:19:15 +00:00
Yevhen Orlov	2efc2256fe	net: marvell: prestera: fix incorrect structure access In line: upper = info->upper_dev; We access upper_dev field, which is related only for particular events (e.g. event == NETDEV_CHANGEUPPER). So, this line cause invalid memory access for another events, when ptr is not netdev_notifier_changeupper_info. The KASAN logs are as follows: [ 30.123165] BUG: KASAN: stack-out-of-bounds in prestera_netdev_port_event.constprop.0+0x68/0x538 [prestera] [ 30.133336] Read of size 8 at addr ffff80000cf772b0 by task udevd/778 [ 30.139866] [ 30.141398] CPU: 0 PID: 778 Comm: udevd Not tainted 5.16.0-rc3 #6 [ 30.147588] Hardware name: DNI AmazonGo1 A7040 board (DT) [ 30.153056] Call trace: [ 30.155547] dump_backtrace+0x0/0x2c0 [ 30.159320] show_stack+0x18/0x30 [ 30.162729] dump_stack_lvl+0x68/0x84 [ 30.166491] print_address_description.constprop.0+0x74/0x2b8 [ 30.172346] kasan_report+0x1e8/0x250 [ 30.176102] __asan_load8+0x98/0xe0 [ 30.179682] prestera_netdev_port_event.constprop.0+0x68/0x538 [prestera] [ 30.186847] prestera_netdev_event_handler+0x1b4/0x1c0 [prestera] [ 30.193313] raw_notifier_call_chain+0x74/0xa0 [ 30.197860] call_netdevice_notifiers_info+0x68/0xc0 [ 30.202924] register_netdevice+0x3cc/0x760 [ 30.207190] register_netdev+0x24/0x50 [ 30.211015] prestera_device_register+0x8a0/0xba0 [prestera] Fixes: `3d5048cc54` ("net: marvell: prestera: move netdev topology validation to prestera_main") Signed-off-by: Yevhen Orlov <yevhen.orlov@plvision.eu> Link: https://lore.kernel.org/r/20211216171714.11341-1-yevhen.orlov@plvision.eu Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-12-17 19:19:09 -08:00
Yevhen Orlov	8b681bd7c3	net: marvell: prestera: fix incorrect return of port_find In case, when some ports is in list and we don't find requested - we return last iterator state and not return NULL as expected. Fixes: `501ef3066c` ("net: marvell: prestera: Add driver for Prestera family ASIC devices") Signed-off-by: Yevhen Orlov <yevhen.orlov@plvision.eu> Link: https://lore.kernel.org/r/20211216170736.8851-1-yevhen.orlov@plvision.eu Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-12-17 19:19:00 -08:00
Aleksander Jan Bajkowski	1488fc2045	net: lantiq_xrx200: increase buffer reservation If the user sets a lower mtu on the CPU port than on the switch, then DMA inserts a few more bytes into the buffer than expected. In the worst case, it may exceed the size of the buffer. The experiments showed that the buffer should be a multiple of the burst length value. This patch rounds the length of the rx buffer upwards and fixes this bug. The reservation of FCS space in the buffer has been removed as PMAC strips the FCS. Fixes: `998ac35801` ("net: lantiq: add support for jumbo frames") Reported-by: Thomas Nixon <tom@tomn.co.uk> Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-12-17 19:18:29 -08:00
Maciej Fijalkowski	dcbaf72aa4	ice: xsk: fix cleaned_count setting Currently cleaned_count is initialized to ICE_DESC_UNUSED(rx_ring) and later on during the Rx processing it is incremented per each frame that driver consumed. This can result in excessive buffers requested from xsk pool based on that value. To address this, just drop cleaned_count and pass ICE_DESC_UNUSED(rx_ring) directly as a function argument to ice_alloc_rx_bufs_zc(). Idea is to ask for buffers as many as consumed. Let us also call ice_alloc_rx_bufs_zc unconditionally at the end of ice_clean_rx_irq_zc. This has been changed in that way for corresponding ice_clean_rx_irq, but not here. Fixes: `2d4238f556` ("ice: Add support for AF_XDP") Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-12-17 11:18:21 -08:00
Maciej Fijalkowski	8bea15ab74	ice: xsk: allow empty Rx descriptors on XSK ZC data path Commit `ac6f733a7b` ("ice: allow empty Rx descriptors") stated that ice HW can produce empty descriptors that are valid and they should be processed. Add this support to xsk ZC path to avoid potential processing problems. Fixes: `2d4238f556` ("ice: Add support for AF_XDP") Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-12-17 11:16:55 -08:00
Maciej Fijalkowski	8b51a13c37	ice: xsk: do not clear status_error0 for ntu + nb_buffs descriptor The descriptor that ntu is pointing at when we exit ice_alloc_rx_bufs_zc() should not have its corresponding DD bit cleared as descriptor is not allocated in there and it is not valid for HW usage. The allocation routine at the entry will fill the descriptor that ntu points to after it was set to ntu + nb_buffs on previous call. Even the spec says: "The tail pointer should be set to one descriptor beyond the last empty descriptor in host descriptor ring." Therefore, step away from clearing the status_error0 on ntu + nb_buffs descriptor. Fixes: `db804cfc21` ("ice: Use the xsk batched rx allocation interface") Reported-by: Elza Mathew <elza.mathew@intel.com> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-12-17 11:15:15 -08:00
Alexander Lobakin	0708b6facb	ice: remove dead store on XSK hotpath The 'if (ntu == rx_ring->count)' block in ice_alloc_rx_buffers_zc() was previously residing in the loop, but after introducing the batched interface it is used only to wrap-around the NTU descriptor, thus no more need to assign 'xdp'. Fixes: `db804cfc21` ("ice: Use the xsk batched rx allocation interface") Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-12-17 11:12:05 -08:00
Maciej Fijalkowski	617f3e1b58	ice: xsk: allocate separate memory for XDP SW ring Currently, the zero-copy data path is reusing the memory region that was initially allocated for an array of struct ice_rx_buf for its own purposes. This is error prone as it is based on the ice_rx_buf struct always being the same size or bigger than what the zero-copy path needs. There can also be old values present in that array giving rise to errors when the zero-copy path uses it. Fix this by freeing the ice_rx_buf region and allocating a new array for the zero-copy path that has the right length and is initialized to zero. Fixes: `57f7f8b6bc` ("ice: Use xdp_buf instead of rx_buf for xsk zero-copy") Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-12-17 11:09:04 -08:00
Maciej Fijalkowski	afe8a3ba85	ice: xsk: return xsk buffers back to pool when cleaning the ring Currently we only NULL the xdp_buff pointer in the internal SW ring but we never give it back to the xsk buffer pool. This means that buffers can be leaked out of the buff pool and never be used again. Add missing xsk_buff_free() call to the routine that is supposed to clean the entries that are left in the ring so that these buffers in the umem can be used by other sockets. Also, only go through the space that is actually left to be cleaned instead of a whole ring. Fixes: `2d4238f556` ("ice: Add support for AF_XDP") Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-12-17 11:07:04 -08:00
Florian Fainelli	8b8e6e7824	net: systemport: Add global locking for descriptor lifecycle The descriptor list is a shared resource across all of the transmit queues, and the locking mechanism used today only protects concurrency across a given transmit queue between the transmit and reclaiming. This creates an opportunity for the SYSTEMPORT hardware to work on corrupted descriptors if we have multiple producers at once which is the case when using multiple transmit queues. This was particularly noticeable when using multiple flows/transmit queues and it showed up in interesting ways in that UDP packets would get a correct UDP header checksum being calculated over an incorrect packet length. Similarly TCP packets would get an equally correct checksum computed by the hardware over an incorrect packet length. The SYSTEMPORT hardware maintains an internal descriptor list that it re-arranges when the driver produces a new descriptor anytime it writes to the WRITE_PORT_{HI,LO} registers, there is however some delay in the hardware to re-organize its descriptors and it is possible that concurrent TX queues eventually break this internal allocation scheme to the point where the length/status part of the descriptor gets used for an incorrect data buffer. The fix is to impose a global serialization for all TX queues in the short section where we are writing to the WRITE_PORT_{HI,LO} registers which solves the corruption even with multiple concurrent TX queues being used. Fixes: `80105befdb` ("net: systemport: add Broadcom SYSTEMPORT Ethernet MAC driver") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20211215202450.4086240-1-f.fainelli@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-12-16 08:15:31 -08:00
Jiasheng Jiang	407ecd1bd7	sfc_ef100: potential dereference of null pointer The return value of kmalloc() needs to be checked. To avoid use in efx_nic_update_stats() in case of the failure of alloc. Fixes: `b593b6f1b4` ("sfc_ef100: statistics gathering") Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn> Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-12-16 10:55:32 +00:00
John Keeping	0546b224cc	net: stmmac: dwmac-rk: fix oob read in rk_gmac_setup KASAN reports an out-of-bounds read in rk_gmac_setup on the line: while (ops->regs[i]) { This happens for most platforms since the regs flexible array member is empty, so the memory after the ops structure is being read here. It seems that mostly this happens to contain zero anyway, so we get lucky and everything still works. To avoid adding redundant data to nearly all the ops structures, add a new flag to indicate whether the regs field is valid and avoid this loop when it is not. Fixes: `3bb3d6b1c1` ("net: stmmac: Add RK3566/RK3568 SoC support") Signed-off-by: John Keeping <john@metanate.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-12-16 10:47:48 +00:00
David S. Miller	6209dd778f	Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2021-12-15 This series contains updates to igb, igbvf, igc and ixgbe drivers. Karen moves checks for invalid VF MAC filters to occur earlier for igb. Letu Ren fixes a double free issue in igbvf probe. Sasha fixes incorrect min value being used when calculating for max for igc. Robert Schlabbach adds documentation on enabling NBASE-T support for ixgbe. Cyril Novikov adds missing initialization of MDIO bus speed for ixgbe. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-12-16 10:27:12 +00:00
Ioana Ciornei	972ce7e380	dpaa2-eth: fix ethtool statistics Unfortunately, with the blamed commit I also added a side effect in the ethtool stats shown. Because I added two more fields in the per channel structure without verifying if its size is used in any way, part of the ethtool statistics were off by 2. Fix this by not looking up the size of the structure but instead on a fixed value kept in a macro. Fixes: `fc398bec03` ("net: dpaa2: add adaptive interrupt coalescing") Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Link: https://lore.kernel.org/r/20211215105831.290070-1-ioana.ciornei@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-12-15 17:48:54 -08:00
Cyril Novikov	bf0a375055	ixgbe: set X550 MDIO speed before talking to PHY The MDIO bus speed must be initialized before talking to the PHY the first time in order to avoid talking to it using a speed that the PHY doesn't support. This fixes HW initialization error -17 (IXGBE_ERR_PHY_ADDR_INVALID) on Denverton CPUs (a.k.a. the Atom C3000 family) on ports with a 10Gb network plugged in. On those devices, HLREG0[MDCSPD] resets to 1, which combined with the 10Gb network results in a 24MHz MDIO speed, which is apparently too fast for the connected PHY. PHY register reads over MDIO bus return garbage, leading to initialization failure. Reproduced with Linux kernel 4.19 and 5.15-rc7. Can be reproduced using the following setup: * Use an Atom C3000 family system with at least one X552 LAN on the SoC * Disable PXE or other BIOS network initialization if possible (the interface must not be initialized before Linux boots) * Connect a live 10Gb Ethernet cable to an X550 port * Power cycle (not reset, doesn't always work) the system and boot Linux * Observe: ixgbe interfaces w/ 10GbE cables plugged in fail with error -17 Fixes: `e84db72727` ("ixgbe: Introduce function to control MDIO speed") Signed-off-by: Cyril Novikov <cnovikov@lynx.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-12-15 11:24:39 -08:00
Robert Schlabbach	271225fd57	ixgbe: Document how to enable NBASE-T support Commit `a296d665ea` ("ixgbe: Add ethtool support to enable 2.5 and 5.0 Gbps support") introduced suppression of the advertisement of NBASE-T speeds by default, according to Todd Fujinaka to accommodate customers with network switches which could not cope with advertised NBASE-T speeds, as posted in the E1000-devel mailing list: https://sourceforge.net/p/e1000/mailman/message/37106269/ However, the suppression was not documented at all, nor was how to enable NBASE-T support. Properly document the NBASE-T suppression and how to enable NBASE-T support. Fixes: `a296d665ea` ("ixgbe: Add ethtool support to enable 2.5 and 5.0 Gbps support") Reported-by: Robert Schlabbach <robert_s@gmx.net> Signed-off-by: Robert Schlabbach <robert_s@gmx.net> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-12-15 11:09:29 -08:00
Sasha Neftin	0182d1f3fa	igc: Fix typo in i225 LTR functions The LTR maximum value was incorrectly written using the scale from the LTR minimum value. This would cause incorrect values to be sent, in cases where the initial calculation lead to different min/max scales. Fixes: `707abf0695` ("igc: Add initial LTR support") Suggested-by: Dima Ruinskiy <dima.ruinskiy@intel.com> Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Nechama Kraus <nechamax.kraus@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-12-15 11:09:29 -08:00

1 2 3 4 5 ...

40342 Commits