Commit Graph

1140889 Commits

Author SHA1 Message Date
Sascha Hauer
1d89660494 wifi: rtw88: print firmware type in info message
It's confusing to read two different firmware versions in the syslog
for the same device:

rtw_8822cu 2-1:1.2: Firmware version 9.9.4, H2C version 15
rtw_8822cu 2-1:1.2: Firmware version 9.9.11, H2C version 15

Print the firmware type in this message to make clear these are really
two different firmwares for different purposes:

rtw_8822cu 1-1.4:1.2: WOW Firmware version 9.9.4, H2C version 15
rtw_8822cu 1-1.4:1.2: Firmware version 9.9.11, H2C version 15

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Reviewed-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20221202081224.2779981-2-s.hauer@pengutronix.de
2022-12-08 16:48:40 +02:00
Po-Hao Huang
a0e78d5c60 wifi: rtw89: add join info upon create interface
To support multiple vifs, fw need more information of each role.
Send this info to make things work as expected.

Signed-off-by: Po-Hao Huang <phhuang@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20221202061527.505668-5-pkshih@realtek.com
2022-12-08 16:47:59 +02:00
Po-Hao Huang
8fc5d43386 wifi: rtw89: fix unsuccessful interface_add flow
Remove according vifs from list if we couldn't set this interface up.
Otherwise the rtwvif_list could contain unreferenced objects.

Signed-off-by: Po-Hao Huang <phhuang@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20221202061527.505668-4-pkshih@realtek.com
2022-12-08 16:47:58 +02:00
Po-Hao Huang
d592b9f742 wifi: rtw89: stop mac port function when stop_ap()
Disable hardware beacon related functions when ap stops. So hardware won't
transmit beacons while interface is already removed.

Signed-off-by: Po-Hao Huang <phhuang@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20221202061527.505668-3-pkshih@realtek.com
2022-12-08 16:47:58 +02:00
Po-Hao Huang
fb2b8cec81 wifi: rtw89: add mac TSF sync function
If the interface is in AP/P2P GO mode, we adjust the TSF with random
offset to avoid TBTT of different vifs to overlap and collide.
For every new interface added, we adjust the value and resync for all
interfaces.

Signed-off-by: Po-Hao Huang <phhuang@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20221202061527.505668-2-pkshih@realtek.com
2022-12-08 16:47:58 +02:00
Zong-Zhe Yang
13eb07e0be wifi: rtw89: request full firmware only once if it's early requested
Under some condition, we now have to do early request full firmware when
rtw89_early_fw_feature_recognize(). In this case, we can avoid requesting
full firmware twice during probing driver. So, we pass out full firmware
from rtw89_early_fw_feature_recognize() if it's requested successfully.
And then, if firmware is settled, we have no need to request full firmware
again during normal initizating flow.

Setting firmware flow is updated to be as the following.

 platform	| early recognizing 	| normally initizating
-----------------------------------------------------------------------
 deny reading 	| request full FW	| (no more FW requesting)
 partial file 	|			| (obtain FW from early pahse)
-----------------------------------------------------------------------
 able to read	| request partial FW	| async request full FW
 partial file	| (quite small chunk)	|

Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20221202060521.501512-3-pkshih@realtek.com
2022-12-08 16:47:29 +02:00
Zong-Zhe Yang
3ddfe3bdd3 wifi: rtw89: don't request partial firmware if SECURITY_LOADPIN_ENFORCE
Kernel logs on platform enabling SECURITY_LOADPIN_ENFORCE
------
```
LoadPin: firmware old-api-denied obj=<unknown> pid=810 cmdline="modprobe -q -- rtw89_8852ce"
rtw89_8852ce 0000:01:00.0: loading /lib/firmware/rtw89/rtw8852c_fw.bin failed with error -1
rtw89_8852ce 0000:01:00.0: Direct firmware load for rtw89/rtw8852c_fw.bin failed with error -1
rtw89_8852ce 0000:01:00.0: failed to early request firmware: -1
```

Trace
------
```
request_partial_firmware_into_buf()
> _request_firmware()
>> fw_get_filesystem_firmware()
>>> kernel_read_file_from_path_initns()
>>>> kernel_read_file()
>>>>> security_kernel_read_file()
// It will iterate enabled LSMs' hooks for kernel_read_file.
// With loadpin, it hooks loadpin_read_file.
```

If SECURITY_LOADPIN_ENFORCE is enabled, doing kernel_read_file() on partial
files will be denied and return -EPERM (-1). Then, the outer API based on it,
e.g. request_partial_firmware_into_buf(), will get the error.

In the case, we cannot get the firmware stuffs right, even though there might
be no error other than a permission issue on reading a partial file. So we have
to request full firmware if SECURITY_LOADPIN_ENFORCE is enabled. It makes us
still have a chance to do early firmware work on this kind of platforms.

Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20221202060521.501512-2-pkshih@realtek.com
2022-12-08 16:47:29 +02:00
Wang Yufen
c2f2924bc7 wifi: brcmfmac: Fix error return code in brcmf_sdio_download_firmware()
Fix to return a negative error code instead of 0 when
brcmf_chip_set_active() fails. In addition, change the return
value for brcmf_pcie_exit_download_state() to keep consistent.

Fixes: d380ebc9b6 ("brcmfmac: rename chip download functions")
Signed-off-by: Wang Yufen <wangyufen@huawei.com>
Reviewed-by: Arend van Spriel <arend.vanspriel@broadcom.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/1669959342-27144-1-git-send-email-wangyufen@huawei.com
2022-12-08 16:46:32 +02:00
Bitterblue Smith
7de16123d9 wifi: rtl8xxxu: Introduce rtl8xxxu_update_ra_report
The ra_report struct is used for reporting the TX rate via
sta_statistics. The code which fills it out is duplicated in two
places, and the RTL8188EU will need it in a third place. Move this
code into a new function rtl8xxxu_update_ra_report.

Signed-off-by: Bitterblue Smith <rtl8821cerfe2@gmail.com>
Reviewed-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/0777ad35-fe03-473c-2e02-e3390bef5dd0@gmail.com
2022-12-08 16:45:16 +02:00
Bitterblue Smith
76c16af2cb wifi: rtl8xxxu: Fix the channel width reporting
The gen 2 chips RTL8192EU and RTL8188FU periodically send the driver
reports about the TX rate, and the driver passes these reports to
sta_statistics. The reports from RTL8192EU may or may not include the
channel width. The reports from RTL8188FU do not include it.

Only access the c2h->ra_report.bw field if the report (skb) is big
enough.

The other problem fixed here is that the code was actually never
changing the channel width initially reported by
rtl8xxxu_bss_info_changed because the value of RATE_INFO_BW_20 is 0.

Fixes: 0985d3a410 ("rtl8xxxu: Feed current txrate information for mac80211")
Signed-off-by: Bitterblue Smith <rtl8821cerfe2@gmail.com>
Reviewed-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/5b41f1ae-72e7-6b7a-2459-b736399a1c40@gmail.com
2022-12-08 16:45:16 +02:00
Bitterblue Smith
dd469a754a wifi: rtl8xxxu: Add __packed to struct rtl8723bu_c2h
This struct is used to access a sequence of bytes received from the
wifi chip. It must not have any padding bytes between the members.

This doesn't change anything on my system, possibly because currently
none of the members need more than byte alignment.

Fixes: b2b43b7837 ("rtl8xxxu: Initial functionality to handle C2H events for 8723bu")
Signed-off-by: Bitterblue Smith <rtl8821cerfe2@gmail.com>
Reviewed-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/1a270918-da22-ff5f-29fc-7855f740c5ba@gmail.com
2022-12-08 16:45:15 +02:00
Arend van Spriel
8041f2bffb wifi: brcmfmac: introduce BRCMFMAC exported symbols namespace
Using a namespace variant to make clear it is only intended to be used
by the vendor-specific modules. The symbol will only truly export the
symbols when the driver and consequently the vendor-specific part are
built as kernel modules.

Reviewed-by: Hante Meuleman <hante.meuleman@broadcom.com>
Reviewed-by: Pieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com>
Reviewed-by: Franky Lin <franky.lin@broadcom.com>
Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20221129135446.151065-8-arend.vanspriel@broadcom.com
2022-12-08 16:44:08 +02:00
Arend van Spriel
7205f9f2fc wifi: brcmfmac: add vendor name in revinfo debugfs file
Upon probe the driver determines the vendor supporting the device.
Expose this information in the revinfo debugfs file.

Reviewed-by: Hante Meuleman <hante.meuleman@broadcom.com>
Reviewed-by: Pieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com>
Reviewed-by: Franky Lin <franky.lin@broadcom.com>
Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20221129135446.151065-7-arend.vanspriel@broadcom.com
2022-12-08 16:44:08 +02:00
Arend van Spriel
b1d94be570 wifi: brcmfmac: add support Broadcom BCA firmware api
Broadcom BCA division develops its own firmware api and as such will
likely diverge over time (or already has). Add support for handling
this.

Reviewed-by: Hante Meuleman <hante.meuleman@broadcom.com>
Reviewed-by: Pieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com>
Reviewed-by: Franky Lin <franky.lin@broadcom.com>
Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20221129135446.151065-6-arend.vanspriel@broadcom.com
2022-12-08 16:44:07 +02:00
Arend van Spriel
f74f1ec22d wifi: brcmfmac: add support for Cypress firmware api
Cypress uses the brcmfmac driver and releases firmware which will
likely diverge over time (or already has). So adding support for
handling that.

Reviewed-by: Hante Meuleman <hante.meuleman@broadcom.com>
Reviewed-by: Pieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com>
Reviewed-by: Franky Lin <franky.lin@broadcom.com>
Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20221129135446.151065-5-arend.vanspriel@broadcom.com
2022-12-08 16:44:07 +02:00
Arend van Spriel
d6a5c56221 wifi: brcmfmac: add support for vendor-specific firmware api
The driver is being used by multiple vendors who develop the firmware
api independently. So far the firmware api as used by the driver has
not diverged (yet). This change adds framework for supporting multiple
firmware apis. The vendor-specific support code has to provide a number
of callback operations. Right now it is only attach and detach callbacks
so no real functionality as the api is still common. This code only
adds WCC variant anyway, which is selected for all devices right now.
The vendor-specific part will be built in a separate module when the
driver is configured to be built as a module through Kconfig, ie. when
CONFIG_BRCMFMAC=m.

Reviewed-by: Hante Meuleman <hante.meuleman@broadcom.com>
Reviewed-by: Pieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com>
Reviewed-by: Franky Lin <franky.lin@broadcom.com>
Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20221129135446.151065-4-arend.vanspriel@broadcom.com
2022-12-08 16:44:07 +02:00
Arend van Spriel
da6d9c8ecd wifi: brcmfmac: add firmware vendor info in driver info
In order to determine the vendor that released a firmware image for
a specific device, the device table now sets the vendor identifier
in driver info and it is stored in struct brcmf_bus::fwvid during
probe.

Reviewed-by: Hante Meuleman <hante.meuleman@broadcom.com>
Reviewed-by: Pieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com>
Reviewed-by: Franky Lin <franky.lin@broadcom.com>
Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20221129135446.151065-3-arend.vanspriel@broadcom.com
2022-12-08 16:44:07 +02:00
Arend van Spriel
76821aad49 wifi: brcmfmac: add function to unbind device to bus layer api
Introduce a new bus callback .remove() which will unbind the device
from the driver. This allows the common driver layer to stop handling
a device.

Reviewed-by: Hante Meuleman <hante.meuleman@broadcom.com>
Reviewed-by: Pieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com>
Reviewed-by: Franky Lin <franky.lin@broadcom.com>
Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20221129135446.151065-2-arend.vanspriel@broadcom.com
2022-12-08 16:44:06 +02:00
Jiapeng Chong
5107778d00 wifi: ipw2x00: Remove some unused functions
Functions write_nic_auto_inc_address() and write_nic_dword_auto_inc() are
defined in the ipw2100.c file, but not called elsewhere, so remove these
unused functions.

drivers/net/wireless/intel/ipw2x00/ipw2100.c:427:20: warning: unused function 'write_nic_dword_auto_inc'.

Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=3285
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20221129062407.83157-1-jiapeng.chong@linux.alibaba.com
2022-12-08 16:42:44 +02:00
Leon Romanovsky
37d244ad18 net/mlx5e: Open mlx5 driver to accept IPsec packet offload
Enable configuration of IPsec packet offload through XFRM state add
interface together with moving specific to IPsec packet mode limitations
to specific switch-case section.

Reviewed-by: Raed Salem <raeds@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2022-12-08 10:36:10 +01:00
Leon Romanovsky
cee137a634 net/mlx5e: Handle ESN update events
Extend event logic to update ESN state (esn_msb, esn_overlap)
for an IPsec Offload context.

Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2022-12-08 10:36:09 +01:00
Leon Romanovsky
8c582ddfbb net/mlx5e: Handle hardware IPsec limits events
Enable object changed event to signal IPsec about hitting
soft and hard limits.

Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2022-12-08 10:36:09 +01:00
Leon Romanovsky
1ed78fc033 net/mlx5e: Update IPsec soft and hard limits
Implement mlx5 IPsec callback to update current lifetime counters.

Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2022-12-08 10:36:08 +01:00
Leon Romanovsky
403b383a3c net/mlx5e: Store all XFRM SAs in Xarray
Instead of performing custom hash calculations, rely on FW that returns
unique identifier to every created SA. That identifier is Xarray ready,
which provides better semantic with efficient access.

In addition, store both TX and RX SAs to allow correlation between event
generated by HW when limits are armed and XFRM states.

Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2022-12-08 10:36:08 +01:00
Leon Romanovsky
7bddb659bd net/mlx5e: Provide intermediate pointer to access IPsec struct
Improve readability by providing direct pointer to struct mlx5e_ipsec.

Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2022-12-08 10:36:07 +01:00
Leon Romanovsky
6721239672 net/mlx5e: Skip IPsec encryption for TX path without matching policy
Software implementation of IPsec skips encryption of packets in TX
path if no matching policy is found. So align HW implementation to
this behavior, by requiring matching reqid for offloaded policy and
SA.

Reviewed-by: Raed Salem <raeds@nvidia.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2022-12-08 10:36:06 +01:00
Raed Salem
81f8fba5ec net/mlx5e: Add statistics for Rx/Tx IPsec offloaded flows
Add the following statistics:
RX successfully IPsec flows:
ipsec_rx_pkts  : Number of packets passed Rx IPsec flow
ipsec_rx_bytes : Number of bytes passed Rx IPsec flow

Rx dropped IPsec policy packets:
ipsec_rx_drop_pkts: Number of packets dropped in Rx datapath due to IPsec drop policy
ipsec_rx_drop_bytes: Number of bytes dropped in Rx datapath due to IPsec drop policy

TX successfully encrypted and encapsulated IPsec packets:
ipsec_tx_pkts  : Number of packets encrypted and encapsulated successfully
ipsec_tx_bytes : Number of bytes encrypted and encapsulated successfully

Tx dropped IPsec policy packets:
ipsec_tx_drop_pkts: Number of packets dropped in Tx datapath due to IPsec drop policy
ipsec_tx_drop_bytes: Number of bytes dropped in Tx datapath due to IPsec drop policy

The above can be seen using:
ethtool -S <ifc> |grep ipsec

Signed-off-by: Raed Salem <raeds@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2022-12-08 10:36:06 +01:00
Leon Romanovsky
18f38fd267 net/mlx5e: Improve IPsec flow steering autogroup
Flow steering API separates newly created rules based on their
match criteria. Right now, all IPsec tables are created with one
group and suffers from not-optimal FS performance.

Count number of different match criteria for relevant tables, and
set proper value at the table creation.

Reviewed-by: Raed Salem <raeds@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2022-12-08 10:36:05 +01:00
Leon Romanovsky
6b5c45e16e net/mlx5e: Configure IPsec packet offload flow steering
In packet offload mode, the HW is responsible to handle ESP headers,
SPI numbers and trailers (ICV) together with different logic for
RX and TX paths.

In order to support packet offload mode, special logic is added
to flow steering rules.

Reviewed-by: Raed Salem <raeds@nvidia.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2022-12-08 10:36:05 +01:00
Leon Romanovsky
9af594d8a9 net/mlx5e: Use same coding pattern for Rx and Tx flows
Remove intermediate variable in favor of having similar coding style
for Rx and Tx add rule functions.

Reviewed-by: Raed Salem <raeds@nvidia.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2022-12-08 10:36:04 +01:00
Leon Romanovsky
a5b8ca9471 net/mlx5e: Add XFRM policy offload logic
Implement mlx5 flow steering logic and mlx5 IPsec code support
XFRM policy offload.

Reviewed-by: Raed Salem <raeds@nvidia.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2022-12-08 10:36:03 +01:00
Leon Romanovsky
8c17295bd4 net/mlx5e: Create IPsec policy offload tables
Add empty table to be used for IPsec policy offload.

Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2022-12-08 10:36:02 +01:00
Steffen Klassert
e8a292d6a7 Merge branch 'mlx5 IPsec packet offload support (Part I)'
Leon Romanovsky says:

============
This series follows previously sent "Extend XFRM core to allow packet
offload configuration" series [1].

It is first part with refactoring to mlx5 allow us natively extend
mlx5 IPsec logic to support both crypto and packet offloads.
============

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2022-12-08 10:25:30 +01:00
Yanteng Si
1385313d8b docs/zh_CN: Add LoongArch booting description's translation
Translate ../loongarch/booting.rst into Chinese.

Suggested-by: Xiaotian Wu <wuxiaotian@loongson.cn>
Signed-off-by: Yanteng Si <siyanteng@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-12-08 15:03:14 +08:00
Yanteng Si
38eb496d85 docs/LoongArch: Add booting description
1, Describe the information passed from BootLoader to kernel.
2, Describe the meaning and values of the kernel image header field.

Suggested-by: Xiaotian Wu <wuxiaotian@loongson.cn>
Signed-off-by: Yanteng Si <siyanteng@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-12-08 14:59:15 +08:00
Huacai Chen
b681604eda LoongArch: mm: Fix huge page entry update for virtual machine
In virtual machine (guest mode), the tlbwr instruction can not write the
last entry of MTLB, so we need to make it non-present by invtlb and then
write it by tlbfill. This also simplify the whole logic.

Signed-off-by: Rui Wang <wangrui@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-12-08 14:59:15 +08:00
Bibo Mao
143d64bdbd LoongArch: Export symbol for function smp_send_reschedule()
Function smp_send_reschedule() is standard kernel API, which is defined
in header file include/linux/smp.h. However, on LoongArch it is defined
as an inline function, this is confusing and kernel modules can not use
this function.

Now we define smp_send_reschedule() as a general function, and add a
EXPORT_SYMBOL_GPL on this function, so that kernel modules can use it.

Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-12-08 14:59:15 +08:00
Jakub Kicinski
d8b879c00f Merge branch 'net-ethernet-ti-am65-cpsw-fix-set-channel-operation'
Roger Quadros says:

====================
net: ethernet: ti: am65-cpsw: Fix set channel operation

This contains a critical bug fix for the recently merged suspend/resume
support [1] that broke set channel operation. (ethtool -L eth0 tx <n>)

As there were 2 dependent patches on top of the offending commit [1]
first revert them and then apply them back after the correct fix.

[1] fd23df72f2 ("net: ethernet: ti: am65-cpsw: Add suspend/resume support")
====================

Link: https://lore.kernel.org/r/20221206094419.19478-1-rogerq@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-07 20:17:35 -08:00
Roger Quadros
020b232f79 net: ethernet: ti: am65-cpsw: Fix hardware switch mode on suspend/resume
On low power during system suspend the ALE table context is lost.
Save the ALE context before suspend and restore it after resume.

Signed-off-by: Roger Quadros <rogerq@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-07 20:17:31 -08:00
Roger Quadros
1581cd8b11 net: ethernet: ti: am65-cpsw: retain PORT_VLAN_REG after suspend/resume
During suspend resume the context of PORT_VLAN_REG is lost so
save it during suspend and restore it during resume for
host port and slave ports.

Signed-off-by: Roger Quadros <rogerq@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-07 20:17:31 -08:00
Roger Quadros
24bc19b05f net: ethernet: ti: am65-cpsw: Add suspend/resume support
Add PM handlers for System suspend/resume.

As DMA driver doesn't yet support suspend/resume we free up
the DMA channels at suspend and acquire and initialize them
at resume.

In this revised approach we do not free the TX/RX IRQs at
am65_cpsw_nuss_common_stop() as it causes problems.
We will now free them only on .suspend() as we need to release
the DMA channels (as DMA looses context) and re-acquiring
them on .resume() may not necessarily give us the same
IRQs.

To make this easier:
- introduce am65_cpsw_nuss_remove_rx_chns() which is
   similar to am65_cpsw_nuss_remove_tx_chns(). These will
   be invoked in pm.suspend() to release the DMA channels
   and free up the IRQs.
- move napi_add() and request_irq() calls to
   am65_cpsw_nuss_init_rx/tx_chns() so we can invoke them
   in pm.resume() to acquire the DMA channels and IRQs.

As CPTS looses contect during suspend/resume, invoke the
necessary CPTS suspend/resume helpers.

ALE_CLEAR command is issued in cpsw_ale_start() so no need
to issue it before the call to cpsw_ale_start().

Signed-off-by: Roger Quadros <rogerq@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-07 20:17:31 -08:00
Roger Quadros
1a014663e7 Revert "net: ethernet: ti: am65-cpsw: Add suspend/resume support"
This reverts commit fd23df72f2.

This commit broke set channel operation. Revert this and
implement it with a different approach in a separate patch.

Signed-off-by: Roger Quadros <rogerq@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-07 20:17:31 -08:00
Roger Quadros
1bae8fa8c4 Revert "net: ethernet: ti: am65-cpsw: retain PORT_VLAN_REG after suspend/resume"
This reverts commit 643cf0e3ab.

This is to make it easier to revert the offending commit
fd23df72f2 ("net: ethernet: ti: am65-cpsw: Add suspend/resume support")

Signed-off-by: Roger Quadros <rogerq@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-07 20:17:31 -08:00
Roger Quadros
1a35259672 Revert "net: ethernet: ti: am65-cpsw: Fix hardware switch mode on suspend/resume"
This reverts commit 1af3cb3702.

This is to make it easier to revert the offending commit
fd23df72f2 ("net: ethernet: ti: am65-cpsw: Add suspend/resume support")

Signed-off-by: Roger Quadros <rogerq@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-07 20:17:31 -08:00
Eric Dumazet
803e84867d ipv6: avoid use-after-free in ip6_fragment()
Blamed commit claimed rcu_read_lock() was held by ip6_fragment() callers.

It seems to not be always true, at least for UDP stack.

syzbot reported:

BUG: KASAN: use-after-free in ip6_dst_idev include/net/ip6_fib.h:245 [inline]
BUG: KASAN: use-after-free in ip6_fragment+0x2724/0x2770 net/ipv6/ip6_output.c:951
Read of size 8 at addr ffff88801d403e80 by task syz-executor.3/7618

CPU: 1 PID: 7618 Comm: syz-executor.3 Not tainted 6.1.0-rc6-syzkaller-00012-g4312098baf37 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0xd1/0x138 lib/dump_stack.c:106
 print_address_description mm/kasan/report.c:284 [inline]
 print_report+0x15e/0x45d mm/kasan/report.c:395
 kasan_report+0xbf/0x1f0 mm/kasan/report.c:495
 ip6_dst_idev include/net/ip6_fib.h:245 [inline]
 ip6_fragment+0x2724/0x2770 net/ipv6/ip6_output.c:951
 __ip6_finish_output net/ipv6/ip6_output.c:193 [inline]
 ip6_finish_output+0x9a3/0x1170 net/ipv6/ip6_output.c:206
 NF_HOOK_COND include/linux/netfilter.h:291 [inline]
 ip6_output+0x1f1/0x540 net/ipv6/ip6_output.c:227
 dst_output include/net/dst.h:445 [inline]
 ip6_local_out+0xb3/0x1a0 net/ipv6/output_core.c:161
 ip6_send_skb+0xbb/0x340 net/ipv6/ip6_output.c:1966
 udp_v6_send_skb+0x82a/0x18a0 net/ipv6/udp.c:1286
 udp_v6_push_pending_frames+0x140/0x200 net/ipv6/udp.c:1313
 udpv6_sendmsg+0x18da/0x2c80 net/ipv6/udp.c:1606
 inet6_sendmsg+0x9d/0xe0 net/ipv6/af_inet6.c:665
 sock_sendmsg_nosec net/socket.c:714 [inline]
 sock_sendmsg+0xd3/0x120 net/socket.c:734
 sock_write_iter+0x295/0x3d0 net/socket.c:1108
 call_write_iter include/linux/fs.h:2191 [inline]
 new_sync_write fs/read_write.c:491 [inline]
 vfs_write+0x9ed/0xdd0 fs/read_write.c:584
 ksys_write+0x1ec/0x250 fs/read_write.c:637
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7fde3588c0d9
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 f1 19 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fde365b6168 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 00007fde359ac050 RCX: 00007fde3588c0d9
RDX: 000000000000ffdc RSI: 00000000200000c0 RDI: 000000000000000a
RBP: 00007fde358e7ae9 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007fde35acfb1f R14: 00007fde365b6300 R15: 0000000000022000
 </TASK>

Allocated by task 7618:
 kasan_save_stack+0x22/0x40 mm/kasan/common.c:45
 kasan_set_track+0x25/0x30 mm/kasan/common.c:52
 __kasan_slab_alloc+0x82/0x90 mm/kasan/common.c:325
 kasan_slab_alloc include/linux/kasan.h:201 [inline]
 slab_post_alloc_hook mm/slab.h:737 [inline]
 slab_alloc_node mm/slub.c:3398 [inline]
 slab_alloc mm/slub.c:3406 [inline]
 __kmem_cache_alloc_lru mm/slub.c:3413 [inline]
 kmem_cache_alloc+0x2b4/0x3d0 mm/slub.c:3422
 dst_alloc+0x14a/0x1f0 net/core/dst.c:92
 ip6_dst_alloc+0x32/0xa0 net/ipv6/route.c:344
 ip6_rt_pcpu_alloc net/ipv6/route.c:1369 [inline]
 rt6_make_pcpu_route net/ipv6/route.c:1417 [inline]
 ip6_pol_route+0x901/0x1190 net/ipv6/route.c:2254
 pol_lookup_func include/net/ip6_fib.h:582 [inline]
 fib6_rule_lookup+0x52e/0x6f0 net/ipv6/fib6_rules.c:121
 ip6_route_output_flags_noref+0x2e6/0x380 net/ipv6/route.c:2625
 ip6_route_output_flags+0x76/0x320 net/ipv6/route.c:2638
 ip6_route_output include/net/ip6_route.h:98 [inline]
 ip6_dst_lookup_tail+0x5ab/0x1620 net/ipv6/ip6_output.c:1092
 ip6_dst_lookup_flow+0x90/0x1d0 net/ipv6/ip6_output.c:1222
 ip6_sk_dst_lookup_flow+0x553/0x980 net/ipv6/ip6_output.c:1260
 udpv6_sendmsg+0x151d/0x2c80 net/ipv6/udp.c:1554
 inet6_sendmsg+0x9d/0xe0 net/ipv6/af_inet6.c:665
 sock_sendmsg_nosec net/socket.c:714 [inline]
 sock_sendmsg+0xd3/0x120 net/socket.c:734
 __sys_sendto+0x23a/0x340 net/socket.c:2117
 __do_sys_sendto net/socket.c:2129 [inline]
 __se_sys_sendto net/socket.c:2125 [inline]
 __x64_sys_sendto+0xe1/0x1b0 net/socket.c:2125
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd

Freed by task 7599:
 kasan_save_stack+0x22/0x40 mm/kasan/common.c:45
 kasan_set_track+0x25/0x30 mm/kasan/common.c:52
 kasan_save_free_info+0x2e/0x40 mm/kasan/generic.c:511
 ____kasan_slab_free mm/kasan/common.c:236 [inline]
 ____kasan_slab_free+0x160/0x1c0 mm/kasan/common.c:200
 kasan_slab_free include/linux/kasan.h:177 [inline]
 slab_free_hook mm/slub.c:1724 [inline]
 slab_free_freelist_hook+0x8b/0x1c0 mm/slub.c:1750
 slab_free mm/slub.c:3661 [inline]
 kmem_cache_free+0xee/0x5c0 mm/slub.c:3683
 dst_destroy+0x2ea/0x400 net/core/dst.c:127
 rcu_do_batch kernel/rcu/tree.c:2250 [inline]
 rcu_core+0x81f/0x1980 kernel/rcu/tree.c:2510
 __do_softirq+0x1fb/0xadc kernel/softirq.c:571

Last potentially related work creation:
 kasan_save_stack+0x22/0x40 mm/kasan/common.c:45
 __kasan_record_aux_stack+0xbc/0xd0 mm/kasan/generic.c:481
 call_rcu+0x9d/0x820 kernel/rcu/tree.c:2798
 dst_release net/core/dst.c:177 [inline]
 dst_release+0x7d/0xe0 net/core/dst.c:167
 refdst_drop include/net/dst.h:256 [inline]
 skb_dst_drop include/net/dst.h:268 [inline]
 skb_release_head_state+0x250/0x2a0 net/core/skbuff.c:838
 skb_release_all net/core/skbuff.c:852 [inline]
 __kfree_skb net/core/skbuff.c:868 [inline]
 kfree_skb_reason+0x151/0x4b0 net/core/skbuff.c:891
 kfree_skb_list_reason+0x4b/0x70 net/core/skbuff.c:901
 kfree_skb_list include/linux/skbuff.h:1227 [inline]
 ip6_fragment+0x2026/0x2770 net/ipv6/ip6_output.c:949
 __ip6_finish_output net/ipv6/ip6_output.c:193 [inline]
 ip6_finish_output+0x9a3/0x1170 net/ipv6/ip6_output.c:206
 NF_HOOK_COND include/linux/netfilter.h:291 [inline]
 ip6_output+0x1f1/0x540 net/ipv6/ip6_output.c:227
 dst_output include/net/dst.h:445 [inline]
 ip6_local_out+0xb3/0x1a0 net/ipv6/output_core.c:161
 ip6_send_skb+0xbb/0x340 net/ipv6/ip6_output.c:1966
 udp_v6_send_skb+0x82a/0x18a0 net/ipv6/udp.c:1286
 udp_v6_push_pending_frames+0x140/0x200 net/ipv6/udp.c:1313
 udpv6_sendmsg+0x18da/0x2c80 net/ipv6/udp.c:1606
 inet6_sendmsg+0x9d/0xe0 net/ipv6/af_inet6.c:665
 sock_sendmsg_nosec net/socket.c:714 [inline]
 sock_sendmsg+0xd3/0x120 net/socket.c:734
 sock_write_iter+0x295/0x3d0 net/socket.c:1108
 call_write_iter include/linux/fs.h:2191 [inline]
 new_sync_write fs/read_write.c:491 [inline]
 vfs_write+0x9ed/0xdd0 fs/read_write.c:584
 ksys_write+0x1ec/0x250 fs/read_write.c:637
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd

Second to last potentially related work creation:
 kasan_save_stack+0x22/0x40 mm/kasan/common.c:45
 __kasan_record_aux_stack+0xbc/0xd0 mm/kasan/generic.c:481
 call_rcu+0x9d/0x820 kernel/rcu/tree.c:2798
 dst_release net/core/dst.c:177 [inline]
 dst_release+0x7d/0xe0 net/core/dst.c:167
 refdst_drop include/net/dst.h:256 [inline]
 skb_dst_drop include/net/dst.h:268 [inline]
 __dev_queue_xmit+0x1b9d/0x3ba0 net/core/dev.c:4211
 dev_queue_xmit include/linux/netdevice.h:3008 [inline]
 neigh_resolve_output net/core/neighbour.c:1552 [inline]
 neigh_resolve_output+0x51b/0x840 net/core/neighbour.c:1532
 neigh_output include/net/neighbour.h:546 [inline]
 ip6_finish_output2+0x56c/0x1530 net/ipv6/ip6_output.c:134
 __ip6_finish_output net/ipv6/ip6_output.c:195 [inline]
 ip6_finish_output+0x694/0x1170 net/ipv6/ip6_output.c:206
 NF_HOOK_COND include/linux/netfilter.h:291 [inline]
 ip6_output+0x1f1/0x540 net/ipv6/ip6_output.c:227
 dst_output include/net/dst.h:445 [inline]
 NF_HOOK include/linux/netfilter.h:302 [inline]
 NF_HOOK include/linux/netfilter.h:296 [inline]
 mld_sendpack+0xa09/0xe70 net/ipv6/mcast.c:1820
 mld_send_cr net/ipv6/mcast.c:2121 [inline]
 mld_ifc_work+0x720/0xdc0 net/ipv6/mcast.c:2653
 process_one_work+0x9bf/0x1710 kernel/workqueue.c:2289
 worker_thread+0x669/0x1090 kernel/workqueue.c:2436
 kthread+0x2e8/0x3a0 kernel/kthread.c:376
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:306

The buggy address belongs to the object at ffff88801d403dc0
 which belongs to the cache ip6_dst_cache of size 240
The buggy address is located 192 bytes inside of
 240-byte region [ffff88801d403dc0, ffff88801d403eb0)

The buggy address belongs to the physical page:
page:ffffea00007500c0 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1d403
memcg:ffff888022f49c81
flags: 0xfff00000000200(slab|node=0|zone=1|lastcpupid=0x7ff)
raw: 00fff00000000200 ffffea0001ef6580 dead000000000002 ffff88814addf640
raw: 0000000000000000 00000000800c000c 00000001ffffffff ffff888022f49c81
page dumped because: kasan: bad access detected
page_owner tracks the page as allocated
page last allocated via order 0, migratetype Unmovable, gfp_mask 0x112a20(GFP_ATOMIC|__GFP_NOWARN|__GFP_NORETRY|__GFP_HARDWALL), pid 3719, tgid 3719 (kworker/0:6), ts 136223432244, free_ts 136222971441
 prep_new_page mm/page_alloc.c:2539 [inline]
 get_page_from_freelist+0x10b5/0x2d50 mm/page_alloc.c:4288
 __alloc_pages+0x1cb/0x5b0 mm/page_alloc.c:5555
 alloc_pages+0x1aa/0x270 mm/mempolicy.c:2285
 alloc_slab_page mm/slub.c:1794 [inline]
 allocate_slab+0x213/0x300 mm/slub.c:1939
 new_slab mm/slub.c:1992 [inline]
 ___slab_alloc+0xa91/0x1400 mm/slub.c:3180
 __slab_alloc.constprop.0+0x56/0xa0 mm/slub.c:3279
 slab_alloc_node mm/slub.c:3364 [inline]
 slab_alloc mm/slub.c:3406 [inline]
 __kmem_cache_alloc_lru mm/slub.c:3413 [inline]
 kmem_cache_alloc+0x31a/0x3d0 mm/slub.c:3422
 dst_alloc+0x14a/0x1f0 net/core/dst.c:92
 ip6_dst_alloc+0x32/0xa0 net/ipv6/route.c:344
 icmp6_dst_alloc+0x71/0x680 net/ipv6/route.c:3261
 mld_sendpack+0x5de/0xe70 net/ipv6/mcast.c:1809
 mld_send_cr net/ipv6/mcast.c:2121 [inline]
 mld_ifc_work+0x720/0xdc0 net/ipv6/mcast.c:2653
 process_one_work+0x9bf/0x1710 kernel/workqueue.c:2289
 worker_thread+0x669/0x1090 kernel/workqueue.c:2436
 kthread+0x2e8/0x3a0 kernel/kthread.c:376
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:306
page last free stack trace:
 reset_page_owner include/linux/page_owner.h:24 [inline]
 free_pages_prepare mm/page_alloc.c:1459 [inline]
 free_pcp_prepare+0x65c/0xd90 mm/page_alloc.c:1509
 free_unref_page_prepare mm/page_alloc.c:3387 [inline]
 free_unref_page+0x1d/0x4d0 mm/page_alloc.c:3483
 __unfreeze_partials+0x17c/0x1a0 mm/slub.c:2586
 qlink_free mm/kasan/quarantine.c:168 [inline]
 qlist_free_all+0x6a/0x170 mm/kasan/quarantine.c:187
 kasan_quarantine_reduce+0x184/0x210 mm/kasan/quarantine.c:294
 __kasan_slab_alloc+0x66/0x90 mm/kasan/common.c:302
 kasan_slab_alloc include/linux/kasan.h:201 [inline]
 slab_post_alloc_hook mm/slab.h:737 [inline]
 slab_alloc_node mm/slub.c:3398 [inline]
 kmem_cache_alloc_node+0x304/0x410 mm/slub.c:3443
 __alloc_skb+0x214/0x300 net/core/skbuff.c:497
 alloc_skb include/linux/skbuff.h:1267 [inline]
 netlink_alloc_large_skb net/netlink/af_netlink.c:1191 [inline]
 netlink_sendmsg+0x9a6/0xe10 net/netlink/af_netlink.c:1896
 sock_sendmsg_nosec net/socket.c:714 [inline]
 sock_sendmsg+0xd3/0x120 net/socket.c:734
 __sys_sendto+0x23a/0x340 net/socket.c:2117
 __do_sys_sendto net/socket.c:2129 [inline]
 __se_sys_sendto net/socket.c:2125 [inline]
 __x64_sys_sendto+0xe1/0x1b0 net/socket.c:2125
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd

Fixes: 1758fd4688 ("ipv6: remove unnecessary dst_hold() in ip6_fragment()")
Reported-by: syzbot+8c0ac31aa9681abb9e2d@syzkaller.appspotmail.com
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Wei Wang <weiwan@google.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/r/20221206101351.2037285-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-07 20:14:35 -08:00
Yang Yingliang
7d8c19bfc8 net: plip: don't call kfree_skb/dev_kfree_skb() under spin_lock_irq()
It is not allowed to call kfree_skb() or consume_skb() from
hardware interrupt context or with interrupts being disabled.
So replace kfree_skb/dev_kfree_skb() with dev_kfree_skb_irq()
and dev_consume_skb_irq() under spin_lock_irq().

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20221207015310.2984909-1-yangyingliang@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-07 20:10:47 -08:00
Jakub Kicinski
e1228581b3 Merge branch 'devlink-add-port-function-attribute-to-enable-disable-roce-and-migratable'
Shay Drory says:

====================
devlink: Add port function attribute to enable/disable Roce and migratable

This series is a complete rewrite of the series "devlink: Add port
function attribute to enable/disable roce"
link:
https://lore.kernel.org/netdev/20221102163954.279266-1-danielj@nvidia.com/

Currently mlx5 PCI VF and SF are enabled by default for RoCE
functionality. And mlx5 PCI VF is disable by dafault for migratable
functionality.

Currently a user does not have the ability to disable RoCE for a PCI
VF/SF device before such device is enumerated by the driver.

User is also incapable to do such setting from smartnic scenario for a
VF from the smartnic.

Current 'enable_roce' device knob is limited to do setting only at
driverinit time. By this time device is already created and firmware has
already allocated necessary system memory for supporting RoCE.

Also, Currently a user does not have the ability to enable migratable
for a PCI VF.

The above are a hyper visor level control, to set the functionality of
devices passed through to guests.

This is achieved by extending existing 'port function' object to control
capabilities of a function. This enables users to control capability of
the device before enumeration.

Examples when user prefers to disable RoCE for a VF when using switchdev
mode:

$ devlink port show pci/0000:06:00.0/1
pci/0000:06:00.0/1: type eth netdev pf0vf0 flavour pcivf controller 0
pfnum 0 vfnum 0 external false splittable false
  function:
    hw_addr 00:00:00:00:00:00 roce enable

$ devlink port function set pci/0000:06:00.0/1 roce disable

$ devlink port show pci/0000:06:00.0/1
pci/0000:06:00.0/1: type eth netdev pf0vf0 flavour pcivf controller 0
pfnum 0 vfnum 0 external false splittable false
  function:
    hw_addr 00:00:00:00:00:00 roce disable

FAQs:
-----
1. What does roce enable/disable do?
Ans: It disables RoCE capability of the function before its enumerated,
so when driver reads the capability from the device firmware, it is
disabled.
At this point RDMA stack will not be able to create UD, QP1, RC, XRC
type of QPs. When RoCE is disabled, the GID table of all ports of the
device is disabled in the device and software stack.

2. How is the roce 'port function' option different from existing
devlink param?
Ans: RoCE attribute at the port function level disables the RoCE
capability at the specific function level; while enable_roce only does
at the software level.

3. Why is this option for disabling only RoCE and not the whole RDMA
device?
Ans: Because user still wants to use the RDMA device for non RoCE
commands in more memory efficient way.
====================

Link: https://lore.kernel.org/r/20221206185119.380138-1-shayd@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-07 20:09:21 -08:00
Shay Drory
e5b9642a33 net/mlx5: E-Switch, Implement devlink port function cmds to control migratable
Implement devlink port function commands to enable / disable migratable.
This is used to control the migratable capability of the device.

Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Acked-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-07 20:09:18 -08:00
Shay Drory
a8ce7b26a5 devlink: Expose port function commands to control migratable
Expose port function commands to enable / disable migratable
capability, this is used to set the port function as migratable.

Live migration is the process of transferring a live virtual machine
from one physical host to another without disrupting its normal
operation.

In order for a VM to be able to perform LM, all the VM components must
be able to perform migration. e.g.: to be migratable.
In order for VF to be migratable, VF must be bound to VFIO driver with
migration support.

When migratable capability is enabled for a function of the port, the
device is making the necessary preparations for the function to be
migratable, which might include disabling features which cannot be
migrated.

Example of LM with migratable function configuration:
Set migratable of the VF's port function.

$ devlink port show pci/0000:06:00.0/2
pci/0000:06:00.0/2: type eth netdev enp6s0pf0vf1 flavour pcivf pfnum 0
vfnum 1
    function:
        hw_addr 00:00:00:00:00:00 migratable disable

$ devlink port function set pci/0000:06:00.0/2 migratable enable

$ devlink port show pci/0000:06:00.0/2
pci/0000:06:00.0/2: type eth netdev enp6s0pf0vf1 flavour pcivf pfnum 0
vfnum 1
    function:
        hw_addr 00:00:00:00:00:00 migratable enable

Bind VF to VFIO driver with migration support:
$ echo <pci_id> > /sys/bus/pci/devices/0000:08:00.0/driver/unbind
$ echo mlx5_vfio_pci > /sys/bus/pci/devices/0000:08:00.0/driver_override
$ echo <pci_id> > /sys/bus/pci/devices/0000:08:00.0/driver/bind

Attach VF to the VM.
Start the VM.
Perform LM.

Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Shannon Nelson <shannon.nelson@amd.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-07 20:09:18 -08:00
Yishai Hadas
7db98396ef net/mlx5: E-Switch, Implement devlink port function cmds to control RoCE
Implement devlink port function commands to enable / disable RoCE.
This is used to control the RoCE device capabilities.

This patch implement infrastructure which will be used by downstream
patches that will add additional capabilities.

Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Acked-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-07 20:09:18 -08:00