Commit Graph

121925 Commits

Author SHA1 Message Date
Beniamino Galvani
daa2ba7ed1 geneve: use generic function for tunnel IPv4 route lookup
The route lookup can be done now via generic function
udp_tunnel_dst_lookup() to replace the custom implementation in
geneve_get_v4_rt().

Suggested-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Beniamino Galvani <b.galvani@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-16 09:57:52 +01:00
Beniamino Galvani
60a77d11cd geneve: add dsfield helper function
Add a helper function to compute the tos/dsfield. In this way, we can
factor out some duplicate code. Also, the helper will be called from
more places in the next commit.

Suggested-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Beniamino Galvani <b.galvani@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-16 09:57:52 +01:00
Beniamino Galvani
72fc68c635 ipv4: add new arguments to udp_tunnel_dst_lookup()
We want to make the function more generic so that it can be used by
other UDP tunnel implementations such as geneve and vxlan. To do that,
add the following arguments:

 - source and destination UDP port;
 - ifindex of the output interface, needed by vxlan;
 - the tos, because in some cases it is not taken from struct
   ip_tunnel_info (for example, when it's inherited from the inner
   packet);
 - the dst cache, because not all tunnel types (e.g. vxlan) want to
   use the one from struct ip_tunnel_info.

With these parameters, the function no longer needs the full struct
ip_tunnel_info as argument and we can pass only the relevant part of
it (struct ip_tunnel_key).

Suggested-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Beniamino Galvani <b.galvani@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-16 09:57:52 +01:00
Beniamino Galvani
78f3655adc ipv4: remove "proto" argument from udp_tunnel_dst_lookup()
The function is now UDP-specific, the protocol is always IPPROTO_UDP.

Suggested-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Beniamino Galvani <b.galvani@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-16 09:57:52 +01:00
Beniamino Galvani
bf3fcbf7e7 ipv4: rename and move ip_route_output_tunnel()
At the moment ip_route_output_tunnel() is used only by bareudp.
Ideally, other UDP tunnel implementations should use it, but to do so
the function needs to accept new parameters that are specific for UDP
tunnels, such as the ports.

Prepare for these changes by renaming the function to
udp_tunnel_dst_lookup() and move it to file
net/ipv4/udp_tunnel_core.c.

Suggested-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Beniamino Galvani <b.galvani@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-16 09:57:52 +01:00
Christian Marangi
101c603203 net: cxgb3: simplify logic for rspq_check_napi
Simplify logic for rspq_check_napi.
Drop redundant and wrong napi_is_scheduled call as it's not race free
and directly use the output of napi_schedule to understand if a napi is
pending or not.

rspq_check_napi main logic is to check if is_new_response is true and
check if a napi is not scheduled. The result of this function is then
used to detect if we are missing some interrupt and act on top of
this... With this knowing, we can rework and simplify the logic and make
it less problematic with testing an internal bit for napi.

Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-16 09:09:54 +01:00
Xuan Zhuo
5720c43d52 virtio_net: fix the missing of the dma cpu sync
Commit 295525e29a ("virtio_net: merge dma operations when filling
mergeable buffers") unmaps the buffer with DMA_ATTR_SKIP_CPU_SYNC when
the dma->ref is zero. We do that with DMA_ATTR_SKIP_CPU_SYNC, because we
do not want to do the sync for the entire page_frag. But that misses the
sync for the current area.

This patch does cpu sync regardless of whether the ref is zero or not.

Fixes: 295525e29a ("virtio_net: merge dma operations when filling mergeable buffers")
Reported-by: Michael Roth <michael.roth@amd.com>
Closes: http://lore.kernel.org/all/20230926130451.axgodaa6tvwqs3ut@amd.com
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-10-15 11:49:57 -07:00
Arkadiusz Kubalewski
90e1c90750 ice: dpll: implement phase related callbacks
Implement new callback ops related to measurement and adjustment of
signal phase for pin-dpll in ice driver.

Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-15 16:08:24 +01:00
Ivan Vecera
3e02480d5e i40e: Add PBA as board id info to devlink .info_get
Expose stored PBA ID string as unique board identifier via
devlink's .info_get command.

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-15 14:33:42 +01:00
Ivan Vecera
df19ea6966 i40e: Refactor and rename i40e_read_pba_string()
Function i40e_read_pba_string() is currently unused but will be used
by subsequent patch to provide board ID via devlink device info.

The function reads PBA block from NVM so it cannot be called during
adapter reset and as we would like to provide PBA ID via devlink
info it is better to read the PBA ID during i40e_probe() and cache
it in i40e_hw structure to avoid a waiting for potential adapter
reset in devlink info callback.

So...
- Remove pba_num and pba_num_size arguments from the function,
  allocate resource managed buffer to store PBA ID string and
  save resulting pointer to i40e_hw->pba_id field
- Make the function void as the PBA ID can be missing and in this
  case (or in case of NVM reading failure) the i40e_hw->pba_id
  will be NULL
- Rename the function to i40e_get_pba_string() to align with other
  functions like i40e_get_oem_version() i40e_get_port_mac_addr()...
- Call this function on init during i40e_probe()

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-15 14:33:41 +01:00
Ivan Vecera
5a423552e0 i40e: Add handler for devlink .info_get
Provide devlink .info_get callback to allow the driver to report
detailed version information. The following info is reported:

 "serial_number" -> The PCI DSN of the adapter
 "fw.mgmt" -> The version of the firmware
 "fw.mgmt.api" -> The API version of interface exposed over the AdminQ
 "fw.psid" -> The version of the NVM image
 "fw.bundle_id" -> Unique identifier for the combined flash image
 "fw.undi" -> The combo image version

With this, 'devlink dev info' provides at least the same amount
information as is reported by ETHTOOL_GDRVINFO:

$ ethtool -i enp2s0f0 | egrep '(driver|firmware)'
driver: i40e
firmware-version: 9.30 0x8000e5f3 1.3429.0

$ devlink dev info pci/0000:02:00.0
pci/0000:02:00.0:
  driver i40e
  serial_number c0-de-b7-ff-ff-ef-ec-3c
  versions:
      running:
        fw.mgmt 9.130.73618
        fw.mgmt.api 1.15
        fw.psid 9.30
        fw.bundle_id 0x8000e5f3
        fw.undi 1.3429.0

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-15 14:33:41 +01:00
Ivan Vecera
7aabde3976 i40e: Split and refactor i40e_nvm_version_str()
The function formats NVM version string according adapter's
EETrackID value. If this value OEM specific (0xffffffff) then
the reported version is with format:
"<gen>.<snap>.<release>"
and in other case
"<nvm_maj>.<nvm_min> <eetrackid> <cvid_maj>.<cvid_bld>.<cvid_min>"

These versions are reported in the subsequent patch in this series
that implements devlink .info_get but separately.

So split the function into separate ones, refactor it to use them
and remove ugly static string buffer.
Additionally convert NVM/OEM version mask macros to use GENMASK and
use FIELD_GET/FIELD_PREP for them in i40e_nvm_version_str() and
i40e_get_oem_version(). This makes code more readable and allows
us to remove related shift macros.

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-15 14:33:41 +01:00
Ivan Vecera
9e479d64dc i40e: Add initial devlink support
Add an initial support for devlink interface to i40e driver.

Similarly to ice driver the implementation doe not enable devlink
to manage device-wide configuration and devlink instance is created
for each physical function of PCIe device.

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-15 14:33:41 +01:00
Pavan Chebbi
b22f21f7a5 tg3: Improve PTP TX timestamping logic
When we are trying to timestamp a TX packet, there may be
occasions when the TX timestamp register is still not
updated with the latest timestamp even if the timestamp
packet descriptor is marked as complete.
This usually happens in cases where the system is under
stress or flow control is affecting the transmit side.

We will solve this problem by saving the snapshot of the
timestamp register when we are posting the TX descriptor.
At this time, the register contains previously timestamped
packet's value and valid timestamp of the current packet must
be different than this.
Upon completion of the current descriptor, we will check if
the timestamp register is updated or not before timestamping
the skb. If not updated, we will schedule the ptp worker to
fetch the updated time later and timestamp the skb.
Also now we restrict number of outstanding PTP TX packet
requests to 1.

Reported-by: Simon White <Simon.White@viavisolutions.com>
Link: https://lore.kernel.org/netdev/CACKFLikGdN9XPtWk-fdrzxdcD=+bv-GHBvfVfSpJzHY7hrW39g@mail.gmail.com/
Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Andy Gospodarek <gospo@broadcom.com>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-15 14:30:24 +01:00
Manish Chopra
2f3389c738 qed: fix LL2 RX buffer allocation
Driver allocates the LL2 rx buffers from kmalloc()
area to construct the skb using slab_build_skb()

The required size allocation seems to have overlooked
for accounting both skb_shared_info size and device
placement padding bytes which results into the below
panic when doing skb_put() for a standard MTU sized frame.

skbuff: skb_over_panic: text:ffffffffc0b0225f len:1514 put:1514
head:ff3dabceaf39c000 data:ff3dabceaf39c042 tail:0x62c end:0x566
dev:<NULL>
…
skb_panic+0x48/0x4a
skb_put.cold+0x10/0x10
qed_ll2b_complete_rx_packet+0x14f/0x260 [qed]
qed_ll2_rxq_handle_completion.constprop.0+0x169/0x200 [qed]
qed_ll2_rxq_completion+0xba/0x320 [qed]
qed_int_sp_dpc+0x1a7/0x1e0 [qed]

This patch fixes this by accouting skb_shared_info and device
placement padding size bytes when allocating the buffers.

Cc: David S. Miller <davem@davemloft.net>
Fixes: 0a7fb11c23 ("qed: Add Light L2 support")
Signed-off-by: Manish Chopra <manishc@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-15 14:29:14 +01:00
Edward Cree
0c7fe3b372 sfc: support offloading ct(nat) action in RHS rules
If an IP address and/or L4 port for NAPT is available from a CT match,
 the MAE will perform the edits; if no CT lookup has been performed for
 this packet, the CT lookup did not return a match, or the matched CT
 entry did not include NAPT, the action will have no effect.

Reviewed-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-15 14:25:03 +01:00
Edward Cree
38f9a08a3e sfc: parse mangle actions (NAT) in conntrack entries
The MAE can edit either address, L4 port, or both, for either source
 or destination.  These can't be mixed; i.e. it can edit source addr
 and source port, but not (say) source addr and dest port.

Reviewed-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-15 14:25:03 +01:00
Leon Romanovsky
627aa13921 net/mlx5e: Allow IPsec soft/hard limits in bytes
Actually the mlx5 code already has needed support to allow users
to configure soft/hard limits in bytes. It is possible due to the
situation with TX path, where CX7 devices are missing hardware
implementation to send events to the software, see commit b2f7b01d36
("net/mlx5e: Simulate missing IPsec TX limits hardware functionality").

That software workaround is not limited to TX and works for bytes too.
So relax the validation logic to not block soft/hard limits in bytes.

Reviewed-by: Patrisious Haddad <phaddad@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-10-14 10:16:33 -07:00
Adham Faris
6dd6eaf43e net/mlx5e: Increase max supported channels number to 256
Increase max supported channels number to 256 (it is not extended
further due to testing disabilities).

Signed-off-by: Adham Faris <afaris@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-10-14 10:16:33 -07:00
Adham Faris
74a8dadac1 net/mlx5e: Preparations for supporting larger number of channels
Data center server CPUs number keeps getting larger with time.
Currently, our driver limits the number of channels to 128.

Maximum channels number is enforced and bounded by hardcoded
defines (en.h/MLX5E_MAX_NUM_CHANNELS) even though the device and machine
(CPUs num) can allow more.

Refactor current implementation in order to handle further channels.

The maximum supported channels number will be increased in the followup
patch.

Introduce RQT size calculation/allocation scheme below:
1) Preserve current RQT size of 256 for channels number up to 128 (the
   old limit).
2) For greater channels number, RQT size is calculated by multiplying
   the channels number by 2 and rounding up the result to the nearest
   power of 2. If the calculated RQT size exceeds the maximum supported
   size by the NIC, fallback to this maximum RQT size
   (1 << log_max_rqt_size).

Since RQT size is no more static, allocate and free the indirection
table SW shadow dynamically.

Signed-off-by: Adham Faris <afaris@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-10-14 10:16:33 -07:00
Adham Faris
0d806cf9c0 net/mlx5e: Refactor mlx5e_rss_init() and mlx5e_rss_free() API's
Introduce code refactoring below:
1) Introduce single API for creating and destroying rss object,
   mlx5e_rss_create() and mlx5e_rss_destroy() respectively.
2) mlx5e_rss_create() constructs and initializes RSS object depends
   on a function new param enum mlx5e_rss_create_type. Callers (like
   rx_res.c) will no longer need to allocate RSS object via
   mlx5e_rss_alloc() and initialize it immediately via
   mlx5e_rss_init_no_tirs() or mlx5e_rss_init(), this will be done by
   a single call to mlx5e_rss_create(). Hence, mlx5e_rss_alloc() and
   mlx5e_rss_init_no_tirs() have been removed from rss.h file and became
   static functions.

Signed-off-by: Adham Faris <afaris@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-10-14 10:16:32 -07:00
Adham Faris
cae8e6dea2 net/mlx5e: Refactor mlx5e_rss_set_rxfh() and mlx5e_rss_get_rxfh()
Initialize indirect table array with memcpy rather than for loop.
This change has made for two reasons:
1) To be consistent with the indirect table array init in
   mlx5e_rss_set_rxfh().
2) In general, prefer to use memcpy for array initializing rather than
   for loop.

Signed-off-by: Adham Faris <afaris@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-10-14 10:16:32 -07:00
Adham Faris
d90ea84375 net/mlx5e: Refactor rx_res_init() and rx_res_free() APIs
Refactor mlx5e_rx_res_init() and mlx5e_rx_res_free() by wrapping
mlx5e_rx_res_alloc() and mlx5e_rx_res_destroy() API's respectively.

Signed-off-by: Adham Faris <afaris@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-10-14 10:16:32 -07:00
Yu Liao
5a37b28824 net/mlx5e: Use PTR_ERR_OR_ZERO() to simplify code
Use the standard error pointer macro to shorten the code and simplify.

Signed-off-by: Yu Liao <liaoyu15@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-10-14 10:16:32 -07:00
Jinjie Ruan
68e81110fb net/mlx5: Use PTR_ERR_OR_ZERO() to simplify code
Return PTR_ERR_OR_ZERO() instead of return 0 or PTR_ERR() to
simplify code.

Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-10-14 10:16:31 -07:00
Yue Haibing
0d2d6bc7e7 net/mlx5: Remove unused declaration
Commit 2ac9cfe782 ("net/mlx5e: IPSec, Add Innova IPSec offload TX data path")
declared mlx5e_ipsec_inverse_table_init() but never implemented it.
Commit f52f2faee5 ("net/mlx5e: Introduce flow steering API")
declared mlx5e_fs_set_tc() but never implemented it.
Commit f2f3df5501 ("net/mlx5: EQ, Privatize eq_table and friends")
declared mlx5_eq_comp_cpumask() but never implemented it.
Commit cac1eb2cf2 ("net/mlx5: Lag, properly lock eswitch if needed")
removed mlx5_lag_update() but not its declaration.
Commit 35ba005d82 ("net/mlx5: DR, Set flex parser for TNL_MPLS dynamically")
removed mlx5dr_ste_build_tnl_mpls() but not its declaration.

Commit e126ba97db ("mlx5: Add driver for Mellanox Connect-IB adapters")
declared but never implemented mlx5_alloc_cmd_mailbox_chain() and mlx5_free_cmd_mailbox_chain().
Commit 0cf53c1247 ("net/mlx5: FWPage, Use async events chain")
removed mlx5_core_req_pages_handler() but not its declaration.
Commit 938fe83c8d ("net/mlx5_core: New device capabilities handling")
removed mlx5_query_odp_caps() but not its declaration.
Commit f6a8a19bb1 ("RDMA/netdev: Hoist alloc_netdev_mqs out of the driver")
removed mlx5_rdma_netdev_alloc() but not its declaration.

Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-10-14 10:16:31 -07:00
Shay Drory
b430c1b4f6 net/mlx5: Replace global mlx5_intf_lock with HCA devcom component lock
mlx5_intf_lock is used to sync between LAG changes and its slaves
mlx5 core dev aux devices changes, which means every time mlx5 core
dev add/remove aux devices, mlx5 is taking this global lock, even if
LAG functionality isn't supported over the core dev.
This cause a bottleneck when probing VFs/SFs in parallel.

Hence, replace mlx5_intf_lock with HCA devcom component lock, or no
lock if LAG functionality isn't supported.

Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-10-14 10:16:31 -07:00
Shay Drory
e534552c92 net/mlx5: Refactor LAG peer device lookout bus logic to mlx5 devcom
LAG peer device lookout bus logic required the usage of global lock,
mlx5_intf_mutex.
As part of the effort to remove this global lock, refactor LAG peer
device lookout to use mlx5 devcom layer.

Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-10-14 10:16:30 -07:00
Shay Drory
89d351c224 net/mlx5: Avoid false positive lockdep warning by adding lock_class_key
Downstream patch will add devcom component which will be locked in
many places. This can lead to a false positive "possible circular
locking dependency" warning by lockdep, on flows which lock more than
one mlx5 devcom component, such as probing ETH aux device.
Hence, add a lock_class_key per mlx5 device.

Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-10-14 10:16:30 -07:00
Wei Zhang
15fa898aeb net/mlx5: Redesign SF active work to remove table_lock
active_work is a work that iterates over all
possible SF devices which their SF port
representors are located on different function,
and in case SF is in active state, probes it.
Currently, the active_work in active_wq is
synced with mlx5_vhca_events_work via table_lock
and this lock causing a bottleneck in performance.

To remove table_lock, redesign active_wq logic
so that it now pushes active_work per SF to
mlx5_vhca_events_workqueues. Since the latter
workqueues are ordered, active_work and
mlx5_vhca_events_work with same index will be
pushed into same workqueue, thus it completely
eliminates the need for a lock.

Signed-off-by: Wei Zhang <weizhang@nvidia.com>
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-10-14 10:16:30 -07:00
Wei Zhang
3f7f31fff2 net/mlx5: Parallelize vhca event handling
At present, mlx5 driver have a general purpose
event handler which not only handles vhca event
but also many other events. This incurs a huge
bottleneck because the event handler is
implemented by single threaded workqueue and all
events are forced to be handled in serial manner
even though application tries to create multiple
SFs simultaneously.

Introduce a dedicated vhca event handler which
manages SFs parallel creation.

Signed-off-by: Wei Zhang <weizhang@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-10-14 10:16:29 -07:00
Zong-Zhe Yang
b650981501 wifi: rtw89: mac: do bf_monitor only if WiFi 6 chips
Beamforming monitor is used to adjust registers to fine tune performance
and power save, and currently only existing WiFi 6 chips need it.

Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20231012021455.19816-7-pkshih@realtek.com
2023-10-14 09:43:32 +03:00
Zong-Zhe Yang
31b7cd195a wifi: rtw89: mac: set bf_assoc capabilities according to chip gen
When associated peer has beamformer capability, we should enable
beamformee, set CSI parameter, and configure rate to send CSI packets.
Since registers of WiFi 7 chips are very different from existing chips,
separate configuration functions.

Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20231012021455.19816-6-pkshih@realtek.com
2023-10-14 09:43:32 +03:00
Zong-Zhe Yang
5fa1c5d416 wifi: rtw89: mac: set bfee_ctrl() according to chip gen
When associated peer has beamformer capability, enable hardware beamformee
function, and then hardware can run sounding protocol itself. Oppositely,
disable this function when disassociated. Define different registers for
WiFi 6 and 7 generations respectively.

Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20231012021455.19816-5-pkshih@realtek.com
2023-10-14 09:43:32 +03:00
Ping-Ke Shih
79c55327cf wifi: rtw89: mac: add registers of MU-EDCA parameters for WiFi 7 chips
According to chip generation, set MU-EDCA parameters from mac80211 when
connected.

Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20231012021455.19816-4-pkshih@realtek.com
2023-10-14 09:43:31 +03:00
Zong-Zhe Yang
7f69cd4253 wifi: rtw89: mac: generalize register of MU-EDCA switch according to chip gen
When connected with 802.11ax AP, MU-EDCA parameters are given, so enable
this hardware function by registers according to chip generation.

Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20231012021455.19816-3-pkshih@realtek.com
2023-10-14 09:43:31 +03:00
Zong-Zhe Yang
fbd1829d29 wifi: rtw89: mac: update RTS threshold according to chip gen
When TX size or time of packet over RTS threshold set by this register,
hardware will use RTS protection automatically. Since WiFi 6 and 7 chips
have different register address for this, separate the address according
to chip gen.

Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20231012021455.19816-2-pkshih@realtek.com
2023-10-14 09:43:31 +03:00
Dmitry Antipov
4619088252 wifi: rtlwifi: simplify TX command fill callbacks
Since 'rtlpriv->cfg->ops->fill_tx_cmddesc()' is always called
with 'firstseg' and 'lastseg' set to 1 (and the latter is
never actually used), all of the relevant chip-specific
routines may be simplified. Compile tested only.

Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
Acked-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20231011154442.52457-2-dmantipov@yandex.ru
2023-10-14 09:42:24 +03:00
Arnd Bergmann
f35ccb65bd wifi: hostap: remove unused ioctl function
The ioctl handler has no actual callers in the kernel and is useless.
All the functionality should be reachable through the regualar interfaces.

Acked-by: Kalle Valo <kvalo@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20231011140225.253106-9-arnd@kernel.org
2023-10-14 09:41:22 +03:00
Arnd Bergmann
166ab7ca34 wifi: atmel: remove unused ioctl function
This function has no callers, and for the past 20 years, the request_firmware
interface has been in place instead of the custom firmware loader.

Acked-by: Kalle Valo <kvalo@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20231011140225.253106-8-arnd@kernel.org
2023-10-14 09:41:21 +03:00
Jakub Kicinski
2d1c882d44 Merge tag 'mlx5-fixes-2023-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:

====================
mlx5 fixes 2023-10-12

This series provides bug fixes to mlx5 driver.

* tag 'mlx5-fixes-2023-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
  net/mlx5e: Fix VF representors reporting zero counters to "ip -s" command
  net/mlx5e: Don't offload internal port if filter device is out device
  net/mlx5e: Take RTNL lock before triggering netdev notifiers
  net/mlx5e: XDP, Fix XDP_REDIRECT mpwqe page fragment leaks on shutdown
  net/mlx5e: RX, Fix page_pool allocation failure recovery for legacy rq
  net/mlx5e: RX, Fix page_pool allocation failure recovery for striding rq
  net/mlx5: Handle fw tracer change ownership event based on MTRC
  net/mlx5: Bridge, fix peer entry ageing in LAG mode
  net/mlx5: E-switch, register event handler before arming the event
  net/mlx5: Perform DMA operations in the right locations
====================

Link: https://lore.kernel.org/r/20231012195127.129585-1-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-13 18:00:27 -07:00
Mateusz Pacuszka
42066c4d5d ice: Fix safe mode when DDP is missing
One thing is broken in the safe mode, that is
ice_deinit_features() is being executed even
that ice_init_features() was not causing stack
trace during pci_unregister_driver().

Add check on the top of the function.

Fixes: 5b246e533d ("ice: split probe into smaller functions")
Signed-off-by: Mateusz Pacuszka <mateuszx.pacuszka@intel.com>
Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Link: https://lore.kernel.org/r/20231011233334.336092-4-jacob.e.keller@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-13 17:57:05 -07:00
Jesse Brandeburg
0288c3e709 ice: reset first in crash dump kernels
When the system boots into the crash dump kernel after a panic, the ice
networking device may still have pending transactions that can cause errors
or machine checks when the device is re-enabled. This can prevent the crash
dump kernel from loading the driver or collecting the crash data.

To avoid this issue, perform a function level reset (FLR) on the ice device
via PCIe config space before enabling it on the crash kernel. This will
clear any outstanding transactions and stop all queues and interrupts.
Restore the config space after the FLR, otherwise it was found in testing
that the driver wouldn't load successfully.

The following sequence causes the original issue:
- Load the ice driver with modprobe ice
- Enable SR-IOV with 2 VFs: echo 2 > /sys/class/net/eth0/device/sriov_num_vfs
- Trigger a crash with echo c > /proc/sysrq-trigger
- Load the ice driver again (or let it load automatically) with modprobe ice
- The system crashes again during pcim_enable_device()

Fixes: 837f08fdec ("ice: Add basic driver framework for Intel(R) E800 Series")
Reported-by: Vishal Agrawal <vagrawal@redhat.com>
Reviewed-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Link: https://lore.kernel.org/r/20231011233334.336092-3-jacob.e.keller@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-13 17:57:05 -07:00
Michal Schmidt
fc6f716a50 i40e: prevent crash on probe if hw registers have invalid values
The hardware provides the indexes of the first and the last available
queue and VF. From the indexes, the driver calculates the numbers of
queues and VFs. In theory, a faulty device might say the last index is
smaller than the first index. In that case, the driver's calculation
would underflow, it would attempt to write to non-existent registers
outside of the ioremapped range and crash.

I ran into this not by having a faulty device, but by an operator error.
I accidentally ran a QE test meant for i40e devices on an ice device.
The test used 'echo i40e > /sys/...ice PCI device.../driver_override',
bound the driver to the device and crashed in one of the wr32 calls in
i40e_clear_hw.

Add checks to prevent underflows in the calculations of num_queues and
num_vfs. With this fix, the wrong device probing reports errors and
returns a failure without crashing.

Fixes: 838d41d92a ("i40e: clear all queues and interrupts")
Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Link: https://lore.kernel.org/r/20231011233334.336092-2-jacob.e.keller@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-13 17:57:05 -07:00
Justin Stitt
a02527363a qed: replace uses of strncpy
strncpy() is deprecated for use on NUL-terminated destination strings
[1] and as such we should prefer more robust and less ambiguous string
interfaces.

This patch eliminates three uses of strncpy():

Firstly, `dest` is expected to be NUL-terminated which is evident by the
manual setting of a NUL-byte at size - 1. For this use specifically,
strscpy() is a viable replacement due to the fact that it guarantees
NUL-termination on the destination buffer.

The next two cases should simply be memcpy() as the size of the src
string is always 3 and the destination string just wants the first 3
bytes changed.

To be clear, there are no buffer overread bugs in the current code as
the sizes and offsets are carefully managed such that buffers are
NUL-terminated. However, with these changes, the code is now more robust
and less ambiguous (and hopefully easier to read).

Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings [1]
Link: https://manpages.debian.org/testing/linux-manual-4.8/strscpy.9.en.html [2]
Link: https://github.com/KSPP/linux/issues/90
Signed-off-by: Justin Stitt <justinstitt@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20231012-strncpy-drivers-net-ethernet-qlogic-qed-qed_debug-c-v2-1-16d2c0162b80@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-13 17:40:21 -07:00
Heiner Kallweit
621735f590 r8169: fix rare issue with broken rx after link-down on RTL8125
In very rare cases (I've seen two reports so far about different
RTL8125 chip versions) it seems the MAC locks up when link goes down
and requires a software reset to get revived.
Realtek doesn't publish hw errata information, therefore the root cause
is unknown. Realtek vendor drivers do a full hw re-initialization on
each link-up event, the slimmed-down variant here was reported to fix
the issue for the reporting user.
It's not fully clear which parts of the NIC are reset as part of the
software reset, therefore I can't rule out side effects.

Fixes: f1bce4ad2f ("r8169: add support for RTL8125")
Reported-by: Martin Kjær Jørgensen <me@lagy.org>
Link: https://lore.kernel.org/netdev/97ec2232-3257-316c-c3e7-a08192ce16a6@gmail.com/T/
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://lore.kernel.org/r/9edde757-9c3b-4730-be3b-0ef3a374ff71@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-13 17:34:12 -07:00
MD Danish Anwar
2c0d808f36 net: ti: icssg-prueth: Fix tx_total_bytes count
ICSSG HW stats on TX side considers 8 preamble bytes as data bytes. Due
to this the tx_bytes of ICSSG interface doesn't match the rx_bytes of the
link partner. There is no public errata available yet.

As a workaround to fix this, decrease tx_bytes by 8 bytes for every tx
frame.

Fixes: c1e10d5dc7 ("net: ti: icssg-prueth: Add ICSSG Stats")
Signed-off-by: MD Danish Anwar <danishanwar@ti.com>
Link: https://lore.kernel.org/r/20231012064626.977466-1-danishanwar@ti.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-13 17:32:19 -07:00
Breno Leitao
5fbd6cdbe3 netconsole: Attach cmdline target to dynamic target
Enable the attachment of a dynamic target to the target created during
boot time. The boot-time targets are named as "cmdline\d", where "\d" is
a number starting at 0.

If the user creates a dynamic target named "cmdline0", it will attach to
the first target created at boot time (as defined in the
`netconsole=...` command line argument). `cmdline1` will attach to the
second target and so forth.

If there is no netconsole target created at boot time, then, the target
name could be reused.

Relevant design discussion:
https://lore.kernel.org/all/ZRWRal5bW93px4km@gmail.com/

Suggested-by: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://lore.kernel.org/r/20231012111401.333798-4-leitao@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-13 17:26:38 -07:00
Breno Leitao
131eeb45b9 netconsole: Initialize configfs_item for default targets
For netconsole targets allocated during the boot time (passing
netconsole=... argument), netconsole_target->item is not initialized.
That is not a problem because it is not used inside configfs.

An upcoming patch will be using it, thus, initialize the targets with
the name 'cmdline' plus a counter starting from 0.  This name will match
entries in the configfs later.

Suggested-by: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://lore.kernel.org/r/20231012111401.333798-3-leitao@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-13 17:26:38 -07:00
Breno Leitao
28856ab2c0 netconsole: move init/cleanup functions lower
Move alloc_param_target() and its counterpart (free_param_target())
to the bottom of the file. These functions are called mostly at
initialization/cleanup of the module, and they should be just above the
callers, at the bottom of the file.

From a practical perspective, having alloc_param_target() at the bottom
of the file will avoid forward declaration later (in the following
patch).

Nothing changed other than the functions location.

Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://lore.kernel.org/r/20231012111401.333798-2-leitao@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-13 17:26:38 -07:00