Replace the variable name err used for return codes with the more
generic name ret. An upcoming patch series for adding MSI interrupts
will introduce code which also returns values other than return codes.
Renaming the variable to ret enables using it for both purposes.
This is applied to the whole file to make it consistent.
Signed-off-by: Martin Jocic <martin.jocic@kvaser.com>
Link: https://lore.kernel.org/all/20240614151524.2718287-8-martin.jocic@kvaser.com
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
This check is already done at the creation of the net devices in
kvaser_pciefd_setup_can_ctrls called from kvaser_pciefd_probe.
If it fails, the driver won't load, so there should be no need to
repeat the check inside the ISR. The number of channels is read
from the FPGA and should be trusted.
Signed-off-by: Martin Jocic <martin.jocic@kvaser.com>
Link: https://lore.kernel.org/all/20240614151524.2718287-3-martin.jocic@kvaser.com
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Shannon Nelson says:
====================
ionic: rework fix for doorbell miss
A latency test in a scaled out setting (many VMs with many queues)
has uncovered an issue with our missed doorbell fix from
commit b69585bfce ("ionic: missed doorbell workaround")
As a refresher, the Elba ASIC has an issue where once in a blue
moon it might miss/drop a queue doorbell notification from
the driver. This can result in Tx timeouts and potential Rx
buffer misses.
The basic problem with the original solution is that
we're delaying things with a timer for every single queue,
periodically using mod_timer() to reset to reset the alarm, and
mod_timer() becomes a more and more expensive thing as there
are more and more VFs and queues each with their own timer.
A ping-pong latency test tends to exacerbate the effect such
that every napi is doing a mod_timer() in every cycle.
An alternative has been worked out to replace this using
periodic workqueue items outside the napi cycle to request a
napi_schedule driven by a single delayed-workqueue per device
rather than a timer for every queue. Also, now that newer
firmware is actually reporting its ASIC type, we can restrict
this to the appropriate chip.
The testing scenario used 128 VFs in UP state, 16 queues per
VF, and latency tests were done using TCP_RR with adaptive
interrupt coalescing enabled, running on 1 VF. We would see
99th percentile latencies of up to 900us range, with some max
fliers as much as 4ms.
With these fixes the 99th percentile latencies are typically well
under 50us with the occasional max under 500us.
v1:
https://lore.kernel.org/netdev/20240610230706.34883-1-shannon.nelson@amd.com/
====================
Link: https://lore.kernel.org/r/20240619003257.6138-1-shannon.nelson@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
We only support (u16)-1 size for rx_copybreak, so we can reduce the
field size and move a couple other fields around to save a little
space in the ionic_lif struct.
Before:
/* size: 17440, cachelines: 273, members: 56 */
/* sum members: 17403, holes: 9, sum holes: 37 */
/* last cacheline: 32 bytes */
After:
/* size: 17424, cachelines: 273, members: 56 */
/* sum members: 17401, holes: 7, sum holes: 23 */
/* last cacheline: 16 bytes */
Signed-off-by: Brett Creeley <brett.creeley@amd.com>
Signed-off-by: Shannon Nelson <shannon.nelson@amd.com>
Link: https://lore.kernel.org/r/20240619003257.6138-8-shannon.nelson@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add a work item for each queue that will be run on the queue's
preferred cpu and will schedule another napi. This napi is
run in case the device missed a doorbell and didn't process
a packet. This is a problem for the Elba asic that happens
very rarely.
Signed-off-by: Shannon Nelson <shannon.nelson@amd.com>
Link: https://lore.kernel.org/r/20240619003257.6138-6-shannon.nelson@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Instead of using the system's default workqueue, add a private
workqueue for the device to use for its little jobs. This is
to better support the new work items we will be adding in the
next patches for PF and VF specific jobs, without inundating
the system workqueue in a couple of customer cases where our
devices get scaled out to 100-200 VFs.
Signed-off-by: Brett Creeley <brett.creeley@amd.com>
Signed-off-by: Shannon Nelson <shannon.nelson@amd.com>
Link: https://lore.kernel.org/r/20240619003257.6138-4-shannon.nelson@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Currently the driver either sets the initial interrupt affinity for its
adminq and tx/rx queues on probe or resets it on various
down/up/reconfigure flows. If any user and/or user process
(i.e. irqbalance) changes IRQ affinity for any of the driver's interrupts
that will be reset to driver defaults whenever any down/up/reconfigure
operation happens. This is incorrect and is fixed by making 2 changes:
1. Allocate an array of cpumasks that's only allocated on probe and
destroyed on remove.
2. Update the cpumask(s) for interrupts that are in use by registering
for affinity notifiers.
Signed-off-by: Brett Creeley <brett.creeley@amd.com>
Signed-off-by: Shannon Nelson <shannon.nelson@amd.com>
Link: https://lore.kernel.org/r/20240619003257.6138-3-shannon.nelson@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Petr Machata says:
====================
mlxsw: Use page pool for Rx buffers allocation
Amit Cohen writes:
After using NAPI to process events from hardware, the next step is to
use page pool for Rx buffers allocation, which is also enhances
performance.
To simplify this change, first use page pool to allocate one continuous
buffer for each packet, later memory consumption can be improved by using
fragmented buffers.
This set significantly enhances mlxsw driver performance, CPU can handle
about 370% of the packets per second it previously handled.
The next planned improvement is using XDP to optimize telemetry.
Patch set overview:
Patches #1-#2 are small preparations for page pool usage
Patch #3 initializes page pool, but do not use it
Patch #4 converts the driver to use page pool for buffers allocations
Patch #5 is an optimization for buffer access
Patch #6 cleans up an unused structure
Patch #7 uses napi_consume_skb() as part of Tx completion
====================
Link: https://lore.kernel.org/r/cover.1718709196.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
As part of driver init, all Rx queues are filled with buffers for
hardware usage. Later, when a packet is received, a new buffer should be
allocated to be used by hardware instead of the received buffer.
Packet's processing time includes allocation time, which can be improved
using page pool.
Using page pool, DMA mapping is done only for first allocation of buffers.
As subsequent buffers allocation avoid DMA mapping, it results in
performance improvement. The purpose of page pool is to allocate pages fast
from cache without locking. This lockless guarantee naturally comes from
running under a NAPI.
Use page pool to allocate the data buffer only, so hardware will use it to
fill the packet. At completion time, attach the data buffer (now filled
with packet payload) to new SKB which is allocated around the received
buffer. SKB building at completion time prevents cache miss for each
packet, as now the SKB is allocated right before packets will be handled by
networking stack.
Page pool for each Rx queue enhances Rx side performance by reclaiming
buffers back to each queue specific pool. This change significantly
improves driver performance, CPU can handle about 345% of the packets per
second it previously handled.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/1cf788a8f43c70aae6d526018ef77becb27ad6d3.1718709196.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Next patch will use page pool to allocate buffers for RDQ. Initialize
page pool for each CQ, which is mapped 1:1 to RDQ. Page pool for each Rx
queue enhances Rx side performance by reclaiming buffers back to each queue
specific pool.
When only one NAPI instance is the consumer of pages from page pool, it is
recommended to pass it as part of 'page_pool_params', then page pool APIs
will be done without special locks. mlxsw driver holds NAPI instance per
CQ, so add page pool per CQ and use the existing NAPI instance.
For now, pages are not allocated from the pool, next patch will use it.
Some notes regarding 'page_pool_params':
* Use PP_FLAG_DMA_MAP to allow page pool handles DMA mapping, for now
do not use sync flag, as only the device writes to this memory and we
read it only when it finishes writing there. This will probably be
changed when we will support XDP.
* Define 'order' according to maximum MTU and take into account software
overhead. Some round up are done, which means that we allocate more pages
than we really need. This can be improved later by using fragmented
buffers.
* Use pool_size = MLXSW_PCI_WQE_COUNT. This will be the size of 'ptr_ring',
and should be the maximum amount of packets that page pool will allocate
memory for. In our case, this is the queue size, defined as
MLXSW_PCI_WQE_COUNT.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/02e5856ae7c572d4293ce6bb92c286ee6cfec800.1718709196.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Next patches will add support for page pool in mlxsw driver. Page pool will
be used to allocate buffers for RDQ and will use NAPI instance of the
appropriate CQ (RDQ is mapped 1:1 to CQ).
To allow pool initialization as part of CQ init, when NAPI is initialized,
page_pool structure will be as part of CQ structure. Later, the allocations
for RDQ will be done from the pool in the appropriate CQ. To allow access
to the appropriate pool, set CQ pointer as part of RDQ initialization.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/d60918ca1e142a554af1df9c1152cdac83854a3b.1718709196.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
mlxsw_pci_cq_napi_setup() includes both NAPI initialization and
enablement, similar to teardown function. Next patches will add support
for page pool in mlxsw driver, then we use NAPI instance for page pool.
Page pool initialization should be done before NAPI enablement, same for
page pool destruction which should be done after NAPI disablement.
As preparation, split NAPI setup/teardown into two steps, then page pool
setup will be done between the phases.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/8dbf37e859f07247498fca17109b8858ff2b0498.1718709196.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
With ARCH=m68k, make allmodconfig && make W=1 C=1 reports:
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/net/ethernet/amd/a2065.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/net/ethernet/amd/ariadne.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/net/ethernet/amd/atarilance.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/net/ethernet/amd/hplance.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/net/ethernet/amd/7990.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/net/ethernet/amd/mvme147.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/net/ethernet/amd/sun3lance.o
Add the missing invocation of the MODULE_DESCRIPTION() macro to all
files which have a MODULE_LICENSE().
This includes drivers/net/ethernet/amd/lance.c which, although it did
not produce a warning with the m68k allmodconfig configuration, may
cause this warning with other configurations.
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240618-md-m68k-drivers-net-ethernet-amd-v1-1-50ee7a9ad50e@quicinc.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>