In the AXI SPI Engine driver, compiling the message is an expensive
operation. Previously, it was done per message transfer in the
prepare_message hook. This patch moves the message compile to the
optimize_message hook so that it is only done once per message in
cases where the peripheral driver calls spi_optimize_message().
This can be a significant performance improvement for some peripherals.
For example, the ad7380 driver saw a 13% improvement in throughput
when using the AXI SPI Engine driver with this patch.
Since we now need two message states, one for the optimization stage
that doesn't change for the lifetime of the message and one that is
reset on each transfer for managing the current transfer state, the old
msg->state is split into msg->opt_state and spi_engine->msg_state. The
latter is included in the driver struct now since there is only one
current message at a time that can ever use it and it is in a hot path
so avoiding allocating a new one on each message transfer saves a few
cpu cycles and lets us get rid of the prepare_message callback.
Signed-off-by: David Lechner <dlechner@baylibre.com>
Link: https://msgid.link/r/20240219-mainline-spi-precook-message-v2-4-4a762c6701b9@baylibre.com
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Splitting transfers is an expensive operation so we can potentially
optimize it by doing it only once per optimization of the message
instead of repeating each time the message is transferred.
The transfer splitting functions are currently the only user of
spi_res_alloc() so spi_res_release() can be safely moved at this time
from spi_finalize_current_message() to spi_unoptimize_message().
The doc comments of the public functions for splitting transfers are
also updated so that callers will know when it is safe to call them
to ensure proper resource management.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: David Lechner <dlechner@baylibre.com>
Link: https://msgid.link/r/20240219-mainline-spi-precook-message-v2-2-4a762c6701b9@baylibre.com
Signed-off-by: Mark Brown <broonie@kernel.org>
This adds a new spi_optimize_message() function that can be used to
optimize SPI messages that are used more than once. Peripheral drivers
that use the same message multiple times can use this API to perform SPI
message validation and controller-specific optimizations once and then
reuse the message while avoiding the overhead of revalidating the
message on each spi_(a)sync() call.
Internally, the SPI core will also call this function for each message
if the peripheral driver did not explicitly call it. This is done to so
that controller drivers don't have to have multiple code paths for
optimized and non-optimized messages.
A hook is provided for controller drivers to perform controller-specific
optimizations.
Suggested-by: Martin Sperl <kernel@martin.sperl.org>
Link: https://lore.kernel.org/linux-spi/39DEC004-10A1-47EF-9D77-276188D2580C@martin.sperl.org/
Signed-off-by: David Lechner <dlechner@baylibre.com>
Link: https://msgid.link/r/20240219-mainline-spi-precook-message-v2-1-4a762c6701b9@baylibre.com
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Change the maximum chip-select count in cadence-qspi to 4 instead of 16.
The value gets used as default ->num_chipselect when the num-cs DT
property isn't received from devicetree. It also determines the
cqspi->f_pdata array size.
Hardware only supports values up to 4; see cqspi_chipselect() that sets
CS using a one-bit-per-CS 4-bit register field.
Add a static_assert() call as a defensive measure to ensure we stay
under the SPI subsystem limit. It got set to 4 when introduced in
4d8ff6b099 ("spi: Add multi-cs memories support in SPI core") and
later increased to 16 in 2f8c7c3715 ("spi: Raise limit on number of
chip selects").
Signed-off-by: Théo Lebrun <theo.lebrun@bootlin.com>
Link: https://msgid.link/r/20240209-cdns-qspi-cs-v1-2-a4f9dfed9ab4@bootlin.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Check each flash CS against the num-cs property from devicetree.
Fallback to the driver max supported value (CQSPI_MAX_CHIPSELECT) if
num-cs isn't present.
cqspi->num_chipselect is set in cqspi_of_get_pdata() to the num-cs
devicetree property, or to CQSPI_MAX_CHIPSELECT if num-cs is not set.
Signed-off-by: Théo Lebrun <theo.lebrun@bootlin.com>
Reviewed-by: Dhruva Gole <d-gole@ti.com>
Link: https://msgid.link/r/20240209-cdns-qspi-cs-v1-1-a4f9dfed9ab4@bootlin.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Update the master/slave terminology wherever possible to adopt usage of
the controller/host/target. Some parts have been left untouched because
they were sysfs entries and will probably end up being inaccurate if
simply replaced here.
Signed-off-by: Dhruva Gole <d-gole@ti.com>
Link: https://msgid.link/r/20240215085404.1711976-1-d-gole@ti.com
Signed-off-by: Mark Brown <broonie@kernel.org>
The variable id len being initialized with a value that is never read,
it is being re-assigned later on in a for-loop. The initialization is
redundant and can be removed.
Cleans up clang scan build warning:
drivers/spi/spi-dw-dma.c:580:17: warning: Although the value stored
to 'len' is used in the enclosing expression, the value is never
actually read from 'len' [deadcode.DeadStores]
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Acked-by: Serge Semin <fancer.lancer@gmail.com>
Link: https://msgid.link/r/20240215131603.2062332-1-colin.i.king@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Merge series from Uwe Kleine-König <u.kleine-koenig@pengutronix.de>:
This series finishes off the removal of some of the legacy names for
SPI controllers and devices.
pci1xxxx_spi_transfer_with_dma adds DMA support to copy the data between
host cpu buffer and SPI IO Buffer.
On DMA Completion interrupt, the next SPI transaction is initiated in isr.
Helper functions pci1xxxx_spi_setup, pci1xxxx_spi_setup_dma_from_io,
pci1xxxx_spi_setup_dma_to_io and pci1xxxx_start_spi_xfer are added for
setting up spi, setting up dma operations, and to start spi transfer
respectively. In the existing implementation, codes are replaced with
helper functions wherever applicable.
Signed-off-by: Thangaraj Samynathan <thangaraj.s@microchip.com>
Link: https://lore.kernel.org/r/20240207080621.30742-3-thangaraj.s@microchip.com
Signed-off-by: Mark Brown <broonie@kernel.org>
In PCI1xxxx C0, support for DMA in PCIe endpoint is added
to enhance the SPI performance. With this support, the
performance is improved from 6Mbps to 17Mbps with 20Mhz clock.
- DMA Supports two Channels, 0 and 1
- SPI Instance 0 uses chan 0 and SPI Instance 1 uses chan 1
- DMA can be used only if SPI is mapped to PF0 in the multi
function endpoint and the MSI interrupt is supported
- MSI interrupt of one of the SPI instance is assigned to the DMA
and both channels 0 and 1 share the same irq, the MSI address and
MSI Data of the irq is obtained and stored in DMA registers to
generate interrupt
Signed-off-by: Thangaraj Samynathan <thangaraj.s@microchip.com>
Link: https://lore.kernel.org/r/20240207080621.30742-2-thangaraj.s@microchip.com
Signed-off-by: Mark Brown <broonie@kernel.org>
The driver uses u32 and relies on an implicit inclusion of
<linux/types.h>.
It is good practice to directly include all headers used, it avoids
implicit dependencies and spurious breakage if someone rearranges
headers and causes the implicit include to vanish.
Include the missing header.
Signed-off-by: Tudor Ambarus <tudor.ambarus@linaro.org>
Link: https://lore.kernel.org/r/20240207120431.2766269-5-tudor.ambarus@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>