Merge series from David Jander <david@protonic.nl>:
These patches optimize the spi_sync call for the common case that the
worker thread is idle and the queue is empty. It also opens the
possibility to potentially further optimize the async path also, since
it doesn't need to take into account the direct sync path anymore.
As an example for the performance gain, on an i.MX8MM SoC with a SPI CAN
controller attached (MCP2518FD), the time the interrupt line stays
active (which corresponds roughly with the time it takes to send 3
relatively short consecutive spi_sync messages) is reduced from 98us to
only 72us by this patch.
A note about message ordering:
This patch series should not change the behavior of message ordering when
coming from the same context. This means that if a client driver issues
one or more spi_async() messages immediately followed by a spi_sync()
message in the same context, it can still rely on these messages being
sent out in the order they were fired.
This fixes the sequence of dma_release_channel.
Since commit f52b03c707 ("spi: s3c64xx: requests spi-dma channel only
during data transfer"),
dma_release_channel has been located in the s3c64xx_spi_transfer_one
but this makes invalid return of can_dma callback.
__spi_unmap_msg will check whether the request is requested by dma or
not via can_dma callback. When it is calling to check it, the channels
will be already released at the end of s3c64xx_spi_transfer_one so the
callback function will return always "false". So, they can't be unmapped
from __spi_unmap_msg call. To fix this, we need to add
unprepare_transfer_hardware callback and move the dma_release_channel
from s3c64xx_spi_transfer_one to there.
Fixes: f52b03c707 ("spi: s3c64xx: requests spi-dma channel only during data transfer")
Signed-off-by: Chanho Park <chanho61.park@samsung.com>
Link: https://lore.kernel.org/r/20220627013845.138350-1-chanho61.park@samsung.com
Signed-off-by: Mark Brown <broonie@kernel.org>
There are only a few drivers that do not call
spi_finalize_current_message() in the context of transfer_one_message(),
and even for those cases the completion ctlr->cur_msg_completion is not
needed always. The calls to complete() and wait_for_completion() each
take a spin-lock, which is costly. This patch makes it possible to avoid
those calls in the big majority of cases, by introducing two flags that
with the help of ordering via barriers can avoid using the completion
safely. In case of a race with the context calling
spi_finalize_current_message(), the scheme errs on the safe side and takes
the completion.
The impact of this patch is worth the effort: On a i.MX8MM SoC, the time
the SPI bus is idle between two consecutive calls to spi_sync(), is
reduced from 19.6us to 16.8us... roughly 15%.
Signed-off-by: David Jander <david@protonic.nl>
Link: https://lore.kernel.org/r/20220621061234.3626638-12-david@protonic.nl
Signed-off-by: Mark Brown <broonie@kernel.org>
This patch introduces a completion that is completed in
spi_finalize_current_message() and waited for in
__spi_pump_transfer_message(). This way all manipulation of ctlr->cur_msg
is done with the io_mutex held and strictly ordered:
__spi_pump_transfer_message() will not return until
spi_finalize_current_message() is done using ctlr->cur_msg, and its
calling context is only touching ctlr->cur_msg after returning.
Due to this, we can safely drop the spin-locks around ctlr->cur_msg.
Signed-off-by: David Jander <david@protonic.nl>
Link: https://lore.kernel.org/r/20220621061234.3626638-11-david@protonic.nl
Signed-off-by: Mark Brown <broonie@kernel.org>
The interaction with the controller message queue and its corresponding
auxiliary flags and variables requires the use of the queue_lock which is
costly. Since spi_sync will transfer the complete message anyway, and not
return until it is finished, there is no need to put the message into the
queue if the queue is empty. This can save a lot of overhead.
As an example of how significant this is, when using the MCP2518FD SPI CAN
controller on a i.MX8MM SoC, the time during which the interrupt line
stays active (during 3 relatively short spi_sync messages), is reduced
from 98us to 72us by this patch.
Signed-off-by: David Jander <david@protonic.nl>
Link: https://lore.kernel.org/r/20220621061234.3626638-3-david@protonic.nl
Signed-off-by: Mark Brown <broonie@kernel.org>
We deprecated open coding of the transfer queue back in 2017 so it's high
time we finished up converting drivers to use the standard message queue
code. The mpc52xx-psc driver is fairly straightforward so convert to use
transfer_one_message(), it looks like the driver would be a good fit for
transfer_one() with a little bit of updating but this smaller change seems
safer.
The driver seems like a good candidate for transfer_one() but the chip
select function is actually doing rather more than just updating the chip
select and both transfer_one() and transfer_one_message() are current APIs
so leave that refactoring for another day, ideally by someone with the
hardware.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20220613121946.136193-1-broonie@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
We deprecated open coding of the transfer queue back in 2017 so it's high
time we finished up converting drivers to use the standard message queue
code. The SH driver is fairly straightforward so convert to use
transfer_one_message(), it looks like the driver would be a good fit for
transfer_one() with a little bit of updating but this smaller change seems
safer.
I'm not actually clear how the driver worked robustly previously, it
clears SSA and CR1 when queueing a transfer which looks like it would
interfere with any running transfer. This clearing has been moved to the
start of the message transfer function.
I'm also unclear how exactly the chip select is managed with this driver.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20220610154649.1707851-1-broonie@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Currently if the source DMA device isn't ready to provide the channels
capable of the SPI DMA transfers, the DW SSI controller will be registered
with no DMA support. It isn't right since all what the driver needs to do
is to postpone the probe procedure until the DMA device is ready. Let's
fix that in the framework of the DWC SSI generic DMA implementation. First
we need to use the dma_request_chan() method instead of the
dma_request_slave_channel() function, because the later one is deprecated
and most importantly doesn't return the failure cause but the
NULL-pointer. Second we need to stop the DW SSI controller probe procedure
if the -EPROBE_DEFER error is returned on the DMA initialization. The
procedure will resume later when the channels are ready to be requested.
Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Link: https://lore.kernel.org/r/20220624210623.6383-1-Sergey.Semin@baikalelectronics.ru
Signed-off-by: Mark Brown <broonie@kernel.org>
The topcliff-pch driver requires TX and RX buffers on all transfers, open
coding checks for this. Remove those open coded checks and instead rely on
the core functionality, which has the added bonus that it will fix up any
transfers submitted by drivers as needed rather than erroring out.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20220615174138.4060912-1-broonie@kernel.org
Print a warning if the device mode doesn't match the requested mode.
The user doesn't enter the mode in hex so it isn't obvious when
setting the mode succeeds that the mode isn't the requested mode. The
kernel logs a message, it will be more visible if the test also prints
a warning. I was testing --quad, which is unsupported, but doesn't
cause the mode request to fail.
Signed-off-by: David Fries <David@Fries.net>
Link: https://lore.kernel.org/r/YqbNnSGtWHe/GG7w@spacedout.fries.net
Signed-off-by: Mark Brown <broonie@kernel.org>
Merge series from Claudiu Beznea <claudiu.beznea@microchip.com>:
The following series adds runtime PM support for atmel-quadspi driver.
clk_disable()/clk_enable() is called on proper
runtime_suspend()/runtime_resume() ops. Along with it 2 minor cleanups
were added (patches 2/3, 3/3).
On 32 bit systems, the following kernel BUG is hit:
BUG: using smp_processor_id() in preemptible [00000000] code: swapper/0/1
caller is debug_smp_processor_id+0x18/0x24
CPU: 1 PID: 1 Comm: swapper/0 Not tainted 5.19.0-rc1-00001-g6ae0aec8a366 #181
Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
Backtrace:
dump_backtrace from show_stack+0x20/0x24
r7:81024ffd r6:00000000 r5:81024ffd r4:60000013
show_stack from dump_stack_lvl+0x60/0x78
dump_stack_lvl from dump_stack+0x14/0x1c
r7:81024ffd r6:80f652de r5:80bec180 r4:819a2500
dump_stack from check_preemption_disabled+0xc8/0xf0
check_preemption_disabled from debug_smp_processor_id+0x18/0x24
r8:8119b7e0 r7:81205534 r6:819f5c00 r5:819f4c00 r4:c083d724
debug_smp_processor_id from __spi_sync+0x78/0x220
__spi_sync from spi_sync+0x34/0x4c
r10:bb7bf4e0 r9:c083d724 r8:00000007 r7:81a068c0 r6:822a83c0 r5:c083d724
r4:819f4c00
spi_sync from spi_mem_exec_op+0x338/0x370
r5:000000b4 r4:c083d910
spi_mem_exec_op from spi_nor_read_id+0x98/0xdc
r10:bb7bf4e0 r9:00000000 r8:00000000 r7:00000000 r6:00000000 r5:82358040
r4:819f7c40
spi_nor_read_id from spi_nor_detect+0x38/0x114
r7:82358040 r6:00000000 r5:819f7c40 r4:819f7c40
spi_nor_detect from spi_nor_scan+0x11c/0xbec
r10:bb7bf4e0 r9:00000000 r8:00000000 r7:c083da4c r6:00000000 r5:00010101
r4:819f7c40
spi_nor_scan from spi_nor_probe+0x10c/0x2d0
r10:bb7bf4e0 r9:bb7bf4d0 r8:00000000 r7:819f4c00 r6:00000000 r5:00000000
r4:819f7c40
per-cpu access needs to be guarded against preemption.
Fixes: 6598b91b5a ("spi: spi.c: Convert statistics to per-cpu u64_stats_t")
Reported-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: David Jander <david@protonic.nl>
Tested-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
Link: https://lore.kernel.org/r/20220609121334.2984808-1-david@protonic.nl
Signed-off-by: Mark Brown <broonie@kernel.org>
Fixes these "make htmldocs" warnings:
include/linux/spi/spi.h:82: warning: Function parameter or member 'syncp' not described in 'spi_statistics'
include/linux/spi/spi.h:213: warning: Function parameter or member 'pcpu_statistics' not described in 'spi_device'
include/linux/spi/spi.h:676: warning: Function parameter or member 'pcpu_statistics' not described in 'spi_controller'
Fixes: 6598b91b5a ("spi: spi.c: Convert statistics to per-cpu u64_stats_t")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David Jander <david@protonic.nl>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Link: https://lore.kernel.org/r/20220608153309.2899565-1-david@protonic.nl
Signed-off-by: Mark Brown <broonie@kernel.org>