The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is ignored (apart
from emitting a warning) and this typically results in resource leaks.
To improve here there is a quest to make the remove callback return
void. In the first step of this quest all drivers are converted to
.remove_new(), which already returns void. Eventually after all drivers
are converted, .remove_new() will be renamed to .remove().
Trivially convert this driver from always returning zero in the remove
callback to the void returning variant.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Herve Codina <herve.codina@bootlin.com>
Link: https://lore.kernel.org/r/20240409091203.39062-2-u.kleine-koenig@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Pavel Begunkov says:
====================
optimise local CPU skb_attempt_defer_free
Optimise the case when an skb comes to skb_attempt_defer_free()
on the same CPU it was allocated on. The patch 1 enables skb caches
and gives frags a chance to hit the page pool's fast path.
CPU bound benchmarking with perfect skb_attempt_defer_free()
gives around 1% of extra throughput.
====================
Link: https://lore.kernel.org/r/cover.1712711977.git.asml.silence@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Optimise skb_attempt_defer_free() when run by the same CPU the skb was
allocated on. Instead of __kfree_skb() -> kmem_cache_free() we can
disable softirqs and put the buffer into cpu local caches.
CPU bound TCP ping pong style benchmarking (i.e. netbench) showed a 1%
throughput increase (392.2 -> 396.4 Krps). Cross checking with profiles,
the total CPU share of skb_attempt_defer_free() dropped by 0.6%. Note,
I'd expect the win doubled with rx only benchmarks, as the optimisation
is for the receive path, but the test spends >55% of CPU doing writes.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/a887463fb219d973ec5ad275e31194812571f1f5.1712711977.git.asml.silence@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski says:
====================
selftests: move bpf-offload test from bpf to net
The test_offload.py test fits in networking and bpf equally
well. We started adding more Python tests in networking
and some of the code in test_offload.py can be reused,
so move it to networking. Looks like it bit rotted over
time and some fixes are needed.
Admittedly more code could be extracted but I only had
the time for a minor cleanup :(
====================
Link: https://lore.kernel.org/r/20240409031549.3531084-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Non-ancient ip (iproute2-5.15.0, libbpf 0.7.0) refuses to load
the sample with maps because we don't generate BTF:
libbpf: BTF is required, but is missing or corrupted.
ERROR: opening BPF object file failed
Enable BTF by adding -g to clang flags. With that done
neither of the programs load:
libbpf: prog 'func': error relocating .BTF.ext function info: -22
libbpf: prog 'func': failed to relocate calls: -22
libbpf: failed to load object 'ksft-net-drv/net/sample_ret0.bpf.o'
Andrii explains that this is because we don't specify
section names for the code. Add the section names, too.
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Stanislav Fomichev <sdf@google.com>
Acked-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20240409031549.3531084-4-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Tony Nguyen says:
====================
net/e1000e, igb, igc: Remove redundant runtime resume
Bjorn Helgaas says:
e1000e, igb, and igc all have code to runtime resume the device during
ethtool operations.
Since f32a213765 ("ethtool: runtime-resume netdev parent before ethtool
ioctl ops"), dev_ethtool() does this for us, so remove it from the
individual drivers.
* '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
igc: Remove redundant runtime resume for ethtool ops
igb: Remove redundant runtime resume for ethtool_ops
e1000e: Remove redundant runtime resume for ethtool_ops
====================
Link: https://lore.kernel.org/r/20240408210849.3641172-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Eric Dumazet says:
====================
bonding: remove RTNL from three sysfs files
First patch might fix a potential deadlock.
sysfs handlers should use rtnl_trylock() instead of rtnl_lock().
Following files can be read without acquiring RTNL :
- /sys/class/net/bonding_masters
- /sys/class/net/<name>/bonding/slaves
- /sys/class/net/<name>/bonding/queue_id
====================
Link: https://lore.kernel.org/r/20240408190437.2214473-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When constructing a heap, heapify operations are required on all
non-leaf nodes. Thus, determining the index of the first non-leaf node
is crucial. In a heap, the left child's index of node i is 2 * i + 1
and the right child's index is 2 * i + 2. Node CAKE_MAX_TINS *
CAKE_QUEUES / 2 has its left and right children at indexes
CAKE_MAX_TINS * CAKE_QUEUES + 1 and CAKE_MAX_TINS * CAKE_QUEUES + 2,
respectively, which are beyond the heap's range, indicating it as a
leaf node. Conversely, node CAKE_MAX_TINS * CAKE_QUEUES / 2 - 1 has a
left child at index CAKE_MAX_TINS * CAKE_QUEUES - 1, confirming its
non-leaf status. The loop should start from it since it's not a leaf
node.
By starting the loop from CAKE_MAX_TINS * CAKE_QUEUES / 2 - 1, we
minimize function calls and branch condition evaluations. This
adjustment theoretically reduces two function calls (one for
cake_heapify() and another for cake_heap_get_backlog()) and five branch
evaluations (one for iterating all non-leaf nodes, one within
cake_heapify()'s while loop, and three more within the while loop
with if conditions).
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Acked-by: Toke Høiland-Jørgensen <toke@toke.dk>
Link: https://lore.kernel.org/r/20240408174716.751069-1-visitorckw@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Horatiu Vultur says:
====================
net: phy: micrel: lan8814: Enable PTP_PF_PEROUT
Add support for PTP_PF_PEROUT to lan8814. First patch just enables
the LTC at probe time, such that it is not required to enable
timestamping to have the LTC enabled. While the second patch actually
adds support for PTP_PF_PEROUT.
====================
Link: https://lore.kernel.org/r/20240408064432.3881636-1-horatiu.vultur@microchip.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Lan8814 has 24 GPIOs but only 2 GPIOs (GPIO 0 and GPIO 1) can be
configured to generate period signals. And there are 2 events (EVENT_A
and EVENT_B) but these events are hardcoded to the GPIO 0 and GPIO 1.
These events are used to generate period signals. It is possible to
configure the length, the start time and the period of the signal by
configuring the event.
These events are generated by comparing the target time with the PHC
time. In case the PHC time is changed to a value bigger than the target
time + reload time, then it would generate only 1 event and then it
would stop because target time + reload time is smaller than PHC time.
Therefore it is required to change also the target time every time when
the PHC is changed. The same will apply also when the PHC time is
changed to a smaller value.
This was tested using:
testptp -i 1 -L 1,2
testptp -i 1 -p 1000000000 -w 200000000
Acked-by: Richard Cochran <richardcochran@gmail.com>
Reviewed-by: Divya Koppera <divya.koppera@microchip.com>
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
The LTC for lan8814 was enabled only if timestamping was enabled,
otherwise it would be stopped. Meaning that LTC will not increase by
itself. This might break other features that don't required timestamping
like generating 1PPS. Therefore enable the LTC at probe time.
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Eric Dumazet says:
====================
tcp: fix ISN selection in TIMEWAIT -> SYN_RECV
TCP can transform a TIMEWAIT socket into a SYN_RECV one from
a SYN packet, and the ISN of the SYNACK packet is normally
generated using TIMEWAIT tw_snd_nxt.
This SYN packet also bypasses normal checks against listen queue
being full or not.
Unfortunately this has been broken almost one decade ago.
This series fixes the issue, in two patches.
First patch refactors code to add tcp_tw_isn as a parameter
to ->route_req(), to make the second patch smaller.
Second patch fixes the issue, by no longer using TCP_SKB_CB(skb)
to store the tcp_tw_isn.
Following packetdrill test passes after this series:
// Set up a server listening socket.
0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
+0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
+0 bind(3, ..., ...) = 0
+0 listen(3, 1) = 0
// Establish connection
+0 < S 0:0(0) win 32792 <mss 1460,nop,nop,sackOK>
+0 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK>
+.01 < . 1:1(0) ack 1 win 32792
+0 accept(3, ..., ...) = 4
// We close(), send a FIN, and get an ACK and FIN, in order to get into TIME_WAIT.
+.01 close(4) = 0
+0 > F. 1:1(0) ack 1
+.01 < F. 1:1(0) ack 2 win 32792
+0 > . 2:2(0) ack 2
// SYN hitting a TIME_WAIT -> should use an ISN based on TIMEWAIT tw_snd_nxt
+.01 < S 1000:1000(0) win 65535 <mss 1460,nop,nop,sackOK>
+0 > S. 65539:65539(0) ack 1001 <mss 1460,nop,nop,sackOK>
====================
Link: https://lore.kernel.org/r/20240407093322.3172088-1-edumazet@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
TCP can transform a TIMEWAIT socket into a SYN_RECV one from
a SYN packet, and the ISN of the SYNACK packet is normally
generated using TIMEWAIT tw_snd_nxt :
tcp_timewait_state_process()
...
u32 isn = tcptw->tw_snd_nxt + 65535 + 2;
if (isn == 0)
isn++;
TCP_SKB_CB(skb)->tcp_tw_isn = isn;
return TCP_TW_SYN;
This SYN packet also bypasses normal checks against listen queue
being full or not.
tcp_conn_request()
...
__u32 isn = TCP_SKB_CB(skb)->tcp_tw_isn;
...
/* TW buckets are converted to open requests without
* limitations, they conserve resources and peer is
* evidently real one.
*/
if ((syncookies == 2 || inet_csk_reqsk_queue_is_full(sk)) && !isn) {
want_cookie = tcp_syn_flood_action(sk, rsk_ops->slab_name);
if (!want_cookie)
goto drop;
}
This was using TCP_SKB_CB(skb)->tcp_tw_isn field in skb.
Unfortunately this field has been accidentally cleared
after the call to tcp_timewait_state_process() returning
TCP_TW_SYN.
Using a field in TCP_SKB_CB(skb) for a temporary state
is overkill.
Switch instead to a per-cpu variable.
As a bonus, we do not have to clear tcp_tw_isn in TCP receive
fast path.
It is temporarily set then cleared only in the TCP_TW_SYN dance.
Fixes: 4ad19de877 ("net: tcp6: fix double call of tcp_v6_fill_cb()")
Fixes: eeea10b83a ("tcp: add tcp_v4_fill_cb()/tcp_v4_restore_cb()")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
tcp_v6_init_req() reads TCP_SKB_CB(skb)->tcp_tw_isn to find
out if the request socket is created by a SYN hitting a TIMEWAIT socket.
This has been buggy for a decade, lets directly pass the information
from tcp_conn_request().
This is a preparatory patch to make the following one easier to review.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Daniel Machon says:
====================
Add support for flower actions mirred and redirect
This series adds support for the two tc flower actions mirred and
redirect. Both actions are implemented by means of a port mask and a
mask mode. The mask mode controls how the mask is applied, and together
they are used by the switch to make a forwarding decision. Both actions
are configurable via the IS0 or IS2 VCAP's (ingress stage 0 and 2,
respectively).
Patch #1: adds support for tc flower mirred action.
Patch #2: adds support for tc flower redirect action.
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
====================
Link: https://lore.kernel.org/r/20240405-mirror-redirect-actions-v2-0-875d4c1927c8@microchip.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Add support for the flower redirect action. Two VCAP actions are encoded
in the rule - one for the port mask, and one for the port mask mode.
When the rule is hit, the port mask is used as the final destination
set, replacing all other port masks.
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Add support for tc flower mirred action. Two VCAP actions are encoded in
the rule - one for the port mask, and one for the port mask mode. When
the rule is hit, the destination mask is OR'ed with the port mask.
Also add new VCAP function for supporting 72-bit wide actions, and a tc
helper for setting the port forwarding mask.
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Diogo Ivo says:
====================
Support ICSSG-based Ethernet on AM65x SR1.0 devices
This series extends the current ICSSG-based Ethernet driver to support
AM65x Silicon Revision 1.0 devices.
Notable differences between the Silicon Revisions are that there is
no TX core in SR1.0 with this being handled by the firmware, requiring
extra DMA channels to manage communication with the firmware (with the
firmware being different as well) and in the packet classifier.
The motivation behind it is that a significant number of Siemens
devices containing SR1.0 silicon have been deployed in the field
and need to be supported and updated to newer kernel versions
without losing functionality.
This series is based on TI's 5.10 SDK [1].
The fifth version of this patch series can be found in [2].
Compared to the last version of the patch set there are only changes in
patch 05/10, where the fields of a struct are now explicitly declared as
__le32 so that we can properly interpret them.
Both of the problems mentioned in v4 have been addressed by disabling
those functionalities, meaning that this driver currently only supports
one TX queue and does not support a 100Mbit/s half-duplex connection.
The removal of these features has been commented in the appropriate
locations in the code.
[1]: https://git.ti.com/cgit/ti-linux-kernel/ti-linux-kernel/tree/?h=ti-linux-5.10.y
[2]: https://lore.kernel.org/netdev/20240326110709.26165-1-diogo.ivo@siemens.com/
====================
Link: https://lore.kernel.org/r/20240403104821.283832-1-diogo.ivo@siemens.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
In order to allow code sharing between Silicon Revisions 1.0 and 2.0
move all functions that can be shared into a common file. This commit
introduces no functional changes.
Signed-off-by: Diogo Ivo <diogo.ivo@siemens.com>
Reviewed-by: MD Danish Anwar <danishanwar@ti.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
As these addresses can be useful outside of checking if an address
is a multicast address (for example in device drivers) make them
accessible to users of etherdevice.h to avoid code duplication.
Signed-off-by: Diogo Ivo <diogo.ivo@siemens.com>
Reviewed-by: MD Danish Anwar <danishanwar@ti.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Silicon Revision 1.0 of the AM65x came with a slightly different ICSSG
support: Only 2 PRUs per slice are available and instead 2 additional
DMA channels are used for management purposes. We have no restrictions
on specified PRUs, but the DMA channels need to be adjusted.
Co-developed-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Diogo Ivo <diogo.ivo@siemens.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Roger Quadros <rogerq@kernel.org>
Reviewed-by: MD Danish Anwar <danishanwar@ti.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
8c5ad0dae9 ("igc: Add ethtool support") added ethtool_ops.begin() and
.complete(), which used pm_runtime_get_sync() to resume suspended devices
before any ethtool_ops callback and allow suspend after it completed.
Subsequently, f32a213765 ("ethtool: runtime-resume netdev parent before
ethtool ioctl ops") added pm_runtime_get_sync() in the dev_ethtool() path,
so the device is resumed before any ethtool_ops callback even if the driver
didn't supply a .begin() callback.
Remove the .begin() and .complete() callbacks, which are now redundant
because dev_ethtool() already resumes the device.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Naama Meir <naamax.meir@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
749ab2cd12 ("igb: add basic runtime PM support") added
ethtool_ops.begin() and .complete(), which used pm_runtime_get_sync() to
resume suspended devices before any ethtool_ops callback and allow suspend
after it completed.
Subsequently, f32a213765 ("ethtool: runtime-resume netdev parent before
ethtool ioctl ops") added pm_runtime_get_sync() in the dev_ethtool() path,
so the device is resumed before any ethtool_ops callback even if the driver
didn't supply a .begin() callback.
Remove the .begin() and .complete() callbacks, which are now redundant
because dev_ethtool() already resumes the device.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>