Drivers could refuse to offload a LAG configuration for a variety of
reasons, mainly having to do with its TX type. Additionally, since DSA
masters may now also be LAG interfaces, and this will translate into a
call to port_lag_join on the CPU ports, there may be extra restrictions
there. Propagate the netlink extack to this DSA method in order for
drivers to give a meaningful error message back to the user.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
These don't work (print a harmless error about the operation failing)
and make little sense to have anyway, because when a LAG DSA master goes
away, we will introduce logic to move our CPU port back to the first
physical DSA master. So suppress these device links in preparation for
adding support for LAG DSA masters.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Similar to the discussion about tracking the admin/oper state of LAG DSA
masters, we have the problem here that struct dsa_port *cpu_dp caches a
single pair of orig_ethtool_ops and netdev_ops pointers.
So if we call dsa_master_setup(bond0, cpu_dp) where cpu_dp is also the
dev->dsa_ptr of one of the physical DSA masters, we'd effectively
overwrite what we cached from that physical netdev with what replaced
from the bonding interface.
We don't need DSA ethtool stats on the bonding interface when used as
DSA master, it's good enough to have them just on the physical DSA
masters, so suppress this logic.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
We store information about the DSA master's state in
cpu_dp->master_admin_up and cpu_dp->master_oper_up, and this assumes a
bijective association between a CPU port and a DSA master.
However, when we have CPU ports in a LAG (and DSA masters in a LAG too),
the way in which we set up things is that the physical DSA masters still
have dev->dsa_ptr pointing to our cpu_dp, but the bonding/team device
itself also has its dev->dsa_ptr pointing towards one of the CPU port
structures (the first one).
So logically speaking, that first cpu_dp can't keep track of both the
physical master's admin/oper state, and of the bonding master's state.
This isn't even needed; the reason why we keep track of the DSA master's
state is to know when it is available for Ethernet-based register access.
For that use case, we don't even need LAG; we just need to decide upon
one of the physical DSA masters (if there is more than 1 available) and
use that.
This change suppresses dsa_tree_master_{admin,oper}_state_change() calls
on LAG DSA masters (which will be supported in a future change), to
allow the tracking of just physical DSA masters.
Link: https://lore.kernel.org/netdev/628cc94d.1c69fb81.15b0d.422d@mx.google.com/
Suggested-by: Christian Marangi <ansuelsmth@gmail.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Some DSA switches have multiple CPU ports, which can be used to improve
CPU termination throughput, but DSA, through dsa_tree_setup_cpu_ports(),
sets up only the first one, leading to suboptimal use of hardware.
The desire is to not change the default configuration but to permit the
user to create a dynamic mapping between individual user ports and the
CPU port that they are served by, configurable through rtnetlink. It is
also intended to permit load balancing between CPU ports, and in that
case, the foreseen model is for the DSA master to be a bonding interface
whose lowers are the physical DSA masters.
To that end, we create a struct rtnl_link_ops for DSA user ports with
the "dsa" kind. We expose the IFLA_DSA_MASTER link attribute that
contains the ifindex of the newly desired DSA master.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
There is a desire to support for DSA masters in a LAG.
That configuration is intended to work by simply enslaving the master to
a bonding/team device. But the physical DSA master (the LAG slave) still
has a dev->dsa_ptr, and that cpu_dp still corresponds to the physical
CPU port.
However, we would like to be able to retrieve the LAG that's the upper
of the physical DSA master. In preparation for that, introduce a helper
called dsa_port_get_master() that replaces all occurrences of the
dp->cpu_dp->master pattern. The distinction between LAG and non-LAG will
be made later within the helper itself.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Some network drivers use __dev_mc_sync()/__dev_uc_sync() and therefore
program the hardware only with addresses with a non-zero sync_cnt.
Some of the above drivers also need to save/restore the address
filtering lists when certain events happen, and they need to walk
through the struct net_device :: uc and struct net_device :: mc lists.
But these lists contain unsynced addresses too.
To keep the appearance of an elementary form of data encapsulation,
provide iterators through these lists that only look at entries with a
non-zero sync_cnt, instead of filtering entries out from device drivers.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Tony Nguyen says:
====================
ice: L2TPv3 offload support
Wojciech Drewek says:
Add support for dissecting L2TPv3 session id in flow dissector. Add support
for this field in tc-flower and support offloading L2TPv3. Finally, add
support for hardware offload of L2TPv3 packets based on session id in
switchdev mode in ice driver.
Example filter:
# tc filter add dev $PF1 ingress prio 1 protocol ip \
flower \
ip_proto l2tp \
l2tpv3_sid 1234 \
skip_sw \
action mirred egress redirect dev $VF1_PR
Changes in iproute2 are required to use the new fields.
ICE COMMS DDP package is required to create a filter in ice.
COMMS DDP package contains profiles of more advanced protocols.
Without COMMS DDP package hw offload will not work, however
sw offload will still work.
====================
Link: https://lore.kernel.org/r/20220908171644.1282191-1-anthony.l.nguyen@intel.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Add support for offloading packets based on L2TPv3 session id in switchdev
mode.
Example filter:
tc filter add dev $PF1 ingress prio 1 protocol ip flower ip_proto l2tp \
l2tpv3_sid 1234 skip_sw action mirred egress redirect dev $VF1_PR
Changes in iproute2 are required to be able to specify l2tpv3_sid.
ICE COMMS DDP package is required to create a filter as it contains L2TPv3
profiles.
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Allow to offload L2TPv3 filters by adding flow_rule_match_l2tpv3.
Drivers can extract L2TPv3 specific fields from now on.
Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Add support for matching on L2TPv3 session ID.
Session ID can be specified only when ip proto was
set to IPPROTO_L2TP.
Example filter:
# tc filter add dev $PF1 ingress prio 1 protocol ip \
flower \
ip_proto l2tp \
l2tpv3_sid 1234 \
skip_sw \
action mirred egress redirect dev $VF1_PR
Acked-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Allow to dissect L2TPv3 specific field which is:
- session ID (32 bits)
L2TPv3 might be transported over IP or over UDP,
this implementation is only about L2TPv3 over IP.
IP protocol carries L2TPv3 when ip_proto is
IPPROTO_L2TP (115).
Acked-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
IPPROTO_L2TP is currently defined in l2tp.h, but most of
ip protocols are defined in in.h file. Move it there in order
to keep code clean.
Acked-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Simon Wunderlich says:
====================
This cleanup patchset includes the following patches:
- bump version strings, by Simon Wunderlich
- drop unused headers in trace.h, by Sven Eckelmann
- drop initialization of flexible ethtool_link_ksettings,
by Sven Eckelmann
- remove unused struct definitions, by Marek Lindner
* tag 'batadv-next-pullrequest-20220916' of git://git.open-mesh.org/linux-merge:
batman-adv: remove unused struct definitions
batman-adv: Drop initialization of flexible ethtool_link_ksettings
batman-adv: Drop unused headers in trace.h
batman-adv: Start new development cycle
====================
Link: https://lore.kernel.org/r/20220916161454.1413154-1-sw@simonwunderlich.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Petr Machata says:
====================
mlxsw: Adjust QOS tests for Spectrum-4 testing
Amit writes:
Quality Of Service tests create congestion and verify the switch behavior.
To create congestion, they need to have more traffic than the port can
handle, so some of them force 1Gbps speed.
The tests assume that 1Gbps speed is supported. Spectrum-4 ASIC will not
support this speed in all ports, so to be able to run QOS tests there,
some adjustments are required.
Patch set overview:
Patch #1 adjusts qos_ets_strict, qos_mc_aware and sch_ets tests.
Patch #2 adjusts RED tests.
Patch #3 extends devlink_lib to support querying maximum pool size.
Patch #4 adds a test which can be used instead of qos_burst and do not
assume that 1Gbps speed is supported.
Patch #5 removes qos_burst test.
====================
Link: https://lore.kernel.org/r/cover.1663152826.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The previous patch added a test which can be used instead of qos_burst.sh.
Remove this test.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add an equivalent test to qos_burst, the test's purpose is same, but the
new test uses simpler topology and does not require forcing low speed.
In addition, it can be run Spectrum-2 and not only Spectrum-3+. The idea
is to use a shaper in order to limit the traffic and create congestion.
qos_burst test uses small pool, sends many small packets, and verify that
packets are not dropped, which means that many descriptors can be handled.
This test should check the change that commit c864769add
("mlxsw: Configure descriptor buffers") pushed.
Instead, the new test tries to use more than 85% of maximum supported
descriptors. The idea is to use big pool (as much as the ASIC supports),
such that the pool size does not limit the traffic, then send many small
packets, which means that many descriptors are used, and check how many
packets the switch can handle.
The usage of shaper allows to run the test in all ASICs, regardless of
the CPU abilities, as it is able to create the congestion with low rate
of packets.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The maximum pool size is exposed via 'devlink sb' command. The next
patch will add a test which increases some pools to the maximum size.
Add a function to query the value.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
QOS tests create congestion and verify the switch behavior. To create
congestion, they need to have more traffic than the port can handle, so
some of them force 1Gbps speed.
The tests assume that 1Gbps speed is supported, otherwise, they will fail.
Spectrum-4 ASIC will not support this speed in all ports, so to be able
to run the tests there, some adjustments are required. Use shapers to limit
the traffic instead of forcing speed. Note that for several ports, the
speed configuration is just for autoneg issues, so shaper is not needed
instead.
The tests already use ETS qdisc as a root and RED qdiscs as children. Add
a new TBF shaper to limit the rate of traffic, and use it as a root qdisc,
then save the previous hierarchy of qdiscs under the new TBF root.
In some ASICs, the shapers do not limit the traffic as accurately as
forcing speed. To make the tests stable, allow the backlog size to be up to
+-10% of the threshold. The aim of the tests is to make sure that with
backlog << threshold, there are no drops, and that packets are dropped
somewhere in vicinity of the configured threshold.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
QOS tests create congestion and verify the switch behavior. To create
congestion, they need to have more traffic than the port can handle, so
some of them force 1Gbps speed.
The tests assume that 1Gbps speed is supported, otherwise, they will fail.
Spectrum-4 ASIC will not support this speed in all ports, so to be able
to run QOS tests there, some adjustments are required. Use shapers to
limit the traffic instead of forcing speed. Note that for several ports,
the speed configuration is just for autoneg issues, so shaper is not needed
instead.
In tests that already use shapers, set the existing shaper to be a child of
a new TBF shaper which is added as a root qdisc and acts as a port shaper.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Vladimir Oltean says:
====================
Remove label = "cpu" from DSA dt-bindings
As explained in more detail in patch 1/3, label = "cpu" is not part of
DSA's device tree bindings, yet we have some checks in the dt-schema for
mt7530 which are written as if it was.
Reformulate those checks, and remove all occurrences of this seemingly
used, but actually unused, property from the binding examples.
====================
Link: https://lore.kernel.org/r/20220912175058.280386-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The fact that some DSA device trees use 'label = "cpu"' for the CPU port
is nothing but blind cargo cult copying. The 'label' property was never
part of the DSA DT bindings for anything except the user ports, where it
provided a hint as to what name the created netdevs should use.
DSA does use the "cpu" port label to identify a CPU port in dsa_port_parse(),
but this is only for non-OF code paths (platform data).
The proper way to identify a CPU port is to look at whether the
'ethernet' phandle is present.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Vladimir Oltean says:
====================
Standardized ethtool counters for NXP ENETC
This is another preparation patch for the introduction of MAC Merge
Layer statistics, this time for the enetc driver (endpoint ports on the
NXP LS1028A). The same set of stats groups is supported as in the case
of the Felix DSA switch.
====================
Link: https://lore.kernel.org/r/20220909113800.55225-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Structure the code in such a way that it can be reused later for the
pMAC statistics, by just changing the "mac" argument to 1.
Usage:
ethtool --include-statistics --show-pause eno2
ethtool -S eno0 --groups eth-mac
ethtool -S eno0 --groups eth-ctrl
ethtool -S eno0 --groups rmon
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The ENETC has counters for the eMAC and for the pMAC exactly 0x1000
apart from each other. The driver only contains definitions for PM0,
the eMAC.
Rather than duplicating everything for PM1, modify the register
definitions such that they take the MAC as argument.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Wei Fang says:
====================
Add FEC support on s32v234 platform
This series patches are to add FEC support on s32v234 platfom.
1. Add compatible string and quirks for fsl,s32v234
2. Update Kconfig to also check for ARCH_S32.
====================
Link: https://lore.kernel.org/r/20220907095649.3101484-1-wei.fang@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Update Kconfig to also check for ARCH_S32.
Add compatible string and quirks for fsl,s32v234
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Lee Jones says:
====================
Immutable branch between MFD, Net and Pinctrl due for the v6.0 merge window
* tag 'ib-mfd-net-pinctrl-v6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd:
mfd: ocelot: Add support for the vsc7512 chip via spi
dt-bindings: mfd: ocelot: Add bindings for VSC7512
resource: add define macro for register address resources
pinctrl: microchip-sgpio: add ability to be used in a non-mmio configuration
pinctrl: microchip-sgpio: allow sgpio driver to be used as a module
pinctrl: ocelot: add ability to be used in a non-mmio configuration
net: mdio: mscc-miim: add ability to be used in a non-mmio configuration
mfd: ocelot: Add helper to get regmap from a resource
====================
Link: https://lore.kernel.org/r/YxrjyHcceLOFlT/c@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>