linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-05-03 19:26:01 -04:00

Author	SHA1	Message	Date
Ido Schimmel	62199e3f16	selftests: net: Add VXLAN MDB test Add test cases for VXLAN MDB, testing the control and data paths. Two different sets of namespaces (i.e., ns{1,2}_v4 and ns{1,2}_v6) are used in order to test VXLAN MDB with both IPv4 and IPv6 underlays, respectively. Example truncated output: # ./test_vxlan_mdb.sh [...] Tests passed: 620 Tests failed: 0 Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-03-17 08:05:50 +00:00
Ido Schimmel	08f876a7d7	vxlan: Enable MDB support Now that the VXLAN MDB control and data paths are in place we can expose the VXLAN MDB functionality to user space. Set the VXLAN MDB net device operations to the appropriate functions, thereby allowing the rtnetlink code to reach the VXLAN driver. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-03-17 08:05:50 +00:00
Ido Schimmel	0f83e69f44	vxlan: Add MDB data path support Integrate MDB support into the Tx path of the VXLAN driver, allowing it to selectively forward IP multicast traffic according to the matched MDB entry. If MDB entries are configured (i.e., 'VXLAN_F_MDB' is set) and the packet is an IP multicast packet, perform up to three different lookups according to the following priority: 1. For an (S, G) entry, using {Source VNI, Source IP, Destination IP}. 2. For a (, G) entry, using {Source VNI, Destination IP}. 3. For the catchall MDB entry (0.0.0.0 or ::), using the source VNI. The catchall MDB entry is similar to the catchall FDB entry (00:00:00:00:00:00) that is currently used to transmit BUM (broadcast, unknown unicast and multicast) traffic. However, unlike the catchall FDB entry, this entry is only used to transmit unregistered IP multicast traffic that is not link-local. Therefore, when configured, the catchall FDB entry will only transmit BULL (broadcast, unknown unicast, link-local multicast) traffic. The catchall MDB entry is useful in deployments where inter-subnet multicast forwarding is used and not all the VTEPs in a tenant domain are members in all the broadcast domains. In such deployments it is advantageous to transmit BULL (broadcast, unknown unicast and link-local multicast) and unregistered IP multicast traffic on different tunnels. If the same tunnel was used, a VTEP only interested in IP multicast traffic would also pull all the BULL traffic and drop it as it is not a member in the originating broadcast domain [1]. If the packet did not match an MDB entry (or if the packet is not an IP multicast packet), return it to the Tx path, allowing it to be forwarded according to the FDB. If the packet did match an MDB entry, forward it to the associated remote VTEPs. However, if the entry is a (, G) entry and the associated remote is in INCLUDE mode, then skip over it as the source IP is not in its source list (otherwise the packet would have matched on an (S, G) entry). Similarly, if the associated remote is marked as BLOCKED (can only be set on (S, G) entries), then skip over it as well as the remote is in EXCLUDE mode and the source IP is in its source list. [1] https://datatracker.ietf.org/doc/html/draft-ietf-bess-evpn-irb-mcast#section-2.6 Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-03-17 08:05:50 +00:00
Ido Schimmel	bc6c6b013f	vxlan: mdb: Add an internal flag to indicate MDB usage Add an internal flag to indicate whether MDB entries are configured or not. Set the flag after installing the first MDB entry and clear it before deleting the last one. The flag will be consulted by the data path which will only perform an MDB lookup if the flag is set, thereby keeping the MDB overhead to a minimum when the MDB is not used. Another option would have been to use a static key, but it is global and not per-device, unlike the current approach. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-03-17 08:05:49 +00:00
Ido Schimmel	a3a48de5ea	vxlan: mdb: Add MDB control path support Implement MDB control path support, enabling the creation, deletion, replacement and dumping of MDB entries in a similar fashion to the bridge driver. Unlike the bridge driver, each entry stores a list of remote VTEPs to which matched packets need to be replicated to and not a list of bridge ports. The motivating use case is the installation of MDB entries by a user space control plane in response to received EVPN routes. As such, only allow permanent MDB entries to be installed and do not implement snooping functionality, avoiding a lot of unnecessary complexity. Since entries can only be modified by user space under RTNL, use RTNL as the write lock. Use RCU to ensure that MDB entries and remotes are not freed while being accessed from the data path during transmission. In terms of uAPI, reuse the existing MDB netlink interface, but add a few new attributes to request and response messages: * IP address of the destination VXLAN tunnel endpoint where the multicast receivers reside. * UDP destination port number to use to connect to the remote VXLAN tunnel endpoint. * VXLAN VNI Network Identifier to use to connect to the remote VXLAN tunnel endpoint. Required when Ingress Replication (IR) is used and the remote VTEP is not a member of originating broadcast domain (VLAN/VNI) [1]. * Source VNI Network Identifier the MDB entry belongs to. Used only when the VXLAN device is in external mode. * Interface index of the outgoing interface to reach the remote VXLAN tunnel endpoint. This is required when the underlay destination IP is multicast (P2MP), as the multicast routing tables are not consulted. All the new attributes are added under the 'MDBA_SET_ENTRY_ATTRS' nest which is strictly validated by the bridge driver, thereby automatically rejecting the new attributes. [1] https://datatracker.ietf.org/doc/html/draft-ietf-bess-evpn-irb-mcast#section-3.2.2 Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-03-17 08:05:49 +00:00
Ido Schimmel	6ab271aaad	vxlan: Expose vxlan_xmit_one() Given a packet and a remote destination, the function will take care of encapsulating the packet and transmitting it to the destination. Expose it so that it could be used in subsequent patches by the MDB code to transmit a packet to the remote destination(s) stored in the MDB entry. It will allow us to keep the MDB code self-contained, not exposing its data structures to the rest of the VXLAN driver. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-03-17 08:05:49 +00:00
Ido Schimmel	f307c8bf37	vxlan: Move address helpers to private headers Move the helpers out of the core C file to the private header so that they could be used by the upcoming MDB code. While at it, constify the second argument of vxlan_nla_get_addr(). Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-03-17 08:05:49 +00:00
Ido Schimmel	da654c80a0	rtnetlink: bridge: mcast: Relax group address validation in common code In the upcoming VXLAN MDB implementation, the 0.0.0.0 and :: MDB entries will act as catchall entries for unregistered IP multicast traffic in a similar fashion to the 00:00:00:00:00:00 VXLAN FDB entry that is used to transmit BUM traffic. In deployments where inter-subnet multicast forwarding is used, not all the VTEPs in a tenant domain are members in all the broadcast domains. It is therefore advantageous to transmit BULL (broadcast, unknown unicast and link-local multicast) and unregistered IP multicast traffic on different tunnels. If the same tunnel was used, a VTEP only interested in IP multicast traffic would also pull all the BULL traffic and drop it as it is not a member in the originating broadcast domain [1]. Prepare for this change by allowing the 0.0.0.0 group address in the common rtnetlink MDB code and forbid it in the bridge driver. A similar change is not needed for IPv6 because the common code only validates that the group address is not the all-nodes address. [1] https://datatracker.ietf.org/doc/html/draft-ietf-bess-evpn-irb-mcast#section-2.6 Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-03-17 08:05:49 +00:00
Ido Schimmel	cc7f5022f8	rtnetlink: bridge: mcast: Move MDB handlers out of bridge driver Currently, the bridge driver registers handlers for MDB netlink messages, making it impossible for other drivers to implement MDB support. As a preparation for VXLAN MDB support, move the MDB handlers out of the bridge driver to the core rtnetlink code. The rtnetlink code will call into individual drivers by invoking their previously added MDB net device operations. Note that while the diffstat is large, the change is mechanical. It moves code out of the bridge driver to rtnetlink code. Also note that a similar change was made in 2012 with commit `77162022ab` ("net: add generic PF_BRIDGE:RTM_ FDB hooks") that moved FDB handlers out of the bridge driver to the core rtnetlink code. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-03-17 08:05:48 +00:00
Ido Schimmel	c009de1061	bridge: mcast: Implement MDB net device operations Implement the previously added MDB net device operations in the bridge driver so that they could be invoked by core rtnetlink code in the next patch. The operations are identical to the existing br_mdb_{dump,add,del} functions. The '_new' suffix will be removed in the next patch. The functions are re-implemented in this patch to make the conversion in the next patch easier to review. Add dummy implementations when 'CONFIG_BRIDGE_IGMP_SNOOPING' is disabled, so that an error will be returned to user space when it is trying to add or delete an MDB entry. This is consistent with existing behavior where the bridge driver does not even register rtnetlink handlers for RTM_{NEW,DEL,GET}MDB messages when this Kconfig option is disabled. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-03-17 08:05:48 +00:00
Ido Schimmel	8c44fa12c8	net: Add MDB net device operations Add MDB net device operations that will be invoked by rtnetlink code in response to received RTM_{NEW,DEL,GET}MDB messages. Subsequent patches will implement these operations in the bridge and VXLAN drivers. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-03-17 08:05:48 +00:00
David S. Miller	ec47dcb489	Merge branch 'J784S4-CPSW9G-bindings' Siddharth Vadapalli says: ==================== Add J784S4 CPSW9G NET Bindings This series cleans up the bindings by reordering the compatibles, followed by adding the bindings for CPSW9G instance of CPSW Ethernet Switch on TI's J784S4 SoC. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2023-03-17 08:01:35 +00:00
Siddharth Vadapalli	e0c9c2a7dd	dt-bindings: net: ti: k3-am654-cpsw-nuss: Add J784S4 CPSW9G support Update bindings for TI K3 J784S4 SoC which contains 9 ports (8 external ports) CPSW9G module and add compatible for it. Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-03-17 08:01:34 +00:00
Siddharth Vadapalli	40235edead	dt-bindings: net: ti: k3-am654-cpsw-nuss: Fix compatible order Reorder compatibles to follow alphanumeric order. Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-03-17 08:01:34 +00:00
Shradha Gupta	bd7fc6e195	net: mana: Add new MANA VF performance counters for easier troubleshooting Extended performance counter stats in 'ethtool -S <interface>' output for MANA VF to facilitate troubleshooting. Tested-on: Ubuntu22 Signed-off-by: Shradha Gupta <shradhagupta@linux.microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-03-17 07:58:18 +00:00
Mengyuan Lou	81dc07417f	net: wangxun: Implement the ndo change mtu interface Add ngbe and txgbe ndo_change_mtu support. Signed-off-by: Mengyuan Lou <mengyuanlou@net-swift.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-03-17 07:52:41 +00:00
Luiz Angelo Daros de Luca	c36a77c33d	net: dsa: realtek: rtl8365mb: add change_mtu The rtl8365mb was using a fixed MTU size of 1536, which was probably inspired by the rtl8366rb's initial frame size. However, unlike that family, the rtl8365mb family can specify the max frame size in bytes, rather than in fixed steps. DSA calls change_mtu for the CPU port once the max MTU value among the ports changes. As the max frame size is defined globally, the switch is configured only when the call affects the CPU port. The available specifications do not directly define the max supported frame size, but it mentions a 16k limit. This driver will use the 0x3FFF limit as it is used in the vendor API code. However, the switch sets the max frame size to 16368 bytes (0x3FF0) after it resets. change_mtu uses MTU size, or ethernet payload size, while the switch works with frame size. The frame size is calculated considering the ethernet header (14 bytes), a possible 802.1Q tag (4 bytes), the payload size (MTU), and the Ethernet FCS (4 bytes). The CPU tag (8 bytes) is consumed before the switch enforces the limit. During setup, the driver will use the default 1500-byte MTU of DSA to set the maximum frame size. The current sum will be VLAN_ETH_HLEN+1500+ETH_FCS_LEN, which results in 1522 bytes. Although it is lower than the previous initial value of 1536 bytes, the driver will increase the frame size for a larger MTU. However, if something requires more space without increasing the MTU, such as QinQ, we would need to add the extra length to the rtl8365mb_port_change_mtu() formula. MTU was tested up to 2018 (with 802.1Q) as that is as far as mt7620 (where rtl8367s is stacked) can go. The register was manually manipulated byte-by-byte to ensure the MTU to frame size conversion was correct. For frames without 802.1Q tag, the frame size limit will be 4 bytes over the required size. There is a jumbo register, enabled by default at 6k frame size. However, the jumbo settings do not seem to limit nor expand the maximum tested MTU (2018), even when jumbo is disabled. More tests are needed with a device that can handle larger frames. Signed-off-by: Luiz Angelo Daros de Luca <luizluca@gmail.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Reviewed-by: Alvin Šipraga <alsi@bang-olufsen.dk> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-03-17 07:45:06 +00:00
Jakub Kicinski	b883d1ee98	Merge branch 'add-ptp-support-for-sama7g5' Durai Manickam says: ==================== Add PTP support for sama7g5 This patch series is intended to add PTP capability to the GEM and EMAC for sama7g5. ==================== Link: https://lore.kernel.org/r/20230315095053.53969-1-durai.manickamkr@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-03-16 17:21:52 -07:00
Durai Manickam KR	9bae0dd05e	net: macb: Add PTP support to EMAC for sama7g5 Add PTP capability to the Ethernet MAC. Signed-off-by: Durai Manickam KR <durai.manickamkr@microchip.com> Reviewed-by: Claudiu Beznea <claudiu.beznea@microchip.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-03-16 17:21:51 -07:00
Durai Manickam KR	abc783a7b0	net: macb: Add PTP support to GEM for sama7g5 Add PTP capability to the Gigabit Ethernet MAC. Signed-off-by: Durai Manickam KR <durai.manickamkr@microchip.com> Reviewed-by: Claudiu Beznea <claudiu.beznea@microchip.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-03-16 17:21:51 -07:00
Andy Shevchenko	d565263b7d	net: dsa: hellcreek: Get rid of custom led_init_default_state_get() LED core provides a helper to parse default state from firmware node. Use it instead of custom implementation. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Link: https://lore.kernel.org/r/20230314181824.56881-1-andriy.shevchenko@linux.intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-03-16 17:02:56 -07:00
Rob Herring	cc6d85c1cb	nfc: mrvl: Use of_property_read_bool() for boolean properties It is preferred to use typed property access functions (i.e. of_property_read_<type> functions) rather than low-level of_get_property/of_find_property functions for reading properties. Convert reading boolean properties to of_property_read_bool(). Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-03-16 17:43:14 +00:00
Rob Herring	053fdaa841	nfc: mrvl: Move platform_data struct into driver There are no users of nfcmrvl platform_data struct outside of the driver and none will be added, so move it into the driver. Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-03-16 17:43:14 +00:00
Xu Liang	0ba13995be	net: phy: mxl-gpy: enhance delay time required by loopback disable function GPY2xx devices need 3 seconds to fully switch out of loopback mode before it can safely re-enter loopback mode. Implement timeout mechanism to guarantee 3 seconds waited before re-enter loopback mode. Signed-off-by: Xu Liang <lxu@maxlinear.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-03-16 17:30:45 +00:00
Colin Ian King	9bdf4489a3	net: phy: micrel: Fix spelling mistake "minimim" -> "minimum" There is a spelling mistake in a pr_warn_ratelimited message. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Link: https://lore.kernel.org/r/20230314082315.26532-1-colin.i.king@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-03-15 22:17:40 -07:00
Jakub Kicinski	6873465c19	Merge branch 'nfp-flower-add-support-for-multi-zone-conntrack' Louis Peens says: ==================== nfp: flower: add support for multi-zone conntrack This series add changes to support offload of connection tracking across multiple zones. Previously the driver only supported offloading of a single goto_chain, spanning a single zone. This was implemented by merging a pre_ct rule, post_ct rule and the nft rule. This series provides updates to let the original post_ct rule act as the new pre_ct rule for a next set of merges if it contains another goto and conntrack action. In pseudo-tc rule format this adds support for: ingress chain 0 proto ip flower action ct zone 1 pipe action goto 1 ingress chain 1 proto ip flower ct_state +tr+new ct_zone 1 action ct_clear pipe action ct zone 2 pipe action goto 2 ingress chain 1 proto ip flower ct_state +tr+est ct_zone 1 action ct_clear pipe action ct zone 2 pipe action goto 2 ingress chain 2 proto ip flower ct_state +tr+new ct_zone 2 action mirred egress redirect dev ... ingress chain 2 proto ip flower ct_state +tr+est ct_zone 2 action mirred egress redirect dev ... This can continue for up to a maximum of 4 zone recirculations. The first few patches are some smaller preparation patches while the last one introduces the functionality. ==================== Link: https://lore.kernel.org/r/20230314063610.10544-1-louis.peens@corigine.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-03-15 22:16:26 -07:00
Wentao Jia	a87ceb3d42	nfp: flower: offload tc flows of multiple conntrack zones If goto_chain action present in the post ct flow rule, merge flow rules in this ct-zone, create a new pre_ct entry as the pre ct flow rule of next ct-zone, but do not offload merged flow rules to firmware. Repeat the process in the next ct-zone until no goto_chain action present in the post ct flow rule in a certain ct-zone, merged all the flow rules. Offload to firmware finally. Signed-off-by: Wentao Jia <wentao.jia@corigine.com> Acked-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Louis Peens <louis.peens@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-03-15 22:16:23 -07:00
Wentao Jia	46a83c85b6	nfp: flower: prepare for parameterisation of number of offload rules The fixed number of offload flow rule is only supported scenario of one ct zone, in the scenario of multiple ct zones, dynamic number and more number of offload flow rules are required. In order to support scenario of multiple ct zones, parameter num_rules is added for to offload flow rules Signed-off-by: Wentao Jia <wentao.jia@corigine.com> Acked-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Louis Peens <louis.peens@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-03-15 22:16:22 -07:00
Wentao Jia	3e44d19934	nfp: flower: add goto_chain_index for ct entry The chain_index has different means in pre ct entry and post ct entry. In pre ct entry, it means chain index, but in post ct entry, it means goto chain index, it is confused. chain_index and goto_chain_index may be present in one flow rule, It cannot be distinguished by one field chain_index, both chain_index and goto_chain_index are required in the follow-up patch to support multiple ct zones Another field goto_chain_index is added to record the goto chain index. If no goto action in post ct entry, goto_chain_index is 0. Signed-off-by: Wentao Jia <wentao.jia@corigine.com> Acked-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Louis Peens <louis.peens@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-03-15 22:16:22 -07:00
Wentao Jia	0b8d953cce	nfp: flower: refactor function "is_post_ct_flow" 'ct_clear' action only or no ct action is supported for 'post_ct_flow'. But in scenario of multiple ct zones, one non 'ct_clear' ct action or more ct actions, including 'ct_clear action', may be present in one flow rule. If ct state match key is 'ct_established', the flow rule is still expected to be classified as 'post_ct_flow'. Check ct status first in function "is_post_ct_flow" to achieve this. Signed-off-by: Wentao Jia <wentao.jia@corigine.com> Acked-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Louis Peens <louis.peens@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-03-15 22:16:22 -07:00
Wentao Jia	cee7b339d8	nfp: flower: refactor function "is_pre_ct_flow" In the scenario of multiple ct zones, ct state key match and ct action is present in one flow rule, the flow rule is classified to post_ct_flow in design. There is no ct state key match for pre ct flow, the judging condition is added to function "is_pre_ct_flow". Chain_index is another field for judging which flows are pre ct flow If chain_index not 0, the flow is not pre ct flow. Signed-off-by: Wentao Jia <wentao.jia@corigine.com> Acked-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Louis Peens <louis.peens@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-03-15 22:16:22 -07:00
Wentao Jia	8a8db7aeaa	nfp: flower: add get_flow_act_ct() for ct action CT action is a special case different from other actions, CT clear action is not required when get ct action, but this case is not considered. If CT clear action in the flow rule, skip the CT clear action when get ct action, return the first ct action that is not a CT clear action Signed-off-by: Wentao Jia <wentao.jia@corigine.com> Acked-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Louis Peens <louis.peens@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-03-15 22:16:22 -07:00
Jakub Kicinski	fabdc10075	Merge mlx5 updates 2023-03-13 Saeed Mahameed says: ==================== mlx5-updates-2023-03-13 1) Trivial cleanup patches 2) By Sandipan Patra: Implement thermal zone to report NIC temperature 3) Adham Faris, Improves devlink health diagnostics for netdev objects 4) From Maor, Enable TC offload for egress and engress MACVLAN over bond 5) From Gal, add devlink hairpin queues parameters to replace debugfs as was discussed in [1]: [1] https://lore.kernel.org/all/20230111194608.7f15b9a1@kernel.org/ ==================== Link: https://lore.kernel.org/all/20230314054234.267365-1-saeed@kernel.org/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-03-15 22:12:09 -07:00
Maor Dickman	63b02048f9	net/mlx5e: Enable TC offload for egress MACVLAN over bond Support offloading of TC rules that mirror/redirect egress traffic to a MACVLAN device, which is attached to bond device which master mlx5 devices. Signed-off-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20230314054234.267365-16-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-03-15 22:12:09 -07:00
Maor Dickman	d5d006bb27	net/mlx5e: Enable TC offload for ingress MACVLAN over bond Support offloading of TC rules that filter ingress traffic from a MACVLAN device, which is attached to bond device. Signed-off-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20230314054234.267365-15-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-03-15 22:12:09 -07:00
Maor Dickman	244fd69820	net/mlx5e: TC, Extract indr setup block checks to function In preparation for next patch which will add new check if device block can be setup, extract all existing checks to function to make it more readable and maintainable. Signed-off-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20230314054234.267365-14-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-03-15 22:12:09 -07:00
Gal Pressman	8a0594c096	net/mlx5e: Add more information to hairpin table dump Print the number of hairpin queues and size as part of the hairpin table dump. Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20230314054234.267365-13-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-03-15 22:12:08 -07:00
Gal Pressman	1bffcea429	net/mlx5e: Add devlink hairpin queues parameters We refer to a TC NIC rule that involves forwarding as "hairpin". Hairpin queues are mlx5 hardware specific implementation for hardware forwarding of such packets. Per the discussion in [1], move the hairpin queues control (number and size) from debugfs to devlink. Expose two devlink params: - hairpin_num_queues: control the number of hairpin queues - hairpin_queue_size: control the size (in packets) of the hairpin queues [1] https://lore.kernel.org/all/20230111194608.7f15b9a1@kernel.org/ Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20230314054234.267365-12-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-03-15 22:12:08 -07:00
Gal Pressman	028522e284	net/mlx5: Move needed PTYS functions to core layer Downstream patches require devlink params to access the PTYS register, move the needed functions from mlx5e to the core layer. Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20230314054234.267365-11-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-03-15 22:12:08 -07:00
Adham Faris	bb76d250e5	net/mlx5e: Add XSK RQ state flag for RQ devlink health diagnostics Currently RQ health diagnostics doesn't inform the user whether an RQ is an XSK RQ or not. Address this, by adding XSK state flag to RQ SW state enum in core/en.h. XSK will be '1' if current RQ is an XSK RQ, and it will be '0' if it's not. In this example below, it can be seen that XSK field value is '1' since xdpsock program have been attached to channel 0 before issuing the devlink query command: $ devlink health diagnose auxiliary/mlx5_core.eth.0/65535 reporter rx Output: ======================================================================= Common config: RQ: type: 2 stride size: 4096 size: 16 ts_format: FRC CQ: stride size: 64 size: 1024 RQs: channel ix: 0 rqn: 4236 HW state: 1 WQE counter: 15 posted WQEs: 15 cc: 15 SW State: enabled: 1 recovering: 0 am: 1 no_csum_complete: 1 csum_full: 0 mini_cqe_hw_stridx: 1 shampo: 0 mini_cqe_enhanced: 0 xsk: 1 CQ: cqn: 1085 HW status: 0 ci: 0 size: 1024 EQ: eqn: 7 irqn: 32 vecidx: 0 ci: 5 size: 2048 ICOSQ: sqn: 4229 HW state: 1 cc: 158 pc: 158 WQE size: 2048 CQ: cqn: 1080 cc: 1 size: 2048 Signed-off-by: Adham Faris <afaris@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20230314054234.267365-10-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-03-15 22:12:08 -07:00
Adham Faris	fc9d982a25	net/mlx5e: Expose SQ SW state as part of SQ health diagnostics Add SQ SW state textual representation to devlink health diagnostics for tx reporter. SQ SW state can be retrieved by issuing the devlink command below: $ devlink health diagnose auxiliary/mlx5_core.eth.0/65535 reporter tx Output ======================================================================= Common Config: SQ: stride size: 64 size: 1024 ts_format: FRC CQ: stride size: 64 size: 1024 SQs: channel ix: 0 tc: 0 txq ix: 0 sqn: 4170 HW state: 1 stopped: false cc: 0 pc: 0 SW State: enabled: 1 mpwqe: 1 recovering: 0 ipsec: 0 am: 1 vlan_need_l2_inline: 1 pending_xsk_tx: 0 pending_tls_rx_resync: 0 xdp_multibuf: 0 CQ: cqn: 1031 HW status: 0 ci: 0 size: 1024 EQ: eqn: 7 irqn: 32 vecidx: 0 ci: 2 size: 2048 Signed-off-by: Adham Faris <afaris@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20230314054234.267365-9-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-03-15 22:12:08 -07:00
Adham Faris	1fe7bc109e	net/mlx5e: Stringify RQ SW state in RQ devlink health diagnostics One of the parameters that is retrieved/printed as a response to devlink health diagnostics for rx reporter is the RQ SW state. It's printed as a bitmap decimal number. Printing it as bitmap is problematic and non informative. In addition User can't count on SW state without accessing the kernel sources (mlx5e rq state enum in en.h). This patch prints RQ SW state in a textual representation, as a key: value pairs, where disabled rq states will appear as '0' and enabled ones will appear as '1'. See below the generated output for rx health diagnostics devlink command: $ devlink health diagnose auxiliary/mlx5_core.eth.0/65535 reporter rx Before: ======================================================================= Common config: RQ: type: 2 stride size: 2048 size: 8 ts_format: FRC CQ: stride size: 64 size: 1024 RQs: channel ix: 0 rqn: 4172 HW state: 1 SW state: 37 WQE counter: 7 posted WQEs: 7 cc: 7 CQ: cqn: 1033 HW status: 0 ci: 0 size: 1024 EQ: eqn: 7 irqn: 32 vecidx: 0 ci: 2 size: 2048 ICOSQ: sqn: 4169 HW state: 1 cc: 74 pc: 74 WQE size: 128 CQ: cqn: 1030 cc: 1 size: 128 channel ix: 1 ... . . After: ======================================================================= Common config: RQ: type: 2 stride size: 2048 size: 8 ts_format: FRC CQ: stride size: 64 size: 1024 RQs: channel ix: 0 rqn: 4172 HW state: 1 WQE counter: 7 posted WQEs: 7 cc: 7 SW State: enabled: 1 recovering: 0 am: 1 no_csum_complete: 0 csum_full: 0 mini_cqe_hw_stridx: 1 shampo: 0 mini_cqe_enhanced: 0 CQ: cqn: 1033 HW status: 0 ci: 0 size: 1024 EQ: eqn: 7 irqn: 32 vecidx: 0 ci: 2 size: 2048 ICOSQ: sqn: 4169 HW state: 1 cc: 74 pc: 74 WQE size: 128 CQ: cqn: 1030 cc: 1 size: 128 channel: ix: 1 ... . . Signed-off-by: Adham Faris <afaris@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20230314054234.267365-8-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-03-15 22:12:08 -07:00
Adham Faris	2b5bd5b161	net/mlx5e: Rename RQ/SQ adaptive moderation state flag Dynamic interrupt moderation RQ and SQ feature represented by MLX5E_RQ_STATE_AM and MLX5E_SQ_STATE_AM enums respectively, is not consistent with the feature naming in the driver, and with the formal feature and library names. Hence, change MLX5E_RQ_STATE_AM and MLX5E_SQ_STATE_AM enum type names in core/en.h to MLX5E_RQ_STATE_DIM and MLX5E_SQ_STATE_DIM respectively. Signed-off-by: Adham Faris <afaris@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20230314054234.267365-7-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-03-15 22:12:08 -07:00
Rahul Rameshbabu	aa98d15ea4	net/mlx5e: Utilize the entire fifo Previous check was comparing against the fifo mask. The mask is size of the fifo (power of two) minus one, so a less than or equal comparator should be used for checking if the fifo has room for the SKB. Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20230314054234.267365-6-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-03-15 22:11:50 -07:00
Sandipan Patra	c1fef618d6	net/mlx5: Implement thermal zone Implement thermal zone support for mlx5 based HW. The NIC uses temperature sensor provided by ASIC to report current temperature to thermal core. Signed-off-by: Sandipan Patra <spatra@nvidia.com> Reviewed-by: Gal Pressman <gal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20230314054234.267365-5-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-03-15 22:09:14 -07:00
Jiri Pirko	ceefcfb8a3	net/mlx5: Add comment to mlx5_devlink_params_register() Add comment to mlx5_devlink_params_register() functions so it is clear that only driver init params should be registered here. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20230314054234.267365-4-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-03-15 22:09:14 -07:00
Moshe Shemesh	8ff38e730c	net/mlx5: Stop waiting for PCI up if teardown was triggered If driver teardown is called while PCI is turned off, there is a race between health recovery and teardown. If health recovery already started it will wait 60 sec trying to see if PCI gets back and it can recover, but actually there is no need to wait anymore once teardown was called. Use the MLX5_BREAK_FW_WAIT flag which is set on driver teardown to break waiting for PCI up. Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20230314054234.267365-3-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-03-15 22:09:13 -07:00
Moshe Shemesh	c05d145abe	net/mlx5: remove redundant clear_bit When shutdown or remove callbacks are called the driver sets the flag MLX5_BREAK_FW_WAIT, to stop waiting for FW as teardown was called. There is no need to clear the bit as once shutdown or remove were called as there is no way back, the driver is going down. Furthermore, if not cleared the flag can be used also in other loops where we may wait while teardown was already called. Use test_bit() instead of test_and_clear_bit() as there is no need to clear the flag. Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20230314054234.267365-2-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-03-15 22:09:13 -07:00
Wolfram Sang	a57cc54d69	net: phy: micrel: drop superfluous use of temp variable 'temp' was used before commit `c0c99d0cd1` ("net: phy: micrel: remove the use of .ack_interrupt()") refactored the code. Now, we can simplify it a little. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://lore.kernel.org/r/20230314124928.44948-1-wsa+renesas@sang-engineering.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-03-15 21:32:02 -07:00
Wolfram Sang	83456576a4	net: phy: update obsolete comment about PHY_STARTING Commit `899a3cbbf7` ("net: phy: remove states PHY_STARTING and PHY_PENDING") missed to update a comment in phy_probe. Remove superfluous "Description:" prefix while we are here. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Reviewed-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://lore.kernel.org/r/20230314124856.44878-1-wsa+renesas@sang-engineering.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-03-15 21:31:54 -07:00

1 2 3 4 5 ...

1169731 Commits