Commit Graph

1337900 Commits

Author SHA1 Message Date
Satish Kharat
bcb725c79c enic: enable rq extended cq support
Enables getting from hw all the supported rq cq sizes and
uses the highest supported cq size.

Co-developed-by: Nelson Escobar <neescoba@cisco.com>
Signed-off-by: Nelson Escobar <neescoba@cisco.com>
Co-developed-by: John Daley <johndale@cisco.com>
Signed-off-by: John Daley <johndale@cisco.com>
Signed-off-by: Satish Kharat <satishkh@cisco.com>
Link: https://patch.msgid.link/20250304-enic_cleanup_and_ext_cq-v2-4-85804263dad8@cisco.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-11 10:21:15 +01:00
Satish Kharat
2be2eb7643 enic: enic rq extended cq defines
Adds the defines for 32 and 64 byte receive queue completion queue
descriptors.
Adds devcmd define to get rq cq descriptor size/s supported by hw.

Co-developed-by: Nelson Escobar <neescoba@cisco.com>
Signed-off-by: Nelson Escobar <neescoba@cisco.com>
Co-developed-by: John Daley <johndale@cisco.com>
Signed-off-by: John Daley <johndale@cisco.com>
Signed-off-by: Satish Kharat <satishkh@cisco.com>
Link: https://patch.msgid.link/20250304-enic_cleanup_and_ext_cq-v2-3-85804263dad8@cisco.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-11 10:21:15 +01:00
Satish Kharat
eaa23db868 enic: enic rq code reorg
Separates enic rx path from generic vnic api. Removes some
complexity of doign enic callbacks through vnic api in rx.
This is in preparation for enabling enic extended cq which
applies only to enic rx path.

Co-developed-by: Nelson Escobar <neescoba@cisco.com>
Signed-off-by: Nelson Escobar <neescoba@cisco.com>
Co-developed-by: John Daley <johndale@cisco.com>
Signed-off-by: John Daley <johndale@cisco.com>
Signed-off-by: Satish Kharat <satishkh@cisco.com>
Link: https://patch.msgid.link/20250304-enic_cleanup_and_ext_cq-v2-2-85804263dad8@cisco.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-11 10:21:15 +01:00
Satish Kharat
025cf93180 enic: Move function from header file to c file
Moves cq_enet_rq_desc_dec from cq_enet_desc.h to enic_rq.c.
This is in preparation for enic extended completion queue
enabling.

Co-developed-by: Nelson Escobar <neescoba@cisco.com>
Signed-off-by: Nelson Escobar <neescoba@cisco.com>
Co-developed-by: John Daley <johndale@cisco.com>
Signed-off-by: John Daley <johndale@cisco.com>
Signed-off-by: Satish Kharat <satishkh@cisco.com>
Link: https://patch.msgid.link/20250304-enic_cleanup_and_ext_cq-v2-1-85804263dad8@cisco.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-11 10:21:15 +01:00
Jakub Kicinski
71ca3561c2 Merge branch 'mptcp-pm-code-reorganisation'
Matthieu Baerts says:

====================
mptcp: pm: code reorganisation

Before this series, the PM code was dispersed in different places:

- pm.c had common code for all PMs.

- pm_netlink.c was initially only about the in-kernel PM, but ended up
  also getting exported common helpers, callbacks used by the different
  PMs, NL events for PM userspace daemon, etc. quite confusing.

- pm_userspace.c had userspace PM only code, but it was using "specific"
  in-kernel PM helpers according to their names.

To clarify the code, a reorganisation is suggested here, only by moving
code around, and small helper renaming to avoid confusions:

- pm_netlink.c now only contains common PM generic Netlink code:
  - PM events: this code was already there
  - shared helpers around Netlink code that were already there as well
  - shared Netlink commands code from pm.c

- pm_kernel.c now contains only code that is specific to the in-kernel
  PM. Now all functions are either called from:
  - pm.c: events coming from the core, when this PM is being used
  - pm_netlink.c: for shared Netlink commands
  - mptcp_pm_gen.c: for Netlink commands specific to the in-kernel PM
  - sockopt.c: for the exported counters per netns

- pm.c got many code from pm_netlink.c:
  - helpers used from both PMs and not linked to Netlink
  - callbacks used by different PMs, e.g. ADD_ADDR management
  - some helpers have been renamed to remove the '_nl' prefix, and some
    have been marked as 'static'.

- protocol.h has been updated accordingly:
  - some helpers no longer need to be exported
  - new ones needed to be exported: they have been prefixed if needed.

The code around the PM is now less confusing, which should help for the
maintenance in the long term, and the introduction of a PM Ops.

This will certainly impact future backports, but because other cleanups
have already done recently, and more are coming to ease the addition of
a new path-manager controlled with BPF (struct_ops), doing that now
seems to be a good time. Also, many issues around the PM have been fixed
a few months ago while increasing the code coverage in the selftests, so
such big reorganisation can be done with more confidence now.

Note that checkpatch, when used with --max-line-length=80, will complain
 about lines being over the 80 limits, but these warnings were already
there before moving the code around.

Also, patch 1 is not directly related to the code reorganisation, but it
was a remaining cleanup that we didn't upstream before, because it was
conflicting with another patch that has been sent for inclusion to the
net tree.
====================

Link: https://patch.msgid.link/20250307-net-next-mptcp-pm-reorg-v1-0-abef20ada03b@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-10 13:36:18 -07:00
Matthieu Baerts (NGI0)
2e7e6e9cda mptcp: pm: move Netlink PM helpers to pm_netlink.c
Before this patch, the PM code was dispersed in different places:

- pm.c had common code for all PMs, but also Netlink specific code that
  will not be needed with the future BPF path-managers.

- pm_netlink.c had common Netlink code.

To clarify the code, a reorganisation is suggested here, only by moving
code around, and small helper renaming to avoid confusions:

- pm_netlink.c now only contains common PM Netlink code:
  - PM events: this code was already there
  - shared helpers around Netlink code that were already there as well
  - shared Netlink commands code from pm.c

- pm.c now no longer contain Netlink specific code.

- protocol.h has been updated accordingly:
  - mptcp_nl_fill_addr() no longer need to be exported.

The code around the PM is now less confusing, which should help for the
maintenance in the long term.

This will certainly impact future backports, but because other cleanups
have already done recently, and more are coming to ease the addition of
a new path-manager controlled with BPF (struct_ops), doing that now
seems to be a good time. Also, many issues around the PM have been fixed
a few months ago while increasing the code coverage in the selftests, so
such big reorganisation can be done with more confidence now.

No behavioural changes intended.

Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250307-net-next-mptcp-pm-reorg-v1-15-abef20ada03b@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-10 13:35:50 -07:00
Matthieu Baerts (NGI0)
8617e85e04 mptcp: pm: split in-kernel PM specific code
Before this patch, the PM code was dispersed in different places:

- pm.c had common code for all PMs

- pm_netlink.c was supposed to be about the in-kernel PM, but also had
  exported common Netlink helpers, NL events for PM userspace daemons,
  etc. quite confusing.

To clarify the code, a reorganisation is suggested here, only by moving
code around to avoid confusions:

- pm_netlink.c now only contains common PM Netlink code:
  - PM events: this code was already there
  - shared helpers around Netlink code that were already there as well
  - more shared Netlink commands code from pm.c will come after

- pm_kernel.c now contains only code that is specific to the in-kernel
  PM. Now all functions are either called from:
  - pm.c: events coming from the core, when this PM is being used
  - pm_netlink.c: for shared Netlink commands
  - mptcp_pm_gen.c: for Netlink commands specific to the in-kernel PM
  - sockopt.c: for the exported counters per netns
  - (while at it, a useless 'return;' spot by checkpatch at the end of
     mptcp_pm_nl_set_flags_all, has been removed)

The code around the PM is now less confusing, which should help for the
maintenance in the long term.

This will certainly impact future backports, but because other cleanups
have already done recently, and more are coming to ease the addition of
a new path-manager controlled with BPF (struct_ops), doing that now
seems to be a good time. Also, many issues around the PM have been fixed
a few months ago while increasing the code coverage in the selftests, so
such big reorganisation can be done with more confidence now.

No behavioural changes intended.

Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250307-net-next-mptcp-pm-reorg-v1-14-abef20ada03b@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-10 13:35:50 -07:00
Matthieu Baerts (NGI0)
e4c28e3d5c mptcp: pm: move generic PM helpers to pm.c
Before this patch, the PM code was dispersed in different places:

- pm.c had common code for all PMs

- pm_netlink.c was supposed to be about the in-kernel PM, but also had
  exported common helpers, callbacks used by the different PMs, NL
  events for PM userspace daemon, etc. quite confusing.

- pm_userspace.c had userspace PM only code, but using specific
  in-kernel PM helpers

To clarify the code, a reorganisation is suggested here, only by moving
code around, and (un)exporting functions:

- helpers used from both PMs and not linked to Netlink
- callbacks used by different PMs, e.g. ADD_ADDR management
- some helpers have been marked as 'static'
- protocol.h has been updated accordingly
- (while at it, a needless if before a kfree(), spot by checkpatch in
   mptcp_remove_anno_list_by_saddr(), has been removed)

The code around the PM is now less confusing, which should help for the
maintenance in the long term.

This will certainly impact future backports, but because other cleanups
have already done recently, and more are coming to ease the addition of
a new path-manager controlled with BPF (struct_ops), doing that now
seems to be a good time. Also, many issues around the PM have been fixed
a few months ago while increasing the code coverage in the selftests, so
such big reorganisation can be done with more confidence now.

No behavioural changes intended.

Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250307-net-next-mptcp-pm-reorg-v1-13-abef20ada03b@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-10 13:35:50 -07:00
Matthieu Baerts (NGI0)
bcc32640ad mptcp: pm: move generic helper at the top
In prevision to another change importing all generic PM helpers from
pm_netlink.c to there.

No behavioural changes intended.

Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250307-net-next-mptcp-pm-reorg-v1-12-abef20ada03b@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-10 13:35:50 -07:00
Matthieu Baerts (NGI0)
a146731272 mptcp: pm: export mptcp_remote_address
In a following commit, the 'remote_address' helper will need to be used
from different files.

It is then exported, and prefixed with 'mptcp_', similar to
'mptcp_local_address'.

No behavioural changes intended.

Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250307-net-next-mptcp-pm-reorg-v1-11-abef20ada03b@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-10 13:35:50 -07:00
Matthieu Baerts (NGI0)
a49eb8ae95 mptcp: pm: worker: split in-kernel and common tasks
To make it clear what actions are in-kernel PM specific and which ones
are not and done for all PMs, e.g. sending ADD_ADDR and close associated
subflows when a RM_ADDR is received.

The behavioural is changed a bit: MPTCP_PM_ADD_ADDR_RECEIVED is now
treated after MPTCP_PM_ADD_ADDR_SEND_ACK and MPTCP_PM_RM_ADDR_RECEIVED,
but that should not change anything in practice.

Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250307-net-next-mptcp-pm-reorg-v1-10-abef20ada03b@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-10 13:35:49 -07:00
Matthieu Baerts (NGI0)
a17336b2b2 mptcp: pm: avoid calling PM specific code from core
When destroying an MPTCP socket, some userspace PM specific code was
called from mptcp_destroy_common() in protocol.c. That feels wrong, and
it is the only case.

Instead, the core now calls mptcp_pm_destroy() from pm.c which is now in
charge of cleaning the announced addresses list, and ask the different
PMs to do extra cleaning if needed, e.g. the userspace PM, if used, will
clean the local addresses list.

While at it, the userspace PM specific helper has been prefixed with
'mptcp_userspace_pm_' like the other ones.

No behavioural changes intended.

Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250307-net-next-mptcp-pm-reorg-v1-9-abef20ada03b@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-10 13:35:49 -07:00
Matthieu Baerts (NGI0)
40aa7409d3 mptcp: pm: kernel: add '_pm' to mptcp_nl_set_flags
Currently, in-kernel PM specific helpers are prefixed with
'mptcp_pm_nl_'. Here, '_pm' was missing from 'mptcp_nl_set_flags'.

Add '_pm' to be similar to others, and add '_all' to avoid confusions
witih the global 'mptcp_pm_nl_set_flags'.

No behavioural changes intended.

Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250307-net-next-mptcp-pm-reorg-v1-8-abef20ada03b@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-10 13:35:49 -07:00
Matthieu Baerts (NGI0)
498d7d8b75 mptcp: pm: remove '_nl' from mptcp_pm_nl_is_init_remote_addr
Currently, in-kernel PM specific helpers are prefixed with
'mptcp_pm_nl_'. But here 'mptcp_pm_nl_is_init_remote_addr' is not
specific to this PM: it is called from pm.c for both the in-kernel and
userspace PMs.

To avoid confusions, the '_nl' bit has been removed from the name.

No behavioural changes intended.

Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250307-net-next-mptcp-pm-reorg-v1-7-abef20ada03b@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-10 13:35:49 -07:00
Matthieu Baerts (NGI0)
550c50bbc2 mptcp: pm: remove '_nl' from mptcp_pm_nl_subflow_chk_stale()
Currently, in-kernel PM specific helpers are prefixed with
'mptcp_pm_nl_'. But here 'mptcp_pm_nl_subflow_chk_stale' is not specific
to this PM: it is called from pm.c for both the in-kernel and userspace
PMs.

To avoid confusions, the '_nl' bit has been removed from the name.

No behavioural changes intended.

Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250307-net-next-mptcp-pm-reorg-v1-6-abef20ada03b@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-10 13:35:49 -07:00
Matthieu Baerts (NGI0)
6361139185 mptcp: pm: remove '_nl' from mptcp_pm_nl_rm_addr_received
Currently, in-kernel PM specific helpers are prefixed with
'mptcp_pm_nl_'. But here 'mptcp_pm_nl_rm_addr_received' is not specific
to this PM: it is called from the PM worker, and used by both the
in-kernel and userspace PMs. The helper has been renamed to
'mptcp_pm_rm_addr_recv' instead of '_received' to avoid confusions with
the one from pm.c.

mptcp_pm_nl_rm_addr_or_subflow', and 'mptcp_pm_nl_rm_subflow_received'
have been updated too for the same reason.

To avoid confusions, the '_nl' bit has been removed from the name.

While at it, the in-kernel PM specific code has been move from
mptcp_pm_rm_addr_or_subflow to a new dedicated helper, clearer.

No behavioural changes intended.

Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250307-net-next-mptcp-pm-reorg-v1-5-abef20ada03b@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-10 13:35:49 -07:00
Matthieu Baerts (NGI0)
551a9ad787 mptcp: pm: remove '_nl' from mptcp_pm_nl_work
Currently, in-kernel PM specific helpers are prefixed with
'mptcp_pm_nl_'. But here 'mptcp_pm_nl_work' is not specific to this PM:
it is called from the core to call helpers, some of them needed by both
the in-kernel and userspace PMs.

To avoid confusions, the '_nl' bit has been removed from the name.

Also used 'worker' instead of 'work', similar to protocol.c's worker.

No behavioural changes intended.

Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250307-net-next-mptcp-pm-reorg-v1-4-abef20ada03b@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-10 13:35:49 -07:00
Matthieu Baerts (NGI0)
d173498799 mptcp: pm: remove '_nl' from mptcp_pm_nl_mp_prio_send_ack
Currently, in-kernel PM specific helpers are prefixed with
'mptcp_pm_nl_'. But here 'mptcp_pm_nl_mp_prio_send_ack()' is not
specific to this PM: it is used by both the in-kernel and userspace PMs.

To avoid confusions, the '_nl' bit has been removed from the name.

No behavioural changes intended.

Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250307-net-next-mptcp-pm-reorg-v1-3-abef20ada03b@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-10 13:35:48 -07:00
Matthieu Baerts (NGI0)
fac7a6ddc7 mptcp: pm: remove '_nl' from mptcp_pm_nl_addr_send_ack
Currently, in-kernel PM specific helpers are prefixed with
'mptcp_pm_nl_'. But here 'mptcp_pm_nl_addr_send_ack()' is not specific
to this PM: it is used by both the in-kernel and userspace PMs.

To avoid confusions, the '_nl' bit has been removed from the name.

No behavioural changes intended.

Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250307-net-next-mptcp-pm-reorg-v1-2-abef20ada03b@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-10 13:35:48 -07:00
Geliang Tang
7462fe22cc mptcp: pm: use addr entry for get_local_id
The following code in mptcp_userspace_pm_get_local_id() that assigns "skc"
to "new_entry" is not allowed in BPF if we use the same code to implement
the get_local_id() interface of a BFP path manager:

	memset(&new_entry, 0, sizeof(struct mptcp_pm_addr_entry));
	new_entry.addr = *skc;
	new_entry.addr.id = 0;
	new_entry.flags = MPTCP_PM_ADDR_FLAG_IMPLICIT;

To solve the issue, this patch moves this assignment to "new_entry" forward
to mptcp_pm_get_local_id(), and then passing "new_entry" as a parameter to
both mptcp_pm_nl_get_local_id() and mptcp_userspace_pm_get_local_id().

No behavioural changes intended.

Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250307-net-next-mptcp-pm-reorg-v1-1-abef20ada03b@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-10 13:35:48 -07:00
Dan Carpenter
991a1b0992 eth: fbnic: fix memory corruption in fbnic_tlv_attr_get_string()
This code is trying to ensure that the last byte of the buffer is a NUL
terminator.  However, the problem is that attr->value[] is an array of
__le32, not char, so it zeroes out 4 bytes way beyond the end of the
buffer.  Cast the buffer to char to address this.

Fixes: e5cf5107c9 ("eth: fbnic: Update fbnic_tlv_attr_get_string() to work like nla_strscpy()")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Lee Trager <lee@trager.us>
Link: https://patch.msgid.link/2791d4be-ade4-4e50-9b12-33307d8410f6@stanley.mountain
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-10 13:17:33 -07:00
Heiner Kallweit
473367a5ff r8169: increase max jumbo packet size on RTL8125/RTL8126
Realtek confirmed that all RTL8125/RTL8126 chip versions support up to
16K jumbo packets. Reflect this in the driver.

Tested by Rui on RTL8125B with 12K jumbo packets.

Suggested-by: Rui Salvaterra <rsalvaterra@gmail.com>
Tested-by: Rui Salvaterra <rsalvaterra@gmail.com>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/396762ad-cc65-4e60-b01e-8847db89e98b@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-10 13:13:30 -07:00
Jakub Kicinski
feb2935e14 Merge branch 'follow-up-on-deduplicate-cookie-logic'
Willem de Bruijn says:

====================
follow-up on deduplicate cookie logic

1/3: I came across a leftover from cookie deduplication, due to UDP
having two code paths: lockless fast path and locked cork path.

3/3: Even though the leftover was in the fast path, this prompted me
to complete coverage to the cork path.

2/3: That uncovered a subtle API inconsistency in how dontfrag is
configured. It should not be possible to switch DF mid datagram.
====================

Link: https://patch.msgid.link/20250307033620.411611-1-willemdebruijn.kernel@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-10 13:13:06 -07:00
Willem de Bruijn
0922cb68ed selftests/net: expand cmsg_ip with MSG_MORE
UDP send with MSG_MORE takes a slightly different path than the
lockless fast path.

For completeness, add coverage to this case too.

Pass MSG_MORE on the initial sendmsg, then follow up with a zero byte
write to unplug the cork.

Unrelated: also add two missing endlines in usage().

Signed-off-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20250307033620.411611-4-willemdebruijn.kernel@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-10 13:13:04 -07:00
Willem de Bruijn
a18dfa9925 ipv6: save dontfrag in cork
When spanning datagram construction over multiple send calls using
MSG_MORE, per datagram settings are configured on the first send.

That is when ip(6)_setup_cork stores these settings for subsequent use
in __ip(6)_append_data and others.

The only flag that escaped this was dontfrag. As a result, a datagram
could be constructed with df=0 on the first sendmsg, but df=1 on a
next. Which is what cmsg_ip.sh does in an upcoming MSG_MORE test in
the "diff" scenario.

Changing datagram conditions in the middle of constructing an skb
makes this already complex code path even more convoluted. It is here
unintentional. Bring this flag in line with expected sockopt/cmsg
behavior.

And stop passing ipc6 to __ip6_append_data, to avoid such issues
in the future. This is already the case for __ip_append_data.

inet6_cork had a 6 byte hole, so the 1B flag has no impact.

Signed-off-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20250307033620.411611-3-willemdebruijn.kernel@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-10 13:13:04 -07:00
Willem de Bruijn
54580ccdd8 ipv6: remove leftover ip6 cookie initializer
As of the blamed commit ipc6.dontfrag is always initialized at the
start of udpv6_sendmsg, by ipcm6_init_sk, to either 0 or 1.

Later checks against -1 are no longer needed and the branches are now
dead code.

The blamed commit had removed those branches. But I had overlooked
this one case.

UDP has both a lockless fast path and a slower path for corked
requests. This branch remained in the fast path.

Fixes: 096208592b ("ipv6: replace ipcm6_init calls with ipcm6_init_sk")
Signed-off-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20250307033620.411611-2-willemdebruijn.kernel@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-10 13:13:03 -07:00
Jakub Kicinski
48c57a49c5 Merge branch 'virtio-net-link-queues-to-napis'
Joe Damato says:

====================
virtio-net: Link queues to NAPIs

Jakub recently commented [1] that I should not hold this series on
virtio-net linking queues to NAPIs behind other important work that is
on-going and suggested I re-spin, so here we are :)

As per the discussion on the v3 [2], now both RX and TX NAPIs use the
API to link queues to NAPIs. Since TX-only NAPIs don't have a NAPI ID,
commit 6597e8d358 ("netdev-genl: Elide napi_id when not present") now
correctly elides the TX-only NAPIs (instead of printing zero) when the
queues and NAPIs are linked.

As per the discussion on the v4 [3], patch 3 has been refactored to hold
RTNL only in the specific locations which need it as Jason requested.

As per the discussion on the v5 [4], patch 3 now leaves refill_work
as-is and does not use the API to unlink and relink queues and NAPIs. A
comment has been left as suggested by Jakub [5] for future work.

See the commit message of patch 3 for an example of how to get the NAPI
to queue mapping information.

See the commit message of patch 4 for an example of how NAPI IDs are
persistent despite queue count changes.

[1]: https://lore.kernel.org/20250221142650.3c74dcac@kernel.org
[2]: https://lore.kernel.org/20250127142400.24eca319@kernel.org
[3]: https://lore.kernel.org/CACGkMEv=ejJnOWDnAu7eULLvrqXjkMkTL4cbi-uCTUhCpKN_GA@mail.gmail.com
[4]: https://lore.kernel.org/Z8X15hxz8t-vXpPU@LQ3V64L9R2
[5]: https://lore.kernel.org/20250303160355.5f8d82d8@kernel.org

v5: https://lore.kernel.org/20250227185017.206785-1-jdamato@fastly.com
v4: https://lore.kernel.org/20250225020455.212895-1-jdamato@fastly.com
rfcv3: https://lore.kernel.org/20250121191047.269844-1-jdamato@fastly.com
v2: https://lore.kernel.org/20250116055302.14308-1-jdamato@fastly.com
v1: https://lore.kernel.org/20250110202605.429475-1-jdamato@fastly.com
====================

Link: https://patch.msgid.link/20250307011215.266806-1-jdamato@fastly.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-10 13:09:26 -07:00
Joe Damato
d5d715207e virtio_net: Use persistent NAPI config
Use persistent NAPI config so that NAPI IDs are not renumbered as queue
counts change.

$ sudo ethtool -l ens4  | tail -5 | egrep -i '(current|combined)'
Current hardware settings:
Combined:       4

$ ./tools/net/ynl/pyynl/cli.py \
    --spec Documentation/netlink/specs/netdev.yaml \
    --dump queue-get --json='{"ifindex": 2}'
[{'id': 0, 'ifindex': 2, 'napi-id': 8193, 'type': 'rx'},
 {'id': 1, 'ifindex': 2, 'napi-id': 8194, 'type': 'rx'},
 {'id': 2, 'ifindex': 2, 'napi-id': 8195, 'type': 'rx'},
 {'id': 3, 'ifindex': 2, 'napi-id': 8196, 'type': 'rx'},
 {'id': 0, 'ifindex': 2, 'type': 'tx'},
 {'id': 1, 'ifindex': 2, 'type': 'tx'},
 {'id': 2, 'ifindex': 2, 'type': 'tx'},
 {'id': 3, 'ifindex': 2, 'type': 'tx'}]

Now adjust the queue count, note that the NAPI IDs are not renumbered:

$ sudo ethtool -L ens4 combined 1
$ ./tools/net/ynl/pyynl/cli.py \
    --spec Documentation/netlink/specs/netdev.yaml \
    --dump queue-get --json='{"ifindex": 2}'
[{'id': 0, 'ifindex': 2, 'napi-id': 8193, 'type': 'rx'},
 {'id': 0, 'ifindex': 2, 'type': 'tx'}]

$ sudo ethtool -L ens4 combined 8
$ ./tools/net/ynl/pyynl/cli.py \
    --spec Documentation/netlink/specs/netdev.yaml \
    --dump queue-get --json='{"ifindex": 2}'
[{'id': 0, 'ifindex': 2, 'napi-id': 8193, 'type': 'rx'},
 {'id': 1, 'ifindex': 2, 'napi-id': 8194, 'type': 'rx'},
 {'id': 2, 'ifindex': 2, 'napi-id': 8195, 'type': 'rx'},
 {'id': 3, 'ifindex': 2, 'napi-id': 8196, 'type': 'rx'},
 {'id': 4, 'ifindex': 2, 'napi-id': 8197, 'type': 'rx'},
 {'id': 5, 'ifindex': 2, 'napi-id': 8198, 'type': 'rx'},
 {'id': 6, 'ifindex': 2, 'napi-id': 8199, 'type': 'rx'},
 {'id': 7, 'ifindex': 2, 'napi-id': 8200, 'type': 'rx'},
 [...]

Signed-off-by: Joe Damato <jdamato@fastly.com>
Reviewed-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Link: https://patch.msgid.link/20250307011215.266806-5-jdamato@fastly.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-10 13:09:22 -07:00
Joe Damato
e7231f49d5 virtio-net: Map NAPIs to queues
Use netif_queue_set_napi to map NAPIs to queue IDs so that the mapping
can be accessed by user apps. Note that the netif_queue_set_napi
currently requires RTNL, so care must be taken to ensure RTNL is held on
paths where this API might be reached.

The paths in the driver where this API can be reached appear to be:

  - ndo_open, ndo_close, which hold RTNL so no driver change is needed.
  - rx_pause, rx_resume, tx_pause, tx_resume are reached either via
    an ethtool ioctl or via XSK - neither path requires a driver change.
  - power management paths (which call open and close), which have been
    updated to hold/release RTNL.

$ ethtool -i ens4 | grep driver
driver: virtio_net

$ sudo ethtool -L ens4 combined 4

$ ./tools/net/ynl/pyynl/cli.py \
       --spec Documentation/netlink/specs/netdev.yaml \
       --dump queue-get --json='{"ifindex": 2}'
[{'id': 0, 'ifindex': 2, 'napi-id': 8289, 'type': 'rx'},
 {'id': 1, 'ifindex': 2, 'napi-id': 8290, 'type': 'rx'},
 {'id': 2, 'ifindex': 2, 'napi-id': 8291, 'type': 'rx'},
 {'id': 3, 'ifindex': 2, 'napi-id': 8292, 'type': 'rx'},
 {'id': 0, 'ifindex': 2, 'type': 'tx'},
 {'id': 1, 'ifindex': 2, 'type': 'tx'},
 {'id': 2, 'ifindex': 2, 'type': 'tx'},
 {'id': 3, 'ifindex': 2, 'type': 'tx'}]

Note that virtio_net has TX-only NAPIs which do not have NAPI IDs, so
the lack of 'napi-id' in the above output is expected.

Signed-off-by: Joe Damato <jdamato@fastly.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Link: https://patch.msgid.link/20250307011215.266806-4-jdamato@fastly.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-10 13:09:21 -07:00
Joe Damato
986a930451 virtio-net: Refactor napi_disable paths
Create virtnet_napi_disable helper and refactor virtnet_napi_tx_disable
to take a struct send_queue.

Signed-off-by: Joe Damato <jdamato@fastly.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Link: https://patch.msgid.link/20250307011215.266806-3-jdamato@fastly.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-10 13:09:21 -07:00
Joe Damato
2af5adf962 virtio-net: Refactor napi_enable paths
Refactor virtnet_napi_enable and virtnet_napi_tx_enable to take a struct
receive_queue. Create a helper, virtnet_napi_do_enable, which contains
the logic to enable a NAPI.

Signed-off-by: Joe Damato <jdamato@fastly.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Link: https://patch.msgid.link/20250307011215.266806-2-jdamato@fastly.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-10 13:09:21 -07:00
Jakub Kicinski
8ef890df40 net: move misc netdev_lock flavors to a separate header
Move the more esoteric helpers for netdev instance lock to
a dedicated header. This avoids growing netdevice.h to infinity
and makes rebuilding the kernel much faster (after touching
the header with the helpers).

The main netdev_lock() / netdev_unlock() functions are used
in static inlines in netdevice.h and will probably be used
most commonly, so keep them in netdevice.h.

Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250307183006.2312761-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-08 09:06:50 -08:00
Eric Dumazet
9bfc9d65a1 hamradio: use netdev_lockdep_set_classes() helper
It is time to use netdev_lockdep_set_classes() in bpqether.c

List of related commits:

0bef512012 ("net: add netdev_lockdep_set_classes() to virtual drivers")
c74e103991 ("net: bridge: use netdev_lockdep_set_classes()")
9a3c93af54 ("vlan: use netdev_lockdep_set_classes()")
0d7dd798fd ("net: ipvlan: call netdev_lockdep_set_classes()")
24ffd75200 ("net: macvlan: call netdev_lockdep_set_classes()")
78e7a2ae87 ("net: vrf: call netdev_lockdep_set_classes()")
d3fff6c443 ("net: add netdev_lockdep_set_classes() helper")

syzbot reported:

WARNING: possible recursive locking detected
6.14.0-rc5-syzkaller-01064-g2525e16a2bae #0 Not tainted

dhcpcd/5501 is trying to acquire lock:
 ffff8880797e2d28 (&dev->lock){+.+.}-{4:4}, at: netdev_lock include/linux/netdevice.h:2765 [inline]
 ffff8880797e2d28 (&dev->lock){+.+.}-{4:4}, at: register_netdevice+0x12d8/0x1b70 net/core/dev.c:11008

but task is already holding lock:
 ffff88802e530d28 (&dev->lock){+.+.}-{4:4}, at: netdev_lock include/linux/netdevice.h:2765 [inline]
 ffff88802e530d28 (&dev->lock){+.+.}-{4:4}, at: netdev_lock_ops include/linux/netdevice.h:2804 [inline]
 ffff88802e530d28 (&dev->lock){+.+.}-{4:4}, at: dev_change_flags+0x120/0x270 net/core/dev_api.c:65

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(&dev->lock);
  lock(&dev->lock);

 *** DEADLOCK ***

 May be due to missing lock nesting notation

2 locks held by dhcpcd/5501:
  #0: ffffffff8fed6848 (rtnl_mutex){+.+.}-{4:4}, at: rtnl_net_lock include/linux/rtnetlink.h:130 [inline]
  #0: ffffffff8fed6848 (rtnl_mutex){+.+.}-{4:4}, at: devinet_ioctl+0x34c/0x1d80 net/ipv4/devinet.c:1121
  #1: ffff88802e530d28 (&dev->lock){+.+.}-{4:4}, at: netdev_lock include/linux/netdevice.h:2765 [inline]
  #1: ffff88802e530d28 (&dev->lock){+.+.}-{4:4}, at: netdev_lock_ops include/linux/netdevice.h:2804 [inline]
  #1: ffff88802e530d28 (&dev->lock){+.+.}-{4:4}, at: dev_change_flags+0x120/0x270 net/core/dev_api.c:65

stack backtrace:
CPU: 1 UID: 0 PID: 5501 Comm: dhcpcd Not tainted 6.14.0-rc5-syzkaller-01064-g2525e16a2bae #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2025
Call Trace:
 <TASK>
  __dump_stack lib/dump_stack.c:94 [inline]
  dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120
  print_deadlock_bug+0x483/0x620 kernel/locking/lockdep.c:3039
  check_deadlock kernel/locking/lockdep.c:3091 [inline]
  validate_chain+0x15e2/0x5920 kernel/locking/lockdep.c:3893
  __lock_acquire+0x1397/0x2100 kernel/locking/lockdep.c:5228
  lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5851
  __mutex_lock_common kernel/locking/mutex.c:585 [inline]
  __mutex_lock+0x19c/0x1010 kernel/locking/mutex.c:730
  netdev_lock include/linux/netdevice.h:2765 [inline]
  register_netdevice+0x12d8/0x1b70 net/core/dev.c:11008
  bpq_new_device drivers/net/hamradio/bpqether.c:499 [inline]
  bpq_device_event+0x4b1/0x8d0 drivers/net/hamradio/bpqether.c:542
  notifier_call_chain+0x1a5/0x3f0 kernel/notifier.c:85
 __dev_notify_flags+0x207/0x400
  netif_change_flags+0xf0/0x1a0 net/core/dev.c:9442
  dev_change_flags+0x146/0x270 net/core/dev_api.c:66
  devinet_ioctl+0xea2/0x1d80 net/ipv4/devinet.c:1200
  inet_ioctl+0x3d7/0x4f0 net/ipv4/af_inet.c:1001
  sock_do_ioctl+0x158/0x460 net/socket.c:1190
  sock_ioctl+0x626/0x8e0 net/socket.c:1309
  vfs_ioctl fs/ioctl.c:51 [inline]
  __do_sys_ioctl fs/ioctl.c:906 [inline]
  __se_sys_ioctl+0xf5/0x170 fs/ioctl.c:892
  do_syscall_x64 arch/x86/entry/common.c:52 [inline]
  do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

Fixes: 7e4d784f58 ("net: hold netdev instance lock during rtnetlink operations")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250307160358.3153859-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-08 08:56:18 -08:00
Eric Dumazet
b3aaf3c13b udp: expand SKB_DROP_REASON_UDP_CSUM use
SKB_DROP_REASON_UDP_CSUM can be used in four locations
when dropping a packet because of a wrong UDP checksum.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20250307102002.2095238-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-08 08:56:04 -08:00
Breno Leitao
248f6571fd netpoll: Optimize skb refilling on critical path
netpoll tries to refill the skb queue on every packet send, independently
if packets are being consumed from the pool or not. This was
particularly problematic while being called from printk(), where the
operation would be done while holding the console lock.

Introduce a more intelligent approach to skb queue management. Instead
of constantly attempting to refill the queue, the system now defers
refilling to a work queue and only triggers the workqueue when a buffer
is actually dequeued. This change significantly reduces operations with
the lock held.

Add a work_struct to the netpoll structure for asynchronous refilling,
updating find_skb() to schedule refill work only when necessary (skb is
dequeued).

These changes have demonstrated a 15% reduction in time spent during
netpoll_send_msg operations, especially when no SKBs are not consumed
from consumed from pool.

When SKBs are being dequeued, the improvement is even better, around
70%, mainly because refilling the SKB pool is now happening outside of
the critical patch (with console_owner lock held).

Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250304-netpoll_refill_v2-v1-1-06e2916a4642@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-07 19:55:40 -08:00
Jakub Kicinski
fca9fe1aae Merge branch 'net-phy-tja11xx-add-support-for-tja1102s'
Dimitri Fedrau via says:

====================
net: phy: tja11xx: add support for TJA1102S

- add support for TJA1102S
- enable PHY in sleep mode for TJA1102S

v1: https://lore.kernel.org/20250303-tja1102s-support-v1-0-180e945396e0@liebherr.com
====================

Link: https://patch.msgid.link/20250304-tja1102s-support-v2-0-cd3e61ab920f@liebherr.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-07 19:51:08 -08:00
Dimitri Fedrau
5b3178c452 net: phy: tja11xx: enable PHY in sleep mode for TJA1102S
Due to pin strapping the PHY maybe disabled per default. TJA1102 devices
can be enabled by setting the PHY_EN bit. Support is provided for TJA1102S
devices but can be easily added for TJA1102 too.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Dimitri Fedrau <dimitri.fedrau@liebherr.com>
Link: https://patch.msgid.link/20250304-tja1102s-support-v2-2-cd3e61ab920f@liebherr.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-07 19:51:04 -08:00
Dimitri Fedrau
5d7610577f net: phy: tja11xx: add support for TJA1102S
NXPs TJA1102S is a single PHY version of the TJA1102 in which one of the
PHYs is disabled.

Signed-off-by: Dimitri Fedrau <dimitri.fedrau@liebherr.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20250304-tja1102s-support-v2-1-cd3e61ab920f@liebherr.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-07 19:51:03 -08:00
Lukas Bulwahn
e2537326e3 net: ethernet: Remove accidental duplication in Kconfig file
Commit fb3dda82fd ("net: airoha: Move airoha_eth driver in a dedicated
folder") accidentally added the line:

  source "drivers/net/ethernet/mellanox/Kconfig"

in drivers/net/ethernet/Kconfig, so that this line is duplicated in that
file.

Remove this accidental duplication.

Fixes: fb3dda82fd ("net: airoha: Move airoha_eth driver in a dedicated folder")
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@redhat.com>
Acked-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://patch.msgid.link/20250306094753.63806-1-lukas.bulwahn@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-07 19:42:26 -08:00
Lukas Bulwahn
730f8d1c61 MAINTAINERS: adjust entry in AIROHA ETHERNET DRIVER
Commit fb3dda82fd ("net: airoha: Move airoha_eth driver in a dedicated
folder") moves the driver to drivers/net/ethernet/airoha/, but misses to
adjust the AIROHA ETHERNET DRIVER section in MAINTAINERS. Hence,
./scripts/get_maintainer.pl --self-test=patterns complains about a broken
reference.

Adjust the file entry to the dedicated folder for this driver.

Signed-off-by: Lukas Bulwahn <lukas.bulwahn@redhat.com>
Acked-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://patch.msgid.link/20250306094636.63709-1-lukas.bulwahn@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-07 19:41:52 -08:00
Lorenzo Bianconi
e368d2a1e8 net: airoha: Fix dev->dsa_ptr check in airoha_get_dsa_tag()
Fix the following warning reported by Smatch static checker in
airoha_get_dsa_tag routine:

drivers/net/ethernet/airoha/airoha_eth.c:1722 airoha_get_dsa_tag()
warn: 'dp' isn't an ERR_PTR

dev->dsa_ptr can't be set to an error pointer, it can just be NULL.
Remove this check since it is already performed in netdev_uses_dsa().

Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/netdev/Z8l3E0lGOcrel07C@lore-desk/T/#m54adc113fcdd8c5e6c5f65ffd60d8e8b1d483d90
Fixes: af3cf757d5 ("net: airoha: Move DSA tag in DMA descriptor")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250306-airoha-flowtable-fixes-v1-1-68d3c1296cdd@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-07 19:40:23 -08:00
Jakub Kicinski
530581047d Merge branch 'tcp-ulp-diag-expose-more-to-non-net-admin-users'
Matthieu Baerts says:

====================
tcp: ulp: diag: expose more to non net admin users

Since its introduction in commit 61723b3932 ("tcp: ulp: add functions
to dump ulp-specific information"), the ULP diag info have been exported
only to users with CAP_NET_ADMIN capability.

Not everything is sensitive, and some info can be exported to all users
in order to ease the debugging from the userspace side without requiring
additional capabilities.

First, the ULP name can be easily exported. Then more depending on each
layer:

 - On kTLS side, it looks like everything can be exported to all users:
   version, cipher type, tx/rx user config type, plus some flags.

 - On MPTCP side, everything but the sequence numbers are exported to
   all non net admin users, similar to TCP.
====================

Link: https://patch.msgid.link/20250306-net-next-tcp-ulp-diag-net-admin-v1-0-06afdd860fc9@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-07 19:39:57 -08:00
Matthieu Baerts (NGI0)
0d7336f8f0 tcp: ulp: diag: more info without CAP_NET_ADMIN
When introduced in commit 61723b3932 ("tcp: ulp: add functions to dump
ulp-specific information"), the whole ULP diag info has been exported
only if the requester had CAP_NET_ADMIN.

It looks like not everything is sensitive, and some info can be exported
to all users in order to ease the debugging from the userspace side
without requiring additional capabilities. Each layer should then decide
what can be exposed to everybody. The 'net_admin' boolean is then passed
to the different layers.

On kTLS side, it looks like there is nothing sensitive there: version,
cipher type, tx/rx user config type, plus some flags. So, only some
metadata about the configuration, no cryptographic info like keys, etc.
Then, everything can be exported to all users.

On MPTCP side, that's different. The MPTCP-related sequence numbers per
subflow should certainly not be exposed to everybody. For example, the
DSS mapping and ssn_offset would give all users on the system access to
narrow ranges of values for the subflow TCP sequence numbers and
MPTCP-level DSNs, and then ease packet injection. The TCP diag interface
doesn't expose the TCP sequence numbers for TCP sockets, so best to do
the same here. The rest -- token, IDs, flags -- can be exported to
everybody.

Acked-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250306-net-next-tcp-ulp-diag-net-admin-v1-2-06afdd860fc9@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-07 19:39:53 -08:00
Matthieu Baerts (NGI0)
f5afcb9fbb tcp: ulp: diag: always print the name if any
Since its introduction in commit 61723b3932 ("tcp: ulp: add functions
to dump ulp-specific information"), the ULP diag info have been exported
only if the requester had CAP_NET_ADMIN.

At least the ULP name can be exported without CAP_NET_ADMIN. This will
already help identifying which layer is being used, e.g. which TCP
connections are in fact MPTCP subflow.

Acked-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250306-net-next-tcp-ulp-diag-net-admin-v1-1-06afdd860fc9@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-07 19:39:53 -08:00
Jakub Kicinski
15933ad12c Merge branch 'eth-fbnic-support-ring-size-configuration'
Jakub Kicinski says:

====================
eth: fbnic: support ring size configuration

Support ethtool -g / -G and a couple other small tweaks.
====================

Link: https://patch.msgid.link/20250306145150.1757263-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-07 19:37:39 -08:00
Jakub Kicinski
6cbf18a05c eth: fbnic: support ring size configuration
Support ethtool -g / -G. Leverage the code added for -l / -L
to alloc / stop / start / free.

Check parameters against HW min/max but also our own min/max.
Min HW queue is 16 entries, we can't deal with TWQs that small
because of the queue waking logic. Add similar contraint on RCQ
for symmetry.

We need 3 sizes on Rx, as the NIC does header-data split two separate
buffer pools:
  (1) head page ring    - how many empty pages we post for headers
  (2) payload page ring - how many empty pages we post for payloads
  (3) completion ring   - where NIC produces the Rx descriptors

Acked-by: Joe Damato <jdamato@fastly.com>
Link: https://patch.msgid.link/20250306145150.1757263-4-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-07 19:37:37 -08:00
Jakub Kicinski
bfb522f347 eth: fbnic: fix typo in compile assert
We should be validating the Rx count on the Rx struct,
not the Tx struct. There is no real change here, rx_stats
and tx_stats are instances of the same struct.

Acked-by: Joe Damato <jdamato@fastly.com>
Link: https://patch.msgid.link/20250306145150.1757263-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-07 19:37:37 -08:00
Jakub Kicinski
c1aacad306 eth: fbnic: link NAPIs to page pools
The lifetime of page pools is tied to NAPI instances,
and they are destroyed before NAPI is deleted.
It's safe to link them up.

Acked-by: Joe Damato <jdamato@fastly.com>
Link: https://patch.msgid.link/20250306145150.1757263-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-07 19:37:37 -08:00
Jakub Kicinski
a3cc3f424d Merge branch 'net-bcmgenet-revise-suspend-resume'
Doug Berger says:

====================
net: bcmgenet: revise suspend/resume

This commit set updates the GENET driver to reduce the delay to
resume the ethernet link when the Wake on Lan features are used.

In addition, the encoding of hardware versioning and features is
revised to avoid some redundancy and improve readability as well
as remove a warning that occurred for the BCM7712 device which
updated the device major version while maintaining compatibility
with the driver.

The assignment of hardware descriptor rings was modified to
simplify programming and to allow support for the hardware
RX_CLS_FLOW_DISC filter action.
====================

Link: https://patch.msgid.link/20250306192643.2383632-1-opendmb@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-07 19:33:50 -08:00
Doug Berger
254f3239dd net: bcmgenet: revise suspend/resume
If the network interface is configured for Wake-on-LAN we should
avoid bringing the interface down and up since it slows the time
to reestablish network traffic on resume.

Redundant calls to phy_suspend() and phy_resume() are removed
since they are already invoked from within phy_stop() and
phy_start() called from bcmgenet_netif_stop() and
bcmgenet_netif_start().

Signed-off-by: Doug Berger <opendmb@gmail.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20250306192643.2383632-15-opendmb@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-07 19:33:48 -08:00