Commit Graph

1264951 Commits

Author SHA1 Message Date
Pawel Dembicki
a9e4230d0b net: phy: marvell: implement cable-test for 88E308X/88E609X family
This commit implements VCT in 88E308X/88E609X Family.

It require two workarounds with some magic configuration.
Regular use require only one register configuration. But Open Circuit
require second workaround.
It cause implementation two phases for fault length measuring.

Fast Ethernet PHY have implemented very simple version of VCT. It's
complitley different than vct5 or vct7.

Signed-off-by: Pawel Dembicki <paweldembicki@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20240402201123.2961909-3-paweldembicki@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-03 19:33:20 -07:00
Pawel Dembicki
9cc8a6e626 net: ethtool: Add impedance mismatch result code to cable test
Some PHYs can recognize during a cable test if the impedance in the cable
is okay. They can detect reflections caused by impedance discontinuity
between a regular 100 Ohm cable and an abnormal part with a higher or
lower impedance.

This commit introduces a new result code:
ETHTOOL_A_CABLE_RESULT_CODE_IMPEDANCE_MISMATCH,
which represents the results of a cable test indicating issues with
impedance integrity.

Signed-off-by: Pawel Dembicki <paweldembicki@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20240402201123.2961909-2-paweldembicki@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-03 19:33:20 -07:00
Pawel Dembicki
ada9841e3e net: phy: marvell: add basic support of 88E308X/88E609X family
This patch implements only basic support.

It covers PHY used in multiple IC:
PHY: 88E3082, 88E3083
Switch: 88E6096, 88E6097

Signed-off-by: Pawel Dembicki <paweldembicki@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/20240402201123.2961909-1-paweldembicki@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-03 19:33:20 -07:00
Christophe JAILLET
04af1d6437 net: fman: Remove some unused fields in some structure
In "struct muram_info", the 'size' field is unused.
In "struct memac_cfg", the 'fixed_link' field is unused.

Remove them.

Found with cppcheck, unusedStructMember.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Sean Anderson <sean.anderson@seco.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/425222d4f6c584e8316ccb7b2ef415a85c96e455.1712084103.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-03 19:30:28 -07:00
Jakub Kicinski
1b39b79d26 Merge branch 'af_unix-remove-old-gc-leftovers'
Kuniyuki Iwashima says:

====================
af_unix: Remove old GC leftovers.

This is a follow-up series for commit 4090fa373f ("af_unix: Replace
garbage collection algorithm.") which introduced the new GC for AF_UNIX.

Now we no longer need two ugly tricks for the old GC, let's remove them.
====================

Link: https://lore.kernel.org/r/20240401173125.92184-1-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-03 19:27:15 -07:00
Kuniyuki Iwashima
118f457da9 af_unix: Remove lock dance in unix_peek_fds().
In the previous GC implementation, the shape of the inflight socket
graph was not expected to change while GC was in progress.

MSG_PEEK was tricky because it could install inflight fd silently
and transform the graph.

Let's say we peeked a fd, which was a listening socket, and accept()ed
some embryo sockets from it.  The garbage collection algorithm would
have been confused because the set of sockets visited in scan_inflight()
would change within the same GC invocation.

That's why we placed spin_lock(&unix_gc_lock) and spin_unlock() in
unix_peek_fds() with a fat comment.

In the new GC implementation, we no longer garbage-collect the socket
if it exists in another queue, that is, if it has a bridge to another
SCC.  Also, accept() will require the lock if it has edges.

Thus, we need not do the complicated lock dance.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://lore.kernel.org/r/20240401173125.92184-3-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-03 19:27:13 -07:00
Kuniyuki Iwashima
7c349ed090 af_unix: Remove scm_fp_dup() in unix_attach_fds().
When we passed fds, we used to bump each file's refcount twice
in scm_fp_copy() and scm_fp_dup() before linking the socket to
gc_inflight_list.

This is because we incremented the inflight count of the socket
and linked it to the list in advance before passing skb to the
destination socket.

Otherwise, the inflight socket could have been garbage-collected
in a small race window between linking the socket to the list and
queuing skb:

  CPU 1 : sendmsg(X) w/ A's fd     CPU 2 : close(A)
  -----                            -----
  /* Here A's refcount is 1, and inflight count is 0 */

  bump A's refcount to 2 in scm_fp_copy()
  bump A's inflight count to 1
  link A to gc_inflight_list
                                   decrement A's refcount to 1

  /* A's refcount == inflight count, thus A could be GC candidate */

                                   start GC
                                   mark A as candidate
                                   purge A's receive queue

  queue skb w/ A's fd to X

  /* A is queued, but all data has been lost */

After commit 4090fa373f ("af_unix: Replace garbage collection
algorithm."), we increment the inflight count and link the socket
to the global list only when queuing the skb.

The race no longer exists, so let's not clone the fd nor bump
the count in unix_attach_fds().

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://lore.kernel.org/r/20240401173125.92184-2-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-03 19:27:13 -07:00
Jakub Kicinski
d20bac353b Merge branch 'tcp-make-trace-of-reset-logic-complete'
Jason Xing says:

====================
tcp: make trace of reset logic complete

Before this, we miss some cases where the TCP layer could send RST but
we cannot trace it. So I decided to complete it :)

Link: https://lore.kernel.org/all/20240329034243.7929-1-kerneljasonxing@gmail.com/
====================

Link: https://lore.kernel.org/r/20240401073605.37335-1-kerneljasonxing@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-03 19:26:16 -07:00
Jason Xing
19822a980e trace: tcp: fully support trace_tcp_send_reset
Prior to this patch, what we can see by enabling trace_tcp_send is
only happening under two circumstances:
1) active rst mode
2) non-active rst mode and based on the full socket

That means the inconsistency occurs if we use tcpdump and trace
simultaneously to see how rst happens.

It's necessary that we should take into other cases into considerations,
say:
1) time-wait socket
2) no socket
...

By parsing the incoming skb and reversing its 4-tuple can
we know the exact 'flow' which might not exist.

Samples after applied this patch:
1. tcp_send_reset: skbaddr=XXX skaddr=XXX src=ip:port dest=ip:port
state=TCP_ESTABLISHED
2. tcp_send_reset: skbaddr=000...000 skaddr=XXX src=ip:port dest=ip:port
state=UNKNOWN
Note:
1) UNKNOWN means we cannot extract the right information from skb.
2) skbaddr/skaddr could be 0

Signed-off-by: Jason Xing <kernelxing@tencent.com>
Link: https://lore.kernel.org/r/20240401073605.37335-3-kerneljasonxing@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-03 19:26:14 -07:00
Jason Xing
9807080e21 trace: adjust TP_STORE_ADDR_PORTS_SKB() parameters
Introducing entry_saddr and entry_daddr parameters in this macro
for later use can help us record the reverse 4-tuple by analyzing
the 4-tuple of the incoming skb when receiving.

Signed-off-by: Jason Xing <kernelxing@tencent.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20240401073605.37335-2-kerneljasonxing@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-03 19:26:14 -07:00
Marcelo Tosatti
2f3c7195a7 net: enable timestamp static key if CPU
For systems that use CPU isolation (via nohz_full), creating or destroying
a socket with SO_TIMESTAMP, SO_TIMESTAMPNS or SO_TIMESTAMPING with flag
SOF_TIMESTAMPING_RX_SOFTWARE will cause a static key to be enabled/disabled.
This in turn causes undesired IPIs to isolated CPUs.

So enable the static key unconditionally, if CPU isolation is enabled,
thus avoiding the IPIs.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/ZgrUiLLtbEUf9SFn@tpad
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-03 19:14:53 -07:00
David S. Miller
34c58c89fe Merge branch 'gve-ring-size-changes'
Harshitha Ramamurthy says:

====================
gve: enable ring size changes

This series enables support to change ring size via ethtool
in gve.

The first three patches deal with some clean up, setting
default values for the ring sizes and related fields. The
last two patches enable ring size changes.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-03 11:11:15 +01:00
Harshitha Ramamurthy
834f9458f2 gve: add support to change ring size via ethtool
Allow the user to change ring size via ethtool if
supported by the device. The driver relies on the
ring size ranges queried from device to validate
ring sizes requested by the user.

Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Harshitha Ramamurthy <hramamurthy@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-03 11:11:15 +01:00
Harshitha Ramamurthy
ed4fb32694 gve: add support to read ring size ranges from the device
Add support to read ring size change capability and the
min and max descriptor counts from the device and store it
in the driver. Also accommodate a special case where the
device does not provide minimum ring size depending on the
version of the device. In that case, rely on default values
for the minimums.

Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Harshitha Ramamurthy <hramamurthy@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-03 11:11:15 +01:00
Harshitha Ramamurthy
b94d3703c1 gve: set page count for RX QPL for GQI and DQO queue formats
Fulfill the requirement that for GQI, the number of pages per
RX QPL is equal to the ring size. Set this value to be equal to
ring size. Because of this change, the rx_data_slot_cnt and
rx_pages_per_qpl fields stored in the priv structure are not
needed, so remove their usage. And for DQO, the number of pages
per RX QPL is more than ring size to account for out-of-order
completions. So set it to two times of rx ring size.

Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Harshitha Ramamurthy <hramamurthy@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-03 11:11:15 +01:00
Harshitha Ramamurthy
5dee3c702c gve: make the completion and buffer ring size equal for DQO
For the DQO queue format, the gve driver stores two ring sizes
for both TX and RX - one for completion queue ring and one for
data buffer ring. This is supposed to enable asymmetric sizes
for these two rings but that is not supported. Make both fields
reference the same single variable.

This change renders reading supported TX completion ring size
and RX buffer ring size for DQO from the device useless, so change
those fields to reserved and remove related code.

Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Harshitha Ramamurthy <hramamurthy@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-03 11:11:15 +01:00
Harshitha Ramamurthy
4cbc70f6ec gve: simplify setting decriptor count defaults
Combine the gve_set_desc_cnt and gve_set_desc_cnt_dqo into
one function which sets the counts after checking the queue
format. Both the functions in the previous code and the new
combined function never return an error so make the new
function void and remove the goto on error.

Also rename the new function to gve_set_default_desc_cnt to
be clearer about its intention.

Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Harshitha Ramamurthy <hramamurthy@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-03 11:11:15 +01:00
Sai Krishna
4c6ce450a8 octeontx2-pf: Reset MAC stats during probe
Reset CGX/RPM MAC HW statistics at the time of driver probe()

Signed-off-by: Hariprasad Kelam <hkelam@marvell.com>
Signed-off-by: Sai Krishna <saikrishnag@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-03 11:03:40 +01:00
Gustavo A. R. Silva
9748dbc9f2 net/smc: Avoid -Wflex-array-member-not-at-end warnings
-Wflex-array-member-not-at-end is coming in GCC-14, and we are getting
ready to enable it globally.

There are currently a couple of objects in `struct smc_clc_msg_proposal_area`
that contain a couple of flexible structures:

struct smc_clc_msg_proposal_area {
	...
	struct smc_clc_v2_extension             pclc_v2_ext;
	...
	struct smc_clc_smcd_v2_extension        pclc_smcd_v2_ext;
	...
};

So, in order to avoid ending up with a couple of flexible-array members
in the middle of a struct, we use the `struct_group_tagged()` helper to
separate the flexible array from the rest of the members in the flexible
structure:

struct smc_clc_smcd_v2_extension {
        struct_group_tagged(smc_clc_smcd_v2_extension_fixed, fixed,
                            u8 system_eid[SMC_MAX_EID_LEN];
                            u8 reserved[16];
        );
        struct smc_clc_smcd_gid_chid gidchid[];
};

With the change described above, we now declare objects of the type of
the tagged struct without embedding flexible arrays in the middle of
another struct:

struct smc_clc_msg_proposal_area {
        ...
        struct smc_clc_v2_extension_fixed	pclc_v2_ext;
        ...
        struct smc_clc_smcd_v2_extension_fixed	pclc_smcd_v2_ext;
        ...
};

We also use `container_of()` when we need to retrieve a pointer to the
flexible structures.

So, with these changes, fix the following warnings:

In file included from net/smc/af_smc.c:42:
net/smc/smc_clc.h:186:49: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
  186 |         struct smc_clc_v2_extension             pclc_v2_ext;
      |                                                 ^~~~~~~~~~~
net/smc/smc_clc.h:188:49: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
  188 |         struct smc_clc_smcd_v2_extension        pclc_smcd_v2_ext;
      |                                                 ^~~~~~~~~~~~~~~~

Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Wen Gu <guwen@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-03 11:01:30 +01:00
Johannes Berg
b1f81b9a53 netdevice: add DEFINE_FREE() for dev_put
For short netdev holds within a function there are still a lot of
users of dev_put() rather than netdev_put(). Add DEFINE_FREE() to
allow making those safer.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-03 09:59:38 +01:00
Johannes Berg
464eb03c4a rtnetlink: add guard for RTNL
The new guard/scoped_gard can be useful for the RTNL as well,
so add a guard definition for it. It gets used like

 {
   guard(rtnl)();
   // RTNL held until end of block
 }

or

  scoped_guard(rtnl) {
    // RTNL held in this block
  }

as with any other guard/scoped_guard.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-03 09:59:38 +01:00
Jakub Kicinski
84c41dcaae Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2024-04-01 (ice)

This series contains updates to ice driver only.

Michal Schmidt changes flow for gettimex64 to use host-side spinlock
rather than hardware semaphore for lighter-weight locking.

Steven adds ability for switch recipes to be re-used when firmware
supports it.

Thorsten Blum removes unwanted newlines in netlink messaging.

Michal Swiatkowski and Piotr re-organize devlink related code; renaming,
moving, and consolidating it to a single location. Michal also
simplifies the devlink init and cleanup path to occur under a single
lock call.

* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
  ice: hold devlink lock for whole init/cleanup
  ice: move devlink port code to a separate file
  ice: move ice_devlink.[ch] to devlink folder
  ice: Remove newlines in NL_SET_ERR_MSG_MOD
  ice: Add switch recipe reusing feature
  ice: fold ice_ptp_read_time into ice_ptp_gettimex64
  ice: avoid the PTP hardware semaphore in gettimex64 path
  ice: add ice_adapter for shared data across PFs on the same NIC
====================

Link: https://lore.kernel.org/r/20240401172421.1401696-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-02 19:15:35 -07:00
Rob Herring
992c287d87 dt-bindings: net: snps,dwmac: Align 'snps,priority' type definition
'snps,priority' is also defined in dma/snps,dw-axi-dmac.yaml as a
uint32-array. It's preferred to have a single type for a given property
name, so update the type in snps,dwmac schema to match.

Signed-off-by: Rob Herring <robh@kernel.org>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20240401204422.1692359-2-robh@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-02 19:11:22 -07:00
Jakub Kicinski
c1a6589faf Merge branch 'doc-netlink-add-a-yaml-spec-for-team'
Hangbin Liu says:

====================
doc/netlink: add a YAML spec for team

Add a YAML spec for team. As we need to link two objects together to form
the team module, rename team to team_core for linking.
====================

Link: https://lore.kernel.org/r/20240401031004.1159713-1-liuhangbin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-02 18:24:36 -07:00
Hangbin Liu
e57ba7e3d7 uapi: team: use header file generated from YAML spec
generated with:

 $ ./tools/net/ynl/ynl-gen-c.py --mode uapi \
 > --spec Documentation/netlink/specs/team.yaml \
 > --header -o include/uapi/linux/if_team.h

Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20240401031004.1159713-5-liuhangbin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-02 18:24:33 -07:00
Hangbin Liu
948dbafc15 net: team: use policy generated by YAML spec
generated with:

 $ ./tools/net/ynl/ynl-gen-c.py --mode kernel \
 > --spec Documentation/netlink/specs/team.yaml --source \
 > -o drivers/net/team/team_nl.c
 $ ./tools/net/ynl/ynl-gen-c.py --mode kernel \
 > --spec Documentation/netlink/specs/team.yaml --header \
 > -o drivers/net/team/team_nl.h

The TEAM_ATTR_LIST_PORT in team_nl_policy is removed as it is only in the
port list reply attributes.

Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20240401031004.1159713-4-liuhangbin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-02 18:24:32 -07:00
Hangbin Liu
a0393e3e3d net: team: rename team to team_core for linking
Similar with commit 08d323234d ("net: fou: rename the source for linking"),
We'll need to link two objects together to form the team module.
This means the source can't be called team, the build system expects
team.o to be the combined object.

Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20240401031004.1159713-3-liuhangbin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-02 18:24:32 -07:00
Hangbin Liu
387724cbf4 Documentation: netlink: add a YAML spec for team
Add a YAML specification for team.

Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20240401031004.1159713-2-liuhangbin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-02 18:24:32 -07:00
Jason Xing
9a79c65f00 tcp/dccp: complete lockless accesses to sk->sk_max_ack_backlog
Since commit 099ecf59f0 ("net: annotate lockless accesses to
sk->sk_max_ack_backlog") decided to handle the sk_max_ack_backlog
locklessly, there is one more function mostly called in TCP/DCCP
cases. So this patch completes it:)

Signed-off-by: Jason Xing <kernelxing@tencent.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20240331090521.71965-1-kerneljasonxing@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-02 18:21:13 -07:00
Christophe JAILLET
f9a4506438 caif: Use UTILITY_NAME_LENGTH instead of hard-coding 16
UTILITY_NAME_LENGTH is 16. So better use the former when defining the
'utility_name' array. This makes the intent clearer when it is used around
line 260.

While at it, declare variable in reverse xmas tree style.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/8c1160501f69b64bb2d45ce9f26f746eec80ac77.1711787352.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-02 18:20:00 -07:00
Jakub Kicinski
5f0b6c94e3 Merge branch 'avoid-explicit-cpumask-var-allocation-on-stack'
Dawei Li says:

====================
Avoid explicit cpumask var allocation on stack

v1: https://lore.kernel.org/lkml/20240329105610.922675-1-dawei.li@shingroup.cn/
====================

Link: https://lore.kernel.org/r/20240331053441.1276826-1-dawei.li@shingroup.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-02 18:19:12 -07:00
Dawei Li
d33fe1714a net/dpaa2: Avoid explicit cpumask var allocation on stack
For CONFIG_CPUMASK_OFFSTACK=y kernel, explicit allocation of cpumask
variable on stack is not recommended since it can cause potential stack
overflow.

Instead, kernel code should always use *cpumask_var API(s) to allocate
cpumask var in config-neutral way, leaving allocation strategy to
CONFIG_CPUMASK_OFFSTACK.

Use *cpumask_var API(s) to address it.

Signed-off-by: Dawei Li <dawei.li@shingroup.cn>
Link: https://lore.kernel.org/r/20240331053441.1276826-3-dawei.li@shingroup.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-02 18:19:09 -07:00
Dawei Li
be4e130441 net/iucv: Avoid explicit cpumask var allocation on stack
For CONFIG_CPUMASK_OFFSTACK=y kernel, explicit allocation of cpumask
variable on stack is not recommended since it can cause potential stack
overflow.

Instead, kernel code should always use *cpumask_var API(s) to allocate
cpumask var in config-neutral way, leaving allocation strategy to
CONFIG_CPUMASK_OFFSTACK.

Use *cpumask_var API(s) to address it.

Signed-off-by: Dawei Li <dawei.li@shingroup.cn>
Reviewed-by: Alexandra Winter <wintera@linux.ibm.com>
Link: https://lore.kernel.org/r/20240331053441.1276826-2-dawei.li@shingroup.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-02 18:19:09 -07:00
Krzysztof Kozlowski
ad6afdfc63 net: dsa: sja1105: drop driver owner assignment
Core in spi_register_driver() already sets the .owner, so driver
does not need to.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://lore.kernel.org/r/20240330211023.100924-2-krzysztof.kozlowski@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-02 18:17:18 -07:00
Krzysztof Kozlowski
a343eb0343 net: dsa: microchip: drop driver owner assignment
Core in spi_register_driver() already sets the .owner, so driver
does not need to.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://lore.kernel.org/r/20240330211023.100924-1-krzysztof.kozlowski@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-02 18:17:18 -07:00
Niklas Söderlund
8da891720c dt-bindings: net: renesas,ethertsn: Create child-node for MDIO bus
The bindings for Renesas Ethernet TSN was just merged in v6.9 and the
design for the bindings followed that of other Renesas Ethernet drivers
and thus did not force a child-node for the MDIO bus. As there
are no upstream drivers or users of this binding yet take the
opportunity to correct this and force the usage of a child-node for the
MDIO bus.

Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20240330131228.1541227-1-niklas.soderlund+renesas@ragnatech.se
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-02 18:16:32 -07:00
Jakub Kicinski
eb05529a10 Merge branch 'page_pool-allow-direct-bulk-recycling'
Alexander Lobakin says:

====================
page_pool: allow direct bulk recycling

Previously, there was no reliable way to check whether it's safe to use
direct PP cache. The drivers were passing @allow_direct to the PP
recycling functions and that was it. Bulk recycling is used by
xdp_return_frame_bulk() on .ndo_xdp_xmit() frames completion where
the page origin is unknown, thus the direct recycling has never been
tried.
Now that we have at least 2 ways of checking if we're allowed to perform
direct recycling -- pool->p.napi (Jakub) and pool->cpuid (Lorenzo), we
can use them when doing bulk recycling as well. Just move that logic
from the skb core to the PP core and call it before
__page_pool_put_page() every time @allow_direct is false.
Under high .ndo_xdp_xmit() traffic load, the win is 2-3% Pps assuming
the sending driver uses xdp_return_frame_bulk() on Tx completion.
====================

Link: https://lore.kernel.org/r/20240329165507.3240110-1-aleksander.lobakin@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-02 18:13:51 -07:00
Alexander Lobakin
39806b96c8 page_pool: try direct bulk recycling
Now that the checks for direct recycling possibility live inside the
Page Pool core, reuse them when performing bulk recycling.
page_pool_put_page_bulk() can be called from process context as well,
page_pool_napi_local() takes care of this at the very beginning.
Under high .ndo_xdp_xmit() traffic load, the win is 2-3% Pps assuming
the sending driver uses xdp_return_frame_bulk() on Tx completion.

Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Link: https://lore.kernel.org/r/20240329165507.3240110-3-aleksander.lobakin@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-02 18:13:49 -07:00
Alexander Lobakin
4a96a4e807 page_pool: check for PP direct cache locality later
Since we have pool->p.napi (Jakub) and pool->cpuid (Lorenzo) to check
whether it's safe to use direct recycling, we can use both globally for
each page instead of relying solely on @allow_direct argument.
Let's assume that @allow_direct means "I'm sure it's local, don't waste
time rechecking this" and when it's false, try the mentioned params to
still recycle the page directly. If neither is true, we'll lose some
CPU cycles, but then it surely won't be hotpath. On the other hand,
paths where it's possible to use direct cache, but not possible to
safely set @allow_direct, will benefit from this move.
The whole propagation of @napi_safe through a dozen of skb freeing
functions can now go away, which saves us some stack space.

Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Link: https://lore.kernel.org/r/20240329165507.3240110-2-aleksander.lobakin@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-02 18:13:49 -07:00
Jonathan Neuschäfer
8db2509faa rhashtable: Improve grammar
Change "a" to "an" according to the usual rules, fix an "if" that
was mistyped as "in", improve grammar in "considerable slow" ->
"considerably slower".

Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Link: https://lore.kernel.org/r/20240329-misc-rhashtable-v1-1-5862383ff798@gmx.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-02 18:03:32 -07:00
Jakub Kicinski
d6d647d7ba tools: ynl: add ynl_dump_empty() helper
Checking if dump is empty requires a couple of casts.
Add a convenient wrapper.

Add an example use in the netdev sample, loopback is always
present so an empty dump is an error.

Reviewed-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Link: https://lore.kernel.org/r/20240329181651.319326-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-02 18:01:58 -07:00
Gustavo A. R. Silva
d88cabfd9a nfp: Avoid -Wflex-array-member-not-at-end warnings
-Wflex-array-member-not-at-end is coming in GCC-14, and we are getting
ready to enable it globally.

There is currently an object (`tl`), at the beginning of multiple
structures, that contains a flexible structure (`struct nfp_dump_tl`),
for example:

struct nfp_dumpspec_csr {
        struct nfp_dump_tl tl;

        ...

        __be32 register_width;  /* in bits */
};

So, in order to avoid ending up with flexible-array members in the
middle of multiple other structs, we use the `struct_group_tagged()`
helper to separate the flexible array from the rest of the members
in the flexible structure:

struct nfp_dump_tl {
	struct_group_tagged(nfp_dump_tl_hdr, hdr,

	... the rest of members

	);
        char data[];
};

With the change described above, we now declare objects of the type of
the tagged struct, in this case `struct nfp_dump_tl_hdr`, without
embedding flexible arrays in the middle of another struct:

struct nfp_dumpspec_csr {
        struct nfp_dump_tl_hdr tl;

	...

        __be32 register_width;  /* in bits */
};

Also, use `container_of()` whenever we need to retrieve a pointer to
the flexible structure, through which we can access the flexible
array if needed.

So, with these changes, fix 33 of the following warnings:
drivers/net/ethernet/netronome/nfp/nfp_net_debugdump.c:58:28: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
drivers/net/ethernet/netronome/nfp/nfp_net_debugdump.c:64:28: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
drivers/net/ethernet/netronome/nfp/nfp_net_debugdump.c:70:28: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
drivers/net/ethernet/netronome/nfp/nfp_net_debugdump.c:78:28: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
drivers/net/ethernet/netronome/nfp/nfp_net_debugdump.c:87:28: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
drivers/net/ethernet/netronome/nfp/nfp_net_debugdump.c:92:28: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]

Link: https://github.com/KSPP/linux/issues/202
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://lore.kernel.org/r/ZgYWlkxdrrieDYIu@neat
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-02 17:58:50 -07:00
Paweł Owoc
c278ec6443 net: phy: aquantia: add support for AQR114C PHY ID
Add support for AQR114C PHY ID. This PHY advertise 10G speed:
SPEED(0x04): 0x6031
  capabilities: -400g +5g +2.5g -200g -25g -10g-xr -100g -40g -10g/1g -10
                +100 +1000 -10-ts -2-tl +10g
EXTABLE(0x0B): 0x40fc
  capabilities: -10g-cx4 -10g-lrm +10g-t +10g-kx4 +10g-kr +1000-t +1000-kx
                +100-tx -10-t -p2mp -40g/100g -1000/100-t1 -25g -200g/400g
                +2.5g/5g -1000-h

but supports only up to 5G speed (as with AQR111/111B0).
AQR111 init config is used to set max speed 5G.

Signed-off-by: Paweł Owoc <frut3k7@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20240401145114.1699451-1-frut3k7@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-02 17:46:28 -07:00
Eric Dumazet
5fc68320c1 ipv6: remove RTNL protection from inet6_dump_fib()
No longer hold RTNL while calling inet6_dump_fib().

Also change return value for a completed dump,
so that NLMSG_DONE can be appended to current skb,
saving one recvmsg() system call.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20240329183053.644630-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-01 21:44:57 -07:00
Jakub Kicinski
edaa34e68c Merge branch 'genetlink-remove-linux-genetlink-h'
Jakub Kicinski says:

====================
genetlink: remove linux/genetlink.h

There are two genetlink headers net/genetlink.h and linux/genetlink.h
This is similar to netlink.h, but for netlink.h both contain good
amount of code. For genetlink.h the linux/ version is leftover
from before uAPI headers were split out, it has 10 lines of code.
Move those 10 lines into other appropriate headers and delete
linux/genetlink.h.

I occasionally open the wrong header in the editor when coding,
I guess I'm not the only one.

v2: https://lore.kernel.org/all/20240325173716.2390605-1-kuba@kernel.org/
v1: https://lore.kernel.org/all/20240309183458.3014713-1-kuba@kernel.org
====================

Link: https://lore.kernel.org/r/20240329175710.291749-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-01 21:44:36 -07:00
Jakub Kicinski
cd7209628c genetlink: remove linux/genetlink.h
genetlink.h is a shell of what used to be a combined uAPI
and kernel header over a decade ago. It has fewer than
10 lines of code. Merge it into net/genetlink.h.
In some ways it'd be better to keep the combined header
under linux/ but it would make looking through git history
harder.

Acked-by: Sven Eckelmann <sven@narfation.org>
Link: https://lore.kernel.org/r/20240329175710.291749-4-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-01 21:44:34 -07:00
Jakub Kicinski
f97c9b533a net: openvswitch: remove unnecessary linux/genetlink.h include
The only legit reason I could think of for net/genetlink.h
and linux/genetlink.h to be separate would be if one was
included by other headers and we wanted to keep it lightweight.
That is not the case, net/openvswitch/meter.h includes
linux/genetlink.h but for no apparent reason (for struct genl_family
perhaps? it's not necessary, types of externs do not need
to be known).

Link: https://lore.kernel.org/r/20240329175710.291749-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-01 21:44:34 -07:00
Jakub Kicinski
5bc63d3a6f netlink: create a new header for internal genetlink symbols
There are things in linux/genetlink.h which are only used
under net/netlink/. Move them to a new local header.
A new header with just 2 externs isn't great, but alternative
would be to include af_netlink.h in genetlink.c which feels
even worse.

Link: https://lore.kernel.org/r/20240329175710.291749-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-01 21:44:34 -07:00
Jakub Kicinski
092ca10741 Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2024-03-29 (net: intel)

This series contains updates to most Intel drivers.

Jesse moves declaration of pci_driver struct to remove need for forward
declarations in igb and converts Intel drivers to user newer power
management ops.

Sasha reworks power management flow on igc to avoid using rtnl_lock()
during those flows.

Maciej reorganizes i40e_nvm file to avoid forward declarations.

* '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
  i40e: avoid forward declarations in i40e_nvm.c
  igc: Refactor runtime power management flow
  net: intel: implement modern PM ops declarations
  igb: simplify pci ops declaration
====================

Link: https://lore.kernel.org/r/20240329175632.211340-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-01 21:44:12 -07:00
Eric Dumazet
1eeb504357 tcp/dccp: do not care about families in inet_twsk_purge()
We lost ability to unload ipv6 module a long time ago.

Instead of calling expensive inet_twsk_purge() twice,
we can handle all families in one round.

Also remove an extra line added in my prior patch,
per Kuniyuki Iwashima feedback.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/netdev/20240327192934.6843-1-kuniyu@amazon.com/
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://lore.kernel.org/r/20240329153203.345203-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-01 21:27:58 -07:00