Commit Graph

1412675 Commits

Author SHA1 Message Date
Bastien Curutchet (Schneider Electric)
d99c1a01ac net: dsa: microchip: Use regs[] to access REG_PTP_SUBNANOSEC_RATE
Accesses to the PTP_SUBNANOSEC_RATE register are done through a
hardcoded address which doesn't match with the KSZ8463's register
layout.

Add a new entry for the PTP_SUBNANOSEC_RATE register in the regs[]
tables.
Use the regs[] table to retrieve the PTP_SUBNANOSEC_RATE register
address when accessing it.
Remove the macro defining the address to prevent further use.

Signed-off-by: Bastien Curutchet (Schneider Electric) <bastien.curutchet@bootlin.com>
Link: https://patch.msgid.link/20260105-ksz-rework-v1-7-a68df7f57375@bootlin.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-01-08 13:01:16 +01:00
Bastien Curutchet (Schneider Electric)
5b1fe74fac net: dsa: microchip: Use regs[] to access REG_PTP_RTC_SUB_NANOSEC
Accesses to the PTP_RTC_SUB_NANOSEC register are done through a
hardcoded address which doesn't match with the KSZ8463's register
layout.

Add a new entry for the PTP_RTC_SUB_NANOSEC register in the regs[]
tables.
Use the regs[] table to retrieve the PTP_RTC_SUB_NANOSEC register
address when accessing it.
Remove the macro defining the address to prevent further use.

Signed-off-by: Bastien Curutchet (Schneider Electric) <bastien.curutchet@bootlin.com>
Link: https://patch.msgid.link/20260105-ksz-rework-v1-6-a68df7f57375@bootlin.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-01-08 13:01:16 +01:00
Bastien Curutchet (Schneider Electric)
776ad30de0 net: dsa: microchip: Use regs[] to access REG_PTP_RTC_SEC
Accesses to the PTP_RTC_SEC register are done through a hardcoded
address which doesn't match with the KSZ8463's register layout.

Add a new entry for the PTP_RTC_SEC register in the regs[] tables.
Use the regs[] table to retrieve the PTP_RTC_SEC register address
when accessing it.
Remove the macro defining the address to prevent further use.

Signed-off-by: Bastien Curutchet (Schneider Electric) <bastien.curutchet@bootlin.com>
Link: https://patch.msgid.link/20260105-ksz-rework-v1-5-a68df7f57375@bootlin.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-01-08 13:01:16 +01:00
Bastien Curutchet (Schneider Electric)
0ee0566fc2 net: dsa: microchip: Use regs[] to access REG_PTP_RTC_NANOSEC
Accesses to the PTP_RTC_NANOSEC register are done through a hardcoded
address which doesn't match with the KSZ8463's register layout.

Add a new entry for the PTP_RTC_NANOSEC register in the regs[] tables.
Use the regs[] table to retrieve the PTP_RTC_NANOSEC register address
when accessing it.
Remove the macro defining the address to prevent further use.

Signed-off-by: Bastien Curutchet (Schneider Electric) <bastien.curutchet@bootlin.com>
Link: https://patch.msgid.link/20260105-ksz-rework-v1-4-a68df7f57375@bootlin.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-01-08 13:01:16 +01:00
Bastien Curutchet (Schneider Electric)
62382d6ffe net: dsa: microchip: Use regs[] to access REG_PTP_CLK_CTRL
Accesses to the PTP_CLK_CTRL register are done through a hardcoded
address which doesn't match with the KSZ8463's register layout.

Add a new entry for the PTP_CLK_CTRL register in the regs[] tables.
Use the regs[] table to retrieve the PTP_CLK_CTRL register address
when accessing it.
Remove the macro defining the address to prevent further use.

Signed-off-by: Bastien Curutchet (Schneider Electric) <bastien.curutchet@bootlin.com>
Link: https://patch.msgid.link/20260105-ksz-rework-v1-3-a68df7f57375@bootlin.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-01-08 13:01:16 +01:00
Bastien Curutchet (Schneider Electric)
22bde912e8 net: dsa: microchip: Use dynamic irq offset
The PTP irq_chip operations use an hardcoded IRQ offset in the bit
logic. This IRQ offset isn't the same on KSZ8463 than on others switches
so it can't use the irq_chip operations.

Convey the interrupt bit offset through a new attribute in struct ksz_irq

Signed-off-by: Bastien Curutchet (Schneider Electric) <bastien.curutchet@bootlin.com>
Link: https://patch.msgid.link/20260105-ksz-rework-v1-2-a68df7f57375@bootlin.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-01-08 13:01:16 +01:00
Bastien Curutchet (Schneider Electric)
813feab1ac net: dsa: microchip: Initialize IRQ's mask outside common_setup()
The IRQ logic of the KSZ8463 differs from that of other KSZ switches.
It doesn't have a 'mask' register but an 'enable' one instead. The
common IRQ framework can still be used though as soon as we reverse
the logic (using '1' to enable interrupts instead of '0') for KSZ8463
cases.

Move the initialization of the kirq->masked outside of
ksz_irq_common_setup() to keep this function truly common when
IRQ support for the KSZ8463 is added.

Signed-off-by: Bastien Curutchet (Schneider Electric) <bastien.curutchet@bootlin.com>
Link: https://patch.msgid.link/20260105-ksz-rework-v1-1-a68df7f57375@bootlin.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-01-08 13:01:16 +01:00
Thomas Fourier
8e7148b560 atm: idt77252: Use sb_pool_remove()
Replacing the manual pool remove with the dedicated function.  This is
safer and more consistent with the rest of the code[1].

[1]; https://lore.kernel.org/all/20250625094013.GL1562@horms.kernel.org/

Signed-off-by: Thomas Fourier <fourier.thomas@gmail.com>
Link: https://patch.msgid.link/20260105212916.26678-2-fourier.thomas@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-06 18:04:59 -08:00
Willem de Bruijn
3b7a108c41 selftests/net: packetdrill: add minimal client and server tests
Introduce minimal tests. These can serve as simple illustrative
examples, and as templates when writing new tests.

When adding new cases, it can be easier to extend an existing base
test rather than start from scratch. The existing tests all focus on
real, often non-trivial, features. It is not obvious which to take as
starting point, and arguably none really qualify.

Add two tests
- the client test performs the active open and initial close
- the server test implements the passive open and final close

Signed-off-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20260105172529.3514786-1-willemdebruijn.kernel@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-06 17:38:26 -08:00
Daniel Sedlak
55ffb0b14a tcp: clarify tcp_congestion_ops functions comments
The optional and required hints in the tcp_congestion_ops are information
for the user of this interface to signalize its importance when
implementing these functions.

However, cong_avoid comment incorrectly tells that it is required,
in reality congestion control must provide one of either cong_avoid or
cong_control.

In addition, min_tso_segs has not had any comment optional/required
hints. So mark it as optional since it is used only in BBR.

Co-developed-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Daniel Sedlak <daniel.sedlak@cdn77.com>
Link: https://patch.msgid.link/20260105115533.1151442-1-daniel.sedlak@cdn77.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-06 17:35:17 -08:00
Vivian Wang
f66086798f net: spacemit: Remove broken flow control support
The current flow control implementation doesn't handle autonegotiation
and ethtool operations properly. Remove it for now so we don't claim
support for something that doesn't really work. A better implementation
will be sent in future patches.

Signed-off-by: Vivian Wang <wangruikang@iscas.ac.cn>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20260104-k1-ethernet-actually-remove-fc-v3-1-3871b055064c@iscas.ac.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-06 17:15:36 -08:00
Breno Leitao
48b27ea623 net: gve: convert to use .get_rx_ring_count
Convert the Google Virtual Ethernet (GVE) driver to use the new
.get_rx_ring_count ethtool operation instead of handling
ETHTOOL_GRXRINGS in .get_rxnfc. This simplifies the code by moving the
ring count query to a dedicated callback.

The new callback provides the same functionality in a more direct way,
following the ongoing ethtool API modernization.

Reviewed-by: Subbaraya Sundeep <sbhatta@marvell.com>
Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com>
Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://patch.msgid.link/20260105-gxring_google-v2-1-e7cfe924d429@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-06 17:15:00 -08:00
Eric Dumazet
27a01c1969 net: fully inline backlog_unlock_irq_restore()
Some arches (like x86) do not inline spin_unlock_irqrestore().

backlog_unlock_irq_restore() is in RPS/RFS critical path,
we prefer using spin_unlock() + local_irq_restore() for
optimal performance.

Also change backlog_unlock_irq_restore() second argument
to avoid a pointless dereference.

No difference in net/core/dev.o code size.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260105163054.13698-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-06 17:14:35 -08:00
Francesco Dolcini
3f049b6534 net: fec: Add stop mode support on i.MX8DX/i.MX8QP
Add additional machines that requires communication to the SC firmware
to set the GPR bit required for stop mode support.

NXP i.MX8DX (fsl,imx8dx) is a low end version of i.MX8QXP (fsl,imx8qxp),
while NXP i.MX8QP (fsl,imx8qp) is a low end version of i.MX8QM
(fsl,imx8qm).

Signed-off-by: Francesco Dolcini <francesco.dolcini@toradex.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20260105152452.84338-1-francesco@dolcini.it
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-06 17:12:58 -08:00
Yeounsu Moon
b70c5c4923 net: dlink: replace printk() with netdev_{info,dbg}() in rio_probe1()
Replace rio_probe1() printk(KERN_INFO) messages with netdev_{info,dbg}().

Keep one netdev_info() line for device identification; move the rest to
netdev_dbg() to avoid spamming the kernel log.

Log rx_timeout on a separate line since netdev_*() prefixes each
message and the multi-line formatting looks broken otherwise.

No functional change intended.

Tested-on: D-Link DGE-550T Rev-A3
Signed-off-by: Yeounsu Moon <yyyynoom@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20260105130552.8721-2-yyyynoom@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-06 17:11:38 -08:00
Lorenzo Bianconi
4d513329b8 net: airoha: Use gdm port enum value whenever possible
Use AIROHA_GDMx_IDX enum value whenever possible.
This patch is just cosmetic changes and does not introduce any logic one.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://patch.msgid.link/20260105-airoha-use-port-idx-enum-v1-1-503ca5763858@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-06 17:06:40 -08:00
Eric Dumazet
e9cd04b281 udp: udplite is unlikely
Add some unlikely() annotations to speed up the fast path,
at least with clang compiler.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260105101719.2378881-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-06 17:06:03 -08:00
Lorenzo Bianconi
e4bc5dd53b net: airoha: npu: Dump fw version during probe
Dump firmware version running on the npu during module probe.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://patch.msgid.link/20260105-airoha-npu-dump-fw-v1-1-36d8326975f8@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-06 17:04:36 -08:00
Jiawen Wu
d362f44633 net: libwx: remove unused rx_buffer_pgcnt
The variable rx_buffer_pgcnt is redundant, just remove it.

Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Link: https://patch.msgid.link/F0907C8394B2D4A8+20260105071158.49929-1-jiawenwu@trustnetic.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-06 17:04:07 -08:00
Gustavo A. R. Silva
c86af46b9c ipv4/inet_sock.h: Avoid thousands of -Wflex-array-member-not-at-end warnings
Use DEFINE_RAW_FLEX() to avoid thousands of -Wflex-array-member-not-at-end
warnings.

Remove struct ip_options_data, and adjust the rest of the code so that
flexible-array member struct ip_options_rcu::opt.__data[] ends last
in struct icmp_bxm.

Compensate for this by using the DEFINE_RAW_FLEX() helper to define each
on-stack struct instance that contained struct ip_options_data as a member,
and to define struct ip_options_rcu with a fixed on-stack size for its
nested flexible-array member opt.__data[].

Also, add a couple of code comments to prevent people from adding members
to a struct after another member that contains a flexible array.

With these changes, fix 2600 warnings of the following type:

include/net/inet_sock.h:65:33: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://patch.msgid.link/aVteBadWA6AbTp7X@kspp
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-06 17:02:52 -08:00
Slark Xiao
915a5f60ad net: wwan: mhi: Add network support for Foxconn T99W760
T99W760 is designed based on Qualcomm SDX35 chip. It use similar
architecture with SDX72/SDX75 chip. So we need to assign initial
link id for this device to make sure network available.

Signed-off-by: Slark Xiao <slark_xiao@163.com>
Link: https://patch.msgid.link/20260105022646.10630-1-slark_xiao@163.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-06 17:00:55 -08:00
Jakub Kicinski
c4df070a57 selftests: hw-net: rss-input-xfrm: try to enable the xfrm at the start
The test currently SKIPs if the symmetric RSS xfrm is not enabled
by default. This leads to spurious SKIPs in the Intel CI reporting
results to NIPA.

Testing on CX7:

 # ./drivers/net/hw/rss_input_xfrm.py
  TAP version 13
  1..2
  ok 1 rss_input_xfrm.test_rss_input_xfrm_ipv4 # SKIP Test requires IPv4 connectivity
  # Sym input xfrm already enabled: {'sym-or-xor'}
  ok 2 rss_input_xfrm.test_rss_input_xfrm_ipv6
  # Totals: pass:1 fail:0 xfail:0 xpass:0 skip:1 error:0

 # ethtool -X eth0 xfrm none

 # ./drivers/net/hw/rss_input_xfrm.py
  TAP version 13
  1..2
  ok 1 rss_input_xfrm.test_rss_input_xfrm_ipv4 # SKIP Test requires IPv4 connectivity
  # Sym input xfrm configured: {'sym-or-xor'}
  ok 2 rss_input_xfrm.test_rss_input_xfrm_ipv6
  # Totals: pass:1 fail:0 xfail:0 xpass:0 skip:1 error:0

Link: https://patch.msgid.link/20260104184600.795280-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-06 16:54:10 -08:00
Yumei Huang
cb3de96eea ipv6: preserve insertion order for same-scope addresses
IPv6 addresses with the same scope are returned in reverse insertion
order, unlike IPv4. For example, when adding a -> b -> c, the list is
reported as c -> b -> a, while IPv4 preserves the original order.

This behavior causes:

a. When using `ip -6 a save` and `ip -6 a restore`, addresses are restored
   in the opposite order from which they were saved. See example below
   showing addresses added as 1::1, 1::2, 1::3 but displayed and saved
   in reverse order.

   # ip -6 a a 1::1 dev x
   # ip -6 a a 1::2 dev x
   # ip -6 a a 1::3 dev x
   # ip -6 a s dev x
   2: x: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
       inet6 1::3/128 scope global tentative
       valid_lft forever preferred_lft forever
       inet6 1::2/128 scope global tentative
       valid_lft forever preferred_lft forever
       inet6 1::1/128 scope global tentative
       valid_lft forever preferred_lft forever
   # ip -6 a save > dump
   # ip -6 a d 1::1 dev x
   # ip -6 a d 1::2 dev x
   # ip -6 a d 1::3 dev x
   # ip a d ::1 dev lo
   # ip a restore < dump
   # ip -6 a s dev x
   2: x: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
       inet6 1::1/128 scope global tentative
       valid_lft forever preferred_lft forever
       inet6 1::2/128 scope global tentative
       valid_lft forever preferred_lft forever
       inet6 1::3/128 scope global tentative
       valid_lft forever preferred_lft forever
   # ip a showdump < dump
    if1:
        inet6 ::1/128 scope host proto kernel_lo
        valid_lft forever preferred_lft forever
    if2:
        inet6 1::3/128 scope global tentative
        valid_lft forever preferred_lft forever
    if2:
        inet6 1::2/128 scope global tentative
        valid_lft forever preferred_lft forever
    if2:
        inet6 1::1/128 scope global tentative
        valid_lft forever preferred_lft forever

b. Addresses in pasta to appear in reversed order compared to host
   addresses.

The ipv6 addresses were added in reverse order by commit e55ffac601
("[IPV6]: order addresses by scope"), then it was changed by commit
502a2ffd73 ("ipv6: convert idev_list to list macros"), and restored by
commit b54c9b98bb ("ipv6: Preserve pervious behavior in
ipv6_link_dev_addr()."). However, this reverse ordering within the same
scope causes inconsistency with IPv4 and the issues described above.

This patch aligns IPv6 address ordering with IPv4 for consistency
by changing the comparison from >= to > when inserting addresses
into the address list. Also updates the ioam6 selftest to reflect
the new address ordering behavior. Combine these two changes into
one patch for bisectability.

Link: https://bugs.passt.top/show_bug.cgi?id=175
Suggested-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Yumei Huang <yuhuang@redhat.com>
Acked-by: Justin Iurman <justin.iurman@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20260104032357.38555-1-yuhuang@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-06 16:41:35 -08:00
Jakub Kicinski
956f569c90 Merge branch 'rust-net-replace-kernel-c_str-with-c-strings'
Tamir Duberstein says:

====================
rust: net: replace `kernel::c_str!` with C-Strings

C-String literals were added in Rust 1.77. Replace instances of
`kernel::c_str!` with C-String literals where possible.
====================

Link: https://patch.msgid.link/20260103-cstr-net-v2-0-8688f504b85d@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-05 16:32:41 -08:00
Tamir Duberstein
5a69d30f30 drivers: net: replace kernel::c_str! with C-Strings
C-String literals were added in Rust 1.77. Replace instances of
`kernel::c_str!` with C-String literals where possible.

Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Reviewed-by: Benno Lossin <lossin@kernel.org>
Reviewed-by: FUJITA Tomonori <fujita.tomonori@gmail.com>
Reviewed-by: Daniel Almeida <daniel.almeida@collabora.com>
Signed-off-by: Tamir Duberstein <tamird@gmail.com>
Link: https://patch.msgid.link/20260103-cstr-net-v2-2-8688f504b85d@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-05 16:32:39 -08:00
Tamir Duberstein
7a8461a2a8 rust: net: replace kernel::c_str! with C-Strings
C-String literals were added in Rust 1.77. Replace instances of
`kernel::c_str!` with C-String literals where possible.

Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Reviewed-by: Benno Lossin <lossin@kernel.org>
Reviewed-by: FUJITA Tomonori <fujita.tomonori@gmail.com>
Reviewed-by: Daniel Almeida <daniel.almeida@collabora.com>
Signed-off-by: Tamir Duberstein <tamird@gmail.com>
Link: https://patch.msgid.link/20260103-cstr-net-v2-1-8688f504b85d@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-05 16:32:39 -08:00
Alok Tiwari
32291cb036 net: marvell: prestera: correct return type of prestera_ldr_wait_buf()
prestera_ldr_wait_buf() returns the result of readl_poll_timeout(),
which is 0 on success or -ETIMEDOUT on failure. Its current return
type is u32.

Assigning this u32 value to an int variable works today because the
bit pattern of (u32)-ETIMEDOUT (0xffffff92) is correctly interpreted
as -ETIMEDOUT when stored in an int. However, keeping the function
return type as u32 is misleading and fragile.

Change the return type to int to reflect the actual semantic of
returning negative error codes and to avoid potential issues with
unsigned casts. There is no functional change.

Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20260103174313.1172197-1-alok.a.tiwari@oracle.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-05 16:18:03 -08:00
Aleksander Jan Bajkowski
2e22977154 net: phy: mediatek: enable interrupts on AN7581
Interrupts work just like on MT7988.

Suggested-by: Benjamin Larsson <benjamin.larsson@genexis.eu>
Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20260102113222.3519900-1-olek2@wp.pl
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-05 16:16:53 -08:00
Clara Engler
48a4aa9d9c ipv4: Improve martian logs
At the current moment, the logs for martian packets are as follows:
```
martian source {DST} from {SRC}, on dev {DEV}
martian destination {DST} from {SRC}, dev {DEV}
```

These messages feel rather hard to understand in production, especially
the "martian source" one, mostly because it is grammatically ambitious
to parse which part is now the source address and which part is the
destination address.  For example, "{DST}" may there be interpreted as
the actual source address due to following the word "source", thereby
implying the actual source address to be the destination one.

Personally, I discovered this bug while toying around with TUN
interfaces and using them as a tunnel (receiving packets via a TUN
interface and sending them over a TCP stream; receiving packets from a
TCP stream and writing them to a TUN).[^1]

When these IP addresses contained local IPs (i.e. 10.0.0.0/8 in source
and destination), everything worked fine.  However, sending them to a
real routable IP address on the internet led to them being treated as a
martian packet, obviously.  Using a few sysctl(8) and iptables(8)
settings[^2] fixed it, but while debugging I found the log message
starting with "martian source" rather confusing, as I was unsure on
whether the packet that gets dropped was the packet originating from me
or the response from the endpoint, as "martian source <ROUTABLE IP>"
could also be falsely interpreted as the response packet being martian,
due to the word "source" followed by the routable IP address, implying
the source address of that packet is set to this IP, as explained above.
In the end, I had to look into the source code of the kernel on where
this error message gets generated, which is usually an indicator of
there being room for improvement with regard to this error message.

In terms of improvement, this commit changes the error messages for
martian source and martian destination packets as follows:
```
martian source (src={SRC}, dst={DST}, dev={DEV})
martian destination (src={SRC}, dst={DST}, dev={DEV})
```

These new wordings leave pretty much no room for ambiguity as all
parameters are prefixed with a respective key explaining their semantic
meaning.

See also the following thread on LKML.[^3]

[^1]: <https://backreference.org/2010/03/26/tuntap-interface-tutorial>
[^2]: sysctl net.ipv4.ip_forward=1 && \
      iptables -A INPUT -i tun0 -j ACCEPT && \
      iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE
[^3]: <https://lore.kernel.org/all/aSd4Xj8rHrh-krjy@4944566b5c925f79/>

Signed-off-by: Clara Engler <cve@cve.cx>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20260101125114.2608-1-cve@cve.cx
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-05 16:15:50 -08:00
Robert Marko
c303e8b86d dt-bindings: net: mscc-miim: add microchip,lan9691-miim
Document Microchip LAN969x MIIM compatible.

Signed-off-by: Robert Marko <robert.marko@sartura.hr>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://patch.msgid.link/20251229184004.571837-11-robert.marko@sartura.hr
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-04 09:58:51 -08:00
Linus Torvalds
dbf8fe85a1 Merge tag 'net-6.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
 "Including fixes from Bluetooth and WiFi. Notably this includes the fix
  for the iwlwifi issue you reported.

  Current release - regressions:

   - core: avoid prefetching NULL pointers

   - wifi:
      - iwlwifi: implement settime64 as stub for MVM/MLD PTP
      - mac80211: fix list iteration in ieee80211_add_virtual_monitor()

   - handshake: fix null-ptr-deref in handshake_complete()

   - eth: mana: fix use-after-free in reset service rescan path

  Previous releases - regressions:

   - openvswitch: avoid needlessly taking the RTNL on vport destroy

   - dsa: properly keep track of conduit reference

   - ipv4:
      - fix error route reference count leak with nexthop objects
      - fib: restore ECMP balance from loopback

   - mptcp: ensure context reset on disconnect()

   - bluetooth: fix potential UaF in btusb

   - nfc: fix deadlock between nfc_unregister_device and
     rfkill_fop_write

   - eth:
      - gve: defer interrupt enabling until NAPI registration
      - i40e: fix scheduling in set_rx_mode
      - macb: relocate mog_init_rings() callback from macb_mac_link_up()
        to macb_open()
      - rtl8150: fix memory leak on usb_submit_urb() failure

   - wifi: 8192cu: fix tid out of range in rtl92cu_tx_fill_desc()

  Previous releases - always broken:

   - ip6_gre: make ip6gre_header() robust

   - ipv6: fix a BUG in rt6_get_pcpu_route() under PREEMPT_RT

   - af_unix: don't post cmsg for SO_INQ unless explicitly asked for

   - phy: mediatek: fix nvmem cell reference leak in
     mt798x_phy_calibration

   - wifi: mac80211: discard beacon frames to non-broadcast address

   - eth:
      - iavf: fix off-by-one issues in iavf_config_rss_reg()
      - stmmac: fix the crash issue for zero copy XDP_TX action
      - team: fix check for port enabled when priority changes"

* tag 'net-6.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (64 commits)
  ipv6: fix a BUG in rt6_get_pcpu_route() under PREEMPT_RT
  net: rose: fix invalid array index in rose_kill_by_device()
  net: enetc: do not print error log if addr is 0
  net: macb: Relocate mog_init_rings() callback from macb_mac_link_up() to macb_open()
  selftests: fib_test: Add test case for ipv4 multi nexthops
  net: fib: restore ECMP balance from loopback
  selftests: fib_nexthops: Add test cases for error routes deletion
  ipv4: Fix reference count leak when using error routes with nexthop objects
  net: usb: sr9700: fix incorrect command used to write single register
  ipv6: BUG() in pskb_expand_head() as part of calipso_skbuff_setattr()
  usbnet: avoid a possible crash in dql_completed()
  gve: defer interrupt enabling until NAPI registration
  net: stmmac: fix the crash issue for zero copy XDP_TX action
  octeontx2-pf: fix "UBSAN: shift-out-of-bounds error"
  af_unix: don't post cmsg for SO_INQ unless explicitly asked for
  net: mana: Fix use-after-free in reset service rescan path
  net: avoid prefetching NULL pointers
  net: bridge: Describe @tunnel_hash member in net_bridge_vlan_group struct
  net: nfc: fix deadlock between nfc_unregister_device and rfkill_fop_write
  net: usb: asix: validate PHY address before use
  ...
2025-12-30 08:45:58 -08:00
Jiayuan Chen
1adaea51c6 ipv6: fix a BUG in rt6_get_pcpu_route() under PREEMPT_RT
On PREEMPT_RT kernels, after rt6_get_pcpu_route() returns NULL, the
current task can be preempted. Another task running on the same CPU
may then execute rt6_make_pcpu_route() and successfully install a
pcpu_rt entry. When the first task resumes execution, its cmpxchg()
in rt6_make_pcpu_route() will fail because rt6i_pcpu is no longer
NULL, triggering the BUG_ON(prev). It's easy to reproduce it by adding
mdelay() after rt6_get_pcpu_route().

Using preempt_disable/enable is not appropriate here because
ip6_rt_pcpu_alloc() may sleep.

Fix this by handling the cmpxchg() failure gracefully on PREEMPT_RT:
free our allocation and return the existing pcpu_rt installed by
another task. The BUG_ON is replaced by WARN_ON_ONCE for non-PREEMPT_RT
kernels where such races should not occur.

Link: https://syzkaller.appspot.com/bug?extid=9b35e9bc0951140d13e6
Fixes: d2d6422f8b ("x86: Allow to enable PREEMPT_RT.")
Reported-by: syzbot+9b35e9bc0951140d13e6@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/6918cd88.050a0220.1c914e.0045.GAE@google.com/T/
Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
Link: https://patch.msgid.link/20251223051413.124687-1-jiayuan.chen@linux.dev
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-12-30 12:04:36 +01:00
Pwnverse
6595beb40f net: rose: fix invalid array index in rose_kill_by_device()
rose_kill_by_device() collects sockets into a local array[] and then
iterates over them to disconnect sockets bound to a device being brought
down.

The loop mistakenly indexes array[cnt] instead of array[i]. For cnt <
ARRAY_SIZE(array), this reads an uninitialized entry; for cnt ==
ARRAY_SIZE(array), it is an out-of-bounds read. Either case can lead to
an invalid socket pointer dereference and also leaks references taken
via sock_hold().

Fix the index to use i.

Fixes: 64b8bc7d5f ("net/rose: fix races in rose_kill_by_device()")
Co-developed-by: Fatma Alwasmi <falwasmi@purdue.edu>
Signed-off-by: Fatma Alwasmi <falwasmi@purdue.edu>
Signed-off-by: Pwnverse <stanksal@purdue.edu>
Link: https://patch.msgid.link/20251222212227.4116041-1-ritviktanksalkar@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-12-30 11:45:51 +01:00
Wei Fang
5939b6dbcd net: enetc: do not print error log if addr is 0
A value of 0 for addr indicates that the IEB_LBCR register does not
need to be configured, as its default value is 0. However, the driver
will print an error log if addr is 0, so this issue needs to be fixed.

Fixes: 50bfd9c06f ("net: enetc: set external PHY address in IERB for i.MX94 ENETC")
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20251222022628.4016403-1-wei.fang@nxp.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-12-30 11:30:41 +01:00
Xiaolei Wang
99537d5c47 net: macb: Relocate mog_init_rings() callback from macb_mac_link_up() to macb_open()
In the non-RT kernel, local_bh_disable() merely disables preemption,
whereas it maps to an actual spin lock in the RT kernel. Consequently,
when attempting to refill RX buffers via netdev_alloc_skb() in
macb_mac_link_up(), a deadlock scenario arises as follows:

   WARNING: possible circular locking dependency detected
   6.18.0-08691-g2061f18ad76e #39 Not tainted
   ------------------------------------------------------
   kworker/0:0/8 is trying to acquire lock:
   ffff00080369bbe0 (&bp->lock){+.+.}-{3:3}, at: macb_start_xmit+0x808/0xb7c

   but task is already holding lock:
   ffff000803698e58 (&queue->tx_ptr_lock){+...}-{3:3}, at: macb_start_xmit
   +0x148/0xb7c

   which lock already depends on the new lock.

   the existing dependency chain (in reverse order) is:

   -> #3 (&queue->tx_ptr_lock){+...}-{3:3}:
          rt_spin_lock+0x50/0x1f0
          macb_start_xmit+0x148/0xb7c
          dev_hard_start_xmit+0x94/0x284
          sch_direct_xmit+0x8c/0x37c
          __dev_queue_xmit+0x708/0x1120
          neigh_resolve_output+0x148/0x28c
          ip6_finish_output2+0x2c0/0xb2c
          __ip6_finish_output+0x114/0x308
          ip6_output+0xc4/0x4a4
          mld_sendpack+0x220/0x68c
          mld_ifc_work+0x2a8/0x4f4
          process_one_work+0x20c/0x5f8
          worker_thread+0x1b0/0x35c
          kthread+0x144/0x200
          ret_from_fork+0x10/0x20

   -> #2 (_xmit_ETHER#2){+...}-{3:3}:
          rt_spin_lock+0x50/0x1f0
          sch_direct_xmit+0x11c/0x37c
          __dev_queue_xmit+0x708/0x1120
          neigh_resolve_output+0x148/0x28c
          ip6_finish_output2+0x2c0/0xb2c
          __ip6_finish_output+0x114/0x308
          ip6_output+0xc4/0x4a4
          mld_sendpack+0x220/0x68c
          mld_ifc_work+0x2a8/0x4f4
          process_one_work+0x20c/0x5f8
          worker_thread+0x1b0/0x35c
          kthread+0x144/0x200
          ret_from_fork+0x10/0x20

   -> #1 ((softirq_ctrl.lock)){+.+.}-{3:3}:
          lock_release+0x250/0x348
          __local_bh_enable_ip+0x7c/0x240
          __netdev_alloc_skb+0x1b4/0x1d8
          gem_rx_refill+0xdc/0x240
          gem_init_rings+0xb4/0x108
          macb_mac_link_up+0x9c/0x2b4
          phylink_resolve+0x170/0x614
          process_one_work+0x20c/0x5f8
          worker_thread+0x1b0/0x35c
          kthread+0x144/0x200
          ret_from_fork+0x10/0x20

   -> #0 (&bp->lock){+.+.}-{3:3}:
          __lock_acquire+0x15a8/0x2084
          lock_acquire+0x1cc/0x350
          rt_spin_lock+0x50/0x1f0
          macb_start_xmit+0x808/0xb7c
          dev_hard_start_xmit+0x94/0x284
          sch_direct_xmit+0x8c/0x37c
          __dev_queue_xmit+0x708/0x1120
          neigh_resolve_output+0x148/0x28c
          ip6_finish_output2+0x2c0/0xb2c
          __ip6_finish_output+0x114/0x308
          ip6_output+0xc4/0x4a4
          mld_sendpack+0x220/0x68c
          mld_ifc_work+0x2a8/0x4f4
          process_one_work+0x20c/0x5f8
          worker_thread+0x1b0/0x35c
          kthread+0x144/0x200
          ret_from_fork+0x10/0x20

   other info that might help us debug this:

   Chain exists of:
     &bp->lock --> _xmit_ETHER#2 --> &queue->tx_ptr_lock

    Possible unsafe locking scenario:

          CPU0                    CPU1
          ----                    ----
     lock(&queue->tx_ptr_lock);
                                  lock(_xmit_ETHER#2);
                                  lock(&queue->tx_ptr_lock);
     lock(&bp->lock);

    *** DEADLOCK ***

   Call trace:
    show_stack+0x18/0x24 (C)
    dump_stack_lvl+0xa0/0xf0
    dump_stack+0x18/0x24
    print_circular_bug+0x28c/0x370
    check_noncircular+0x198/0x1ac
    __lock_acquire+0x15a8/0x2084
    lock_acquire+0x1cc/0x350
    rt_spin_lock+0x50/0x1f0
    macb_start_xmit+0x808/0xb7c
    dev_hard_start_xmit+0x94/0x284
    sch_direct_xmit+0x8c/0x37c
    __dev_queue_xmit+0x708/0x1120
    neigh_resolve_output+0x148/0x28c
    ip6_finish_output2+0x2c0/0xb2c
    __ip6_finish_output+0x114/0x308
    ip6_output+0xc4/0x4a4
    mld_sendpack+0x220/0x68c
    mld_ifc_work+0x2a8/0x4f4
    process_one_work+0x20c/0x5f8
    worker_thread+0x1b0/0x35c
    kthread+0x144/0x200
    ret_from_fork+0x10/0x20

Notably, invoking the mog_init_rings() callback upon link establishment
is unnecessary. Instead, we can exclusively call mog_init_rings() within
the ndo_open() callback. This adjustment resolves the deadlock issue.
Furthermore, since MACB_CAPS_MACB_IS_EMAC cases do not use mog_init_rings()
when opening the network interface via at91ether_open(), moving
mog_init_rings() to macb_open() also eliminates the MACB_CAPS_MACB_IS_EMAC
check.

Fixes: 633e98a711 ("net: macb: use resolved link config in mac_link_up()")
Cc: stable@vger.kernel.org
Suggested-by: Kevin Hao <kexin.hao@windriver.com>
Signed-off-by: Xiaolei Wang <xiaolei.wang@windriver.com>
Link: https://patch.msgid.link/20251222015624.1994551-1-xiaolei.wang@windriver.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-12-30 11:27:45 +01:00
Vadim Fedorenko
3be42c3b3d selftests: fib_test: Add test case for ipv4 multi nexthops
The test checks that with multi nexthops route the preferred route is the
one which matches source ip. In case when source ip is on dummy
interface, it checks that the routes are balanced.

Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Link: https://patch.msgid.link/20251221192639.3911901-2-vadim.fedorenko@linux.dev
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-12-30 11:07:38 +01:00
Vadim Fedorenko
6e17474aa9 net: fib: restore ECMP balance from loopback
Preference of nexthop with source address broke ECMP for packets with
source addresses which are not in the broadcast domain, but rather added
to loopback/dummy interfaces. Original behaviour was to balance over
nexthops while now it uses the latest nexthop from the group. To fix the
issue introduce next hop scoring system where next hops with source
address equal to requested will always have higher priority.

For the case with 198.51.100.1/32 assigned to dummy0 and routed using
192.0.2.0/24 and 203.0.113.0/24 networks:

2: dummy0: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether d6:54:8a:ff:78:f5 brd ff:ff:ff:ff:ff:ff
    inet 198.51.100.1/32 scope global dummy0
       valid_lft forever preferred_lft forever
7: veth1@if6: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 06:ed:98:87:6d:8a brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 192.0.2.2/24 scope global veth1
       valid_lft forever preferred_lft forever
    inet6 fe80::4ed:98ff:fe87:6d8a/64 scope link proto kernel_ll
       valid_lft forever preferred_lft forever
9: veth3@if8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether ae:75:23:38:a0:d2 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 203.0.113.2/24 scope global veth3
       valid_lft forever preferred_lft forever
    inet6 fe80::ac75:23ff:fe38:a0d2/64 scope link proto kernel_ll
       valid_lft forever preferred_lft forever

~ ip ro list:
default
	nexthop via 192.0.2.1 dev veth1 weight 1
	nexthop via 203.0.113.1 dev veth3 weight 1
192.0.2.0/24 dev veth1 proto kernel scope link src 192.0.2.2
203.0.113.0/24 dev veth3 proto kernel scope link src 203.0.113.2

before:
   for i in {1..255} ; do ip ro get 10.0.0.$i; done | grep veth | awk ' {print $(NF-2)}' | sort | uniq -c:
    255 veth3

after:
   for i in {1..255} ; do ip ro get 10.0.0.$i; done | grep veth | awk ' {print $(NF-2)}' | sort | uniq -c:
    122 veth1
    133 veth3

Fixes: 32607a332c ("ipv4: prefer multipath nexthop that matches source address")
Signed-off-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20251221192639.3911901-1-vadim.fedorenko@linux.dev
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-12-30 11:07:38 +01:00
Ido Schimmel
44741e9de2 selftests: fib_nexthops: Add test cases for error routes deletion
Add test cases that check that error routes (e.g., blackhole) are
deleted when their nexthop is deleted.

Output without "ipv4: Fix reference count leak when using error routes
with nexthop objects":

 # ./fib_nexthops.sh -t "ipv4_fcnal ipv6_fcnal"

 IPv4 functional
 ----------------------
 [...]
       WARNING: Unexpected route entry
 TEST: Error route removed on nexthop deletion                       [FAIL]

 IPv6
 ----------------------
 [...]
 TEST: Error route removed on nexthop deletion                       [ OK ]

 Tests passed:  20
 Tests failed:   1
 Tests skipped:  0

Output with "ipv4: Fix reference count leak when using error routes
with nexthop objects":

 # ./fib_nexthops.sh -t "ipv4_fcnal ipv6_fcnal"

 IPv4 functional
 ----------------------
 [...]
 TEST: Error route removed on nexthop deletion                       [ OK ]

 IPv6
 ----------------------
 [...]
 TEST: Error route removed on nexthop deletion                       [ OK ]

 Tests passed:  21
 Tests failed:   0
 Tests skipped:  0

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20251221144829.197694-2-idosch@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-12-30 10:39:22 +01:00
Ido Schimmel
ac782f4e3b ipv4: Fix reference count leak when using error routes with nexthop objects
When a nexthop object is deleted, it is marked as dead and then
fib_table_flush() is called to flush all the routes that are using the
dead nexthop.

The current logic in fib_table_flush() is to only flush error routes
(e.g., blackhole) when it is called as part of network namespace
dismantle (i.e., with flush_all=true). Therefore, error routes are not
flushed when their nexthop object is deleted:

 # ip link add name dummy1 up type dummy
 # ip nexthop add id 1 dev dummy1
 # ip route add 198.51.100.1/32 nhid 1
 # ip route add blackhole 198.51.100.2/32 nhid 1
 # ip nexthop del id 1
 # ip route show
 blackhole 198.51.100.2 nhid 1 dev dummy1

As such, they keep holding a reference on the nexthop object which in
turn holds a reference on the nexthop device, resulting in a reference
count leak:

 # ip link del dev dummy1
 [   70.516258] unregister_netdevice: waiting for dummy1 to become free. Usage count = 2

Fix by flushing error routes when their nexthop is marked as dead.

IPv6 does not suffer from this problem.

Fixes: 493ced1ac4 ("ipv4: Allow routes to use nexthop objects")
Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Closes: https://lore.kernel.org/netdev/d943f806-4da6-4970-ac28-b9373b0e63ac@I-love.SAKURA.ne.jp/
Reported-by: syzbot+881d65229ca4f9ae8c84@syzkaller.appspotmail.com
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20251221144829.197694-1-idosch@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-12-30 10:39:22 +01:00
Ethan Nelson-Moore
fa0b198be1 net: usb: sr9700: fix incorrect command used to write single register
This fixes the device failing to initialize with "error reading MAC
address" for me, probably because the incorrect write of NCR_RST to
SR_NCR is not actually resetting the device.

Fixes: c9b37458e9 ("USB2NET : SR9700 : One chip USB 1.1 USB2NET SR9700Device Driver Support")
Cc: stable@vger.kernel.org
Signed-off-by: Ethan Nelson-Moore <enelsonmoore@gmail.com>
Link: https://patch.msgid.link/20251221082400.50688-1-enelsonmoore@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-12-30 10:31:54 +01:00
Linus Torvalds
8640b74557 Merge tag 'kbuild-fixes-6.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux
Pull Kbuild fixes from Nicolas Schier:

 - Revert commit "scripts/clang-tools: Handle included .c files in
   gen_compile_commands" which is reported to cause false entries for
   some files.

 - Fix compilation of dtb specified on command-line without make rule

 - mcb: Add missing modpost build support

* tag 'kbuild-fixes-6.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux:
  mcb: Add missing modpost build support
  kbuild: fix compilation of dtb specified on command-line without make rule
  Revert "scripts/clang-tools: Handle included .c files in gen_compile_commands"
2025-12-29 12:06:44 -08:00
Linus Torvalds
0b34fd0fea Merge tag 'mm-hotfixes-stable-2025-12-28-21-50' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
 "27 hotfixes.  12 are cc:stable, 18 are MM.

  There's a patch series from Jiayuan Chen which fixes some
  issues with KASAN and vmalloc. Apart from that it's the usual
  shower of singletons - please see the respective changelogs
  for details"

* tag 'mm-hotfixes-stable-2025-12-28-21-50' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (27 commits)
  mm/ksm: fix pte_unmap_unlock of wrong address in break_ksm_pmd_entry
  mm/page_owner: fix memory leak in page_owner_stack_fops->release()
  mm/memremap: fix spurious large folio warning for FS-DAX
  MAINTAINERS: notify the "Device Memory" community of memory hotplug changes
  sparse: update MAINTAINERS info
  mm/page_alloc: report 1 as zone_batchsize for !CONFIG_MMU
  mm: consider non-anon swap cache folios in folio_expected_ref_count()
  rust: maple_tree: rcu_read_lock() in destructor to silence lockdep
  mm: memcg: fix unit conversion for K() macro in OOM log
  mm: fixup pfnmap memory failure handling to use pgoff
  tools/mm/page_owner_sort: fix timestamp comparison for stable sorting
  selftests/mm: fix thread state check in uffd-unit-tests
  kernel/kexec: fix IMA when allocation happens in CMA area
  kernel/kexec: change the prototype of kimage_map_segment()
  MAINTAINERS: add ABI headers to KHO and LIVE UPDATE
  .mailmap: remove one of the entries for WangYuli
  mm/damon/vaddr: fix missing pte_unmap_unlock in damos_va_migrate_pmd_entry()
  MAINTAINERS: update one straggling entry for Bartosz Golaszewski
  mm/page_alloc: change all pageblocks migrate type on coalescing
  mm: leafops.h: correct kernel-doc function param. names
  ...
2025-12-29 11:40:38 -08:00
Will Rosenberg
58fc7342b5 ipv6: BUG() in pskb_expand_head() as part of calipso_skbuff_setattr()
There exists a kernel oops caused by a BUG_ON(nhead < 0) at
net/core/skbuff.c:2232 in pskb_expand_head().
This bug is triggered as part of the calipso_skbuff_setattr()
routine when skb_cow() is passed headroom > INT_MAX
(i.e. (int)(skb_headroom(skb) + len_delta) < 0).

The root cause of the bug is due to an implicit integer cast in
__skb_cow(). The check (headroom > skb_headroom(skb)) is meant to ensure
that delta = headroom - skb_headroom(skb) is never negative, otherwise
we will trigger a BUG_ON in pskb_expand_head(). However, if
headroom > INT_MAX and delta <= -NET_SKB_PAD, the check passes, delta
becomes negative, and pskb_expand_head() is passed a negative value for
nhead.

Fix the trigger condition in calipso_skbuff_setattr(). Avoid passing
"negative" headroom sizes to skb_cow() within calipso_skbuff_setattr()
by only using skb_cow() to grow headroom.

PoC:
	Using `netlabelctl` tool:

        netlabelctl map del default
        netlabelctl calipso add pass doi:7
        netlabelctl map add default address:0::1/128 protocol:calipso,7

        Then run the following PoC:

        int fd = socket(AF_INET6, SOCK_DGRAM, IPPROTO_UDP);

        // setup msghdr
        int cmsg_size = 2;
        int cmsg_len = 0x60;
        struct msghdr msg;
        struct sockaddr_in6 dest_addr;
        struct cmsghdr * cmsg = (struct cmsghdr *) calloc(1,
                        sizeof(struct cmsghdr) + cmsg_len);
        msg.msg_name = &dest_addr;
        msg.msg_namelen = sizeof(dest_addr);
        msg.msg_iov = NULL;
        msg.msg_iovlen = 0;
        msg.msg_control = cmsg;
        msg.msg_controllen = cmsg_len;
        msg.msg_flags = 0;

        // setup sockaddr
        dest_addr.sin6_family = AF_INET6;
        dest_addr.sin6_port = htons(31337);
        dest_addr.sin6_flowinfo = htonl(31337);
        dest_addr.sin6_addr = in6addr_loopback;
        dest_addr.sin6_scope_id = 31337;

        // setup cmsghdr
        cmsg->cmsg_len = cmsg_len;
        cmsg->cmsg_level = IPPROTO_IPV6;
        cmsg->cmsg_type = IPV6_HOPOPTS;
        char * hop_hdr = (char *)cmsg + sizeof(struct cmsghdr);
        hop_hdr[1] = 0x9; //set hop size - (0x9 + 1) * 8 = 80

        sendmsg(fd, &msg, 0);

Fixes: 2917f57b6b ("calipso: Allow the lsm to label the skbuff directly.")
Suggested-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Will Rosenberg <whrosenb@asu.edu>
Acked-by: Paul Moore <paul@paul-moore.com>
Link: https://patch.msgid.link/20251219173637.797418-1-whrosenb@asu.edu
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-12-29 19:36:45 +01:00
Eric Dumazet
e34f0df3d8 usbnet: avoid a possible crash in dql_completed()
syzbot reported a crash [1] in dql_completed() after recent usbnet
BQL adoption.

The reason for the crash is that netdev_reset_queue() is called too soon.

It should be called after cancel_work_sync(&dev->bh_work) to make
sure no more TX completion can happen.

[1]
kernel BUG at lib/dynamic_queue_limits.c:99 !
Oops: invalid opcode: 0000 [#1] SMP KASAN PTI
CPU: 1 UID: 0 PID: 5197 Comm: udevd Tainted: G             L      syzkaller #0 PREEMPT(full)
Tainted: [L]=SOFTLOCKUP
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/25/2025
 RIP: 0010:dql_completed+0xbe1/0xbf0 lib/dynamic_queue_limits.c:99
Call Trace:
 <IRQ>
  netdev_tx_completed_queue include/linux/netdevice.h:3864 [inline]
  netdev_completed_queue include/linux/netdevice.h:3894 [inline]
  usbnet_bh+0x793/0x1020 drivers/net/usb/usbnet.c:1601
  process_one_work kernel/workqueue.c:3257 [inline]
  process_scheduled_works+0xad1/0x1770 kernel/workqueue.c:3340
  bh_worker+0x2b1/0x600 kernel/workqueue.c:3611
  tasklet_action+0xc/0x70 kernel/softirq.c:952
  handle_softirqs+0x27d/0x850 kernel/softirq.c:622
  __do_softirq kernel/softirq.c:656 [inline]
  invoke_softirq kernel/softirq.c:496 [inline]
  __irq_exit_rcu+0xca/0x1f0 kernel/softirq.c:723
  irq_exit_rcu+0x9/0x30 kernel/softirq.c:739

Fixes: 7ff14c5204 ("usbnet: Add support for Byte Queue Limits (BQL)")
Reported-by: syzbot+5b55e49f8bbd84631a9c@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/6945644f.a70a0220.207337.0113.GAE@google.com/T/#u
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Simon Schippers <simon.schippers@tu-dortmund.de>
Link: https://patch.msgid.link/20251219144459.692715-1-edumazet@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-12-29 19:23:34 +01:00
Ankit Garg
3d970eda00 gve: defer interrupt enabling until NAPI registration
Currently, interrupts are automatically enabled immediately upon
request. This allows interrupt to fire before the associated NAPI
context is fully initialized and cause failures like below:

[    0.946369] Call Trace:
[    0.946369]  <IRQ>
[    0.946369]  __napi_poll+0x2a/0x1e0
[    0.946369]  net_rx_action+0x2f9/0x3f0
[    0.946369]  handle_softirqs+0xd6/0x2c0
[    0.946369]  ? handle_edge_irq+0xc1/0x1b0
[    0.946369]  __irq_exit_rcu+0xc3/0xe0
[    0.946369]  common_interrupt+0x81/0xa0
[    0.946369]  </IRQ>
[    0.946369]  <TASK>
[    0.946369]  asm_common_interrupt+0x22/0x40
[    0.946369] RIP: 0010:pv_native_safe_halt+0xb/0x10

Use the `IRQF_NO_AUTOEN` flag when requesting interrupts to prevent auto
enablement and explicitly enable the interrupt in NAPI initialization
path (and disable it during NAPI teardown).

This ensures that interrupt lifecycle is strictly coupled with
readiness of NAPI context.

Cc: stable@vger.kernel.org
Fixes: 1dfc2e4611 ("gve: Refactor napi add and remove functions")
Signed-off-by: Ankit Garg <nktgrg@google.com>
Reviewed-by: Jordan Rhee <jordanrhee@google.com>
Reviewed-by: Joshua Washington <joshwash@google.com>
Signed-off-by: Harshitha Ramamurthy <hramamurthy@google.com>
Link: https://patch.msgid.link/20251219102945.2193617-1-hramamurthy@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-12-29 19:09:49 +01:00
Wei Fang
a48e232210 net: stmmac: fix the crash issue for zero copy XDP_TX action
There is a crash issue when running zero copy XDP_TX action, the crash
log is shown below.

[  216.122464] Unable to handle kernel paging request at virtual address fffeffff80000000
[  216.187524] Internal error: Oops: 0000000096000144 [#1]  SMP
[  216.301694] Call trace:
[  216.304130]  dcache_clean_poc+0x20/0x38 (P)
[  216.308308]  __dma_sync_single_for_device+0x1bc/0x1e0
[  216.313351]  stmmac_xdp_xmit_xdpf+0x354/0x400
[  216.317701]  __stmmac_xdp_run_prog+0x164/0x368
[  216.322139]  stmmac_napi_poll_rxtx+0xba8/0xf00
[  216.326576]  __napi_poll+0x40/0x218
[  216.408054] Kernel panic - not syncing: Oops: Fatal exception in interrupt

For XDP_TX action, the xdp_buff is converted to xdp_frame by
xdp_convert_buff_to_frame(). The memory type of the resulting xdp_frame
depends on the memory type of the xdp_buff. For page pool based xdp_buff
it produces xdp_frame with memory type MEM_TYPE_PAGE_POOL. For zero copy
XSK pool based xdp_buff it produces xdp_frame with memory type
MEM_TYPE_PAGE_ORDER0. However, stmmac_xdp_xmit_back() does not check the
memory type and always uses the page pool type, this leads to invalid
mappings and causes the crash. Therefore, check the xdp_buff memory type
in stmmac_xdp_xmit_back() to fix this issue.

Fixes: bba2556efa ("net: stmmac: Enable RX via AF_XDP zero-copy")
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Reviewed-by: Hariprasad Kelam <hkelam@marvell.com>
Link: https://patch.msgid.link/20251204071332.1907111-1-wei.fang@nxp.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-12-29 17:35:36 +01:00
Paolo Abeni
a6694b7e39 Merge tag 'wireless-2025-12-17' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless
Johannes Berg says:

====================
Various fixes all over, most are recent regressions but
also some long-standing issues:

 - cfg80211:
    - fix an issue with overly long SSIDs

 - mac80211:
    - long-standing beacon protection issue on some devices
    - for for a multi-BSSID AP-side issue
    - fix a syzbot warning on OCB (not really used in practice)
    - remove WARN on connections using disabled channels,
      as that can happen due to changes in the disable flag
    - fix monitor mode list iteration

 - iwlwifi:
    - fix firmware loading on certain (really old) devices
    - add settime64 to PTP clock to avoid a warning and clock
      registration failure, but it's not actually supported

 - rtw88:
    - remove WQ_UNBOUND since it broke USB adapters
      (because it can't be used with WQ_BH)
    - fix SDIO issues with certain devices

 - rtl8192cu: fix TID array out-of-bounds (since 6.9)

 - wlcore (TI): add missing skb push headroom increase

* tag 'wireless-2025-12-17' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless:
  wifi: iwlwifi: Implement settime64 as stub for MVM/MLD PTP
  wifi: iwlwifi: Fix firmware version handling
  wifi: mac80211: ocb: skip rx_no_sta when interface is not joined
  wifi: mac80211: do not use old MBSSID elements
  wifi: mac80211: don't WARN for connections on invalid channels
  wifi: wlcore: ensure skb headroom before skb_push
  wifi: cfg80211: sme: store capped length in __cfg80211_connect_result()
  wifi: mac80211: fix list iteration in ieee80211_add_virtual_monitor()
  wifi: mac80211: Discard Beacon frames to non-broadcast address
  Revert "wifi: rtw88: add WQ_UNBOUND to alloc_workqueue users"
  wifi: rtlwifi: 8192cu: fix tid out of range in rtl92cu_tx_fill_desc()
  wifi: rtw88: limit indirect IO under powered off for RTL8822CS
====================

Link: https://patch.msgid.link/20251217201441.59876-3-johannes@sipsolutions.net
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-12-29 17:04:01 +01:00
Anshumali Gaur
85f4b0c650 octeontx2-pf: fix "UBSAN: shift-out-of-bounds error"
This patch ensures that the RX ring size (rx_pending) is not
set below the permitted length. This avoids UBSAN
shift-out-of-bounds errors when users passes small or zero
ring sizes via ethtool -G.

Fixes: d45d897984 ("octeontx2-pf: Add basic ethtool support")
Signed-off-by: Anshumali Gaur <agaur@marvell.com>
Link: https://patch.msgid.link/20251219062226.524844-1-agaur@marvell.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-12-29 16:46:19 +01:00
Linus Torvalds
7839932417 Merge tag 'sched_ext-for-6.19-rc3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext
Pull sched_ext fixes from Tejun Heo:

 - Fix uninitialized @ret on alloc_percpu() failure leading to
   ERR_PTR(0)

 - Fix PREEMPT_RT warning when bypass load balancer sends IPI to offline
   CPU by using resched_cpu() instead of resched_curr()

 - Fix comment referring to renamed function

 - Update scx_show_state.py for scx_root and scx_aborting changes

* tag 'sched_ext-for-6.19-rc3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext:
  tools/sched_ext: update scx_show_state.py for scx_aborting change
  tools/sched_ext: fix scx_show_state.py for scx_root change
  sched_ext: Use the resched_cpu() to replace resched_curr() in the bypass_lb_node()
  sched_ext: Fix some comments in ext.c
  sched_ext: fix uninitialized ret on alloc_percpu() failure
2025-12-28 17:21:36 -08:00
Linus Torvalds
bba0b6a1c4 Merge tag 'cgroup-for-6.19-rc3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup fix from Tejun Heo:

 - Fix a spurious cpuset warning when disabling remote partition after
   CPU hotplug leaves subpartitions_cpus empty. Guard the warning and
   invalidate affected partitions.

* tag 'cgroup-for-6.19-rc3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cpuset: fix warning when disabling remote partition
2025-12-28 17:19:09 -08:00