Commit Graph

1296588 Commits

Author SHA1 Message Date
Breno Leitao
157f29152b netkit: Assign missing bpf_net_context
During the introduction of struct bpf_net_context handling for
XDP-redirect, the netkit driver has been missed, which also requires it
because NETKIT_REDIRECT invokes skb_do_redirect() which is accessing the
per-CPU variables. Otherwise we see the following crash:

	BUG: kernel NULL pointer dereference, address: 0000000000000038
	bpf_redirect()
	netkit_xmit()
	dev_hard_start_xmit()

Set the bpf_net_context before invoking netkit_xmit() program within the
netkit driver.

Fixes: 401cb7dae8 ("net: Reference bpf_redirect_info via task_struct on PREEMPT_RT.")
Signed-off-by: Breno Leitao <leitao@debian.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Acked-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://patch.msgid.link/20240912155620.1334587-1-leitao@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-13 19:49:45 -07:00
Maciej Fijalkowski
4144a1059b xsk: fix batch alloc API on non-coherent systems
In cases when synchronizing DMA operations is necessary,
xsk_buff_alloc_batch() returns a single buffer instead of the requested
count. This puts the pressure on drivers that use batch API as they have
to check for this corner case on their side and take care of allocations
by themselves, which feels counter productive. Let us improve the core
by looping over xp_alloc() @max times when slow path needs to be taken.

Another issue with current interface, as spotted and fixed by Dries, was
that when driver called xsk_buff_alloc_batch() with @max == 0, for slow
path case it still allocated and returned a single buffer, which should
not happen. By introducing the logic from first paragraph we kill two
birds with one stone and address this problem as well.

Fixes: 47e4075df3 ("xsk: Batched buffer allocation for the pool")
Reported-and-tested-by: Dries De Winter <ddewinter@synamedia.com>
Co-developed-by: Dries De Winter <ddewinter@synamedia.com>
Signed-off-by: Dries De Winter <ddewinter@synamedia.com>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Link: https://patch.msgid.link/20240911191019.296480-1-maciej.fijalkowski@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-13 17:36:16 -07:00
Jakub Kicinski
1f2e900ac2 Merge branch 'bareudp-pull-inner-ip-header-on-xmit-recv'
Guillaume Nault says:

====================
bareudp: Pull inner IP header on xmit/recv.

Bareudp accesses the inner IP header in its xmit and recv paths.
However it doesn't ensure that this header is part of skb->head.

Both vxlan and geneve have received fixes for similar problems
in the past. This series fixes bareudp using the same approach.
====================

Link: https://patch.msgid.link/cover.1726046181.git.gnault@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-12 20:30:33 -07:00
Guillaume Nault
c471236b23 bareudp: Pull inner IP header on xmit.
Both bareudp_xmit_skb() and bareudp6_xmit_skb() read their skb's inner
IP header to get its ECN value (with ip_tunnel_ecn_encap()). Therefore
we need to ensure that the inner IP header is part of the skb's linear
data.

Fixes: 571912c69f ("net: UDP tunnel encapsulation module for tunnelling different protocols like MPLS, IP, NSH etc.")
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/267328222f0a11519c6de04c640a4f87a38ea9ed.1726046181.git.gnault@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-12 20:30:15 -07:00
Guillaume Nault
45fa29c851 bareudp: Pull inner IP header in bareudp_udp_encap_recv().
Bareudp reads the inner IP header to get the ECN value. Therefore, it
needs to ensure that it's part of the skb's linear data.

This is similar to the vxlan and geneve fixes for that same problem:
  * commit f778941913 ("vxlan: Pull inner IP header in vxlan_rcv().")
  * commit 1ca1ba465e ("geneve: make sure to pull inner header in
    geneve_rx()")

Fixes: 571912c69f ("net: UDP tunnel encapsulation module for tunnelling different protocols like MPLS, IP, NSH etc.")
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/5205940067c40218a70fbb888080466b2fc288db.1726046181.git.gnault@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-12 20:30:15 -07:00
Linus Torvalds
5abfdfd402 Merge tag 'net-6.11-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
 "Including fixes from netfilter.

  There is a recently notified BT regression with no fix yet. I do not
  think a fix will land in the next week.

  Current release - regressions:

   - core: tighten bad gso csum offset check in virtio_net_hdr

   - netfilter: move nf flowtable bpf initialization in
     nf_flow_table_module_init()

   - eth: ice: stop calling pci_disable_device() as we use pcim

   - eth: fou: fix null-ptr-deref in GRO.

  Current release - new code bugs:

   - hsr: prevent NULL pointer dereference in hsr_proxy_announce()

  Previous releases - regressions:

   - hsr: remove seqnr_lock

   - netfilter: nft_socket: fix sk refcount leaks

   - mptcp: pm: fix uaf in __timer_delete_sync

   - phy: dp83822: fix NULL pointer dereference on DP83825 devices

   - eth: revert "virtio_net: rx enable premapped mode by default"

   - eth: octeontx2-af: Modify SMQ flush sequence to drop packets

  Previous releases - always broken:

   - eth: mlx5: fix bridge mode operations when there are no VFs

   - eth: igb: Always call igb_xdp_ring_update_tail() under Tx lock"

* tag 'net-6.11-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (36 commits)
  net: netfilter: move nf flowtable bpf initialization in nf_flow_table_module_init()
  net: tighten bad gso csum offset check in virtio_net_hdr
  netlink: specs: mptcp: fix port endianness
  net: dpaa: Pad packets to ETH_ZLEN
  mptcp: pm: Fix uaf in __timer_delete_sync
  net: libwx: fix number of Rx and Tx descriptors
  net: dsa: felix: ignore pending status of TAS module when it's disabled
  net: hsr: prevent NULL pointer dereference in hsr_proxy_announce()
  selftests: mptcp: include net_helper.sh file
  selftests: mptcp: include lib.sh file
  selftests: mptcp: join: restrict fullmesh endp on 1st sf
  netfilter: nft_socket: make cgroupsv2 matching work with namespaces
  netfilter: nft_socket: fix sk refcount leaks
  MAINTAINERS: Add ethtool pse-pd to PSE NETWORK DRIVER
  dt-bindings: net: tja11xx: fix the broken binding
  selftests: net: csum: Fix checksums for packets with non-zero padding
  net: phy: dp83822: Fix NULL pointer dereference on DP83825 devices
  virtio_net: disable premapped mode by default
  Revert "virtio_net: big mode skip the unmap check"
  Revert "virtio_net: rx remove premapped failover code"
  ...
2024-09-12 12:45:24 -07:00
Linus Torvalds
42c5b51949 Merge tag 'platform-drivers-x86-v6.11-7' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
Pull x86 platform driver fixes from Ilpo Järvinen:

 - asus-wmi: Disable OOBE that interferes with backlight control

 - panasonic-laptop: Two fixes to SINF array handling

* tag 'platform-drivers-x86-v6.11-7' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
  platform/x86: asus-wmi: Disable OOBE experience on Zenbook S 16
  platform/x86: panasonic-laptop: Allocate 1 entry extra in the sinf array
  platform/x86: panasonic-laptop: Fix SINF array out of bounds accesses
2024-09-12 12:34:39 -07:00
Linus Torvalds
79a61cc3fc mm: avoid leaving partial pfn mappings around in error case
As Jann points out, PFN mappings are special, because unlike normal
memory mappings, there is no lifetime information associated with the
mapping - it is just a raw mapping of PFNs with no reference counting of
a 'struct page'.

That's all very much intentional, but it does mean that it's easy to
mess up the cleanup in case of errors.  Yes, a failed mmap() will always
eventually clean up any partial mappings, but without any explicit
lifetime in the page table mapping itself, it's very easy to do the
error handling in the wrong order.

In particular, it's easy to mistakenly free the physical backing store
before the page tables are actually cleaned up and (temporarily) have
stale dangling PTE entries.

To make this situation less error-prone, just make sure that any partial
pfn mapping is torn down early, before any other error handling.

Reported-and-tested-by: Jann Horn <jannh@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Simona Vetter <simona.vetter@ffwll.ch>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-09-12 12:10:00 -07:00
Lorenzo Bianconi
3e705251d9 net: netfilter: move nf flowtable bpf initialization in nf_flow_table_module_init()
Move nf flowtable bpf initialization in nf_flow_table module load
routine since nf_flow_table_bpf is part of nf_flow_table module and not
nf_flow_table_inet one. This patch allows to avoid the following kernel
warning running the reproducer below:

$modprobe nf_flow_table_inet
$rmmod nf_flow_table_inet
$modprobe nf_flow_table_inet
modprobe: ERROR: could not insert 'nf_flow_table_inet': Invalid argument

[  184.081501] ------------[ cut here ]------------
[  184.081527] WARNING: CPU: 0 PID: 1362 at kernel/bpf/btf.c:8206 btf_populate_kfunc_set+0x23c/0x330
[  184.081550] CPU: 0 UID: 0 PID: 1362 Comm: modprobe Kdump: loaded Not tainted 6.11.0-0.rc5.22.el10.x86_64 #1
[  184.081553] Hardware name: Red Hat OpenStack Compute, BIOS 1.14.0-1.module+el8.4.0+8855+a9e237a9 04/01/2014
[  184.081554] RIP: 0010:btf_populate_kfunc_set+0x23c/0x330
[  184.081558] RSP: 0018:ff22cfb38071fc90 EFLAGS: 00010202
[  184.081559] RAX: 0000000000000001 RBX: 0000000000000001 RCX: 0000000000000000
[  184.081560] RDX: 000000000000006e RSI: ffffffff95c00000 RDI: ff13805543436350
[  184.081561] RBP: ffffffffc0e22180 R08: ff13805543410808 R09: 000000000001ec00
[  184.081562] R10: ff13805541c8113c R11: 0000000000000010 R12: ff13805541b83c00
[  184.081563] R13: ff13805543410800 R14: 0000000000000001 R15: ffffffffc0e2259a
[  184.081564] FS:  00007fa436c46740(0000) GS:ff1380557ba00000(0000) knlGS:0000000000000000
[  184.081569] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  184.081570] CR2: 000055e7b3187000 CR3: 0000000100c48003 CR4: 0000000000771ef0
[  184.081571] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  184.081572] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  184.081572] PKRU: 55555554
[  184.081574] Call Trace:
[  184.081575]  <TASK>
[  184.081578]  ? show_trace_log_lvl+0x1b0/0x2f0
[  184.081580]  ? show_trace_log_lvl+0x1b0/0x2f0
[  184.081582]  ? __register_btf_kfunc_id_set+0x199/0x200
[  184.081585]  ? btf_populate_kfunc_set+0x23c/0x330
[  184.081586]  ? __warn.cold+0x93/0xed
[  184.081590]  ? btf_populate_kfunc_set+0x23c/0x330
[  184.081592]  ? report_bug+0xff/0x140
[  184.081594]  ? handle_bug+0x3a/0x70
[  184.081596]  ? exc_invalid_op+0x17/0x70
[  184.081597]  ? asm_exc_invalid_op+0x1a/0x20
[  184.081601]  ? btf_populate_kfunc_set+0x23c/0x330
[  184.081602]  __register_btf_kfunc_id_set+0x199/0x200
[  184.081605]  ? __pfx_nf_flow_inet_module_init+0x10/0x10 [nf_flow_table_inet]
[  184.081607]  do_one_initcall+0x58/0x300
[  184.081611]  do_init_module+0x60/0x230
[  184.081614]  __do_sys_init_module+0x17a/0x1b0
[  184.081617]  do_syscall_64+0x7d/0x160
[  184.081620]  ? __count_memcg_events+0x58/0xf0
[  184.081623]  ? handle_mm_fault+0x234/0x350
[  184.081626]  ? do_user_addr_fault+0x347/0x640
[  184.081630]  ? clear_bhb_loop+0x25/0x80
[  184.081633]  ? clear_bhb_loop+0x25/0x80
[  184.081634]  ? clear_bhb_loop+0x25/0x80
[  184.081637]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[  184.081639] RIP: 0033:0x7fa43652e4ce
[  184.081647] RSP: 002b:00007ffe8213be18 EFLAGS: 00000246 ORIG_RAX: 00000000000000af
[  184.081649] RAX: ffffffffffffffda RBX: 000055e7b3176c20 RCX: 00007fa43652e4ce
[  184.081650] RDX: 000055e7737fde79 RSI: 0000000000003990 RDI: 000055e7b3185380
[  184.081651] RBP: 000055e7737fde79 R08: 0000000000000007 R09: 000055e7b3179bd0
[  184.081651] R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000040000
[  184.081652] R13: 000055e7b3176fa0 R14: 0000000000000000 R15: 000055e7b3179b80

Fixes: 391bb6594f ("netfilter: Add bpf_xdp_flow_lookup kfunc")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Acked-by: Florian Westphal <fw@strlen.de>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Link: https://patch.msgid.link/20240911-nf-flowtable-bpf-modprob-fix-v1-1-f9fc075aafc3@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-09-12 15:41:03 +02:00
Paolo Abeni
8700970971 Merge tag 'nf-24-09-12' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following batch contains two fixes from Florian Westphal:

Patch #1 fixes a sk refcount leak in nft_socket on mismatch.

Patch #2 fixes cgroupsv2 matching from containers due to incorrect
	 level in subtree.

netfilter pull request 24-09-12

* tag 'nf-24-09-12' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
  netfilter: nft_socket: make cgroupsv2 matching work with namespaces
  netfilter: nft_socket: fix sk refcount leaks
====================

Link: https://patch.msgid.link/20240911222520.3606-1-pablo@netfilter.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-09-12 15:26:19 +02:00
Willem de Bruijn
6513eb3d31 net: tighten bad gso csum offset check in virtio_net_hdr
The referenced commit drops bad input, but has false positives.
Tighten the check to avoid these.

The check detects illegal checksum offload requests, which produce
csum_start/csum_off beyond end of packet after segmentation.

But it is based on two incorrect assumptions:

1. virtio_net_hdr_to_skb with VIRTIO_NET_HDR_GSO_TCP[46] implies GSO.
True in callers that inject into the tx path, such as tap.
But false in callers that inject into rx, like virtio-net.
Here, the flags indicate GRO, and CHECKSUM_UNNECESSARY or
CHECKSUM_NONE without VIRTIO_NET_HDR_F_NEEDS_CSUM is normal.

2. TSO requires checksum offload, i.e., ip_summed == CHECKSUM_PARTIAL.
False, as tcp[46]_gso_segment will fix up csum_start and offset for
all other ip_summed by calling __tcp_v4_send_check.

Because of 2, we can limit the scope of the fix to virtio_net_hdr
that do try to set these fields, with a bogus value.

Link: https://lore.kernel.org/netdev/20240909094527.GA3048202@port70.net/
Fixes: 89add40066 ("net: drop bad gso csum_start and offset in virtio_net_hdr")
Signed-off-by: Willem de Bruijn <willemb@google.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20240910213553.839926-1-willemdebruijn.kernel@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-11 20:43:07 -07:00
Asbjørn Sloth Tønnesen
09a45a5553 netlink: specs: mptcp: fix port endianness
The MPTCP port attribute is in host endianness, but was documented
as big-endian in the ynl specification.

Below are two examples from net/mptcp/pm_netlink.c showing that the
attribute is converted to/from host endianness for use with netlink.

Import from netlink:
  addr->port = htons(nla_get_u16(tb[MPTCP_PM_ADDR_ATTR_PORT]))

Export to netlink:
  nla_put_u16(skb, MPTCP_PM_ADDR_ATTR_PORT, ntohs(addr->port))

Where addr->port is defined as __be16.

No functional change intended.

Fixes: bc8aeb2045 ("Documentation: netlink: add a YAML spec for mptcp")
Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net>
Reviewed-by: Davide Caratti <dcaratti@redhat.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20240911091003.1112179-1-ast@fiberby.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-11 20:42:47 -07:00
Sean Anderson
cbd7ec0834 net: dpaa: Pad packets to ETH_ZLEN
When sending packets under 60 bytes, up to three bytes of the buffer
following the data may be leaked. Avoid this by extending all packets to
ETH_ZLEN, ensuring nothing is leaked in the padding. This bug can be
reproduced by running

	$ ping -s 11 destination

Fixes: 9ad1a37493 ("dpaa_eth: add support for DPAA Ethernet")
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20240910143144.1439910-1-sean.anderson@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-11 16:01:25 -07:00
Edward Adam Davis
b4cd80b033 mptcp: pm: Fix uaf in __timer_delete_sync
There are two paths to access mptcp_pm_del_add_timer, result in a race
condition:

     CPU1				CPU2
     ====                               ====
     net_rx_action
     napi_poll                          netlink_sendmsg
     __napi_poll                        netlink_unicast
     process_backlog                    netlink_unicast_kernel
     __netif_receive_skb                genl_rcv
     __netif_receive_skb_one_core       netlink_rcv_skb
     NF_HOOK                            genl_rcv_msg
     ip_local_deliver_finish            genl_family_rcv_msg
     ip_protocol_deliver_rcu            genl_family_rcv_msg_doit
     tcp_v4_rcv                         mptcp_pm_nl_flush_addrs_doit
     tcp_v4_do_rcv                      mptcp_nl_remove_addrs_list
     tcp_rcv_established                mptcp_pm_remove_addrs_and_subflows
     tcp_data_queue                     remove_anno_list_by_saddr
     mptcp_incoming_options             mptcp_pm_del_add_timer
     mptcp_pm_del_add_timer             kfree(entry)

In remove_anno_list_by_saddr(running on CPU2), after leaving the critical
zone protected by "pm.lock", the entry will be released, which leads to the
occurrence of uaf in the mptcp_pm_del_add_timer(running on CPU1).

Keeping a reference to add_timer inside the lock, and calling
sk_stop_timer_sync() with this reference, instead of "entry->add_timer".

Move list_del(&entry->list) to mptcp_pm_del_add_timer and inside the pm lock,
do not directly access any members of the entry outside the pm lock, which
can avoid similar "entry->x" uaf.

Fixes: 00cfd77b90 ("mptcp: retransmit ADD_ADDR when timeout")
Cc: stable@vger.kernel.org
Reported-and-tested-by: syzbot+f3a31fb909db9b2a5c4d@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=f3a31fb909db9b2a5c4d
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Edward Adam Davis <eadavis@qq.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Link: https://patch.msgid.link/tencent_7142963A37944B4A74EF76CD66EA3C253609@qq.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-11 16:00:11 -07:00
Jiawen Wu
077ee7e6b1 net: libwx: fix number of Rx and Tx descriptors
The number of transmit and receive descriptors must be a multiple of 128
due to the hardware limitation. If it is set to a multiple of 8 instead of
a multiple 128, the queues will easily be hung.

Cc: stable@vger.kernel.org
Fixes: 883b5984a5 ("net: wangxun: add ethtool_ops for ring parameters")
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240910095629.570674-1-jiawenwu@trustnetic.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-11 15:59:43 -07:00
Xiaoliang Yang
70654f4c21 net: dsa: felix: ignore pending status of TAS module when it's disabled
The TAS module could not be configured when it's running in pending
status. We need disable the module and configure it again. However, the
pending status is not cleared after the module disabled. TC taprio set
will always return busy even it's disabled.

For example, a user uses tc-taprio to configure Qbv and a future
basetime. The TAS module will run in a pending status. There is no way
to reconfigure Qbv, it always returns busy.

Actually the TAS module can be reconfigured when it's disabled. So it
doesn't need to check the pending status if the TAS module is disabled.

After the patch, user can delete the tc taprio configuration to disable
Qbv and reconfigure it again.

Fixes: de143c0e27 ("net: dsa: felix: Configure Time-Aware Scheduler via taprio offload")
Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com>
Link: https://patch.msgid.link/20240906093550.29985-1-xiaoliang.yang_1@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-11 15:54:51 -07:00
Jeongjun Park
a7789fd4ca net: hsr: prevent NULL pointer dereference in hsr_proxy_announce()
In the function hsr_proxy_annouance() added in the previous commit
5f703ce5c9 ("net: hsr: Send supervisory frames to HSR network
with ProxyNodeTable data"), the return value of the hsr_port_get_hsr()
function is not checked to be a NULL pointer, which causes a NULL
pointer dereference.

To solve this, we need to add code to check whether the return value
of hsr_port_get_hsr() is NULL.

Reported-by: syzbot+02a42d9b1bd395cbcab4@syzkaller.appspotmail.com
Fixes: 5f703ce5c9 ("net: hsr: Send supervisory frames to HSR network with ProxyNodeTable data")
Signed-off-by: Jeongjun Park <aha310510@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Acked-by: Lukasz Majewski <lukma@denx.de>
Link: https://patch.msgid.link/20240907190341.162289-1-aha310510@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-11 15:49:41 -07:00
Jakub Kicinski
6254031777 Merge branch 'selftests-mptcp-misc-small-fixes'
Matthieu Baerts says:

====================
selftests: mptcp: misc. small fixes

Here are some various fixes for the MPTCP selftests.

Patch 1 fixes a recently modified test to continue to work as expected
on older kernels. This is a fix for a recent fix that can be backported
up to v5.15.

Patch 2 and 3 include dependences when exporting or installing the
tests. Two fixes for v6.11-rc1.
====================

Link: https://patch.msgid.link/20240910-net-selftests-mptcp-fix-install-v1-0-8f124aa9156d@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-11 15:18:23 -07:00
Matthieu Baerts (NGI0)
c66c08e51b selftests: mptcp: include net_helper.sh file
Similar to the previous commit, the net_helper.sh file from the parent
directory is used by the MPTCP selftests and it needs to be present when
running the tests.

This file then needs to be listed in the Makefile to be included when
exporting or installing the tests, e.g. with:

  make -C tools/testing/selftests \
          TARGETS=net/mptcp \
          install INSTALL_PATH=$KSFT_INSTALL_PATH

  cd $KSFT_INSTALL_PATH
  ./run_kselftest.sh -c net/mptcp

Fixes: 1af3bc912e ("selftests: mptcp: lib: use wait_local_port_listen helper")
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20240910-net-selftests-mptcp-fix-install-v1-3-8f124aa9156d@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-11 15:18:20 -07:00
Matthieu Baerts (NGI0)
1a5a2d19e8 selftests: mptcp: include lib.sh file
The lib.sh file from the parent directory is used by the MPTCP selftests
and it needs to be present when running the tests.

This file then needs to be listed in the Makefile to be included when
exporting or installing the tests, e.g. with:

  make -C tools/testing/selftests \
          TARGETS=net/mptcp \
          install INSTALL_PATH=$KSFT_INSTALL_PATH

  cd $KSFT_INSTALL_PATH
  ./run_kselftest.sh -c net/mptcp

Fixes: f265d3119a ("selftests: mptcp: lib: use setup/cleanup_ns helpers")
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20240910-net-selftests-mptcp-fix-install-v1-2-8f124aa9156d@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-11 15:18:19 -07:00
Matthieu Baerts (NGI0)
49ac6f05ac selftests: mptcp: join: restrict fullmesh endp on 1st sf
A new endpoint using the IP of the initial subflow has been recently
added to increase the code coverage. But it breaks the test when using
old kernels not having commit 86e39e0448 ("mptcp: keep track of local
endpoint still available for each msk"), e.g. on v5.15.

Similar to commit d4c81bbb86 ("selftests: mptcp: join: support local
endpoint being tracked or not"), it is possible to add the new endpoint
conditionally, by checking if "mptcp_pm_subflow_check_next" is present
in kallsyms: this is not directly linked to the commit introducing this
symbol but for the parent one which is linked anyway. So we can know in
advance what will be the expected behaviour, and add the new endpoint
only when it makes sense to do so.

Fixes: 4878f9f842 ("selftests: mptcp: join: validate fullmesh endp on 1st sf")
Cc: stable@vger.kernel.org
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20240910-net-selftests-mptcp-fix-install-v1-1-8f124aa9156d@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-11 15:18:19 -07:00
Florian Westphal
7f3287db65 netfilter: nft_socket: make cgroupsv2 matching work with namespaces
When running in container environmment, /sys/fs/cgroup/ might not be
the real root node of the sk-attached cgroup.

Example:

In container:
% stat /sys//fs/cgroup/
Device: 0,21    Inode: 2214  ..
% stat /sys/fs/cgroup/foo
Device: 0,21    Inode: 2264  ..

The expectation would be for:

  nft add rule .. socket cgroupv2 level 1 "foo" counter

to match traffic from a process that got added to "foo" via
"echo $pid > /sys/fs/cgroup/foo/cgroup.procs".

However, 'level 3' is needed to make this work.

Seen from initial namespace, the complete hierarchy is:

% stat /sys/fs/cgroup/system.slice/docker-.../foo
  Device: 0,21    Inode: 2264 ..

i.e. hierarchy is
0    1               2              3
/ -> system.slice -> docker-1... -> foo

... but the container doesn't know that its "/" is the "docker-1.."
cgroup.  Current code will retrieve the 'system.slice' cgroup node
and store its kn->id in the destination register, so compare with
2264 ("foo" cgroup id) will not match.

Fetch "/" cgroup from ->init() and add its level to the level we try to
extract.  cgroup root-level is 0 for the init-namespace or the level
of the ancestor that is exposed as the cgroup root inside the container.

In the above case, cgrp->level of "/" resolved in the container is 2
(docker-1...scope/) and request for 'level 1' will get adjusted
to fetch the actual level (3).

v2: use CONFIG_SOCK_CGROUP_DATA, eval function depends on it.
    (kernel test robot)

Cc: cgroups@vger.kernel.org
Fixes: e0bb96db96 ("netfilter: nft_socket: add support for cgroupsv2")
Reported-by: Nadia Pinaeva <n.m.pinaeva@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-09-12 00:16:58 +02:00
Florian Westphal
8b26ff7af8 netfilter: nft_socket: fix sk refcount leaks
We must put 'sk' reference before returning.

Fixes: 039b1f4f24 ("netfilter: nft_socket: fix erroneous socket assignment")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-09-12 00:16:54 +02:00
Linus Torvalds
77f5878967 Merge tag 'arm-fixes-6.11-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull ARM SoC fixes from Arnd Bergmann:
 "The bulk of the changes this time are for device tree files in the
  rockchips platform, addressing correctness issues on individual
  boards, plus one change in the rk356x SoC file to make it match the
  binding.

  The only other changes that came in are

   - a CPU frequencey scaling fix for JH7110 (RISC-V)

   - a build fix for the cznic hwrandom driver

   - a fix for a deadlock in qualcomm uefi secure application firmware
     driver"

* tag 'arm-fixes-6.11-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
  platform: cznic: turris-omnia-mcu: fix HW_RANDOM dependency
  riscv: dts: starfive: jh7110-common: Fix lower rate of CPUfreq by setting PLL0 rate to 1.5GHz
  firmware: qcom: uefisecapp: Fix deadlock in qcuefi_acquire()
  arm64: dts: rockchip: Fix compatibles for RK3588 VO{0,1}_GRF
  dt-bindings: soc: rockchip: Fix compatibles for RK3588 VO{0,1}_GRF
  arm64: dts: rockchip: override BIOS_DISABLE signal via GPIO hog on RK3399 Puma
  arm64: dts: rockchip: fix eMMC/SPI corruption when audio has been used on RK3399 Puma
  arm64: dts: rockchip: fix PMIC interrupt pin in pinctrl for ROCK Pi E
  arm64: dts: rockchip: Remove broken tsadc pinctrl binding for rk356x
2024-09-11 11:26:56 -07:00
Linus Torvalds
3857c7b041 Merge tag 'for-6.11/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device mapper fix from Mikulas Patocka:

 - fix a race condition in dm-integrity

* tag 'for-6.11/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dm-integrity: fix a race condition when accessing recalc_sector
2024-09-11 11:21:50 -07:00
Linus Torvalds
914413e3ee Merge tag 'printk-for-6.11-fixup' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux
Pull printk fix from Petr Mladek:

 - Fix build of serial_core as a module

* tag 'printk-for-6.11-fixup' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
  printk: Export match_devname_and_update_preferred_console()
2024-09-11 11:13:20 -07:00
Lorenzo Stoakes
7c6a3a65ac minmax: reduce min/max macro expansion in atomisp driver
Avoid unnecessary nested min()/max() which results in egregious macro
expansion.

Use clamp_t() as this introduces the least possible expansion, and turn
the {s,u}DIGIT_FITTING() macros into inline functions to avoid the
nested expansion.

This resolves an issue with slackware 15.0 32-bit compilation as
reported by Richard Narron.

Presumably the min/max fixups would be difficult to backport, this patch
should be easier and fix's Richard's problem in 5.15.

Reported-by: Richard Narron <richard@aaazen.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Closes: https://lore.kernel.org/all/4a5321bd-b1f-1832-f0c-cea8694dc5aa@aaazen.com/
Fixes: 867046cc70 ("minmax: relax check to allow comparison between unsigned arguments and signed constants")
Cc: stable@vger.kernel.org
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-09-11 11:07:47 -07:00
Arnd Bergmann
0e7af99aef Merge tag 'riscv-soc-fixes-for-v6.11-final' of https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux into arm/fixes
RISC-V soc fixes for v6.11-final

StarFive:
A fix to return one of the clocks on the JH7110 from 1 GHz to 1.5 GHz

Signed-off-by: Conor Dooley <conor.dooley@microchip.com>

* tag 'riscv-soc-fixes-for-v6.11-final' of https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux:
  riscv: dts: starfive: jh7110-common: Fix lower rate of CPUfreq by setting PLL0 rate to 1.5GHz

Link: https://lore.kernel.org/r/20240909-hybrid-groovy-601a33b5b309@spud
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-09-11 08:54:37 +00:00
Arnd Bergmann
b97acde6f9 platform: cznic: turris-omnia-mcu: fix HW_RANDOM dependency
There is still a build failure when the rwrng support is in a loadable
module but the mcu driver is built-in:

arm-linux-gnueabi-ld: drivers/platform/cznic/turris-omnia-mcu-trng.o: in function `omnia_mcu_register_trng':
turris-omnia-mcu-trng.c:(.text.omnia_mcu_register_trng+0x11c): undefined reference to `devm_hwrng_register'

Change the dependency to explicitly disallow the broken
configuration.

Fixes: 41bb142a40 ("platform: cznic: turris-omnia-mcu: Add support for MCU provided TRNG")
Reviewed-by: Marek Behún <kabel@kernel.org>
Link: https://lore.kernel.org/r/20240909110417.247453-1-arnd@kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-09-11 08:54:21 +00:00
Petr Mladek
2c83ded8ae Merge branch 'for-6.11-fixup' into for-linus 2024-09-11 09:30:22 +02:00
Jakub Kicinski
d1aaaa2e0a Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2024-09-09 (ice, igb)

This series contains updates to ice and igb drivers.

Martyna moves LLDP rule removal to the proper uninitialization function
for ice.

Jake corrects accounting logic for FWD_TO_VSI_LIST switch filters on
ice.

Przemek removes incorrect, explicit calls to pci_disable_device() for
ice.

Michal Schmidt stops incorrect use of VSI list for VLAN use on ice.

Sriram Yagnaraman adjusts igb_xdp_ring_update_tail() to be called under
Tx lock on igb.

* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
  igb: Always call igb_xdp_ring_update_tail() under Tx lock
  ice: fix VSI lists confusion when adding VLANs
  ice: stop calling pci_disable_device() as we use pcim
  ice: fix accounting for filters shared by multiple VSIs
  ice: Fix lldp packets dropping after changing the number of channels
====================

Link: https://patch.msgid.link/20240909203842.3109822-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10 20:15:10 -07:00
Jakub Kicinski
3d731dc9b1 Merge tag 'mlx5-fixes-2024-09-09' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:

====================
mlx5 fixes 2024-09-09

This series provides bug fixes to mlx5 driver.

* tag 'mlx5-fixes-2024-09-09' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
  net/mlx5: Fix bridge mode operations when there are no VFs
  net/mlx5: Verify support for scheduling element and TSAR type
  net/mlx5: Add missing masks and QoS bit masks for scheduling elements
  net/mlx5: Explicitly set scheduling element and TSAR type
  net/mlx5e: Add missing link mode to ptys2ext_ethtool_map
  net/mlx5e: Add missing link modes to ptys2ethtool_map
  net/mlx5: Update the list of the PCI supported devices
====================

Link: https://patch.msgid.link/20240909194505.69715-1-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10 20:11:40 -07:00
Kory Maincent
330dadacc5 MAINTAINERS: Add ethtool pse-pd to PSE NETWORK DRIVER
Add net/ethtool/pse-pd.c to PSE NETWORK DRIVER to receive emails concerning
modifications to the ethtool part.

Reviewed-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240909114336.362174-1-kory.maincent@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10 18:31:23 -07:00
Wei Fang
2f9caba9b2 dt-bindings: net: tja11xx: fix the broken binding
As Rob pointed in another mail thread [1], the binding of tja11xx PHY
is completely broken, the schema cannot catch the error in the DTS. A
compatiable string must be needed if we want to add a custom propety.
So extract known PHY IDs from the tja11xx PHY drivers and convert them
into supported compatible string list to fix the broken binding issue.

Fixes: 52b2fe4535 ("dt-bindings: net: tja11xx: add nxp,refclk_in property")
Link: https://lore.kernel.org/31058f49-bac5-49a9-a422-c43b121bf049@kernel.org  # [1]
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
Link: https://patch.msgid.link/20240909012152.431647-1-wei.fang@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10 17:12:58 -07:00
Sean Anderson
e8a63d473b selftests: net: csum: Fix checksums for packets with non-zero padding
Padding is not included in UDP and TCP checksums. Therefore, reduce the
length of the checksummed data to include only the data in the IP
payload. This fixes spurious reported checksum failures like

rx: pkt: sport=33000 len=26 csum=0xc850 verify=0xf9fe
pkt: bad csum

Technically it is possible for there to be trailing bytes after the UDP
data but before the Ethernet padding (e.g. if sizeof(ip) + sizeof(udp) +
udp.len < ip.len). However, we don't generate such packets.

Fixes: 91a7de8560 ("selftests/net: add csum offload test")
Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20240906210743.627413-1-sean.anderson@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10 16:33:31 -07:00
Tomas Paukrt
3f62ea572b net: phy: dp83822: Fix NULL pointer dereference on DP83825 devices
The probe() function is only used for DP83822 and DP83826 PHY,
leaving the private data pointer uninitialized for the DP83825 models
which causes a NULL pointer dereference in the recently introduced/changed
functions dp8382x_config_init() and dp83822_set_wol().

Add the dp8382x_probe() function, so all PHY models will have a valid
private data pointer to fix this issue and also prevent similar issues
in the future.

Fixes: 9ef9ecfa9e ("net: phy: dp8382x: keep WOL settings across suspends")
Signed-off-by: Tomas Paukrt <tomaspaukrt@email.cz>
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/66w.ZbGt.65Ljx42yHo5.1csjxu@seznam.cz
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10 16:20:15 -07:00
Linus Torvalds
8d8d276ba2 Merge tag 'trace-v6.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing fixes from Steven Rostedt:

 - Move declaration of interface_lock outside of CONFIG_TIMERLAT_TRACER

   The fix to some locking races moved the declaration of the
   interface_lock up in the file, but also moved it into the
   CONFIG_TIMERLAT_TRACER #ifdef block, breaking the build when that
   wasn't set. Move it further up and out of that #ifdef block.

 - Remove unused function run_tracer_selftest() stub

   When CONFIG_FTRACE_STARTUP_TEST is not set the stub function
   run_tracer_selftest() is not used and clang is warning about it.
   Remove the function stub as it is not needed.

* tag 'trace-v6.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  tracing: Drop unused helper function to fix the build
  tracing/osnoise: Fix build when timerlat is not enabled
2024-09-10 09:05:20 -07:00
Jakub Kicinski
48aa361c5d Merge branch 'revert-virtio_net-rx-enable-premapped-mode-by-default'
Xuan Zhuo says:

====================
Revert "virtio_net: rx enable premapped mode by default"

Regression: http://lore.kernel.org/8b20cc28-45a9-4643-8e87-ba164a540c0a@oracle.com
====================

Link: https://patch.msgid.link/20240906123137.108741-1-xuanzhuo@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10 09:01:08 -07:00
Xuan Zhuo
111fc9f517 virtio_net: disable premapped mode by default
Now, the premapped mode encounters some problem.

    http://lore.kernel.org/all/8b20cc28-45a9-4643-8e87-ba164a540c0a@oracle.com

So we disable the premapped mode by default.
We can re-enable it in the future.

Fixes: f9dac92ba9 ("virtio_ring: enable premapped mode whatever use_dma_api")
Reported-by: "Si-Wei Liu" <si-wei.liu@oracle.com>
Closes: http://lore.kernel.org/all/8b20cc28-45a9-4643-8e87-ba164a540c0a@oracle.com
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Takero Funaki <flintglass@gmail.com>
Link: https://patch.msgid.link/20240906123137.108741-4-xuanzhuo@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10 09:01:06 -07:00
Xuan Zhuo
38eef112a8 Revert "virtio_net: big mode skip the unmap check"
This reverts commit a377ae542d.

Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Takero Funaki <flintglass@gmail.com>
Link: https://patch.msgid.link/20240906123137.108741-3-xuanzhuo@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10 09:01:06 -07:00
Xuan Zhuo
dc4547fbba Revert "virtio_net: rx remove premapped failover code"
This reverts commit defd28aa5a.

Recover the code to disable premapped mode.

Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Takero Funaki <flintglass@gmail.com>
Link: https://patch.msgid.link/20240906123137.108741-2-xuanzhuo@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10 09:01:06 -07:00
Bas Nieuwenhuizen
d6de45e3c6 platform/x86: asus-wmi: Disable OOBE experience on Zenbook S 16
The OOBE experience fades the keyboard backlight in & out continuously,
and make the backlight uncontrollable using its device.

Workaround taken from
https://wiki.archlinux.org/index.php?title=ASUS_Zenbook_UM5606&diff=next&oldid=815547

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Luke D. Jones <luke@ljones.dev>
Link: https://lore.kernel.org/r/20240909223503.1445779-1-bas@basnieuwenhuizen.nl
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2024-09-10 15:43:58 +03:00
Jacky Chou
fef2843bb4 net: ftgmac100: Enable TX interrupt to avoid TX timeout
Currently, the driver only enables RX interrupt to handle RX
packets and TX resources. Sometimes there is not RX traffic,
so the TX resource needs to wait for RX interrupt to free.
This situation will toggle the TX timeout watchdog when the MAC
TX ring has no more resources to transmit packets.
Therefore, enable TX interrupt to release TX resources at any time.

When I am verifying iperf3 over UDP, the network hangs.
Like the log below.

root# iperf3 -c 192.168.100.100 -i1 -t10 -u -b0
Connecting to host 192.168.100.100, port 5201
[  4] local 192.168.100.101 port 35773 connected to 192.168.100.100 port 5201
[ ID] Interval           Transfer     Bandwidth       Total Datagrams
[  4]   0.00-20.42  sec   160 KBytes  64.2 Kbits/sec  20
[  4]  20.42-20.42  sec  0.00 Bytes  0.00 bits/sec  0
[  4]  20.42-20.42  sec  0.00 Bytes  0.00 bits/sec  0
[  4]  20.42-20.42  sec  0.00 Bytes  0.00 bits/sec  0
[  4]  20.42-20.42  sec  0.00 Bytes  0.00 bits/sec  0
[  4]  20.42-20.42  sec  0.00 Bytes  0.00 bits/sec  0
[  4]  20.42-20.42  sec  0.00 Bytes  0.00 bits/sec  0
[  4]  20.42-20.42  sec  0.00 Bytes  0.00 bits/sec  0
[  4]  20.42-20.42  sec  0.00 Bytes  0.00 bits/sec  0
[  4]  20.42-20.42  sec  0.00 Bytes  0.00 bits/sec  0
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval          Transfer    Bandwidth      Jitter   Lost/Total Datagrams
[  4]   0.00-20.42  sec  160 KBytes 64.2 Kbits/sec 0.000 ms 0/20 (0%)
[  4] Sent 20 datagrams
iperf3: error - the server has terminated

The network topology is FTGMAC connects directly to a PC.
UDP does not need to wait for ACK, unlike TCP.
Therefore, FTGMAC needs to enable TX interrupt to release TX resources instead
of waiting for the RX interrupt.

Fixes: 10cbd64076 ("ftgmac100: Rework NAPI & interrupts handling")
Signed-off-by: Jacky Chou <jacky_chou@aspeedtech.com>
Link: https://patch.msgid.link/20240906062831.2243399-1-jacky_chou@aspeedtech.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-09-10 12:50:07 +02:00
Naveen Mamindlapalli
019aba04f0 octeontx2-af: Modify SMQ flush sequence to drop packets
The current implementation of SMQ flush sequence waits for the packets
in the TM pipeline to be transmitted out of the link. This sequence
doesn't succeed in HW when there is any issue with link such as lack of
link credits, link down or any other traffic that is fully occupying the
link bandwidth (QoS). This patch modifies the SMQ flush sequence to
drop the packets after TL1 level (SQM) instead of polling for the packets
to be sent out of RPM/CGX link.

Fixes: 5d9b976d44 ("octeontx2-af: Support fixed transmit scheduler topology")
Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com>
Reviewed-by: Sunil Kovvuri Goutham <sgoutham@marvell.com>
Link: https://patch.msgid.link/20240906045838.1620308-1-naveenm@marvell.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-09-10 12:16:13 +02:00
Muhammad Usama Anjum
4c80022771 fou: fix initialization of grc
The grc must be initialize first. There can be a condition where if
fou is NULL, goto out will be executed and grc would be used
uninitialized.

Fixes: 7e41969350 ("fou: Fix null-ptr-deref in GRO.")
Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20240906102839.202798-1-usama.anjum@collabora.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-09 17:21:47 -07:00
Andy Shevchenko
4e378158e5 tracing: Drop unused helper function to fix the build
A helper function defined but not used. This, in particular,
prevents kernel builds with clang, `make W=1` and CONFIG_WERROR=y:

kernel/trace/trace.c:2229:19: error: unused function 'run_tracer_selftest' [-Werror,-Wunused-function]
 2229 | static inline int run_tracer_selftest(struct tracer *type)
      |                   ^~~~~~~~~~~~~~~~~~~

Fix this by dropping unused functions.

See also commit 6863f5643d ("kbuild: allow Clang to find unused static
inline functions for W=1 build").

Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Bill Wendling <morbo@google.com>
Cc: Justin Stitt <justinstitt@google.com>
Link: https://lore.kernel.org/20240909105314.928302-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-09-09 16:04:25 -04:00
Steven Rostedt
af17814334 tracing/osnoise: Fix build when timerlat is not enabled
To fix some critical section races, the interface_lock was added to a few
locations. One of those locations was above where the interface_lock was
declared, so the declaration was moved up before that usage.
Unfortunately, where it was placed was inside a CONFIG_TIMERLAT_TRACER
ifdef block. As the interface_lock is used outside that config, this broke
the build when CONFIG_OSNOISE_TRACER was enabled but
CONFIG_TIMERLAT_TRACER was not.

Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: "Helena Anna" <helena.anna.dubel@intel.com>
Cc: "Luis Claudio R. Goncalves" <lgoncalv@redhat.com>
Cc: Tomas Glozar <tglozar@redhat.com>
Link: https://lore.kernel.org/20240909103231.23a289e2@gandalf.local.home
Fixes: e6a53481da ("tracing/timerlat: Only clear timer if a kthread exists")
Reported-by: "Bityutskiy, Artem" <artem.bityutskiy@intel.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-09-09 16:03:57 -04:00
Benjamin Poirier
b1d305abef net/mlx5: Fix bridge mode operations when there are no VFs
Currently, trying to set the bridge mode attribute when numvfs=0 leads to a
crash:

bridge link set dev eth2 hwmode vepa

[  168.967392] BUG: kernel NULL pointer dereference, address: 0000000000000030
[...]
[  168.969989] RIP: 0010:mlx5_add_flow_rules+0x1f/0x300 [mlx5_core]
[...]
[  168.976037] Call Trace:
[  168.976188]  <TASK>
[  168.978620]  _mlx5_eswitch_set_vepa_locked+0x113/0x230 [mlx5_core]
[  168.979074]  mlx5_eswitch_set_vepa+0x7f/0xa0 [mlx5_core]
[  168.979471]  rtnl_bridge_setlink+0xe9/0x1f0
[  168.979714]  rtnetlink_rcv_msg+0x159/0x400
[  168.980451]  netlink_rcv_skb+0x54/0x100
[  168.980675]  netlink_unicast+0x241/0x360
[  168.980918]  netlink_sendmsg+0x1f6/0x430
[  168.981162]  ____sys_sendmsg+0x3bb/0x3f0
[  168.982155]  ___sys_sendmsg+0x88/0xd0
[  168.985036]  __sys_sendmsg+0x59/0xa0
[  168.985477]  do_syscall_64+0x79/0x150
[  168.987273]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[  168.987773] RIP: 0033:0x7f8f7950f917

(esw->fdb_table.legacy.vepa_fdb is null)

The bridge mode is only relevant when there are multiple functions per
port. Therefore, prevent setting and getting this setting when there are no
VFs.

Note that after this change, there are no settings to change on the PF
interface using `bridge link` when there are no VFs, so the interface no
longer appears in the `bridge link` output.

Fixes: 4b89251de0 ("net/mlx5: Support ndo bridge_setlink and getlink")
Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com>
Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2024-09-09 12:39:57 -07:00
Carolina Jubran
861cd9b9cb net/mlx5: Verify support for scheduling element and TSAR type
Before creating a scheduling element in a NIC or E-Switch scheduler,
ensure that the requested element type is supported. If the element is
of type Transmit Scheduling Arbiter (TSAR), also verify that the
specific TSAR type is supported.

Fixes: 214baf2287 ("net/mlx5e: Support HTB offload")
Fixes: 85c5f7c920 ("net/mlx5: E-switch, Create QoS on demand")
Fixes: 0fe132eac3 ("net/mlx5: E-switch, Allow to add vports to rate groups")
Signed-off-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2024-09-09 12:39:57 -07:00
Carolina Jubran
452ef7f860 net/mlx5: Add missing masks and QoS bit masks for scheduling elements
Add the missing masks for supported element types and Transmit
Scheduling Arbiter (TSAR) types in scheduling elements.

Also, add the corresponding bit masks for these types in the QoS
capabilities of a NIC scheduler.

Fixes: 214baf2287 ("net/mlx5e: Support HTB offload")
Signed-off-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2024-09-09 12:39:57 -07:00