Commit Graph

1140889 Commits

Author SHA1 Message Date
Hariprasad Kelam
b9d0fedc62 octeontx2-af: cn10kb: Add RPM_USX MAC support
OcteonTx2's next gen platform the CN10KB has RPM_USX MAC which has a
different serdes when compared to RPM MAC. Though the underlying
HW is different, the CSR interface has been designed largely inline
with RPM MAC, with few exceptions though. So we are using the same
CGX driver for RPM_USX MAC as well and will have a different set of APIs
for RPM_USX where ever necessary.

The RPM and RPM_USX blocks support a different number of LMACS.
RPM_USX support 8 LMACS per MAC block whereas legacy RPM supports only 4
LMACS per MAC. with this RPM_USX support double the number of DMAC filters
and fifo size.

This patch adds initial support for CN10KB's RPM_USX  MAC i.e registering
the driver and defining MAC operations (mac_ops). Adds the logic to
configure internal loopback and pause frames and assign FIFO length to
LMACS.

Kernel reads lmac features like lmac type, autoneg, etc from shared
firmware data this structure only supports 4 lmacs per MAC, this patch
extends this structure to accommodate 8 lmacs.

Signed-off-by: Hariprasad Kelam <hkelam@marvell.com>
Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-12-07 12:24:29 +01:00
Rakesh Babu Saladi
f2e664ad50 octeontx2-af: Support variable number of lmacs
Most of the code in CGX/RPM driver assumes that max lmacs per
given MAC as always, 4 and the number of MAC blocks also as 4.
With this assumption, the max number of interfaces supported is
hardcoded to 16. This creates a problem as next gen CN10KB silicon
MAC supports 8 lmacs per MAC block.

This patch solves the problem by using "max lmac per MAC block"
value from constant csrs and uses cgx_cnt_max value which is
populated based number of MAC blocks supported by silicon.

Signed-off-by: Rakesh Babu Saladi <rsaladi2@marvell.com>
Signed-off-by: Hariprasad Kelam <hkelam@marvell.com>
Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-12-07 12:24:29 +01:00
Yuan Can
4fad22a128 dpaa2-switch: Fix memory leak in dpaa2_switch_acl_entry_add() and dpaa2_switch_acl_entry_remove()
The cmd_buff needs to be freed when error happened in
dpaa2_switch_acl_entry_add() and dpaa2_switch_acl_entry_remove().

Fixes: 1110318d83 ("dpaa2-switch: add tc flower hardware offload on ingress traffic")
Signed-off-by: Yuan Can <yuancan@huawei.com>
Link: https://lore.kernel.org/r/20221205061515.115012-1-yuancan@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-12-07 12:14:12 +01:00
Paolo Abeni
f82389eecd Merge branch 'net-dsa-microchip-add-mtu-support-for-ksz8-series'
Oleksij Rempel says:

====================
net: dsa: microchip: add MTU support for KSZ8 series
====================

Link: https://lore.kernel.org/r/20221205052232.2834166-1-o.rempel@pengutronix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-12-07 11:58:00 +01:00
Oleksij Rempel
55a952eef7 net: dsa: microchip: ksz8: move all DSA configurations to one location
To make the code more comparable to KSZ9477 code, move DSA
configurations to the same location.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Acked-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-12-07 11:57:58 +01:00
Oleksij Rempel
6b30cfa86e net: dsa: microchip: enable MTU normalization for KSZ8795 and KSZ9477 compatible switches
KSZ8795 and KSZ9477 compatible series of switches use global max frame
size configuration register. So, enable MTU normalization for this reason.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Acked-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-12-07 11:57:58 +01:00
Oleksij Rempel
29d1e85f45 net: dsa: microchip: ksz8: add MTU configuration support
Make MTU configurable on KSZ87xx and KSZ88xx series of switches.

Before this patch, pre-configured behavior was different on different
switch series, due to opposite meaning of the same bit:
- KSZ87xx: Reg 4, Bit 1 - if 1, max frame size is 1532; if 0 - 1514
- KSZ88xx: Reg 4, Bit 1 - if 1, max frame size is 1514; if 0 - 1532

Since the code was telling "... SW_LEGAL_PACKET_DISABLE, true)", I
assume, the idea was to set max frame size to 1532.

With this patch, by setting MTU size 1500, both switch series will be
configured to the 1532 frame limit.

This patch was tested on KSZ8873.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Acked-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-12-07 11:57:58 +01:00
Oleksij Rempel
6f1b986a43 net: dsa: microchip: add ksz_rmw8() function
Add ksz_rmw8(), it will be used in the next patch.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Acked-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-12-07 11:57:58 +01:00
Oleksij Rempel
1d0a1a6d0d net: dsa: microchip: do not store max MTU for all ports
If we have global MTU configuration, it is enough to configure it on CPU
port only.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-12-07 11:57:58 +01:00
Oleksij Rempel
838c19f894 net: dsa: microchip: move max mtu to one location
There are no HW specific registers, so we can process all of them
in one location.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Tested-by: Arun Ramadoss <arun.ramadoss@microchip.com> (KSZ9893 and LAN937x)
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-12-07 11:57:58 +01:00
Yuan Can
e22dcbc9aa net: ethernet: mtk_wed: Fix missing of_node_put() in mtk_wed_wo_hardware_init()
The np needs to be released through of_node_put() in the error handling
path of mtk_wed_wo_hardware_init().

Fixes: 799684448e ("net: ethernet: mtk_wed: introduce wed wo support")
Signed-off-by: Yuan Can <yuancan@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/20221205034339.112163-1-yuancan@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-12-07 11:51:06 +01:00
Zhang Changzhong
063a932b64 ethernet: aeroflex: fix potential skb leak in greth_init_rings()
The greth_init_rings() function won't free the newly allocated skb when
dma_mapping_error() returns error, so add dev_kfree_skb() to fix it.

Compile tested only.

Fixes: d4c41139df ("net: Add Aeroflex Gaisler 10/100/1G Ethernet MAC driver")
Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/1670134149-29516-1-git-send-email-zhangchangzhong@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-12-07 11:45:52 +01:00
Xin Long
88956177db tipc: call tipc_lxc_xmit without holding node_read_lock
When sending packets between nodes in netns, it calls tipc_lxc_xmit() for
peer node to receive the packets where tipc_sk_mcast_rcv()/tipc_sk_rcv()
might be called, and it's pretty much like in tipc_rcv().

Currently the local 'node rw lock' is held during calling tipc_lxc_xmit()
to protect the peer_net not being freed by another thread. However, when
receiving these packets, tipc_node_add_conn() might be called where the
peer 'node rw lock' is acquired. Then a dead lock warning is triggered by
lockdep detector, although it is not a real dead lock:

    WARNING: possible recursive locking detected
    --------------------------------------------
    conn_server/1086 is trying to acquire lock:
    ffff8880065cb020 (&n->lock#2){++--}-{2:2}, \
                     at: tipc_node_add_conn.cold.76+0xaa/0x211 [tipc]

    but task is already holding lock:
    ffff8880065cd020 (&n->lock#2){++--}-{2:2}, \
                     at: tipc_node_xmit+0x285/0xb30 [tipc]

    other info that might help us debug this:
     Possible unsafe locking scenario:

           CPU0
           ----
      lock(&n->lock#2);
      lock(&n->lock#2);

     *** DEADLOCK ***

     May be due to missing lock nesting notation

    4 locks held by conn_server/1086:
     #0: ffff8880036d1e40 (sk_lock-AF_TIPC){+.+.}-{0:0}, \
                          at: tipc_accept+0x9c0/0x10b0 [tipc]
     #1: ffff8880036d5f80 (sk_lock-AF_TIPC/1){+.+.}-{0:0}, \
                          at: tipc_accept+0x363/0x10b0 [tipc]
     #2: ffff8880065cd020 (&n->lock#2){++--}-{2:2}, \
                          at: tipc_node_xmit+0x285/0xb30 [tipc]
     #3: ffff888012e13370 (slock-AF_TIPC){+...}-{2:2}, \
                          at: tipc_sk_rcv+0x2da/0x1b40 [tipc]

    Call Trace:
     <TASK>
     dump_stack_lvl+0x44/0x5b
     __lock_acquire.cold.77+0x1f2/0x3d7
     lock_acquire+0x1d2/0x610
     _raw_write_lock_bh+0x38/0x80
     tipc_node_add_conn.cold.76+0xaa/0x211 [tipc]
     tipc_sk_finish_conn+0x21e/0x640 [tipc]
     tipc_sk_filter_rcv+0x147b/0x3030 [tipc]
     tipc_sk_rcv+0xbb4/0x1b40 [tipc]
     tipc_lxc_xmit+0x225/0x26b [tipc]
     tipc_node_xmit.cold.82+0x4a/0x102 [tipc]
     __tipc_sendstream+0x879/0xff0 [tipc]
     tipc_accept+0x966/0x10b0 [tipc]
     do_accept+0x37d/0x590

This patch avoids this warning by not holding the 'node rw lock' before
calling tipc_lxc_xmit(). As to protect the 'peer_net', rcu_read_lock()
should be enough, as in cleanup_net() when freeing the netns, it calls
synchronize_rcu() before the free is continued.

Also since tipc_lxc_xmit() is like the RX path in tipc_rcv(), it makes
sense to call it under rcu_read_lock(). Note that the right lock order
must be:

   rcu_read_lock();
   tipc_node_read_lock(n);
   tipc_node_read_unlock(n);
   tipc_lxc_xmit();
   rcu_read_unlock();

instead of:

   tipc_node_read_lock(n);
   rcu_read_lock();
   tipc_node_read_unlock(n);
   tipc_lxc_xmit();
   rcu_read_unlock();

and we have to call tipc_node_read_lock/unlock() twice in
tipc_node_xmit().

Fixes: f73b12812a ("tipc: improve throughput between nodes in netns")
Reported-by: Shuang Li <shuali@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Link: https://lore.kernel.org/r/5bdd1f8fee9db695cfff4528a48c9b9d0523fb00.1670110641.git.lucien.xin@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-12-07 11:32:04 +01:00
Frank Jungclaus
918ee4911f can: esd_usb: Allow REC and TEC to return to zero
We don't get any further EVENT from an esd CAN USB device for changes
on REC or TEC while those counters converge to 0 (with ecc == 0). So
when handling the "Back to Error Active"-event force txerr = rxerr =
0, otherwise the berr-counters might stay on values like 95 forever.

Also, to make life easier during the ongoing development a
netdev_dbg() has been introduced to allow dumping error events send by
an esd CAN USB device.

Fixes: 96d8e90382 ("can: Add driver for esd CAN-USB/2 device")
Signed-off-by: Frank Jungclaus <frank.jungclaus@esd.eu>
Link: https://lore.kernel.org/all/20221130202242.3998219-2-frank.jungclaus@esd.eu
Cc: stable@vger.kernel.org
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-12-07 10:32:48 +01:00
Max Staudt
f4a4d121eb can: can327: flush TX_work on ldisc .close()
Additionally, remove it from .ndo_stop().

This ensures that the worker is not called after being freed, and that
the UART TX queue remains active to send final commands when the
netdev is stopped.

Thanks to Jiri Slaby for finding this in slcan:

  https://lore.kernel.org/linux-can/20221201073426.17328-1-jirislaby@kernel.org/

A variant of this patch for slcan, with the flush in .ndo_stop() still
present, has been tested successfully on physical hardware:

  https://bugzilla.suse.com/show_bug.cgi?id=1205597

Fixes: 43da2f0762 ("can: can327: CAN/ldisc driver for ELM327 based OBD-II adapters")
Cc: "Jiri Slaby (SUSE)" <jirislaby@kernel.org>
Cc: Max Staudt <max@enpas.org>
Cc: Wolfgang Grandegger <wg@grandegger.com>
Cc: Marc Kleine-Budde <mkl@pengutronix.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: linux-can@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: stable@vger.kernel.org
Signed-off-by: Max Staudt <max@enpas.org>
Link: https://lore.kernel.org/all/20221202160148.282564-1-max@enpas.org
Cc: stable@vger.kernel.org
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-12-07 10:32:36 +01:00
Jiri Slaby (SUSE)
fb855e9f3b can: slcan: fix freed work crash
The LTP test pty03 is causing a crash in slcan:
  BUG: kernel NULL pointer dereference, address: 0000000000000008
  #PF: supervisor read access in kernel mode
  #PF: error_code(0x0000) - not-present page
  PGD 0 P4D 0
  Oops: 0000 [#1] PREEMPT SMP NOPTI
  CPU: 0 PID: 348 Comm: kworker/0:3 Not tainted 6.0.8-1-default #1 openSUSE Tumbleweed 9d20364b934f5aab0a9bdf84e8f45cfdfae39dab
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.15.0-0-g2dd4b9b-rebuilt.opensuse.org 04/01/2014
  Workqueue:  0x0 (events)
  RIP: 0010:process_one_work (/home/rich/kernel/linux/kernel/workqueue.c:706 /home/rich/kernel/linux/kernel/workqueue.c:2185)
  Code: 49 89 ff 41 56 41 55 41 54 55 53 48 89 f3 48 83 ec 10 48 8b 06 48 8b 6f 48 49 89 c4 45 30 e4 a8 04 b8 00 00 00 00 4c 0f 44 e0 <49> 8b 44 24 08 44 8b a8 00 01 00 00 41 83 e5 20 f6 45 10 04 75 0e
  RSP: 0018:ffffaf7b40f47e98 EFLAGS: 00010046
  RAX: 0000000000000000 RBX: ffff9d644e1b8b48 RCX: ffff9d649e439968
  RDX: 00000000ffff8455 RSI: ffff9d644e1b8b48 RDI: ffff9d64764aa6c0
  RBP: ffff9d649e4335c0 R08: 0000000000000c00 R09: ffff9d64764aa734
  R10: 0000000000000007 R11: 0000000000000001 R12: 0000000000000000
  R13: ffff9d649e4335e8 R14: ffff9d64490da780 R15: ffff9d64764aa6c0
  FS:  0000000000000000(0000) GS:ffff9d649e400000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000000000000008 CR3: 0000000036424000 CR4: 00000000000006f0
  Call Trace:
   <TASK>
  worker_thread (/home/rich/kernel/linux/kernel/workqueue.c:2436)
  kthread (/home/rich/kernel/linux/kernel/kthread.c:376)
  ret_from_fork (/home/rich/kernel/linux/arch/x86/entry/entry_64.S:312)

Apparently, the slcan's tx_work is freed while being scheduled. While
slcan_netdev_close() (netdev side) calls flush_work(&sl->tx_work),
slcan_close() (tty side) does not. So when the netdev is never set UP,
but the tty is stuffed with bytes and forced to wakeup write, the work
is scheduled, but never flushed.

So add an additional flush_work() to slcan_close() to be sure the work
is flushed under all circumstances.

The Fixes commit below moved flush_work() from slcan_close() to
slcan_netdev_close(). What was the rationale behind it? Maybe we can
drop the one in slcan_netdev_close()?

I see the same pattern in can327. So it perhaps needs the very same fix.

Fixes: cfcb4465e9 ("can: slcan: remove legacy infrastructure")
Link: https://bugzilla.suse.com/show_bug.cgi?id=1205597
Reported-by: Richard Palethorpe <richard.palethorpe@suse.com>
Tested-by: Petr Vorel <petr.vorel@suse.com>
Cc: Dario Binacchi <dario.binacchi@amarulasolutions.com>
Cc: Wolfgang Grandegger <wg@grandegger.com>
Cc: Marc Kleine-Budde <mkl@pengutronix.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: linux-can@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: stable@vger.kernel.org
Cc: Max Staudt <max@enpas.org>
Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org>
Reviewed-by: Max Staudt <max@enpas.org>
Link: https://lore.kernel.org/all/20221201073426.17328-1-jirislaby@kernel.org
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-12-07 10:32:24 +01:00
Oliver Hartkopp
0acc442309 can: af_can: fix NULL pointer dereference in can_rcv_filter
Analogue to commit 8aa59e3559 ("can: af_can: fix NULL pointer
dereference in can_rx_register()") we need to check for a missing
initialization of ml_priv in the receive path of CAN frames.

Since commit 4e096a1886 ("net: introduce CAN specific pointer in the
struct net_device") the check for dev->type to be ARPHRD_CAN is not
sufficient anymore since bonding or tun netdevices claim to be CAN
devices but do not initialize ml_priv accordingly.

Fixes: 4e096a1886 ("net: introduce CAN specific pointer in the struct net_device")
Reported-by: syzbot+2d7f58292cb5b29eb5ad@syzkaller.appspotmail.com
Reported-by: Wei Chen <harperchen1110@gmail.com>
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Link: https://lore.kernel.org/all/20221206201259.3028-1-socketcan@hartkopp.net
Cc: stable@vger.kernel.org
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-12-07 10:30:47 +01:00
Jakub Kicinski
1799c1b85e Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2022-12-05 (i40e)

Michal clears XPS init flag on reset to allow for updated values to be
written.

Sylwester adds sleep to VF reset to resolve issue of VFs not getting
resources.

Przemyslaw rejects filters for raw IPv4 or IPv6 l4_4_bytes filters as they
 are not supported.

* '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
  i40e: Disallow ip4 and ip6 l4_4_bytes
  i40e: Fix for VF MAC address 0
  i40e: Fix not setting default xps_cpus after reset
====================

Link: https://lore.kernel.org/r/20221205212523.3197565-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-06 20:46:33 -08:00
Lorenzo Bianconi
ed883bec67 net: ethernet: mtk_wed: add reset to rx_ring_setup callback
This patch adds reset parameter to mtk_wed_rx_ring_setup signature
in order to align rx_ring_setup callback to tx_ring_setup one introduced
in 'commit 23dca7a900 ("net: ethernet: mtk_wed: add reset to
tx_ring_setup callback")'

Co-developed-by: Sujuan Chen <sujuan.chen@mediatek.com>
Signed-off-by: Sujuan Chen <sujuan.chen@mediatek.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/29c6e7a5469e784406cf3e2920351d1207713d05.1670239984.git.lorenzo@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-06 20:40:53 -08:00
zhang songyi
e3bd74c3d1 net: microchip: vcap: Remove unneeded semicolons
Semicolons after "}" are not needed.

Signed-off-by: zhang songyi <zhang.songyi@zte.com.cn>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/202212051422158113766@zte.com.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-06 20:38:57 -08:00
ye xingchen
1ab586f517 sfc: use sysfs_emit() to instead of scnprintf()
Follow the advice of the Documentation/filesystems/sysfs.rst and show()
should only use sysfs_emit() or sysfs_emit_at() when formatting the
value to be returned to user space.

Signed-off-by: ye xingchen <ye.xingchen@zte.com.cn>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Link: https://lore.kernel.org/r/202212051021451139126@zte.com.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-06 20:37:38 -08:00
Zhengchao Shao
78a9ea43fc net: dsa: sja1105: fix memory leak in sja1105_setup_devlink_regions()
When dsa_devlink_region_create failed in sja1105_setup_devlink_regions(),
priv->regions is not released.

Fixes: bf425b8205 ("net: dsa: sja1105: expose static config as devlink region")
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20221205012132.2110979-1-shaozhengchao@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-06 20:36:38 -08:00
Jakub Kicinski
e40febfb9c Merge branch 'ipv4-two-bug-fixes'
Ido Schimmel says:

====================
ipv4: Two small fixes for bugs in IPv4 routing code.

A variation of the second bug was reported by an FRR 5.0 (released
06/18) user as this version was setting a table ID of 0 for the
default VRF, unlike iproute2 and newer FRR versions.

The first bug was discovered while fixing the second.

Both bugs are not regressions (never worked) and are not critical
in my opinion, so the fixes can be applied to net-next, if desired.

No regressions in other tests:

 # ./fib_tests.sh
 ...
 Tests passed: 191
 Tests failed:   0
====================

Link: https://lore.kernel.org/r/20221204075045.3780097-1-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-06 20:34:46 -08:00
Ido Schimmel
c0d999348e ipv4: Fix incorrect route flushing when table ID 0 is used
Cited commit added the table ID to the FIB info structure, but did not
properly initialize it when table ID 0 is used. This can lead to a route
in the default VRF with a preferred source address not being flushed
when the address is deleted.

Consider the following example:

 # ip address add dev dummy1 192.0.2.1/28
 # ip address add dev dummy1 192.0.2.17/28
 # ip route add 198.51.100.0/24 via 192.0.2.2 src 192.0.2.17 metric 100
 # ip route add table 0 198.51.100.0/24 via 192.0.2.2 src 192.0.2.17 metric 200
 # ip route show 198.51.100.0/24
 198.51.100.0/24 via 192.0.2.2 dev dummy1 src 192.0.2.17 metric 100
 198.51.100.0/24 via 192.0.2.2 dev dummy1 src 192.0.2.17 metric 200

Both routes are installed in the default VRF, but they are using two
different FIB info structures. One with a metric of 100 and table ID of
254 (main) and one with a metric of 200 and table ID of 0. Therefore,
when the preferred source address is deleted from the default VRF,
the second route is not flushed:

 # ip address del dev dummy1 192.0.2.17/28
 # ip route show 198.51.100.0/24
 198.51.100.0/24 via 192.0.2.2 dev dummy1 src 192.0.2.17 metric 200

Fix by storing a table ID of 254 instead of 0 in the route configuration
structure.

Add a test case that fails before the fix:

 # ./fib_tests.sh -t ipv4_del_addr

 IPv4 delete address route tests
     Regular FIB info
     TEST: Route removed from VRF when source address deleted            [ OK ]
     TEST: Route in default VRF not removed                              [ OK ]
     TEST: Route removed in default VRF when source address deleted      [ OK ]
     TEST: Route in VRF is not removed by address delete                 [ OK ]
     Identical FIB info with different table ID
     TEST: Route removed from VRF when source address deleted            [ OK ]
     TEST: Route in default VRF not removed                              [ OK ]
     TEST: Route removed in default VRF when source address deleted      [ OK ]
     TEST: Route in VRF is not removed by address delete                 [ OK ]
     Table ID 0
     TEST: Route removed in default VRF when source address deleted      [FAIL]

 Tests passed:   8
 Tests failed:   1

And passes after:

 # ./fib_tests.sh -t ipv4_del_addr

 IPv4 delete address route tests
     Regular FIB info
     TEST: Route removed from VRF when source address deleted            [ OK ]
     TEST: Route in default VRF not removed                              [ OK ]
     TEST: Route removed in default VRF when source address deleted      [ OK ]
     TEST: Route in VRF is not removed by address delete                 [ OK ]
     Identical FIB info with different table ID
     TEST: Route removed from VRF when source address deleted            [ OK ]
     TEST: Route in default VRF not removed                              [ OK ]
     TEST: Route removed in default VRF when source address deleted      [ OK ]
     TEST: Route in VRF is not removed by address delete                 [ OK ]
     Table ID 0
     TEST: Route removed in default VRF when source address deleted      [ OK ]

 Tests passed:   9
 Tests failed:   0

Fixes: 5a56a0b3a4 ("net: Don't delete routes in different VRFs")
Reported-by: Donald Sharp <sharpd@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-06 20:34:43 -08:00
Ido Schimmel
f96a3d7455 ipv4: Fix incorrect route flushing when source address is deleted
Cited commit added the table ID to the FIB info structure, but did not
prevent structures with different table IDs from being consolidated.
This can lead to routes being flushed from a VRF when an address is
deleted from a different VRF.

Fix by taking the table ID into account when looking for a matching FIB
info. This is already done for FIB info structures backed by a nexthop
object in fib_find_info_nh().

Add test cases that fail before the fix:

 # ./fib_tests.sh -t ipv4_del_addr

 IPv4 delete address route tests
     Regular FIB info
     TEST: Route removed from VRF when source address deleted            [ OK ]
     TEST: Route in default VRF not removed                              [ OK ]
     TEST: Route removed in default VRF when source address deleted      [ OK ]
     TEST: Route in VRF is not removed by address delete                 [ OK ]
     Identical FIB info with different table ID
     TEST: Route removed from VRF when source address deleted            [FAIL]
     TEST: Route in default VRF not removed                              [ OK ]
 RTNETLINK answers: File exists
     TEST: Route removed in default VRF when source address deleted      [ OK ]
     TEST: Route in VRF is not removed by address delete                 [FAIL]

 Tests passed:   6
 Tests failed:   2

And pass after:

 # ./fib_tests.sh -t ipv4_del_addr

 IPv4 delete address route tests
     Regular FIB info
     TEST: Route removed from VRF when source address deleted            [ OK ]
     TEST: Route in default VRF not removed                              [ OK ]
     TEST: Route removed in default VRF when source address deleted      [ OK ]
     TEST: Route in VRF is not removed by address delete                 [ OK ]
     Identical FIB info with different table ID
     TEST: Route removed from VRF when source address deleted            [ OK ]
     TEST: Route in default VRF not removed                              [ OK ]
     TEST: Route removed in default VRF when source address deleted      [ OK ]
     TEST: Route in VRF is not removed by address delete                 [ OK ]

 Tests passed:   8
 Tests failed:   0

Fixes: 5a56a0b3a4 ("net: Don't delete routes in different VRFs")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-06 20:34:43 -08:00
Rasmus Villemoes
7e6303567c net: fec: properly guard irq coalesce setup
Prior to the Fixes: commit, the initialization code went through the
same fec_enet_set_coalesce() function as used by ethtool, and that
function correctly checks whether the current variant has support for
irq coalescing.

Now that the initialization code instead calls fec_enet_itr_coal_set()
directly, that call needs to be guarded by a check for the
FEC_QUIRK_HAS_COALESCE bit.

Fixes: df727d4547 (net: fec: don't reset irq coalesce settings to defaults on "ip link up")
Reported-by: Greg Ungerer <gregungerer@westnet.com.au>
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20221205204604.869853-1-linux@rasmusvillemoes.dk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-06 20:22:34 -08:00
Hangbin Liu
1f154f3b56 bonding: get correct NA dest address
In commit 4d633d1b46 ("bonding: fix ICMPv6 header handling when receiving
IPv6 messages"), there is a copy/paste issue for NA daddr. I found that
in my testing and fixed it in my local branch. But I forgot to re-format
the patch and sent the wrong mail.

Fix it by reading the correct dest address.

Fixes: 4d633d1b46 ("bonding: fix ICMPv6 header handling when receiving IPv6 messages")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Jonathan Toppins <jtoppins@redhat.com>
Link: https://lore.kernel.org/r/20221206032055.7517-1-liuhangbin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-06 20:20:38 -08:00
Christophe JAILLET
e9b4aeed56 net: xsk: Don't include <linux/rculist.h>
There is no need to include <linux/rculist.h> here.

Prefer the less invasive <linux/types.h> which is needed for 'hlist_head'.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/r/88d6a1d88764cca328610854f890a9ca1f4b029e.1670086246.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-12-06 20:04:34 -08:00
Alexei Starovoitov
f0c5a2d9f2 Merge branch 'Refactor verifier prune and jump point handling'
Andrii Nakryiko says:

====================

Disentangle prune and jump points in BPF verifier code. They are conceptually
independent but currently coupled together. This small patch set refactors
related code and make it possible to have some instruction marked as pruning
or jump point independently.

Besides just conceptual cleanliness, this allows to remove unnecessary jump
points (saving a tiny bit of performance and memory usage, potentially), and
even more importantly it allows for clean extension of special pruning points,
similarly to how it's done for BPF_FUNC_timer_set_callback. This will be used
by future patches implementing open-coded BPF iterators.

v1->v2:
  - clarified path #3 commit message and a comment in the code (John);
  - added back mark_jmp_point() to right after subprog call to record
    non-linear implicit jump from BPF_EXIT to right after CALL <subprog>.
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-12-06 19:14:39 -08:00
Andrii Nakryiko
618945fbed bpf: remove unnecessary prune and jump points
Don't mark some instructions as jump points when there are actually no
jumps and instructions are just processed sequentially. Such case is
handled naturally by precision backtracking logic without the need to
update jump history. See get_prev_insn_idx(). It goes back linearly by
one instruction, unless current top of jmp_history is pointing to
current instruction. In such case we use `st->jmp_history[cnt - 1].prev_idx`
to find instruction from which we jumped to the current instruction
non-linearly.

Also remove both jump and prune point marking for instruction right
after unconditional jumps, as program flow can get to the instruction
right after unconditional jump instruction only if there is a jump to
that instruction from somewhere else in the program. In such case we'll
mark such instruction as prune/jump point because it's a destination of
a jump.

This change has no changes in terms of number of instructions or states
processes across Cilium and selftests programs.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/r/20221206233345.438540-4-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-12-06 19:14:38 -08:00
Andrii Nakryiko
a095f42105 bpf: mostly decouple jump history management from is_state_visited()
Jump history updating and state equivalence checks are conceptually
independent, so move push_jmp_history() out of is_state_visited(). Also
make a decision whether to perform state equivalence checks or not one
layer higher in do_check(), keeping is_state_visited() unconditionally
performing state checks.

push_jmp_history() should be performed after state checks. There is just
one small non-uniformity. When is_state_visited() finds already
validated equivalent state, it propagates precision marks to current
state's parent chain. For this to work correctly, jump history has to be
updated, so is_state_visited() is doing that internally.

But if no equivalent verified state is found, jump history has to be
updated in a newly cloned child state, so is_jmp_point()
+ push_jmp_history() is performed after is_state_visited() exited with
zero result, which means "proceed with validation".

This change has no functional changes. It's not strictly necessary, but
feels right to decouple these two processes.

Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20221206233345.438540-3-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-12-06 19:14:38 -08:00
Andrii Nakryiko
bffdeaa8a5 bpf: decouple prune and jump points
BPF verifier marks some instructions as prune points. Currently these
prune points serve two purposes.

It's a point where verifier tries to find previously verified state and
check current state's equivalence to short circuit verification for
current code path.

But also currently it's a point where jump history, used for precision
backtracking, is updated. This is done so that non-linear flow of
execution could be properly backtracked.

Such coupling is coincidental and unnecessary. Some prune points are not
part of some non-linear jump path, so don't need update of jump history.
On the other hand, not all instructions which have to be recorded in
jump history necessarily are good prune points.

This patch splits prune and jump points into independent flags.
Currently all prune points are marked as jump points to minimize amount
of changes in this patch, but next patch will perform some optimization
of prune vs jmp point placement.

No functional changes are intended.

Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20221206233345.438540-2-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-12-06 19:14:38 -08:00
Russell King (Oracle)
15309fb26b net: sfp: clean up i2c-bus property parsing
We currently have some complicated code in sfp_probe() which gets the
I2C bus depending on whether the sfp node is DT or ACPI, and we use
completely separate lookup functions.

This could do with being in a separate function to make the code more
readable, so move it to a new function, sfp_i2c_get(). We can also use
fwnode_find_reference() to lookup the I2C bus fwnode before then
decending into fwnode-type specific parsing.

A future cleanup would be to move the fwnode-type specific parsing into
the i2c layer, which is where it really should be.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/E1p1WGJ-0098wS-4w@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-06 18:54:14 -08:00
Dave Marchevsky
d8939cb0a0 bpf: Loosen alloc obj test in verifier's reg_btf_record
btf->struct_meta_tab is populated by btf_parse_struct_metas in btf.c.
There, a BTF record is created for any type containing a spin_lock or
any next-gen datastructure node/head.

Currently, for non-MAP_VALUE types, reg_btf_record will only search for
a record using struct_meta_tab if the reg->type exactly matches
(PTR_TO_BTF_ID | MEM_ALLOC). This exact match is too strict: an
"allocated obj" type - returned from bpf_obj_new - might pick up other
flags while working its way through the program.

Loosen the check to be exact for base_type and just use MEM_ALLOC mask
for type_flag.

This patch is marked Fixes as the original intent of reg_btf_record was
unlikely to have been to fail finding btf_record for valid alloc obj
types with additional flags, some of which (e.g. PTR_UNTRUSTED)
are valid register type states for alloc obj independent of this series.
However, I didn't find a specific broken repro case outside of this
series' added functionality, so it's possible that nothing was
triggering this logic error before.

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
cc: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Fixes: 4e814da0d5 ("bpf: Allow locking bpf_spin_lock in allocated objects")
Link: https://lore.kernel.org/r/20221206231000.3180914-2-davemarchevsky@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-12-06 18:40:25 -08:00
Kees Cook
b93884eea2 net/ncsi: Silence runtime memcpy() false positive warning
The memcpy() in ncsi_cmd_handler_oem deserializes nca->data into a
flexible array structure that overlapping with non-flex-array members
(mfr_id) intentionally. Since the mem_to_flex() API is not finished,
temporarily silence this warning, since it is a false positive, using
unsafe_memcpy().

Reported-by: Joel Stanley <joel@jms.id.au>
Link: https://lore.kernel.org/netdev/CACPK8Xdfi=OJKP0x0D1w87fQeFZ4A2DP2qzGCRcuVbpU-9=4sQ@mail.gmail.com/
Cc: Samuel Mendoza-Jonas <sam@mendozajonas.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20221202212418.never.837-kees@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-06 17:29:14 -08:00
David Vernet
156ed20d22 bpf: Don't use rcu_users to refcount in task kfuncs
A series of prior patches added some kfuncs that allow struct
task_struct * objects to be used as kptrs. These kfuncs leveraged the
'refcount_t rcu_users' field of the task for performing refcounting.
This field was used instead of 'refcount_t usage', as we wanted to
leverage the safety provided by RCU for ensuring a task's lifetime.

A struct task_struct is refcounted by two different refcount_t fields:

1. p->usage:     The "true" refcount field which task lifetime. The
		 task is freed as soon as this refcount drops to 0.

2. p->rcu_users: An "RCU users" refcount field which is statically
		 initialized to 2, and is co-located in a union with
		 a struct rcu_head field (p->rcu). p->rcu_users
		 essentially encapsulates a single p->usage
		 refcount, and when p->rcu_users goes to 0, an RCU
		 callback is scheduled on the struct rcu_head which
		 decrements the p->usage refcount.

Our logic was that by using p->rcu_users, we would be able to use RCU to
safely issue refcount_inc_not_zero() a task's rcu_users field to
determine if a task could still be acquired, or was exiting.
Unfortunately, this does not work due to p->rcu_users and p->rcu sharing
a union. When p->rcu_users goes to 0, an RCU callback is scheduled to
drop a single p->usage refcount, and because the fields share a union,
the refcount immediately becomes nonzero again after the callback is
scheduled.

If we were to split the fields out of the union, this wouldn't be a
problem. Doing so should also be rather non-controversial, as there are
a number of places in struct task_struct that have padding which we
could use to avoid growing the structure by splitting up the fields.

For now, so as to fix the kfuncs to be correct, this patch instead
updates bpf_task_acquire() and bpf_task_release() to use the p->usage
field for refcounting via the get_task_struct() and put_task_struct()
functions. Because we can no longer rely on RCU, the change also guts
the bpf_task_acquire_not_zero() and bpf_task_kptr_get() functions
pending a resolution on the above problem.

In addition, the task fixes the kfunc and rcu_read_lock selftests to
expect this new behavior.

Fixes: 90660309b0 ("bpf: Add kfuncs for storing struct task_struct * as a kptr")
Fixes: fca1aa7551 ("bpf: Handle MEM_RCU type properly")
Reported-by: Matus Jokay <matus.jokay@stuba.sk>
Signed-off-by: David Vernet <void@manifault.com>
Link: https://lore.kernel.org/r/20221206210538.597606-1-void@manifault.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-12-06 16:40:16 -08:00
Andrii Nakryiko
235d2ef22c Merge branch 'BPF selftests fixes'
Daan De Meyer says:

====================

This patch series fixes a few issues I've found while integrating the
bpf selftests into systemd's mkosi development environment.
====================

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2022-12-06 16:33:16 -08:00
Daan De Meyer
d0c0b48c87 selftests/bpf: Use CONFIG_TEST_BPF=m instead of CONFIG_TEST_BPF=y
CONFIG_TEST_BPF can only be a module, so let's indicate it as such in
the selftests config.

Signed-off-by: Daan De Meyer <daan.j.demeyer@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20221205131618.1524337-4-daan.j.demeyer@gmail.com
2022-12-06 16:33:16 -08:00
Daan De Meyer
efe7fadbd5 selftests/bpf: Use "is not set" instead of "=n"
"=n" is not valid kconfig syntax. Use "is not set" instead to indicate
the option should be disabled.

Signed-off-by: Daan De Meyer <daan.j.demeyer@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20221205131618.1524337-3-daan.j.demeyer@gmail.com
2022-12-06 16:33:16 -08:00
Daan De Meyer
d68ae4982c selftests/bpf: Install all required files to run selftests
When installing the selftests using
"make -C tools/testing/selftests install", we need to make sure
all the required files to run the selftests are installed. Let's
make sure this is the case.

Signed-off-by: Daan De Meyer <daan.j.demeyer@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20221205131618.1524337-2-daan.j.demeyer@gmail.com
2022-12-06 16:33:11 -08:00
Timo Hunziker
c21dc529ba libbpf: Parse usdt args without offset on x86 (e.g. 8@(%rsp))
Parse USDT arguments like "8@(%rsp)" on x86. These are emmited by
SystemTap. The argument syntax is similar to the existing "memory
dereference case" but the offset left out as it's zero (i.e. read
the value from the address in the register). We treat it the same
as the the "memory dereference case", but set the offset to 0.

I've tested that this fixes the "unrecognized arg #N spec: 8@(%rsp).."
error I've run into when attaching to a probe with such an argument.
Attaching and reading the correct argument values works.

Something similar might be needed for the other supported
architectures.

  [0] Closes: https://github.com/libbpf/libbpf/issues/559

Signed-off-by: Timo Hunziker <timo.hunziker@gmx.ch>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20221203123746.2160-1-timo.hunziker@eclipso.ch
2022-12-06 16:16:50 -08:00
Anders Roxell
d95d140e83 ata: libahci_platform: ahci_platform_find_clk: oops, NULL pointer
When booting a arm 32-bit kernel with config CONFIG_AHCI_DWC enabled on
a am57xx-evm board. This happens when the clock references are unnamed
in DT, the strcmp() produces a NULL pointer dereference, see the
following oops, NULL pointer dereference:

[    4.673950] Unable to handle kernel NULL pointer dereference at virtual address 00000000
[    4.682098] [00000000] *pgd=00000000
[    4.685699] Internal error: Oops: 5 [#1] SMP ARM
[    4.690338] Modules linked in:
[    4.693420] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.1.0-rc7 #1
[    4.699615] Hardware name: Generic DRA74X (Flattened Device Tree)
[    4.705749] PC is at strcmp+0x0/0x34
[    4.709350] LR is at ahci_platform_find_clk+0x3c/0x5c
[    4.714416] pc : [<c130c494>]    lr : [<c0c230e0>]    psr: 20000013
[    4.720703] sp : f000dda8  ip : 00000001  fp : c29b1840
[    4.725952] r10: 00000020  r9 : c1b23380  r8 : c1b23368
[    4.731201] r7 : c1ab4cc4  r6 : 00000001  r5 : c3c66040  r4 : 00000000
[    4.737762] r3 : 00000080  r2 : 00000080  r1 : c1ab4cc4  r0 : 00000000
[...]
[    4.998870]  strcmp from ahci_platform_find_clk+0x3c/0x5c
[    5.004302]  ahci_platform_find_clk from ahci_dwc_probe+0x1f0/0x54c
[    5.010589]  ahci_dwc_probe from platform_probe+0x64/0xc0
[    5.016021]  platform_probe from really_probe+0xe8/0x41c
[    5.021362]  really_probe from __driver_probe_device+0xa4/0x204
[    5.027313]  __driver_probe_device from driver_probe_device+0x38/0xc8
[    5.033782]  driver_probe_device from __driver_attach+0xb4/0x1ec
[    5.039825]  __driver_attach from bus_for_each_dev+0x78/0xb8
[    5.045532]  bus_for_each_dev from bus_add_driver+0x17c/0x220
[    5.051300]  bus_add_driver from driver_register+0x90/0x124
[    5.056915]  driver_register from do_one_initcall+0x48/0x1e8
[    5.062591]  do_one_initcall from kernel_init_freeable+0x1cc/0x234
[    5.068817]  kernel_init_freeable from kernel_init+0x20/0x13c
[    5.074584]  kernel_init from ret_from_fork+0x14/0x2c
[    5.079681] Exception stack(0xf000dfb0 to 0xf000dff8)
[    5.084747] dfa0:                                     00000000 00000000 00000000 00000000
[    5.092956] dfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[    5.101165] dfe0: 00000000 00000000 00000000 00000000 00000013 00000000
[    5.107818] Code: e5e32001 e3520000 1afffffb e12fff1e (e4d03001)
[    5.114013] ---[ end trace 0000000000000000 ]---

Add an extra check in the if-statement if hpriv-clks[i].id.

Fixes: 6ce73f3a6f ("ata: libahci_platform: Add function returning a clock-handle by id")
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Reviewed-by: Serge Semin <fancer.lancer@gmail.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
2022-12-07 08:36:37 +09:00
Martin KaFai Lau
aa67961f32 selftests/bpf: Allow building bpf tests with CONFIG_XFRM_INTERFACE=[m|n]
It is useful to use vmlinux.h in the xfrm_info test like other kfunc
tests do.  In particular, it is common for kfunc bpf prog that requires
to use other core kernel structures in vmlinux.h

Although vmlinux.h is preferred, it needs a ___local flavor of
struct bpf_xfrm_info in order to build the bpf selftests
when CONFIG_XFRM_INTERFACE=[m|n].

Cc: Eyal Birger <eyal.birger@gmail.com>
Fixes: 90a3a05eb3 ("selftests/bpf: add xfrm_info tests")
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20221206193554.1059757-1-martin.lau@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-12-06 12:42:38 -08:00
Miaoqian Lin
fa55ef14ef bpftool: Fix memory leak in do_build_table_cb
strdup() allocates memory for path. We need to release the memory in the
following error path. Add free() to avoid memory leak.

Fixes: 8f184732b6 ("bpftool: Switch to libbpf's hashmap for pinned paths of BPF objects")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20221206071906.806384-1-linmq006@gmail.com
2022-12-06 21:20:42 +01:00
Pu Lehui
b54b600361 riscv, bpf: Emit fixed-length instructions for BPF_PSEUDO_FUNC
For BPF_PSEUDO_FUNC instruction, verifier will refill imm with
correct addresses of bpf_calls and then run last pass of JIT.
Since the emit_imm of RV64 is variable-length, which will emit
appropriate length instructions accorroding to the imm, it may
broke ctx->offset, and lead to unpredictable problem, such as
inaccurate jump. So let's fix it with fixed-length instructions.

Fixes: 69c087ba62 ("bpf: Add bpf_for_each_map_elem() helper")
Suggested-by: Björn Töpel <bjorn@rivosinc.com>
Signed-off-by: Pu Lehui <pulehui@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Björn Töpel <bjorn@kernel.org>
Acked-by: Björn Töpel <bjorn@kernel.org>
Link: https://lore.kernel.org/bpf/20221206091410.1584784-1-pulehui@huaweicloud.com
2022-12-06 20:59:27 +01:00
Linus Torvalds
8ed710da28 Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fix from Catalin Marinas:
 "Revert the dropping of the cache invalidation from the arm64
  arch_dma_prep_coherent() as it caused a regression in the
  qcom_q6v5_mss remoteproc driver.

  The driver is already buggy but the original arm64 change made
  the problem obvious. The change will be re-introduced once the
  driver is fixed"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  Revert "arm64: dma: Drop cache invalidation from arch_dma_prep_coherent()"
2022-12-06 11:03:03 -08:00
Linus Torvalds
5b3e0cd872 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
 "Unless anything comes from the ARM side, this should be the last pull
  request for this release - and it's mostly documentation:

   - Document the interaction between KVM_CAP_HALT_POLL and halt_poll_ns

   - s390: fix multi-epoch extension in nested guests

   - x86: fix uninitialized variable on nested triple fault"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: Document the interaction between KVM_CAP_HALT_POLL and halt_poll_ns
  KVM: Move halt-polling documentation into common directory
  KVM: x86: fix uninitialized variable use on KVM_REQ_TRIPLE_FAULT
  KVM: s390: vsie: Fix the initialization of the epoch extension (epdx) field
2022-12-06 10:49:19 -08:00
Linus Torvalds
b71101d6ae Merge tag 'for-linus-xsa-6.1-rc9-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fixes from Juergen Gross:
 "Two zero-day fixes for the xen-netback driver (XSA-423 and XSA-424)"

* tag 'for-linus-xsa-6.1-rc9-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen/netback: don't call kfree_skb() with interrupts disabled
  xen/netback: Ensure protocol headers don't fall in the non-linear area
2022-12-06 10:19:05 -08:00
Will Deacon
b7d9aae404 Revert "arm64: dma: Drop cache invalidation from arch_dma_prep_coherent()"
This reverts commit c44094eee3.

Although the semantics of the DMA API require only a clean operation
here, it turns out that the Qualcomm 'qcom_q6v5_mss' remoteproc driver
(ab)uses the DMA API for transferring the modem firmware to the secure
world via calls to Trustzone [1].

Once the firmware buffer has changed hands, _any_ access from the
non-secure side (i.e. Linux) will be detected on the bus and result in a
full system reset [2]. Although this is possible even with this revert
in place (due to speculative reads via the cacheable linear alias of
memory), anecdotally the problem occurs considerably more frequently
when the lines have not been invalidated, assumedly due to some
micro-architectural interactions with the cache hierarchy.

Revert the offending change for now, along with a comment, so that the
Qualcomm developers have time to fix the driver [3] to use a firmware
buffer which does not have a cacheable alias in the linear map.

Link: https://lore.kernel.org/r/20221114110329.68413-1-manivannan.sadhasivam@linaro.org [1]
Link: https://lore.kernel.org/r/CAMi1Hd3H2k1J8hJ6e-Miy5+nVDNzv6qQ3nN-9929B0GbHJkXEg@mail.gmail.com/ [2]
Link: https://lore.kernel.org/r/20221206092152.GD15486@thinkpad [2]
Reported-by: Amit Pundir <amit.pundir@linaro.org>
Reported-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: Thorsten Leemhuis <regressions@leemhuis.info>
Cc: Sibi Sankar <quic_sibis@quicinc.com>
Signed-off-by: Will Deacon <will@kernel.org>
Acked-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20221206103403.646-1-will@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-12-06 17:30:39 +00:00
Juergen Gross
74e7e1efda xen/netback: don't call kfree_skb() with interrupts disabled
It is not allowed to call kfree_skb() from hardware interrupt
context or with interrupts being disabled. So remove kfree_skb()
from the spin_lock_irqsave() section and use the already existing
"drop" label in xenvif_start_xmit() for dropping the SKB. At the
same time replace the dev_kfree_skb() call there with a call of
dev_kfree_skb_any(), as xenvif_start_xmit() can be called with
disabled interrupts.

This is XSA-424 / CVE-2022-42328 / CVE-2022-42329.

Fixes: be81992f90 ("xen/netback: don't queue unlimited number of packages")
Reported-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-12-06 16:00:33 +01:00