Commit Graph

1157473 Commits

Author SHA1 Message Date
Eric Dumazet
ec993edf05 ipv6: icmp6: add drop reason support to ndisc_redirect_rcv()
Change ndisc_redirect_rcv() to return a drop reason.

For the moment, return PKT_TOO_SMALL, NOT_SPECIFIED
and values from icmpv6_notify().

More reasons are added later.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-20 08:54:23 +00:00
Eric Dumazet
2f326d9d9f ipv6: icmp6: add drop reason support to ndisc_router_discovery()
Change ndisc_router_discovery() to return a drop reason.

For the moment, return PKT_TOO_SMALL, NOT_SPECIFIED
and SKB_CONSUMED.

More reasons are added later.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-20 08:54:23 +00:00
Eric Dumazet
243e37c642 ipv6: icmp6: add drop reason support to ndisc_recv_rs()
Change ndisc_recv_rs() to return a drop reason.

For the moment, return PKT_TOO_SMALL, NOT_SPECIFIED
or SKB_CONSUMED. More reasons are added later.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-20 08:54:23 +00:00
Eric Dumazet
3009f9ae21 ipv6: icmp6: add drop reason support to ndisc_recv_na()
Change ndisc_recv_na() to return a drop reason.

For the moment, return PKT_TOO_SMALL, NOT_SPECIFIED
or SKB_CONSUMED. More reasons are added later.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-20 08:54:23 +00:00
Eric Dumazet
7c9c8913f4 ipv6: icmp6: add drop reason support to ndisc_recv_ns()
Change ndisc_recv_ns() to return a drop reason.

For the moment, return PKT_TOO_SMALL, NOT_SPECIFIED
or SKB_CONSUMED.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-20 08:54:23 +00:00
Eric Dumazet
dd1b527831 net: add location to trace_consume_skb()
kfree_skb() includes the location, it makes sense
to add it to consume_skb() as well.

After patch:

 taskd_EventMana  8602 [004]   420.406239: skb:consume_skb: skbaddr=0xffff893a4a6d0500 location=unix_stream_read_generic
         swapper     0 [011]   422.732607: skb:consume_skb: skbaddr=0xffff89597f68cee0 location=mlx4_en_free_tx_desc
      discipline  9141 [043]   423.065653: skb:consume_skb: skbaddr=0xffff893a487e9c00 location=skb_consume_udp
         swapper     0 [010]   423.073166: skb:consume_skb: skbaddr=0xffff8949ce9cdb00 location=icmpv6_rcv
         borglet  8672 [014]   425.628256: skb:consume_skb: skbaddr=0xffff8949c42e9400 location=netlink_dump
         swapper     0 [028]   426.263317: skb:consume_skb: skbaddr=0xffff893b1589dce0 location=net_rx_action
            wget 14339 [009]   426.686380: skb:consume_skb: skbaddr=0xffff893a51b552e0 location=tcp_rcv_state_process

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-20 08:28:49 +00:00
Xuan Zhuo
9f78bf330a xsk: support use vaddr as ring
When we try to start AF_XDP on some machines with long running time, due
to the machine's memory fragmentation problem, there is no sufficient
contiguous physical memory that will cause the start failure.

If the size of the queue is 8 * 1024, then the size of the desc[] is
8 * 1024 * 8 = 16 * PAGE, but we also add struct xdp_ring size, so it is
16page+. This is necessary to apply for a 4-order memory. If there are a
lot of queues, it is difficult to these machine with long running time.

Here, that we actually waste 15 pages. 4-Order memory is 32 pages, but
we only use 17 pages.

This patch replaces __get_free_pages() by vmalloc() to allocate memory
to solve these problems.

Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-20 08:22:12 +00:00
Paolo Abeni
b148d400f8 Merge branch 'taprio-queuemaxsdu-fixes'
Vladimir Oltean says:

====================
taprio queueMaxSDU fixes

This fixes 3 issues noticed while attempting to reoffload the
dynamically calculated queueMaxSDU values. These are:
- Dynamic queueMaxSDU is not calculated correctly due to a lost patch
- Dynamically calculated queueMaxSDU needs to be clamped on the low end
- Dynamically calculated queueMaxSDU needs to be clamped on the high end
====================

Link: https://lore.kernel.org/r/20230215224632.2532685-1-vladimir.oltean@nxp.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-02-20 08:46:59 +01:00
Vladimir Oltean
64cb6aad12 net/sched: taprio: dynamic max_sdu larger than the max_mtu is unlimited
It makes no sense to keep randomly large max_sdu values, especially if
larger than the device's max_mtu. These are visible in "tc qdisc show".
Such a max_sdu is practically unlimited and will cause no packets for
that traffic class to be dropped on enqueue.

Just set max_sdu_dynamic to U32_MAX, which in the logic below causes
taprio to save a max_frm_len of U32_MAX and a max_sdu presented to user
space of 0 (unlimited).

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-02-20 08:46:57 +01:00
Vladimir Oltean
bdf366bd86 net/sched: taprio: don't allow dynamic max_sdu to go negative after stab adjustment
The overhead specified in the size table comes from the user. With small
time intervals (or gates always closed), the overhead can be larger than
the max interval for that traffic class, and their difference is
negative.

What we want to happen is for max_sdu_dynamic to have the smallest
non-zero value possible (1) which means that all packets on that traffic
class are dropped on enqueue. However, since max_sdu_dynamic is u32, a
negative is represented as a large value and oversized dropping never
happens.

Use max_t with int to force a truncation of max_frm_len to no smaller
than dev->hard_header_len + 1, which in turn makes max_sdu_dynamic no
smaller than 1.

Fixes: fed87cc671 ("net/sched: taprio: automatically calculate queueMaxSDU based on TC gate durations")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-02-20 08:46:57 +01:00
Vladimir Oltean
09dbdf28f9 net/sched: taprio: fix calculation of maximum gate durations
taprio_calculate_gate_durations() depends on netdev_get_num_tc() and
this returns 0. So it calculates the maximum gate durations for no
traffic class.

I had tested the blamed commit only with another patch in my tree, one
which in the end I decided isn't valuable enough to submit ("net/sched:
taprio: mask off bits in gate mask that exceed number of TCs").

The problem is that having this patch threw off my testing. By moving
the netdev_set_num_tc() call earlier, we implicitly gave to
taprio_calculate_gate_durations() the information it needed.

Extract only the portion from the unsubmitted change which applies the
mqprio configuration to the netdev earlier.

Link: https://patchwork.kernel.org/project/netdevbpf/patch/20230130173145.475943-15-vladimir.oltean@nxp.com/
Fixes: a306a90c8f ("net/sched: taprio: calculate tc gate durations")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-02-20 08:46:57 +01:00
David Howells
c078381856 rxrpc: Fix overproduction of wakeups to recvmsg()
Fix three cases of overproduction of wakeups:

 (1) rxrpc_input_split_jumbo() conditionally notifies the app that there's
     data for recvmsg() to collect if it queues some data - and then its
     only caller, rxrpc_input_data(), goes and wakes up recvmsg() anyway.

     Fix the rxrpc_input_data() to only do the wakeup in failure cases.

 (2) If a DATA packet is received for a call by the I/O thread whilst
     recvmsg() is busy draining the call's rx queue in the app thread, the
     call will left on the recvmsg() queue for recvmsg() to pick up, even
     though there isn't any data on it.

     This can cause an unexpected recvmsg() with a 0 return and no MSG_EOR
     set after the reply has been posted to a service call.

     Fix this by discarding pending calls from the recvmsg() queue that
     don't need servicing yet.

 (3) Not-yet-completed calls get requeued after having data read from them,
     even if they have no data to read.

     Fix this by only requeuing them if they have data waiting on them; if
     they don't, the I/O thread will requeue them when data arrives or they
     fail.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
Link: https://lore.kernel.org/r/3386149.1676497685@warthog.procyon.org.uk
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-02-20 08:33:25 +01:00
Paolo Abeni
d269ac136e Merge branch 'net-final-gsi-register-updates'
Alex Elder says:

====================
net: final GSI register updates

I believe this is the last set of changes required to allow IPA v5.0
to be supported.  There is a little cleanup work remaining, but that
can happen in the next Linux release cycle.  Otherwise we just need
config data and register definitions for IPA v5.0 (and DTS updates).
These are ready but won't be posted without further testing.

The first patch in this series fixes a minor bug in a patch just
posted, which I found too late.  The second eliminates the GSI
memory "adjustment"; this was done previously to avoid/delay the
need to implement a more general way to define GSI register offsets.
Note that this patch causes "checkpatch" warnings due to indentation
that aligns with an open parenthesis.

The third patch makes use of the newly-defined register offsets, to
eliminate the need for a function that hid a few details.  The next
modifies a different helper function to work properly for IPA v5.0+.
The fifth patch changes the way the event ring size is specified
based on how it's now done for IPA v5.0+.  And the last defines a
new register required for IPA v5.0+.
====================

Link: https://lore.kernel.org/r/20230215195352.755744-1-elder@linaro.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-02-20 08:14:22 +01:00
Alex Elder
f651334e1e net: ipa: add HW_PARAM_4 GSI register
Starting at IPA v5.0, the number of event rings per EE is defined
in a field in a new HW_PARAM_4 GSI register rather than HW_PARAM_2.
Define this new register and its fields, and update the code that
checks the number of rings supported by hardware to use the proper
field based on IPA version.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-02-20 08:14:20 +01:00
Alex Elder
37cd29ec84 net: ipa: support different event ring encoding
Starting with IPA v5.0, a channel's event ring index is encoded in
a field in the CH_C_CNTXT_1 GSI register rather than CH_C_CNTXT_0.
Define a new field ID for the former register and encode the event
ring in the appropriate register.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-02-20 08:14:20 +01:00
Alex Elder
62747512eb net: ipa: avoid setting an undefined field
The GSI channel protocol field in the CH_C_CNTXT_0 GSI register is
widened starting IPA v5.0, making the CHTYPE_PROTOCOL_MSB field
added in IPA v4.5 unnecessary.  Update the code to reflect this.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-02-20 08:14:20 +01:00
Alex Elder
f75f44ddd4 net: ipa: kill ev_ch_e_cntxt_1_length_encode()
Now that we explicitly define each register field width there is no
need to have a special encoding function for the event ring length.
Add a field for this to the EV_CH_E_CNTXT_1 GSI register, and use it
in place of ev_ch_e_cntxt_1_length_encode() (which can be removed).

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-02-20 08:14:20 +01:00
Alex Elder
59b12b1d27 net: ipa: kill gsi->virt_raw
Starting at IPA v4.5, almost all GSI registers had their offsets
changed by a fixed amount (shifted downward by 0xd000).  Rather than
defining offsets for all those registers dependent on version, an
adjustment was applied for most register accesses.  This was
implemented in commit cdeee49f3e ("net: ipa: adjust GSI register
addresses").  It was later modified to be a bit more obvious about
the adjusment, in commit 571b1e7e58 ("net: ipa: use a separate
pointer for adjusted GSI memory").

We now are able to define every GSI register with its own offset, so
there's no need to implement this special adjustment.

So get rid of the "virt_raw" pointer, and just maintain "virt" as
the (non-adjusted) base address of I/O mapped GSI register memory.

Redefine the offsets of all GSI registers (other than the INTER_EE
ones, which were not subject to the adjustment) for IPA v4.5+,
subtracting 0xd000 from their defined offsets instead.

Move the ERROR_LOG and ERROR_LOG_CLR definitions further down in the
register definition files so all registers are defined in order of
their offset.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-02-20 08:14:20 +01:00
Alex Elder
ecfa80ce3b net: ipa: fix an incorrect assignment
I spotted an error in a patch posted this week, unfortunately just
after it got accepted.  The effect of the bug is that time-based
interrupt moderation is disabled.  This is not technically a bug,
but it is not what is intended.  The problem is that a |= assignment
got implemented as a simple assignment, so the previously assigned
value was ignored.

Fixes: edc6158b18 ("net: ipa: define fields for event-ring related registers")
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-02-20 08:14:20 +01:00
Lorenzo Bianconi
1c93e48cc3 net: dpaa2-eth: do not always set xsk support in xdp_features flag
Do not always add NETDEV_XDP_ACT_XSK_ZEROCOPY bit in xdp_features flag
but check if the NIC really supports it.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com>
Link: https://lore.kernel.org/r/3dba6ea42dc343a9f2d7d1a6a6a6c173235e1ebf.1676471386.git.lorenzo@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-02-20 07:49:44 +01:00
David S. Miller
675f176b4d Merge ra.kernel.org:/pub/scm/linux/kernel/git/netdev/net
Some of the devlink bits were tricky, but I think I got it right.

Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-17 11:06:39 +00:00
Linus Torvalds
ec35307e18 Merge tag 'drm-fixes-2023-02-17' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
 "Just a final collection of misc fixes, the biggest disables the
  recently added dynamic debugging support, it has a regression that
  needs some bigger fixes.

  Otherwise a bunch of fixes across the board, vc4, amdgpu and vmwgfx
  mostly, with some smaller i915 and ast fixes.

  drm:
   - dynamic debug disable for now

  fbdev:
   - deferred i/o device close fix

  amdgpu:
   - Fix GC11.x suspend warning
   - Fix display warning

  vc4:
   - YUV planes fix
   - hdmi display fix
   - crtc reduced blanking fix

  ast:
   - fix start address computation

  vmwgfx:
   - fix bo/handle races

  i915:
   - gen11 WA fix"

* tag 'drm-fixes-2023-02-17' of git://anongit.freedesktop.org/drm/drm:
  drm/amd/display: Fail atomic_check early on normalize_zpos error
  drm/amd/amdgpu: fix warning during suspend
  drm/vmwgfx: Do not drop the reference to the handle too soon
  drm/vmwgfx: Stop accessing buffer objects which failed init
  drm/i915/gen11: Wa_1408615072/Wa_1407596294 should be on GT list
  drm: Disable dynamic debug as broken
  drm/ast: Fix start address computation
  fbdev: Fix invalid page access after closing deferred I/O devices
  drm/vc4: crtc: Increase setup cost in core clock calculation to handle extreme reduced blanking
  drm/vc4: hdmi: Always enable GCP with AVMUTE cleared
  drm/vc4: Fix YUV plane handling when planes are in different buffers
2023-02-16 20:23:32 -08:00
Dave Airlie
f7597e3c58 Merge tag 'drm-intel-fixes-2023-02-16' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
- Moving gen11 hw wa to the right place. (Matt)

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/Y+47eUvwbafER35/@intel.com
2023-02-17 09:50:06 +10:00
Dave Airlie
a2a04b5155 Merge tag 'drm-misc-fixes-2023-02-16' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
Multiple fixes in vc4 to address issues with YUV planes, HDMI and CRTC;
an invalid page access fix for fbdev, mark dynamic debug as broken, a
double free and refcounting fix for vmwgfx.

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20230216091905.i5wswy4dd74x4br5@houat
2023-02-17 09:24:05 +10:00
Dave Airlie
caa068c9bb Merge tag 'amd-drm-fixes-6.2-2023-02-15' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-6.2-2023-02-15:

amdgpu:
- Fix GC11.x suspend warning
- Fix display warning

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230216041122.7714-1-alexander.deucher@amd.com
2023-02-17 07:34:59 +10:00
Linus Torvalds
3ac88fa460 Merge tag 'net-6.2-final' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
 "Fixes from the main networking tree only, probably because all
  sub-trees have backed off and haven't submitted their changes.

  None of the fixes here are particularly scary and no outstanding
  regressions. In an ideal world the "current release" sections would be
  empty at this stage but that never happens.

  Current release - regressions:

   - fix unwanted sign extension in netdev_stats_to_stats64()

  Current release - new code bugs:

   - initialize net->notrefcnt_tracker earlier

   - devlink: fix netdev notifier chain corruption

   - nfp: make sure mbox accesses in IPsec code are atomic

   - ice: fix check for weight and priority of a scheduling node

  Previous releases - regressions:

   - ice: xsk: fix cleaning of XDP_TX frame, prevent inf loop

   - igb: fix I2C bit banging config with external thermal sensor

  Previous releases - always broken:

   - sched: tcindex: update imperfect hash filters respecting rcu

   - mpls: fix stale pointer if allocation fails during device rename

   - dccp/tcp: avoid negative sk_forward_alloc by ipv6_pinfo.pktoptions

   - remove WARN_ON_ONCE(sk->sk_forward_alloc) from
     sk_stream_kill_queues()

   - af_key: fix heap information leak

   - ipv6: fix socket connection with DSCP (correct interpretation of
     the tclass field vs fib rule matching)

   - tipc: fix kernel warning when sending SYN message

   - vmxnet3: read RSS information from the correct descriptor (eop)"

* tag 'net-6.2-final' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (35 commits)
  devlink: Fix netdev notifier chain corruption
  igb: conditionalize I2C bit banging on external thermal sensor support
  net: mpls: fix stale pointer if allocation fails during device rename
  net/sched: tcindex: search key must be 16 bits
  tipc: fix kernel warning when sending SYN message
  igb: Fix PPS input and output using 3rd and 4th SDP
  net: use a bounce buffer for copying skb->mark
  ixgbe: add double of VLAN header when computing the max MTU
  i40e: add double of VLAN header when computing the max MTU
  ixgbe: allow to increase MTU to 3K with XDP enabled
  net: stmmac: Restrict warning on disabling DMA store and fwd mode
  net/sched: act_ctinfo: use percpu stats
  net: stmmac: fix order of dwmac5 FlexPPS parametrization sequence
  ice: fix lost multicast packets in promisc mode
  ice: Fix check for weight and priority of a scheduling node
  bnxt_en: Fix mqprio and XDP ring checking logic
  net: Fix unwanted sign extension in netdev_stats_to_stats64()
  net/usb: kalmia: Don't pass act_len in usb_bulk_msg error path
  net: openvswitch: fix possible memory leak in ovs_meter_cmd_set()
  af_key: Fix heap information leak
  ...
2023-02-16 12:13:58 -08:00
Linus Torvalds
d3d6f0eb08 Merge tag 'block-6.2-2023-02-16' of git://git.kernel.dk/linux
Pull block fixes from Jens Axboe:
 "Just a few NVMe fixes that should go into the 6.2 release, adding a
  quirk and fixing two issues introduced in this release:

   - NVMe fixes via Christoph:
       - Always return an ERR_PTR from nvme_pci_alloc_dev (Irvin Cote)
       - Add bogus ID quirk for ADATA SX6000PNP (Daniel Wagner)
       - Set the DMA mask earlier (Christoph Hellwig)"

* tag 'block-6.2-2023-02-16' of git://git.kernel.dk/linux:
  nvme-pci: always return an ERR_PTR from nvme_pci_alloc_dev
  nvme-pci: set the DMA mask earlier
  nvme-pci: add bogus ID quirk for ADATA SX6000PNP
2023-02-16 12:05:33 -08:00
Linus Torvalds
b5596f1d54 Merge tag 'spi-v6.2-rc8-abi' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi fix from Mark Brown:
 "One more last minute patch for v6.2 updating the parsing of the newly
  added spi-cs-setup-delay-ns.

  It's been pointed out that due to the way DT parsing works the change
  in property size is ABI visible so let's not let a release go out
  without it being fixed. The change got split from some earlier ABI
  related fixes to the property since the first version sent had a build
  error"

* tag 'spi-v6.2-rc8-abi' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  spi: Use a 32-bit DT property for spi-cs-setup-delay-ns
2023-02-16 12:01:46 -08:00
Linus Torvalds
18902059e0 Merge tag 'gpio-fixes-for-v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull gpio fixes from Bartosz Golaszewski:

 - fix a potential Kconfig issue with gpio-mlxbf2 not selecting
   GPIOLIB_IRQCHIP

 - another immutable irqchip conversion, this time for gpio-vf610

 - fix a wakeup issue on Clevo NH5xAx

* tag 'gpio-fixes-for-v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
  gpio: mlxbf2: select GPIOLIB_IRQCHIP
  gpiolib: acpi: Add a ignore wakeup quirk for Clevo NH5xAx
  gpio: vf610: make irq_chip immutable
  gpiolib: acpi: remove redundant declaration
2023-02-16 11:57:43 -08:00
Christoph Hellwig
88d355832e stop mainaining UUID
The uuid code is very low maintainance now that the major overhaul
has completed, and doesn't need it's own tree.  All the recent work
has been done by Andy who'd like to stay on as a reviewer without an
explicit tree.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Andy Shevchenko <andy@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-02-16 11:53:01 -08:00
Christoph Hellwig
a8cd2990b6 orphan sysvfs
This code has been stale for years and I have no way to test it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-02-16 11:53:01 -08:00
Jakub Kicinski
84cb1b53cd Merge branch 'mlx5-next' of https://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
Leon Romanovsky says:

====================
mlx5-next changes

Following previous conversations [1] and our clear commitment to do
the TC work [2], please pull mlx5-next shared branch, which includes
low-level steering logic to allow RoCEv2 traffic to be encrypted/
decrypted through IPsec.

[1] https://lore.kernel.org/all/20230126230815.224239-1-saeed@kernel.org/
[2] https://lore.kernel.org/all/Y+Z7lVVWqnRBiPh2@nvidia.com/

* 'mlx5-next' of https://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux:
  net/mlx5: Configure IPsec steering for egress RoCEv2 traffic
  net/mlx5: Configure IPsec steering for ingress RoCEv2 traffic
  net/mlx5: Add IPSec priorities in RDMA namespaces
  net/mlx5: Implement new destination type TABLE_TYPE
  net/mlx5: Introduce new destination type TABLE_TYPE
====================

Link: https://lore.kernel.org/r/20230215095624.1365200-1-leon@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-16 11:36:14 -08:00
Jakub Kicinski
ca0df43d21 Merge tag 'wireless-next-2023-03-16' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next
Johannes Berg says:

====================
Major stack changes:
 * EHT channel puncturing support (client & AP)
 * some support for AP MLD without mac80211
 * fixes for A-MSDU on mesh connections

Major driver changes:

iwlwifi
 * EHT rate reporting
 * Bump FW API to 74 for AX devices
 * STEP equalizer support: transfer some STEP (connection to radio
   on platforms with integrated wifi) related parameters from the
   BIOS to the firmware

mt76
 * switch to using page pool allocator
 * mt7996 EHT (Wi-Fi 7) support
 * Wireless Ethernet Dispatch (WED) reset support

libertas
 * WPS enrollee support

brcmfmac
 * Rename Cypress 89459 to BCM4355
 * BCM4355 and BCM4377 support

mwifiex
 * SD8978 chipset support

rtl8xxxu
 * LED support

ath12k
 * new driver for Qualcomm Wi-Fi 7 devices

ath11k
 * IPQ5018 support
 * Fine Timing Measurement (FTM) responder role support
 * channel 177 support

ath10k
 * store WLAN firmware version in SMEM image table

* tag 'wireless-next-2023-03-16' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (207 commits)
  wifi: brcmfmac: p2p: Introduce generic flexible array frame member
  wifi: mac80211: add documentation for amsdu_mesh_control
  wifi: cfg80211: remove gfp parameter from cfg80211_obss_color_collision_notify description
  wifi: mac80211: always initialize link_sta with sta
  wifi: mac80211: pass 'sta' to ieee80211_rx_data_set_sta()
  wifi: cfg80211: Set SSID if it is not already set
  wifi: rtw89: move H2C of del_pkt_offload before polling FW status ready
  wifi: rtw89: use readable return 0 in rtw89_mac_cfg_ppdu_status()
  wifi: rtw88: usb: drop now unnecessary URB size check
  wifi: rtw88: usb: send Zero length packets if necessary
  wifi: rtw88: usb: Set qsel correctly
  wifi: mac80211: fix off-by-one link setting
  wifi: mac80211: Fix for Rx fragmented action frames
  wifi: mac80211: avoid u32_encode_bits() warning
  wifi: mac80211: Don't translate MLD addresses for multicast
  wifi: cfg80211: call reg_notifier for self managed wiphy from driver hint
  wifi: cfg80211: get rid of gfp in cfg80211_bss_color_notify
  wifi: nl80211: Allow authentication frames and set keys on NAN interface
  wifi: mac80211: fix non-MLO station association
  wifi: mac80211: Allow NSS change only up to capability
  ...
====================

Link: https://lore.kernel.org/r/20230216105406.208416-1-johannes@sipsolutions.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-16 11:33:10 -08:00
Bartosz Golaszewski
b8b3b0bfb7 Merge tag 'intel-gpio-v6.2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/andy/linux-gpio-intel into gpio/for-current
intel-gpio for v6.2-2

* Ignore spurious wakeup by touchpad on Clevo NH5xAx
* Miscellaneous fix(es)
2023-02-16 13:31:42 +01:00
Paolo Abeni
40967f77df Merge branch 'seg6-add-psp-flavor-support-for-srv6-end-behavior'
Andrea Mayer says:

====================
seg6: add PSP flavor support for SRv6 End behavior

Segment Routing for IPv6 (SRv6 in short) [1] is the instantiation of the
Segment Routing (SR) [2] architecture on the IPv6 dataplane.
In SRv6, the segment identifiers (SID) are IPv6 addresses and the segment list
(SID List) is carried in the Segment Routing Header (SRH). A segment may be
bound to a specific packet processing operation called "behavior". The RFC8986
[3] defines and standardizes the most common/relevant behaviors for network
operators, e.g., End, End.X and End.T and so on.

The RFC8986 also introduces the "flavors" framework aiming to modify or extend
the capabilities of SRv6 End, End.X and End.T behaviors. Specifically, these
behaviors support the following flavors (either individually or in
combinations):
 - Penultimate Segment Pop (PSP);
 - Ultimate Segment Pop (USP);
 - Ultimate Segment Decapsulation (USD).

Such flavors enable an End/End.X/End.T behavior to pop the SRH on the
penultimate/ultimate SR endpoint node listed in the SID List or to perform a
full decapsulation.

Currently, the Linux kernel supports a large subset of behaviors described in
RFC8986, including the End, End.X and End.T. However, PSP, USP and USD flavors
have not yet been implemented.

In this patchset, we extend the SRv6 subsystem to implement the PSP flavor in
the SRv6 End behavior. To accomplish this task, we leverage the flavor
framework previously introduced by another patchset required for supporting the
efficient representation of the SID List through the NEXT-C-SID mechanism [4].

In details, the patchset is made of:
 - patch 1/3: seg6: factor out End lookup nexthop processing to a dedicated
              function
 - patch 2/3: seg6: add PSP flavor support for SRv6 End behavior
 - patch 3/3: selftests: seg6: add selftest for PSP flavor in SRv6 End
              behavior

From the user space perspective, we do not need to change the iproute2 code to
support the PSP flavor. However, we provide the man page for the PSP flavor in
a separate patch.

Comments, improvements and suggestions are always appreciated.

[1] - RFC8754: https://datatracker.ietf.org/doc/html/rfc8754
[2] - RFC8402: https://datatracker.ietf.org/doc/html/rfc8402
[3] - RFC8986: https://datatracker.ietf.org/doc/html/rfc8986
[4] - https://datatracker.ietf.org/doc/html/draft-ietf-spring-srv6-srh-compression
====================

Link: https://lore.kernel.org/r/20230215134659.7613-1-andrea.mayer@uniroma2.it
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-02-16 13:30:06 +01:00
Andrea Mayer
5198cb408f selftests: seg6: add selftest for PSP flavor in SRv6 End behavior
This selftest is designed for testing the PSP flavor in SRv6 End behavior.
It instantiates a virtual network composed of several nodes: hosts and
SRv6 routers. Each node is realized using a network namespace that is
properly interconnected to others through veth pairs.
The test makes use of the SRv6 End behavior and of the PSP flavor needed
for removing the SRH from the IPv6 header at the penultimate node.

The correct execution of the behavior is verified through reachability
tests carried out between hosts.

Signed-off-by: Andrea Mayer <andrea.mayer@uniroma2.it>
Signed-off-by: Paolo Lungaroni <paolo.lungaroni@uniroma2.it>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-02-16 13:18:06 +01:00
Andrea Mayer
bdf3c0b9c1 seg6: add PSP flavor support for SRv6 End behavior
The "flavors" framework defined in RFC8986 [1] represents additional
operations that can modify or extend a subset of existing behaviors such as
SRv6 End, End.X and End.T. We report these flavors hereafter:
 - Penultimate Segment Pop (PSP);
 - Ultimate Segment Pop (USP);
 - Ultimate Segment Decapsulation (USD).

Depending on how the Segment Routing Header (SRH) has to be handled, an
SRv6 End* behavior can support these flavors either individually or in
combinations.
In this patch, we only consider the PSP flavor for the SRv6 End behavior.

A PSP enabled SRv6 End behavior is used by the Source/Ingress SR node
(i.e., the one applying the SRv6 Policy) when it needs to instruct the
penultimate SR Endpoint node listed in the SID List (carried by the SRH) to
remove the SRH from the IPv6 header.

Specifically, a PSP enabled SRv6 End behavior processes the SRH by:
   i) decreasing the Segment Left (SL) from 1 to 0;
  ii) copying the Last Segment IDentifier (SID) into the IPv6 Destination
      Address (DA);
 iii) removing (i.e., popping) the outer SRH from the extension headers
      following the IPv6 header.

It is important to note that PSP operation (steps i, ii, iii) takes place
only at a penultimate SR Segment Endpoint node (i.e., when the SL=1) and
does not happen at non-penultimate Endpoint nodes. Indeed, when a SID of
PSP flavor is processed at a non-penultimate SR Segment Endpoint node, the
PSP operation is not performed because it would not be possible to decrease
the SL from 1 to 0.

                                                 SL=2 SL=1 SL=0
                                                   |    |    |
For example, given the SRv6 policy (SID List := <  X,   Y,   Z  >):
 - a PSP enabled SRv6 End behavior bound to SID "Y" will apply the PSP
   operation as Segment Left (SL) is 1, corresponding to the Penultimate
   Segment of the SID List;
 - a PSP enabled SRv6 End behavior bound to SID "X" will *NOT* apply the
   PSP operation as the Segment Left is 2. This behavior instance will
   apply the "standard" End packet processing, ignoring the configured PSP
   flavor at all.

[1] - RFC8986: https://datatracker.ietf.org/doc/html/rfc8986

Signed-off-by: Andrea Mayer <andrea.mayer@uniroma2.it>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-02-16 13:18:06 +01:00
Andrea Mayer
525c65ff56 seg6: factor out End lookup nexthop processing to a dedicated function
The End nexthop lookup/input operations are moved into a new helper
function named input_action_end_finish(). This avoids duplicating the
code needed to compute the nexthop in the different flavors of the End
behavior.

Signed-off-by: Andrea Mayer <andrea.mayer@uniroma2.it>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-02-16 13:18:06 +01:00
Lukas Bulwahn
f5b12be342 net: dsa: ocelot: fix selecting MFD_OCELOT
Commit 3d7316ac81 ("net: dsa: ocelot: add external ocelot switch
control") adds config NET_DSA_MSCC_OCELOT_EXT, which selects the
non-existing config MFD_OCELOT_CORE.

Replace this select with the intended and existing MFD_OCELOT.

Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Acked-by: Colin Foster <colin.foster@in-advantage.com>
Link: https://lore.kernel.org/r/20230215104631.31568-1-lukas.bulwahn@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-02-16 13:03:15 +01:00
Paolo Abeni
fa15072b65 Merge branch 'sfc-devlink-support-for-ef100'
Alejandro Lucero says:

====================
sfc: devlink support for ef100

This patchset adds devlink port support for ef100 allowing setting VFs
mac addresses through the VF representor devlink ports.

Basic devlink infrastructure is first introduced, then support for info
command. Next changes for enumerating MAE ports which will be used for
devlink port creation when netdevs are registered.

Adding support for devlink port_function_hw_addr_get requires changes in
the ef100 driver for getting the mac address based on a client handle.
This allows to obtain VFs mac addresses during netdev initialization as
well what is included in patch 6.

Such client handle is used in patches 7 and 8 for getting and setting
devlink port addresses.
====================

Link: https://lore.kernel.org/r/20230215090828.11697-1-alejandro.lucero-palau@amd.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-02-16 12:03:15 +01:00
Alejandro Lucero
3b6096c9b3 sfc: add support for devlink port_function_hw_addr_set in ef100
Using the builtin client handle id infrastructure, add support for
setting the mac address linked to mports in ef100. This implies to
execute an MCDI command for giving the address to the firmware for
the specific devlink port.

Signed-off-by: Alejandro Lucero <alejandro.lucero-palau@amd.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-02-16 12:03:13 +01:00
Alejandro Lucero
fa78b01718 sfc: add support for devlink port_function_hw_addr_get in ef100
Using the builtin client handle id infrastructure, add support for
obtaining the mac address linked to mports in ef100. This implies
to execute an MCDI command for getting the data from the firmware
for each devlink port.

Signed-off-by: Alejandro Lucero <alejandro.lucero-palau@amd.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-02-16 12:03:13 +01:00
Alejandro Lucero
7e056e2360 sfc: obtain device mac address based on firmware handle for ef100
Getting device mac address is currently based on a specific MCDI command
only available for the PF. This patch changes the MCDI command to a
generic one for PFs and VFs based on a client handle. This allows both
PFs and VFs to ask for their mac address during initialization using the
CLIENT_HANDLE_SELF.

Moreover, the patch allows other client handles which will be used by
the PF to ask for mac addresses linked to VFs. This is necessary for
suporting the port_function_hw_addr_get devlink function in further
patches.

Signed-off-by: Alejandro Lucero <alejandro.lucero-palau@amd.com>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-02-16 12:03:12 +01:00
Alejandro Lucero
25414b2a64 sfc: add devlink port support for ef100
Using the data when enumerating mports, create devlink ports just before
netdevs are registered and remove those devlink ports after netdev has
been unregistered.

Signed-off-by: Alejandro Lucero <alejandro.lucero-palau@amd.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-02-16 12:03:12 +01:00
Alejandro Lucero
5227adff37 sfc: add mport lookup based on driver's mport data
Obtaining mport id is based on asking the firmware about it. This is
still needed for mport initialization itself, but once the mport data is
now kept by the driver, further mport id request can be satisfied
internally without firmware interaction.

Previous function is just modified in name making clear the firmware
interaction. The new function uses the old name and looks for the data
in the mport data structure.

Signed-off-by: Alejandro Lucero <alejandro.lucero-palau@amd.com>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-02-16 12:03:12 +01:00
Alejandro Lucero
a6a15aca42 sfc: enumerate mports in ef100
MAE ports (mports) are the ports on the EF100 embedded switch such
as networking PCIe functions, the physical port, and potentially
others.

Signed-off-by: Alejandro Lucero <alejandro.lucero-palau@amd.com>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-02-16 12:03:12 +01:00
Alejandro Lucero
14743ddd24 sfc: add devlink info support for ef100
Add devlink info support for ef100. The information reported is obtained
through the MCDI interface with the specific meaning defined in new
documentation file.

Signed-off-by: Alejandro Lucero <alejandro.lucero-palau@amd.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-02-16 12:03:12 +01:00
Alejandro Lucero
fa34a5140a sfc: add devlink support for ef100
Add devlink infrastructure support. Further patches add devlink
info and devlink port support.

Signed-off-by: Alejandro Lucero <alejandro.lucero-palau@amd.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-02-16 12:03:12 +01:00
Ido Schimmel
b20b8aec6f devlink: Fix netdev notifier chain corruption
Cited commit changed devlink to register its netdev notifier block on
the global netdev notifier chain instead of on the per network namespace
one.

However, when changing the network namespace of the devlink instance,
devlink still tries to unregister its notifier block from the chain of
the old namespace and register it on the chain of the new namespace.
This results in corruption of the notifier chains, as the same notifier
block is registered on two different chains: The global one and the per
network namespace one. In turn, this causes other problems such as the
inability to dismantle namespaces due to netdev reference count issues.

Fix by preventing devlink from moving its notifier block between
namespaces.

Reproducer:

 # echo "10 1" > /sys/bus/netdevsim/new_device
 # ip netns add test123
 # devlink dev reload netdevsim/netdevsim10 netns test123
 # ip netns del test123
 [   71.935619] unregister_netdevice: waiting for lo to become free. Usage count = 2
 [   71.938348] leaked reference.

Fixes: 565b4824c3 ("devlink: change port event netdev notifier from per-net to global")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230215073139.1360108-1-idosch@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-02-16 11:53:47 +01:00
Paolo Abeni
e9ab2559e2 Merge branch 'net-sched-transition-actions-to-pcpu-stats-and-rcu'
Pedro Tammela says:

====================
net/sched: transition actions to pcpu stats and rcu

Following the work done for act_pedit[0], transition the remaining tc
actions to percpu stats and rcu, whenever possible.
Percpu stats make updating the action stats very cheap, while combining
it with rcu action parameters makes it possible to get rid of the per
action lock in the datapath.

For act_connmark and act_nat we run the following tests:
- tc filter add dev ens2f0 ingress matchall action connmark
- tc filter add dev ens2f0 ingress matchall action nat ingress any 10.10.10.10

Our setup consists of a 26 cores Intel CPU and a 25G NIC.
We use TRex to shoot 10mpps TCP packets and take perf measurements.
Both actions improved performance as expected since the datapath lock disappeared.

For act_pedit we move the drop counter to percpu, when available.
For act_gate we move the counters to percpu, when available.

[0] https://lore.kernel.org/all/20230131145149.3776656-1-pctammela@mojatatu.com/
====================

Link: https://lore.kernel.org/r/20230214211534.735718-1-pctammela@mojatatu.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-02-16 10:39:30 +01:00