Commit Graph

142599 Commits

Author SHA1 Message Date
Michal Jaron
e753df8fbc ice: Add support Flex RXD
Add new VIRTCHNL_VF_OFFLOAD_RX_FLEX_DESC flag, opcode
VIRTCHNL_OP_GET_SUPPORTED_RXDIDS and add member rxdid
in struct virtchnl_rxq_info to support AVF Flex RXD
extension.

Add support to allow VF to query flexible descriptor RXDIDs supported
by DDP package and configure Rx queues with selected RXDID for IAVF.

Add code to allow VIRTCHNL_OP_GET_SUPPORTED_RXDIDS message to be
processed. Add necessary macros for registers.

Signed-off-by: Leyi Rong <leyi.rong@intel.com>
Signed-off-by: Xu Ting <ting.xu@intel.com>
Signed-off-by: Michal Jaron <michalx.jaron@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20221025161252.1952939-1-jacob.e.keller@intel.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-10-27 13:23:43 +02:00
Tariq Toukan
28581b9c2c bond: Disable TLS features indication
Bond agnostically interacts with TLS device-offload requests via the
.ndo_sk_get_lower_dev operation. Return value is true iff bond
guarantees fixed mapping between the TLS connection and a lower netdev.

Due to this nature, the bond TLS device offload features are not
explicitly controllable in the bond layer. As of today, these are
read-only values based on the evaluation of bond_sk_check().  However,
this indication might be incorrect and misleading, when the feature bits
are "fixed" by some dependency features.  For example,
NETIF_F_HW_TLS_TX/RX are forcefully cleared in case the corresponding
checksum offload is disabled. But in fact the bond ability to still
offload TLS connections to the lower device is not hurt.

This means that these bits can not be trusted, and hence better become
unused.

This patch revives some old discussion [1] and proposes a much simpler
solution: Clear the bond's TLS features bits. Everyone should stop
reading them.

[1] https://lore.kernel.org/netdev/20210526095747.22446-1-tariqt@nvidia.com/

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20221025105300.4718-1-tariqt@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-10-27 12:45:38 +02:00
David S. Miller
34e0b94520 Merge tag 'ieee802154-for-net-next-2022-10-25' of git://git.kernel.org/pub/scm/linux/kernel/git/sschmidt/wpan-next
Stefan Schmidt says:

====================

==
One of the biggest cycles for ieee802154 in a long time. We are landing the
first pieces of a big enhancements in managing PAN's. We might have another pull
request ready for this cycle later on, but I want to get this one out first.

Miquel Raynal added support for sending frames synchronously as a dependency
to handle MLME commands. Also introducing more filtering levels to match with
the needs of a device when scanning or operating as a pan coordinator.
To support development and testing the hwsim driver for ieee802154 was also
enhanced for the new filtering levels and to update the PIB attributes.

Alexander Aring fixed quite a few bugs spotted during reviewing changes. He
also added support for TRAC in the atusb driver to have better failure
handling if the firmware provides the needed information.

Jilin Yuan fixed a comment with a repeated word in it.
==================

Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-26 15:24:36 +01:00
Kees Cook
b5f0de6df6 net: dev: Convert sa_data to flexible array in struct sockaddr
One of the worst offenders of "fake flexible arrays" is struct sockaddr,
as it is the classic example of why GCC and Clang have been traditionally
forced to treat all trailing arrays as fake flexible arrays: in the
distant misty past, sa_data became too small, and code started just
treating it as a flexible array, even though it was fixed-size. The
special case by the compiler is specifically that sizeof(sa->sa_data)
and FORTIFY_SOURCE (which uses __builtin_object_size(sa->sa_data, 1))
do not agree (14 and -1 respectively), which makes FORTIFY_SOURCE treat
it as a flexible array.

However, the coming -fstrict-flex-arrays compiler flag will remove
these special cases so that FORTIFY_SOURCE can gain coverage over all
the trailing arrays in the kernel that are _not_ supposed to be treated
as a flexible array. To deal with this change, convert sa_data to a true
flexible array. To keep the structure size the same, move sa_data into
a union with a newly introduced sa_data_min with the original size. The
result is that FORTIFY_SOURCE can continue to have no idea how large
sa_data may actually be, but anything using sizeof(sa->sa_data) must
switch to sizeof(sa->sa_data_min).

Cc: Jens Axboe <axboe@kernel.dk>
Cc: Pavel Begunkov <asml.silence@gmail.com>
Cc: David Ahern <dsahern@kernel.org>
Cc: Dylan Yudaken <dylany@fb.com>
Cc: Yajun Deng <yajun.deng@linux.dev>
Cc: Petr Machata <petrm@nvidia.com>
Cc: Hangbin Liu <liuhangbin@gmail.com>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: syzbot <syzkaller@googlegroups.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20221018095503.never.671-kees@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-25 11:44:20 -07:00
Kuniyuki Iwashima
b261eda84e soreuseport: Fix socket selection for SO_INCOMING_CPU.
Kazuho Oku reported that setsockopt(SO_INCOMING_CPU) does not work
with setsockopt(SO_REUSEPORT) since v4.6.

With the combination of SO_REUSEPORT and SO_INCOMING_CPU, we could
build a highly efficient server application.

setsockopt(SO_INCOMING_CPU) associates a CPU with a TCP listener
or UDP socket, and then incoming packets processed on the CPU will
likely be distributed to the socket.  Technically, a socket could
even receive packets handled on another CPU if no sockets in the
reuseport group have the same CPU receiving the flow.

The logic exists in compute_score() so that a socket will get a higher
score if it has the same CPU with the flow.  However, the score gets
ignored after the blamed two commits, which introduced a faster socket
selection algorithm for SO_REUSEPORT.

This patch introduces a counter of sockets with SO_INCOMING_CPU in
a reuseport group to check if we should iterate all sockets to find
a proper one.  We increment the counter when

  * calling listen() if the socket has SO_INCOMING_CPU and SO_REUSEPORT

  * enabling SO_INCOMING_CPU if the socket is in a reuseport group

Also, we decrement it when

  * detaching a socket out of the group to apply SO_INCOMING_CPU to
    migrated TCP requests

  * disabling SO_INCOMING_CPU if the socket is in a reuseport group

When the counter reaches 0, we can get back to the O(1) selection
algorithm.

The overall changes are negligible for the non-SO_INCOMING_CPU case,
and the only notable thing is that we have to update sk_incomnig_cpu
under reuseport_lock.  Otherwise, the race prevents transitioning to
the O(n) algorithm and results in the wrong socket selection.

 cpu1 (setsockopt)               cpu2 (listen)
+-----------------+             +-------------+

lock_sock(sk1)                  lock_sock(sk2)

reuseport_update_incoming_cpu(sk1, val)
.
|  /* set CPU as 0 */
|- WRITE_ONCE(sk1->incoming_cpu, val)
|
|                               spin_lock_bh(&reuseport_lock)
|                               reuseport_grow(sk2, reuse)
|                               .
|                               |- more_socks_size = reuse->max_socks * 2U;
|                               |- if (more_socks_size > U16_MAX &&
|                               |       reuse->num_closed_socks)
|                               |  .
|                               |  |- RCU_INIT_POINTER(sk1->sk_reuseport_cb, NULL);
|                               |  `- __reuseport_detach_closed_sock(sk1, reuse)
|                               |     .
|                               |     `- reuseport_put_incoming_cpu(sk1, reuse)
|                               |        .
|                               |        |  /* Read shutdown()ed sk1's sk_incoming_cpu
|                               |        |   * without lock_sock().
|                               |        |   */
|                               |        `- if (sk1->sk_incoming_cpu >= 0)
|                               |           .
|                               |           |  /* decrement not-yet-incremented
|                               |           |   * count, which is never incremented.
|                               |           |   */
|                               |           `- __reuseport_put_incoming_cpu(reuse);
|                               |
|                               `- spin_lock_bh(&reuseport_lock)
|
|- spin_lock_bh(&reuseport_lock)
|
|- reuse = rcu_dereference_protected(sk1->sk_reuseport_cb, ...)
|- if (!reuse)
|  .
|  |  /* Cannot increment reuse->incoming_cpu. */
|  `- goto out;
|
`- spin_unlock_bh(&reuseport_lock)

Fixes: e32ea7e747 ("soreuseport: fast reuseport UDP socket selection")
Fixes: c125e80b88 ("soreuseport: fast reuseport TCP socket selection")
Reported-by: Kazuho Oku <kazuhooku@gmail.com>
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-10-25 11:35:16 +02:00
Amritha Nambiar
4a6a676f8c act_skbedit: skbedit queue mapping for receive queue
Add support for skbedit queue mapping action on receive
side. This is supported only in hardware, so the skip_sw
flag is enforced. This enables offloading filters for
receive queue selection in the hardware using the
skbedit action. Traffic arrives on the Rx queue requested
in the skbedit action parameter. A new tc action flag
TCA_ACT_FLAGS_AT_INGRESS is introduced to identify the
traffic direction the action queue_mapping is requested
on during filter addition. This is used to disallow
offloading the skbedit queue mapping action on transmit
side.

Example:
$tc filter add dev $IFACE ingress protocol ip flower dst_ip $DST_IP\
 action skbedit queue_mapping $rxq_id skip_sw

Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-10-25 10:32:40 +02:00
Russell King (Oracle)
3989004984 net: sfp: provide a definition for the power level select bit
Provide a named definition for the power level select bit in the
extended status register, rather than using BIT(0) in the code.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-24 21:06:16 -07:00
Jakub Kicinski
96917bb3a3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
include/linux/net.h
  a5ef058dc4 ("net: introduce and use custom sockopt socket flag")
  e993ffe3da ("net: flag sockets supporting msghdr originated zerocopy")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-24 13:44:11 -07:00
Linus Torvalds
337a0a0b63 Merge tag 'net-6.1-rc3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
 "Including fixes from bpf.

  The net-memcg fix stands out, the rest is very run-off-the-mill. Maybe
  I'm biased.

  Current release - regressions:

   - eth: fman: re-expose location of the MAC address to userspace,
     apparently some udev scripts depended on the exact value

  Current release - new code bugs:

   - bpf:
       - wait for busy refill_work when destroying bpf memory allocator
       - allow bpf_user_ringbuf_drain() callbacks to return 1
       - fix dispatcher patchable function entry to 5 bytes nop

  Previous releases - regressions:

   - net-memcg: avoid stalls when under memory pressure

   - tcp: fix indefinite deferral of RTO with SACK reneging

   - tipc: fix a null-ptr-deref in tipc_topsrv_accept

   - eth: macb: specify PHY PM management done by MAC

   - tcp: fix a signed-integer-overflow bug in tcp_add_backlog()

  Previous releases - always broken:

   - eth: amd-xgbe: SFP fixes and compatibility improvements

  Misc:

   - docs: netdev: offer performance feedback to contributors"

* tag 'net-6.1-rc3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (37 commits)
  net-memcg: avoid stalls when under memory pressure
  tcp: fix indefinite deferral of RTO with SACK reneging
  tcp: fix a signed-integer-overflow bug in tcp_add_backlog()
  net: lantiq_etop: don't free skb when returning NETDEV_TX_BUSY
  net: fix UAF issue in nfqnl_nf_hook_drop() when ops_init() failed
  docs: netdev: offer performance feedback to contributors
  kcm: annotate data-races around kcm->rx_wait
  kcm: annotate data-races around kcm->rx_psock
  net: fman: Use physical address for userspace interfaces
  net/mlx5e: Cleanup MACsec uninitialization routine
  atlantic: fix deadlock at aq_nic_stop
  nfp: only clean `sp_indiff` when application firmware is unloaded
  amd-xgbe: add the bit rate quirk for Molex cables
  amd-xgbe: fix the SFP compliance codes check for DAC cables
  amd-xgbe: enable PLL_CTL for fixed PHY modes only
  amd-xgbe: use enums for mailbox cmd and sub_cmds
  amd-xgbe: Yellow carp devices do not need rrc
  bpf: Use __llist_del_all() whenever possbile during memory draining
  bpf: Wait for busy refill_work when destroying bpf memory allocator
  MAINTAINERS: add keyword match on PTP
  ...
2022-10-24 12:43:51 -07:00
Jakub Kicinski
720ca52bce net-memcg: avoid stalls when under memory pressure
As Shakeel explains the commit under Fixes had the unintended
side-effect of no longer pre-loading the cached memory allowance.
Even tho we previously dropped the first packet received when
over memory limit - the consecutive ones would get thru by using
the cache. The charging was happening in batches of 128kB, so
we'd let in 128kB (truesize) worth of packets per one drop.

After the change we no longer force charge, there will be no
cache filling side effects. This causes significant drops and
connection stalls for workloads which use a lot of page cache,
since we can't reclaim page cache under GFP_NOWAIT.

Some of the latency can be recovered by improving SACK reneg
handling but nowhere near enough to get back to the pre-5.15
performance (the application I'm experimenting with still
sees 5-10x worst latency).

Apply the suggested workaround of using GFP_ATOMIC. We will now
be more permissive than previously as we'll drop _no_ packets
in softirq when under pressure. But I can't think of any good
and simple way to address that within networking.

Link: https://lore.kernel.org/all/20221012163300.795e7b86@kernel.org/
Suggested-by: Shakeel Butt <shakeelb@google.com>
Fixes: 4b1327be9f ("net-memcg: pass in gfp_t mask to mem_cgroup_charge_skmem()")
Acked-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
Link: https://lore.kernel.org/r/20221021160304.1362511-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-24 10:35:09 -07:00
Jakub Kicinski
e28c44450b Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Alexei Starovoitov says:

====================
pull-request: bpf 2022-10-23

We've added 7 non-merge commits during the last 18 day(s) which contain
a total of 8 files changed, 69 insertions(+), 5 deletions(-).

The main changes are:

1) Wait for busy refill_work when destroying bpf memory allocator, from Hou.

2) Allow bpf_user_ringbuf_drain() callbacks to return 1, from David.

3) Fix dispatcher patchable function entry to 5 bytes nop, from Jiri.

4) Prevent decl_tag from being referenced in func_proto, from Stanislav.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  bpf: Use __llist_del_all() whenever possbile during memory draining
  bpf: Wait for busy refill_work when destroying bpf memory allocator
  bpf: Fix dispatcher patchable function entry to 5 bytes nop
  bpf: prevent decl_tag from being referenced in func_proto
  selftests/bpf: Add reproducer for decl_tag in func_proto return type
  selftests/bpf: Make bpf_user_ringbuf_drain() selftest callback return 1
  bpf: Allow bpf_user_ringbuf_drain() callbacks to return 1
====================

Link: https://lore.kernel.org/r/20221023192244.81137-1-alexei.starovoitov@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-24 10:32:01 -07:00
Yunsheng Lin
4727bab4e9 net: skb: move skb_pp_recycle() to skbuff.c
skb_pp_recycle() is only used by skb_free_head() in
skbuff.c, so move it to skbuff.c.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-24 13:03:43 +01:00
xu xin
233baf9a1b net: remove useless parameter of __sock_cmsg_send
The parameter 'msg' has never been used by __sock_cmsg_send, so we can remove it
safely.

Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: xu xin <xu.xin16@zte.com.cn>
Reviewed-by: Zhang Yunkai <zhang.yunkai@zte.com.cn>
Acked-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-24 12:43:46 +01:00
Eric Dumazet
0cafd77dcd net: add a refcount tracker for kernel sockets
Commit ffa84b5ffb ("net: add netns refcount tracker to struct sock")
added a tracker to sockets, but did not track kernel sockets.

We still have syzbot reports hinting about netns being destroyed
while some kernel TCP sockets had not been dismantled.

This patch tracks kernel sockets, and adds a ref_tracker_dir_print()
call to net_free() right before the netns is freed.

Normally, each layer is responsible for properly releasing its
kernel sockets before last call to net_free().

This debugging facility is enabled with CONFIG_NET_NS_REFCNT_TRACKER=y

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Tested-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-24 11:04:43 +01:00
Paolo Abeni
8a3854c7b8 udp: track the forward memory release threshold in an hot cacheline
When the receiver process and the BH runs on different cores,
udp_rmem_release() experience a cache miss while accessing sk_rcvbuf,
as the latter shares the same cacheline with sk_forward_alloc, written
by the BH.

With this patch, UDP tracks the rcvbuf value and its update via custom
SOL_SOCKET socket options, and copies the forward memory threshold value
used by udp_rmem_release() in a different cacheline, already accessed by
the above function and uncontended.

Since the UDP socket init operation grown a bit, factor out the common
code between v4 and v6 in a shared helper.

Overall the above give a 10% peek throughput increase under UDP flood.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Acked-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-24 10:52:50 +01:00
Paolo Abeni
a5ef058dc4 net: introduce and use custom sockopt socket flag
We will soon introduce custom setsockopt for UDP sockets, too.
Instead of doing even more complex arbitrary checks inside
sock_use_custom_sol_socket(), add a new socket flag and set it
for the relevant socket types (currently only MPTCP).

Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Acked-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-24 10:52:50 +01:00
Amit Cohen
404c76783f ethtool: Add support for 800Gbps link modes
Add support for 800Gbps speed, link modes of 100Gbps per lane.
As mentioned in slide 21 in IEEE documentation [1], all adopted 802.3df
copper and optical PMDs baselines using 100G/lane will be supported.

Add the relevant PMDs which are mentioned in slide 5 in IEEE
documentation [1] and were approved on 10-2022 [2]:
BP - KR8
Cu Cable - CR8
MMF 50m - VR8
MMF 100m - SR8
SMF 500m - DR8
SMF 2km - DR8-2

[1]: https://www.ieee802.org/3/df/public/22_10/22_1004/shrikhande_3df_01a_221004.pdf
[2]: https://ieee802.org/3/df/KeyMotions_3df_221005.pdf

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-24 10:43:39 +01:00
Kuniyuki Iwashima
1f8c4eeb94 inet6: Remove inet6_destroy_sock().
The last user of inet6_destroy_sock() is its wrapper inet6_cleanup_sock().
Let's rename inet6_destroy_sock() to inet6_cleanup_sock().

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-24 09:40:39 +01:00
Linus Torvalds
05b4ebd2c7 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
 "RISC-V:

   - Fix compilation without RISCV_ISA_ZICBOM

   - Fix kvm_riscv_vcpu_timer_pending() for Sstc

  ARM:

   - Fix a bug preventing restoring an ITS containing mappings for very
     large and very sparse device topology

   - Work around a relocation handling error when compiling the nVHE
     object with profile optimisation

   - Fix for stage-2 invalidation holding the VM MMU lock for too long
     by limiting the walk to the largest block mapping size

   - Enable stack protection and branch profiling for VHE

   - Two selftest fixes

  x86:

   - add compat implementation for KVM_X86_SET_MSR_FILTER ioctl

  selftests:

   - synchronize includes between include/uapi and tools/include/uapi"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  tools: include: sync include/api/linux/kvm.h
  KVM: x86: Add compat handler for KVM_X86_SET_MSR_FILTER
  KVM: x86: Copy filter arg outside kvm_vm_ioctl_set_msr_filter()
  kvm: Add support for arch compat vm ioctls
  RISC-V: KVM: Fix kvm_riscv_vcpu_timer_pending() for Sstc
  RISC-V: Fix compilation without RISCV_ISA_ZICBOM
  KVM: arm64: vgic: Fix exit condition in scan_its_table()
  KVM: arm64: nvhe: Fix build with profile optimization
  KVM: selftests: Fix number of pages for memory slot in memslot_modification_stress_test
  KVM: arm64: selftests: Fix multiple versions of GIC creation
  KVM: arm64: Enable stack protection and branch profiling for VHE
  KVM: arm64: Limit stage2_apply_range() batch size to largest block
  KVM: arm64: Work out supported block level at compile time
2022-10-23 15:00:43 -07:00
Linus Torvalds
52826d3b2d kernel/utsname_sysctl.c: Fix hostname polling
Commit bfca3dd3d0 ("kernel/utsname_sysctl.c: print kernel arch") added
a new entry to the uts_kern_table[] array, but didn't update the
UTS_PROC_xyz enumerators of older entries, breaking anything that used
them.

Which is admittedly not many cases: it's really just the two uses of
uts_proc_notify() in kernel/sys.c.  But apparently journald-systemd
actually uses this to detect hostname changes.

Reported-by: Torsten Hilbrich <torsten.hilbrich@secunet.com>
Fixes: bfca3dd3d0 ("kernel/utsname_sysctl.c: print kernel arch")
Link: https://lore.kernel.org/lkml/0c2b92a6-0f25-9538-178f-eee3b06da23f@secunet.com/
Link: https://linux-regtracking.leemhuis.info/regzbot/regression/0c2b92a6-0f25-9538-178f-eee3b06da23f@secunet.com/
Cc: Petr Vorel <pvorel@suse.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-10-23 12:01:01 -07:00
Linus Torvalds
a703852408 Merge tag 'perf_urgent_for_v6.1_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Borislav Petkov:

 - Fix raw data handling when perf events are used in bpf

 - Rework how SIGTRAPs get delivered to events to address a bunch of
   problems with it. Add a selftest for that too

* tag 'perf_urgent_for_v6.1_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  bpf: Fix sample_flags for bpf_perf_event_output
  selftests/perf_events: Add a SIGTRAP stress test with disables
  perf: Fix missing SIGTRAPs
2022-10-23 10:14:45 -07:00
Linus Torvalds
295dad10bf Merge tag 'x86_urgent_for_v6.0_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:
 "As usually the case, right after a major release, the tip urgent
  branches accumulate a couple more fixes than normal. And here is the
  x86, a bit bigger, urgent pile.

   - Use the correct CPU capability clearing function on the error path
     in Intel perf LBR

   - A CFI fix to ftrace along with a simplification

   - Adjust handling of zero capacity bit mask for resctrl cache
     allocation on AMD

   - A fix to the AMD microcode loader to attempt patch application on
     every logical thread

   - A couple of topology fixes to handle CPUID leaf 0x1f enumeration
     info properly

   - Drop a -mabi=ms compiler option check as both compilers support it
     now anyway

   - A couple of fixes to how the initial, statically allocated FPU
     buffer state is setup and its interaction with dynamic states at
     runtime"

* tag 'x86_urgent_for_v6.0_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/fpu: Fix copy_xstate_to_uabi() to copy init states correctly
  perf/x86/intel/lbr: Use setup_clear_cpu_cap() instead of clear_cpu_cap()
  ftrace,kcfi: Separate ftrace_stub() and ftrace_stub_graph()
  x86/ftrace: Remove ftrace_epilogue()
  x86/resctrl: Fix min_cbm_bits for AMD
  x86/microcode/AMD: Apply the patch early on every logical thread
  x86/topology: Fix duplicated core ID within a package
  x86/topology: Fix multiple packages shown on a single-package system
  hwmon/coretemp: Handle large core ID value
  x86/Kconfig: Drop check for -mabi=ms for CONFIG_EFI_STUB
  x86/fpu: Exclude dynamic states from init_fpstate
  x86/fpu: Fix the init_fpstate size check with the actual size
  x86/fpu: Configure init_fpstate attributes orderly
2022-10-23 10:01:34 -07:00
Linus Torvalds
942e01ab90 Merge tag 'io_uring-6.1-2022-10-22' of git://git.kernel.dk/linux
Pull io_uring follow-up from Jens Axboe:
 "Currently the zero-copy has automatic fallback to normal transmit, and
  it was decided that it'd be cleaner to return an error instead if the
  socket type doesn't support it.

  Zero-copy does work with UDP and TCP, it's more of a future proofing
  kind of thing (eg for samba)"

* tag 'io_uring-6.1-2022-10-22' of git://git.kernel.dk/linux:
  io_uring/net: fail zc sendmsg when unsupported by socket
  io_uring/net: fail zc send when unsupported by socket
  net: flag sockets supporting msghdr originated zerocopy
2022-10-23 09:55:50 -07:00
Linus Torvalds
3272eb1ace Merge tag 'media/v6.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
Pull missed media updates from Mauro Carvalho Chehab:
 "It seems I screwed-up my previous pull request: it ends up that only
  half of the media patches that were in linux-next got merged in -rc1.

  The script which creates the signed tags silently failed due to
  5.19->6.0 so it ended generating a tag with incomplete stuff.

  So here are the missing parts:

   - a DVB core security fix

   - lots of fixes and cleanups for atomisp staging driver

   - old drivers that are VB1 are being moved to staging to be
     deprecated

   - several driver updates - mostly for embedded systems, but there are
     also some things addressing issues with some PC webcams, in the UVC
     video driver"

* tag 'media/v6.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (163 commits)
  media: sun6i-csi: Move csi buffer definition to main header file
  media: sun6i-csi: Introduce and use video helper functions
  media: sun6i-csi: Add media ops with link notify callback
  media: sun6i-csi: Remove controls handler from the driver
  media: sun6i-csi: Register the media device after creation
  media: sun6i-csi: Pass and store csi device directly in video code
  media: sun6i-csi: Tidy up video code
  media: sun6i-csi: Tidy up v4l2 code
  media: sun6i-csi: Tidy up Kconfig
  media: sun6i-csi: Use runtime pm for clocks and reset
  media: sun6i-csi: Define and use variant to get module clock rate
  media: sun6i-csi: Always set exclusive module clock rate
  media: sun6i-csi: Tidy up platform code
  media: sun6i-csi: Refactor main driver data structures
  media: sun6i-csi: Define and use driver name and (reworked) description
  media: cedrus: Add a Kconfig dependency on RESET_CONTROLLER
  media: sun8i-rotate: Add a Kconfig dependency on RESET_CONTROLLER
  media: sun8i-di: Add a Kconfig dependency on RESET_CONTROLLER
  media: sun4i-csi: Add a Kconfig dependency on RESET_CONTROLLER
  media: sun6i-csi: Add a Kconfig dependency on RESET_CONTROLLER
  ...
2022-10-22 15:30:15 -07:00
Pavel Begunkov
e993ffe3da net: flag sockets supporting msghdr originated zerocopy
We need an efficient way in io_uring to check whether a socket supports
zerocopy with msghdr provided ubuf_info. Add a new flag into the struct
socket flags fields.

Cc: <stable@vger.kernel.org> # 6.0
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/3dafafab822b1c66308bb58a0ac738b1e3f53f74.1666346426.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-10-22 08:42:58 -06:00
Alexander Graf
ed51862f2f kvm: Add support for arch compat vm ioctls
We will introduce the first architecture specific compat vm ioctl in the
next patch. Add all necessary boilerplate to allow architectures to
override compat vm ioctls when necessary.

Signed-off-by: Alexander Graf <graf@amazon.com>
Message-Id: <20221017184541.2658-2-graf@amazon.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-22 05:15:23 -04:00
Linus Torvalds
9d6e681d33 Merge tag 'acpi-6.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI fixes from Rafael Wysocki:
 "These fix issues introduced during this merge window (ACPI/PCI, device
  enumeration and documentation) and some other ones found recently.

  Specifics:

   - Add missing device reference counting to acpi_get_pci_dev() after
     changing it recently (Rafael Wysocki)

   - Fix resource list walk in acpi_dma_get_range() (Robin Murphy)

   - Add IRQ override quirk for LENOVO IdeaPad and extend the IRQ
     override warning message (Jiri Slaby)

   - Fix integer overflow in ghes_estatus_pool_init() (Ashish Kalra)

   - Fix multiple error records handling in one of the ACPI extlog
     driver code paths (Tony Luck)

   - Prune DSDT override documentation from index after dropping it
     (Bagas Sanjaya)"

* tag 'acpi-6.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI: scan: Fix DMA range assignment
  ACPI: PCI: Fix device reference counting in acpi_get_pci_dev()
  ACPI: resource: note more about IRQ override
  ACPI: resource: do IRQ override on LENOVO IdeaPad
  ACPI: extlog: Handle multiple records
  ACPI: APEI: Fix integer overflow in ghes_estatus_pool_init()
  Documentation: ACPI: Prune DSDT override documentation from index
2022-10-21 18:08:30 -07:00
Linus Torvalds
ec4cf5dbb1 Merge tag 'efi-fixes-for-v6.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi
Pull EFI fixes from Ard Biesheuvel:

 - fixes for the EFI variable store refactor that landed in v6.0

 - fixes for issues that were introduced during the merge window

 - back out some changes related to EFI zboot signing - we'll add a
   better solution for this during the next cycle

* tag 'efi-fixes-for-v6.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
  efi: runtime: Don't assume virtual mappings are missing if VA == PA == 0
  efi: libstub: Fix incorrect payload size in zboot header
  efi: libstub: Give efi_main() asmlinkage qualification
  efi: efivars: Fix variable writes without query_variable_store()
  efi: ssdt: Don't free memory if ACPI table was loaded successfully
  efi: libstub: Remove zboot signing from build options
2022-10-21 18:02:36 -07:00
Linus Torvalds
e97eace635 Merge tag 'iommu-fixes-v6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull iommu fixes from Joerg Roedel:
 "Intel VT-d fixes:

   - Fix a lockdep splat issue in intel_iommu_init()

   - Allow NVS regions to pass RMRR check

   - Domain cleanup in error path"

* tag 'iommu-fixes-v6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
  iommu/vt-d: Clean up si_domain in the init_dmars() error path
  iommu/vt-d: Allow NVS regions in arch_rmrr_sanity_check()
  iommu/vt-d: Use rcu_lock in get_resv_regions
  iommu: Add gfp parameter to iommu_alloc_resv_region
2022-10-21 17:47:39 -07:00
Linus Torvalds
4f1e0c18bc Merge tag 'linux-watchdog-6.1-rc2' of git://www.linux-watchdog.org/linux-watchdog
Pull watchdog updates from Wim Van Sebroeck:

 - Add tracing events for the most common watchdog events

* tag 'linux-watchdog-6.1-rc2' of git://www.linux-watchdog.org/linux-watchdog:
  watchdog: Add tracing events for the most usual watchdog events
2022-10-21 12:25:39 -07:00
Rafael J. Wysocki
3f8deab61e Merge branches 'acpi-scan', 'acpi-resource', 'acpi-apei', 'acpi-extlog' and 'acpi-docs'
Merge assorted ACPI fixes for 6.1-rc2:

 - Fix resource list walk in acpi_dma_get_range() (Robin Murphy).

 - Add IRQ override quirk for LENOVO IdeaPad and extend the IRQ
   override warning message (Jiri Slaby).

 - Fix integer overflow in ghes_estatus_pool_init() (Ashish Kalra).

 - Fix multiple error records handling in one of the ACPI extlog driver
   code paths (Tony Luck).

 - Prune DSDT override documentation from index after dropping it (Bagas
   Sanjaya).

* acpi-scan:
  ACPI: scan: Fix DMA range assignment

* acpi-resource:
  ACPI: resource: note more about IRQ override
  ACPI: resource: do IRQ override on LENOVO IdeaPad

* acpi-apei:
  ACPI: APEI: Fix integer overflow in ghes_estatus_pool_init()

* acpi-extlog:
  ACPI: extlog: Handle multiple records

* acpi-docs:
  Documentation: ACPI: Prune DSDT override documentation from index
2022-10-21 20:07:41 +02:00
Ard Biesheuvel
8a254d90a7 efi: efivars: Fix variable writes without query_variable_store()
Commit bbc6d2c6ef ("efi: vars: Switch to new wrapper layer")
refactored the efivars layer so that the 'business logic' related to
which UEFI variables affect the boot flow in which way could be moved
out of it, and into the efivarfs driver.

This inadvertently broke setting variables on firmware implementations
that lack the QueryVariableInfo() boot service, because we no longer
tolerate a EFI_UNSUPPORTED result from check_var_size() when calling
efivar_entry_set_get_size(), which now ends up calling check_var_size()
a second time inadvertently.

If QueryVariableInfo() is missing, we support writes of up to 64k -
let's move that logic into check_var_size(), and drop the redundant
call.

Cc: <stable@vger.kernel.org> # v6.0
Fixes: bbc6d2c6ef ("efi: vars: Switch to new wrapper layer")
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-10-21 11:09:40 +02:00
Lu Baolu
0251d0107c iommu: Add gfp parameter to iommu_alloc_resv_region
Add gfp parameter to iommu_alloc_resv_region() for the callers to specify
the memory allocation behavior. Thus iommu_alloc_resv_region() could also
be available in critical contexts.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Tested-by: Alex Williamson <alex.williamson@redhat.com>
Link: https://lore.kernel.org/r/20220927053109.4053662-2-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-10-21 10:49:32 +02:00
Alexey Kodanev
6fdfdef7fd sctp: remove unnecessary NULL check in sctp_association_init()
'&asoc->ulpq' passed to sctp_ulpq_init() as the first argument,
then sctp_qlpq_init() initializes it and eventually returns the
address of the struct member back. Therefore, in this case, the
return pointer cannot be NULL.

Moreover, it seems sctp_ulpq_init() has always been used only in
sctp_association_init(), so there's really no need to return ulpq
anymore.

Detected using the static analysis tool - Svace.
Signed-off-by: Alexey Kodanev <aleksei.kodanev@bell-sw.com>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Link: https://lore.kernel.org/r/20221019180735.161388-1-aleksei.kodanev@bell-sw.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-20 21:43:10 -07:00
Jiri Olsa
dbe69b2998 bpf: Fix dispatcher patchable function entry to 5 bytes nop
The patchable_function_entry(5) might output 5 single nop
instructions (depends on toolchain), which will clash with
bpf_arch_text_poke check for 5 bytes nop instruction.

Adding early init call for dispatcher that checks and change
the patchable entry into expected 5 nop instruction if needed.

There's no need to take text_mutex, because we are using it
in early init call which is called at pre-smp time.

Fixes: ceea991a01 ("bpf: Move bpf_dispatcher function out of ftrace locations")
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20221018075934.574415-1-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-10-20 18:57:51 -07:00
Linus Torvalds
e35184f321 Merge tag 'drm-fixes-2022-10-21' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
 "Usual fixes for the week.

  The amdgpu contains fixes for two regressions, one reported in
  response to rc1 which broke on SI GPUs, and one gfx9 APU regression.

  Otherwise it's mostly fixes for new IP, and some GPU reset fixes. vc4
  is just HDMI fixes, and panfrost has some mnor types fixes.

  Core:
   - fix connector DDC pointer
   - fix buffer overflow in format_helper_test

  amdgpu:
   - Mode2 reset fixes for Sienna Cichlid
   - Revert broken fan speed sensor fix
   - SMU 13.x fixes
   - GC 11.x fixes
   - RAS fixes
   - SR-IOV fixes
   - Fix BO move breakage on SI
   - Misc compiler fixes
   - Fix gfx9 APU regression caused by PCI AER fix

  vc4:
   - HDMI fixes

  panfrost:
   - compiler fixes"

* tag 'drm-fixes-2022-10-21' of git://anongit.freedesktop.org/drm/drm: (35 commits)
  drm/amdgpu: fix sdma doorbell init ordering on APUs
  drm/panfrost: replace endian-specific types with native ones
  drm/panfrost: Remove type name from internal structs
  drm/connector: Set DDC pointer in drmm_connector_init
  drm: tests: Fix a buffer overflow in format_helper_test
  drm/amdgpu: use DRM_SCHED_FENCE_DONT_PIPELINE for VM updates
  drm/sched: add DRM_SCHED_FENCE_DONT_PIPELINE flag
  drm/amdgpu: Fix for BO move issue
  drm/amdgpu: dequeue mes scheduler during fini
  drm/amd/pm: enable thermal alert on smu_v13_0_10
  drm/amdgpu: Program GC registers through RLCG interface in gfx_v11/gmc_v11
  drm/amdkfd: Fix type of reset_type parameter in hqd_destroy() callback
  drm/amd/display: Increase frame size limit for display_mode_vba_util_32.o
  drm/amd/pm: add SMU IP v13.0.4 IF version define to V7
  drm/amd/pm: update SMU IP v13.0.4 driver interface version
  drm/amd/pm: Init pm_attr_list when dpm is disabled
  drm/amd/pm: disable cstate feature for gpu reset scenario
  drm/amd/pm: fulfill SMU13.0.7 cstate control interface
  drm/amd/pm: fulfill SMU13.0.0 cstate control interface
  drm/amdgpu: Add sriov vf ras support in amdgpu_ras_asic_supported
  ...
2022-10-20 18:28:32 -07:00
Jakub Kicinski
94adb5e29e Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
No conflicts.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-20 17:49:10 -07:00
Linus Torvalds
6d36c728bc Merge tag 'net-6.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
 "Including fixes from netfilter.

  Current release - regressions:

   - revert "net: fix cpu_max_bits_warn() usage in
     netif_attrmask_next{,_and}"

   - revert "net: sched: fq_codel: remove redundant resource cleanup in
     fq_codel_init()"

   - dsa: uninitialized variable in dsa_slave_netdevice_event()

   - eth: sunhme: uninitialized variable in happy_meal_init()

  Current release - new code bugs:

   - eth: octeontx2: fix resource not freed after malloc

  Previous releases - regressions:

   - sched: fix return value of qdisc ingress handling on success

   - sched: fix race condition in qdisc_graft()

   - udp: update reuse->has_conns under reuseport_lock.

   - tls: strp: make sure the TCP skbs do not have overlapping data

   - hsr: avoid possible NULL deref in skb_clone()

   - tipc: fix an information leak in tipc_topsrv_kern_subscr

   - phylink: add mac_managed_pm in phylink_config structure

   - eth: i40e: fix DMA mappings leak

   - eth: hyperv: fix a RX-path warning

   - eth: mtk: fix memory leaks

  Previous releases - always broken:

   - sched: cake: fix null pointer access issue when cake_init() fails"

* tag 'net-6.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (43 commits)
  net: phy: dp83822: disable MDI crossover status change interrupt
  net: sched: fix race condition in qdisc_graft()
  net: hns: fix possible memory leak in hnae_ae_register()
  wwan_hwsim: fix possible memory leak in wwan_hwsim_dev_new()
  sfc: include vport_id in filter spec hash and equal()
  genetlink: fix kdoc warnings
  selftests: add selftest for chaining of tc ingress handling to egress
  net: Fix return value of qdisc ingress handling on success
  net: sched: sfb: fix null pointer access issue when sfb_init() fails
  Revert "net: sched: fq_codel: remove redundant resource cleanup in fq_codel_init()"
  net: sched: cake: fix null pointer access issue when cake_init() fails
  ethernet: marvell: octeontx2 Fix resource not freed after malloc
  netfilter: nf_tables: relax NFTA_SET_ELEM_KEY_END set flags requirements
  netfilter: rpfilter/fib: Set ->flowic_uid correctly for user namespaces.
  ionic: catch NULL pointer issue on reconfig
  net: hsr: avoid possible NULL deref in skb_clone()
  bnxt_en: fix memory leak in bnxt_nvm_test()
  ip6mr: fix UAF issue in ip6mr_sk_done() when addrconf_init_net() failed
  udp: Update reuse->has_conns under reuseport_lock.
  net: ethernet: mediatek: ppe: Remove the unused function mtk_foe_entry_usable()
  ...
2022-10-20 17:24:59 -07:00
Peter Zijlstra
883bbbffa5 ftrace,kcfi: Separate ftrace_stub() and ftrace_stub_graph()
Different function signatures means they needs to be different
functions; otherwise CFI gets upset.

As triggered by the ftrace boot tests:

  [] CFI failure at ftrace_return_to_handler+0xac/0x16c (target: ftrace_stub+0x0/0x14; expected type: 0x0a5d5347)

Fixes: 3c516f89e1 ("x86: Add support for CONFIG_CFI_CLANG")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lkml.kernel.org/r/Y06dg4e1xF6JTdQq@hirez.programming.kicks-ass.net
2022-10-20 17:10:27 +02:00
Steven Price
72655fb942 drm/panfrost: replace endian-specific types with native ones
__le32 and __le64 types aren't portable and are not available on
FreeBSD (which uses the same uAPI).

Instead of attempting to always output little endian, just use native
endianness in the dumps. Tools can detect the endianness in use by
looking at the 'magic' field, but equally we don't expect big-endian to
be used with Mali (there are no known implementations out there).

Bug: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7252
Fixes: 730c2bf4ad ("drm/panfrost: Add support for devcoredump")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Signed-off-by: Steven Price <steven.price@arm.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221017104602.142992-3-steven.price@arm.com
2022-10-20 11:02:11 +01:00
Steven Price
7228d9d792 drm/panfrost: Remove type name from internal structs
The two structs internal to struct panfrost_dump_object_header were
named, but sadly that is incompatible with C++, causing an error: "an
anonymous union may only have public non-static data members".

However nothing refers to struct pan_reg_hdr and struct pan_bomap_hdr
and there's no need to export these definitions, so lets drop them. This
fixes the C++ build error with the minimum change in userspace API.

Reported-by: Adrián Larumbe <adrian.larumbe@collabora.com>
Fixes: 730c2bf4ad ("drm/panfrost: Add support for devcoredump")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Signed-off-by: Steven Price <steven.price@arm.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221017104602.142992-2-steven.price@arm.com
2022-10-20 11:02:01 +01:00
Thomas Zimmermann
1aca5ce036 Merge drm/drm-fixes into drm-misc-fixes
Backmerging to get v6.1-rc1.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
2022-10-20 09:09:00 +02:00
Jakub Kicinski
a1a824f448 genetlink: fix kdoc warnings
Address a bunch of kdoc warnings:

include/net/genetlink.h:81: warning: Function parameter or member 'module' not described in 'genl_family'
include/net/genetlink.h:243: warning: expecting prototype for struct genl_info. Prototype was for struct genl_dumpit_info instead
include/net/genetlink.h:419: warning: Function parameter or member 'net' not described in 'genlmsg_unicast'
include/net/genetlink.h:438: warning: expecting prototype for gennlmsg_data(). Prototype was for genlmsg_data() instead
include/net/genetlink.h:244: warning: Function parameter or member 'op' not described in 'genl_dumpit_info'

Link: https://lore.kernel.org/r/20221018231310.1040482-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-19 13:08:30 -07:00
Edward Cree
51c352bdbc netlink: add support for formatted extack messages
Include an 80-byte buffer in struct netlink_ext_ack that can be used
 for scnprintf()ed messages.  This does mean that the resulting string
 can't be enumerated, translated etc. in the way NL_SET_ERR_MSG() was
 designed to allow.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-19 13:00:04 -07:00
Russell King (Oracle)
f392a18464 net: phylink: provide phylink_validate_mask_caps() helper
Provide a helper that restricts the link modes according to the
phylink capabilities.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
[rebased on net-next/master and added documentation]
Signed-off-by: Sean Anderson <sean.anderson@seco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-19 13:25:09 +01:00
Christian König
7b476affcc drm/sched: add DRM_SCHED_FENCE_DONT_PIPELINE flag
Setting this flag on a scheduler fence prevents pipelining of jobs
depending on this fence. In other words we always insert a full CPU
round trip before dependent jobs are pushed to the pipeline.

Signed-off-by: Christian König <christian.koenig@amd.com>
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2113#note_1579296
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Luben Tuikov <luben.tuikov@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221014081553.114899-1-christian.koenig@amd.com
2022-10-19 12:42:51 +02:00
Arnd Bergmann
a2fd08448f net: remove smc911x driver
This driver was used on Arm and SH machines until 2009, when the
last platforms moved to the smsc911x driver for the same hardware.

Time to retire this version.

Link: https://lore.kernel.org/netdev/1232010482-3744-1-git-send-email-steve.glendinning@smsc.com/
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20221017121900.3520108-1-arnd@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-18 19:33:51 -07:00
Jakub Kicinski
3566a79c9e Merge tag 'for-netdev' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2022-10-18

We've added 33 non-merge commits during the last 14 day(s) which contain
a total of 31 files changed, 874 insertions(+), 538 deletions(-).

The main changes are:

1) Add RCU grace period chaining to BPF to wait for the completion
   of access from both sleepable and non-sleepable BPF programs,
   from Hou Tao & Paul E. McKenney.

2) Improve helper UAPI by explicitly defining BPF_FUNC_xxx integer
   values. In the wild we have seen OS vendors doing buggy backports
   where helper call numbers mismatched. This is an attempt to make
   backports more foolproof, from Andrii Nakryiko.

3) Add libbpf *_opts API-variants for bpf_*_get_fd_by_id() functions,
   from Roberto Sassu.

4) Fix libbpf's BTF dumper for structs with padding-only fields,
   from Eduard Zingerman.

5) Fix various libbpf bugs which have been found from fuzzing with
   malformed BPF object files, from Shung-Hsi Yu.

6) Clean up an unneeded check on existence of SSE2 in BPF x86-64 JIT,
   from Jie Meng.

7) Fix various ASAN bugs in both libbpf and selftests when running
   the BPF selftest suite on arm64, from Xu Kuohai.

8) Fix missing bpf_iter_vma_offset__destroy() call in BPF iter selftest
   and use in-skeleton link pointer to remove an explicit bpf_link__destroy(),
   from Jiri Olsa.

9) Fix BPF CI breakage by pointing to iptables-legacy instead of relying
   on symlinked iptables which got upgraded to iptables-nft,
   from Martin KaFai Lau.

10) Minor BPF selftest improvements all over the place, from various others.

* tag 'for-netdev' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (33 commits)
  bpf/docs: Update README for most recent vmtest.sh
  bpf: Use rcu_trace_implies_rcu_gp() for program array freeing
  bpf: Use rcu_trace_implies_rcu_gp() in local storage map
  bpf: Use rcu_trace_implies_rcu_gp() in bpf memory allocator
  rcu-tasks: Provide rcu_trace_implies_rcu_gp()
  selftests/bpf: Use sys_pidfd_open() helper when possible
  libbpf: Fix null-pointer dereference in find_prog_by_sec_insn()
  libbpf: Deal with section with no data gracefully
  libbpf: Use elf_getshdrnum() instead of e_shnum
  selftest/bpf: Fix error usage of ASSERT_OK in xdp_adjust_tail.c
  selftests/bpf: Fix error failure of case test_xdp_adjust_tail_grow
  selftest/bpf: Fix memory leak in kprobe_multi_test
  selftests/bpf: Fix memory leak caused by not destroying skeleton
  libbpf: Fix memory leak in parse_usdt_arg()
  libbpf: Fix use-after-free in btf_dump_name_dups
  selftests/bpf: S/iptables/iptables-legacy/ in the bpf_nf and xdp_synproxy test
  selftests/bpf: Alphabetize DENYLISTs
  selftests/bpf: Add tests for _opts variants of bpf_*_get_fd_by_id()
  libbpf: Introduce bpf_link_get_fd_by_id_opts()
  libbpf: Introduce bpf_btf_get_fd_by_id_opts()
  ...
====================

Link: https://lore.kernel.org/r/20221018210631.11211-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-18 18:56:43 -07:00
Paul E. McKenney
e6c86c513f rcu-tasks: Provide rcu_trace_implies_rcu_gp()
As an accident of implementation, an RCU Tasks Trace grace period also
acts as an RCU grace period.  However, this could change at any time.
This commit therefore creates an rcu_trace_implies_rcu_gp() that currently
returns true to codify this accident.  Code relying on this accident
must call this function to verify that this accident is still happening.

Reported-by: Hou Tao <houtao@huaweicloud.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Martin KaFai Lau <martin.lau@linux.dev>
Link: https://lore.kernel.org/r/20221014113946.965131-2-houtao@huaweicloud.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-10-18 10:27:02 -07:00
Kuniyuki Iwashima
69421bf984 udp: Update reuse->has_conns under reuseport_lock.
When we call connect() for a UDP socket in a reuseport group, we have
to update sk->sk_reuseport_cb->has_conns to 1.  Otherwise, the kernel
could select a unconnected socket wrongly for packets sent to the
connected socket.

However, the current way to set has_conns is illegal and possible to
trigger that problem.  reuseport_has_conns() changes has_conns under
rcu_read_lock(), which upgrades the RCU reader to the updater.  Then,
it must do the update under the updater's lock, reuseport_lock, but
it doesn't for now.

For this reason, there is a race below where we fail to set has_conns
resulting in the wrong socket selection.  To avoid the race, let's split
the reader and updater with proper locking.

 cpu1                               cpu2
+----+                             +----+

__ip[46]_datagram_connect()        reuseport_grow()
.                                  .
|- reuseport_has_conns(sk, true)   |- more_reuse = __reuseport_alloc(more_socks_size)
|  .                               |
|  |- rcu_read_lock()
|  |- reuse = rcu_dereference(sk->sk_reuseport_cb)
|  |
|  |                               |  /* reuse->has_conns == 0 here */
|  |                               |- more_reuse->has_conns = reuse->has_conns
|  |- reuse->has_conns = 1         |  /* more_reuse->has_conns SHOULD BE 1 HERE */
|  |                               |
|  |                               |- rcu_assign_pointer(reuse->socks[i]->sk_reuseport_cb,
|  |                               |                     more_reuse)
|  `- rcu_read_unlock()            `- kfree_rcu(reuse, rcu)
|
|- sk->sk_state = TCP_ESTABLISHED

Note the likely(reuse) in reuseport_has_conns_set() is always true,
but we put the test there for ease of review.  [0]

For the record, usually, sk_reuseport_cb is changed under lock_sock().
The only exception is reuseport_grow() & TCP reqsk migration case.

  1) shutdown() TCP listener, which is moved into the latter part of
     reuse->socks[] to migrate reqsk.

  2) New listen() overflows reuse->socks[] and call reuseport_grow().

  3) reuse->max_socks overflows u16 with the new listener.

  4) reuseport_grow() pops the old shutdown()ed listener from the array
     and update its sk->sk_reuseport_cb as NULL without lock_sock().

shutdown()ed TCP sk->sk_reuseport_cb can be changed without lock_sock(),
but, reuseport_has_conns_set() is called only for UDP under lock_sock(),
so likely(reuse) never be false in reuseport_has_conns_set().

[0]: https://lore.kernel.org/netdev/CANn89iLja=eQHbsM_Ta2sQF0tOGU8vAGrh_izRuuHjuO1ouUag@mail.gmail.com/

Fixes: acdcecc612 ("udp: correct reuseport selection with connected sockets")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://lore.kernel.org/r/20221014182625.89913-1-kuniyu@amazon.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-10-18 10:17:18 +02:00