Commit Graph

996549 Commits

Author SHA1 Message Date
Ravi Bangoria
56901d483b selftests/bpf: Use nanosleep() syscall instead of sleep() in get_cgroup_id
Glibc's sleep() switched to clock_nanosleep() from nanosleep(), and thus
syscalls:sys_enter_nanosleep tracepoint is not hitting which is causing
testcase failure. Instead of depending on glibc sleep(), call nanosleep()
systemcall directly.

Before:

  # ./get_cgroup_id_user
  ...
  main:FAIL:compare_cgroup_id kern cgid 0 user cgid 483

After:

  # ./get_cgroup_id_user
  ...
  main:PASS:compare_cgroup_id

Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210316153048.136447-1-ravi.bangoria@linux.ibm.com
2021-03-17 00:16:59 +01:00
Jiapeng Chong
ebda107e5f selftests/bpf: Fix warning comparing pointer to 0
Fix the following coccicheck warnings:

  ./tools/testing/selftests/bpf/progs/fexit_test.c:77:15-16: WARNING
  comparing pointer to 0.

  ./tools/testing/selftests/bpf/progs/fexit_test.c:68:12-13: WARNING
  comparing pointer to 0.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/1615881577-3493-1-git-send-email-jiapeng.chong@linux.alibaba.com
2021-03-16 23:52:16 +01:00
Alexei Starovoitov
5531939a4d Merge branch 'Build BPF selftests and its libbpf, bpftool in debug mode'
Andrii Nakryiko says:

====================

Build BPF selftests and libbpf and bpftool, that are used as part of
selftests, in debug mode (specifically, -Og). This makes it much simpler and
nicer to do development and/or bug fixing. See patch #4 for some unscientific
measurements.

This patch set fixes new maybe-unitialized warnings produced in -Og build
mode. Patch #1 fixes the blocker which was causing some XDP selftests failures
due to non-zero padding in bpf_xdp_set_link_opts, which only happened in debug
mode.
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2021-03-16 12:26:50 -07:00
Andrii Nakryiko
252e3cbf2b selftests/bpf: Build everything in debug mode
Build selftests, bpftool, and libbpf in debug mode with DWARF data to
facilitate easier debugging.

In terms of impact on building and running selftests. Build is actually faster
now:

BEFORE: make -j60  380.21s user 37.87s system 1466% cpu 28.503 total
AFTER:  make -j60  345.47s user 37.37s system 1599% cpu 23.939 total

test_progs runtime seems to be the same:

BEFORE:
real    1m5.139s
user    0m1.600s
sys     0m43.977s

AFTER:
real    1m3.799s
user    0m1.721s
sys     0m42.420s

Huge difference is being able to debug issues throughout test_progs, bpftool,
and libbpf without constantly updating 3 Makefiles by hand (including GDB
seeing the source code without any extra incantations).

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210313210920.1959628-5-andrii@kernel.org
2021-03-16 12:26:49 -07:00
Andrii Nakryiko
105b842ba4 selftests/bpf: Fix maybe-uninitialized warning in xdpxceiver test
xsk_ring_prod__reserve() doesn't necessarily set idx in some conditions, so
from static analysis point of view compiler is right about the problems like:

In file included from xdpxceiver.c:92:
xdpxceiver.c: In function ‘xsk_populate_fill_ring’:
/data/users/andriin/linux/tools/testing/selftests/bpf/tools/include/bpf/xsk.h:119:20: warning: ‘idx’ may be used uninitialized in this function [-Wmaybe-uninitialized]
  return &addrs[idx & fill->mask];
                ~~~~^~~~~~~~~~~~
xdpxceiver.c:300:6: note: ‘idx’ was declared here
  u32 idx;
      ^~~
xdpxceiver.c: In function ‘tx_only’:
xdpxceiver.c:596:30: warning: ‘idx’ may be used uninitialized in this function [-Wmaybe-uninitialized]
   struct xdp_desc *tx_desc = xsk_ring_prod__tx_desc(&xsk->tx, idx + i);
                              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Fix two warnings reported by compiler by pre-initializing variable.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210313210920.1959628-4-andrii@kernel.org
2021-03-16 12:26:49 -07:00
Andrii Nakryiko
4bbb358368 bpftool: Fix maybe-uninitialized warnings
Somehow when bpftool is compiled in -Og mode, compiler produces new warnings
about possibly uninitialized variables. Fix all the reported problems.

Fixes: 2119f2189d ("bpftool: add C output format option to btf dump subcommand")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210313210920.1959628-3-andrii@kernel.org
2021-03-16 12:26:49 -07:00
Andrii Nakryiko
dde7b3f5f2 libbpf: Add explicit padding to bpf_xdp_set_link_opts
Adding such anonymous padding fixes the issue with uninitialized portions of
bpf_xdp_set_link_opts when using LIBBPF_DECLARE_OPTS macro with inline field
initialization:

DECLARE_LIBBPF_OPTS(bpf_xdp_set_link_opts, opts, .old_fd = -1);

When such code is compiled in debug mode, compiler is generating code that
leaves padding bytes uninitialized, which triggers error inside libbpf APIs
that do strict zero initialization checks for OPTS structs.

Adding anonymous padding field fixes the issue.

Fixes: bd5ca3ef93 ("libbpf: Add function to set link XDP fd while specifying old program")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210313210920.1959628-2-andrii@kernel.org
2021-03-16 12:26:49 -07:00
Wei Yongjun
4d0b93896f bpf: Make symbol 'bpf_task_storage_busy' static
The sparse tool complains as follows:

kernel/bpf/bpf_task_storage.c:23:1: warning:
 symbol '__pcpu_scope_bpf_task_storage_busy' was not declared. Should it be static?

This symbol is not used outside of bpf_task_storage.c, so this
commit marks it static.

Fixes: bc235cdb42 ("bpf: Prevent deadlock from recursive bpf_task_storage_[get|delete]")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20210311131505.1901509-1-weiyongjun1@huawei.com
2021-03-16 12:24:20 -07:00
Liu xuzhi
6bd45f2e78 kernel/bpf/: Fix misspellings using codespell tool
A typo is found out by codespell tool in 34th lines of hashtab.c:

$ codespell ./kernel/bpf/
./hashtab.c:34 : differrent ==> different

Fix a typo found by codespell.

Signed-off-by: Liu xuzhi <liu.xuzhi@zte.com.cn>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210311123103.323589-1-liu.xuzhi@zte.com.cn
2021-03-16 12:22:20 -07:00
Ilya Leoshkevich
ba3b86b9ce s390/bpf: Implement new atomic ops
Implement BPF_AND, BPF_OR and BPF_XOR as the existing BPF_ADD. Since
the corresponding machine instructions return the old value, BPF_FETCH
happens by itself, the only additional thing that is required is
zero-extension.

There is no single instruction that implements BPF_XCHG on s390, so use
a COMPARE AND SWAP loop.

BPF_CMPXCHG, on the other hand, can be implemented by a single COMPARE
AND SWAP. Zero-extension is automatically inserted by the verifier.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210304233002.149096-1-iii@linux.ibm.com
2021-03-16 12:18:49 -07:00
Pedro Tammela
23f50b5ac3 bpf: selftests: Remove unused 'nospace_err' in tests for batched ops in array maps
This seems to be a reminiscent from the hashmap tests.

Signed-off-by: Pedro Tammela <pctammela@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210315132954.603108-1-pctammela@gmail.com
2021-03-15 22:19:33 -07:00
Masanari Iida
d94436a5d1 samples: bpf: Fix a spelling typo in do_hbm_test.sh
This patch fixes a spelling typo in do_hbm_test.sh

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210315124454.1744594-1-standby24x7@gmail.com
2021-03-15 22:17:35 -07:00
Pedro Tammela
0205e9de42 libbpf: Avoid inline hint definition from 'linux/stddef.h'
Linux headers might pull 'linux/stddef.h' which defines
'__always_inline' as the following:

   #ifndef __always_inline
   #define __always_inline inline
   #endif

This becomes an issue if the program picks up the 'linux/stddef.h'
definition as the macro now just hints inline to clang.

This change now enforces the proper definition for BPF programs
regardless of the include order.

Signed-off-by: Pedro Tammela <pctammela@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210314173839.457768-1-pctammela@gmail.com
2021-03-15 22:11:17 -07:00
Manu Bretelle
6503b9f29a bpf: Add getter and setter for SO_REUSEPORT through bpf_{g,s}etsockopt
Augment the current set of options that are accessible via
bpf_{g,s}etsockopt to also support SO_REUSEPORT.

Signed-off-by: Manu Bretelle <chantra@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20210310182305.1910312-1-chantra@fb.com
2021-03-15 17:22:22 +01:00
Andrii Nakryiko
1211f4e9ae Merge branch 'libbpf/xsk cleanups'
Björn Töpel says:

====================

This series removes a header dependency from xsk.h, and moves
libbpf_util.h into xsk.h.

More details in each commit!

Thank you,
Björn
====================

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-03-10 13:45:25 -08:00
Björn Töpel
7e8bbe24cb libbpf: xsk: Move barriers from libbpf_util.h to xsk.h
The only user of libbpf_util.h is xsk.h. Move the barriers to xsk.h,
and remove libbpf_util.h. The barriers are used as an implementation
detail, and should not be considered part of the stable API.

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210310080929.641212-3-bjorn.topel@gmail.com
2021-03-10 13:45:16 -08:00
Björn Töpel
2882c48bf8 libbpf: xsk: Remove linux/compiler.h header
In commit 291471dd15 ("libbpf, xsk: Add libbpf_smp_store_release
libbpf_smp_load_acquire") linux/compiler.h was added as a dependency
to xsk.h, which is the user-facing API. This makes it harder for
userspace application to consume the library. Here the header
inclusion is removed, and instead {READ,WRITE}_ONCE() is added
explicitly.

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210310080929.641212-2-bjorn.topel@gmail.com
2021-03-10 13:38:07 -08:00
Jiapeng Chong
a9c80b03e5 bpf: Fix warning comparing pointer to 0
Fix the following coccicheck warning:

./tools/testing/selftests/bpf/progs/fentry_test.c:67:12-13: WARNING
comparing pointer to 0.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1615360714-30381-1-git-send-email-jiapeng.chong@linux.alibaba.com
2021-03-10 13:37:33 -08:00
Jiapeng Chong
04ea63e34a selftests/bpf: Fix warning comparing pointer to 0
Fix the following coccicheck warning:

./tools/testing/selftests/bpf/progs/test_global_func10.c:17:12-13:
WARNING comparing pointer to 0.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1615357366-97612-1-git-send-email-jiapeng.chong@linux.alibaba.com
2021-03-10 13:37:11 -08:00
David S. Miller
c1acda9807 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says:

====================
pull-request: bpf-next 2021-03-09

The following pull-request contains BPF updates for your *net-next* tree.

We've added 90 non-merge commits during the last 17 day(s) which contain
a total of 114 files changed, 5158 insertions(+), 1288 deletions(-).

The main changes are:

1) Faster bpf_redirect_map(), from Björn.

2) skmsg cleanup, from Cong.

3) Support for floating point types in BTF, from Ilya.

4) Documentation for sys_bpf commands, from Joe.

5) Support for sk_lookup in bpf_prog_test_run, form Lorenz.

6) Enable task local storage for tracing programs, from Song.

7) bpf_for_each_map_elem() helper, from Yonghong.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-09 18:07:05 -08:00
Linus Torvalds
05a59d7979 Merge git://git.kernel.org:/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from David Miller:

 1) Fix transmissions in dynamic SMPS mode in ath9k, from Felix Fietkau.

 2) TX skb error handling fix in mt76 driver, also from Felix.

 3) Fix BPF_FETCH atomic in x86 JIT, from Brendan Jackman.

 4) Avoid double free of percpu pointers when freeing a cloned bpf prog.
    From Cong Wang.

 5) Use correct printf format for dma_addr_t in ath11k, from Geert
    Uytterhoeven.

 6) Fix resolve_btfids build with older toolchains, from Kun-Chuan
    Hsieh.

 7) Don't report truncated frames to mac80211 in mt76 driver, from
    Lorenzop Bianconi.

 8) Fix watcdog timeout on suspend/resume of stmmac, from Joakim Zhang.

 9) mscc ocelot needs NET_DEVLINK selct in Kconfig, from Arnd Bergmann.

10) Fix sign comparison bug in TCP_ZEROCOPY_RECEIVE getsockopt(), from
    Arjun Roy.

11) Ignore routes with deleted nexthop object in mlxsw, from Ido
    Schimmel.

12) Need to undo tcp early demux lookup sometimes in nf_nat, from
    Florian Westphal.

13) Fix gro aggregation for udp encaps with zero csum, from Daniel
    Borkmann.

14) Make sure to always use imp*_ndo_send when necessaey, from Jason A.
    Donenfeld.

15) Fix TRSCER masks in sh_eth driver from Sergey Shtylyov.

16) prevent overly huge skb allocationsd in qrtr, from Pavel Skripkin.

17) Prevent rx ring copnsumer index loss of sync in enetc, from Vladimir
    Oltean.

18) Make sure textsearch copntrol block is large enough, from Wilem de
    Bruijn.

19) Revert MAC changes to r8152 leading to instability, from Hates Wang.

20) Advance iov in 9p even for empty reads, from Jissheng Zhang.

21) Double hook unregister in nftables, from PabloNeira Ayuso.

22) Fix memleak in ixgbe, fropm Dinghao Liu.

23) Avoid dups in pkt scheduler class dumps, from Maximilian Heyne.

24) Various mptcp fixes from Florian Westphal, Paolo Abeni, and Geliang
    Tang.

25) Fix DOI refcount bugs in cipso, from Paul Moore.

26) One too many irqsave in ibmvnic, from Junlin Yang.

27) Fix infinite loop with MPLS gso segmenting via virtio_net, from
    Balazs Nemeth.

* git://git.kernel.org:/pub/scm/linux/kernel/git/netdev/net: (164 commits)
  s390/qeth: fix notification for pending buffers during teardown
  s390/qeth: schedule TX NAPI on QAOB completion
  s390/qeth: improve completion of pending TX buffers
  s390/qeth: fix memory leak after failed TX Buffer allocation
  net: avoid infinite loop in mpls_gso_segment when mpls_hlen == 0
  net: check if protocol extracted by virtio_net_hdr_set_proto is correct
  net: dsa: xrs700x: check if partner is same as port in hsr join
  net: lapbether: Remove netif_start_queue / netif_stop_queue
  atm: idt77252: fix null-ptr-dereference
  atm: uPD98402: fix incorrect allocation
  atm: fix a typo in the struct description
  net: qrtr: fix error return code of qrtr_sendmsg()
  mptcp: fix length of ADD_ADDR with port sub-option
  net: bonding: fix error return code of bond_neigh_init()
  net: enetc: allow hardware timestamping on TX queues with tc-etf enabled
  net: enetc: set MAC RX FIFO to recommended value
  net: davicom: Use platform_get_irq_optional()
  net: davicom: Fix regulator not turned off on driver removal
  net: davicom: Fix regulator not turned off on failed probe
  net: dsa: fix switchdev objects on bridge master mistakenly being applied on ports
  ...
2021-03-09 17:15:56 -08:00
Linus Torvalds
6a30bedfdf Merge git://git.kernel.org:/pub/scm/linux/kernel/git/davem/sparc
Pull sparc fixes from David Miller:
 "Fix opcode filtering for exceptions, and clean up defconfig"

* git://git.kernel.org:/pub/scm/linux/kernel/git/davem/sparc:
  sparc: sparc64_defconfig: remove duplicate CONFIGs
  sparc64: Fix opcode filtering in handling of no fault loads
2021-03-09 17:08:41 -08:00
Corentin Labbe
69264b4a43 sparc: sparc64_defconfig: remove duplicate CONFIGs
After my patch there is CONFIG_ATA defined twice.
Remove the duplicate one.
Same problem for CONFIG_HAPPYMEAL, except I added as builtin for boot
test with NFS.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Fixes: a57cdeb369 ("sparc: sparc64_defconfig: add necessary configs for qemu")
Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-09 16:22:40 -08:00
Rob Gardner
e5e8b80d35 sparc64: Fix opcode filtering in handling of no fault loads
is_no_fault_exception() has two bugs which were discovered via random
opcode testing with stress-ng. Both are caused by improper filtering
of opcodes.

The first bug can be triggered by a floating point store with a no-fault
ASI, for instance "sta %f0, [%g0] #ASI_PNF", opcode C1A01040.

The code first tests op3[5] (0x1000000), which denotes a floating
point instruction, and then tests op3[2] (0x200000), which denotes a
store instruction. But these bits are not mutually exclusive, and the
above mentioned opcode has both bits set. The intent is to filter out
stores, so the test for stores must be done first in order to have
any effect.

The second bug can be triggered by a floating point load with one of
the invalid ASI values 0x8e or 0x8f, which pass this check in
is_no_fault_exception():
     if ((asi & 0xf2) == ASI_PNF)

An example instruction is "ldqa [%l7 + %o7] #ASI 0x8f, %f38",
opcode CF95D1EF. Asi values greater than 0x8b (ASI_SNFL) are fatal
in handle_ldf_stq(), and is_no_fault_exception() must not allow these
invalid asi values to make it that far.

In both of these cases, handle_ldf_stq() reacts by calling
sun4v_data_access_exception() or spitfire_data_access_exception(),
which call is_no_fault_exception() and results in an infinite
recursion.

Signed-off-by: Rob Gardner <rob.gardner@oracle.com>
Tested-by: Anatoly Pugachev <matorola@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-09 16:21:10 -08:00
David S. Miller
8515455720 Merge branch 's390-qeth-fixes'
Julian Wiedmann says:

====================
s390/qeth: fixes 2021-03-09

please apply the following patch series to netdev's net tree.

This brings one fix for a memleak in an error path of the setup code.
Also several fixes for dealing with pending TX buffers - two for old
bugs in their completion handling, and one recent regression in a
teardown path.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-09 16:14:54 -08:00
Julian Wiedmann
7eefda7f35 s390/qeth: fix notification for pending buffers during teardown
The cited commit reworked the state machine for pending TX buffers.
In qeth_iqd_tx_complete() it turned PENDING into a transient state, and
uses NEED_QAOB for buffers that get parked while waiting for their QAOB
completion.

But it missed to adjust the check in qeth_tx_complete_buf(). So if
qeth_tx_complete_pending_bufs() is called during teardown to drain
the parked TX buffers, we no longer raise a notification for af_iucv.

Instead of updating the checked state, just move this code into
qeth_tx_complete_pending_bufs() itself. This also gets rid of the
special-case in the common TX completion path.

Fixes: 8908f36d20 ("s390/qeth: fix af_iucv notification race")
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-09 16:14:54 -08:00
Julian Wiedmann
3e83d467a0 s390/qeth: schedule TX NAPI on QAOB completion
When a QAOB notifies us that a pending TX buffer has been delivered, the
actual TX completion processing by qeth_tx_complete_pending_bufs()
is done within the context of a TX NAPI instance. We shouldn't rely on
this instance being scheduled by some other TX event, but just do it
ourselves.

qeth_qdio_handle_aob() is called from qeth_poll(), ie. our main NAPI
instance. To avoid touching the TX queue's NAPI instance
before/after it is (un-)registered, reorder the code in qeth_open()
and qeth_stop() accordingly.

Fixes: 0da9581ddb ("qeth: exploit asynchronous delivery of storage blocks")
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-09 16:14:54 -08:00
Julian Wiedmann
c20383ad16 s390/qeth: improve completion of pending TX buffers
The current design attaches a pending TX buffer to a custom
single-linked list, which is anchored at the buffer's slot on the
TX ring. The buffer is then checked for final completion whenever
this slot is processed during a subsequent TX NAPI poll cycle.

But if there's insufficient traffic on the ring, we might never make
enough progress to get back to this ring slot and discover the pending
buffer's final TX completion. In particular if this missing TX
completion blocks the application from sending further traffic.

So convert the custom single-linked list code to a per-queue list_head,
and scan this list on every TX NAPI cycle.

Fixes: 0da9581ddb ("qeth: exploit asynchronous delivery of storage blocks")
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-09 16:14:54 -08:00
Julian Wiedmann
e7a36d27f6 s390/qeth: fix memory leak after failed TX Buffer allocation
When qeth_alloc_qdio_queues() fails to allocate one of the buffers that
back an Output Queue, the 'out_freeoutqbufs' path will free all
previously allocated buffers for this queue. But it misses to free the
half-finished queue struct itself.

Move the buffer allocation into qeth_alloc_output_queue(), and deal with
such errors internally.

Fixes: 0da9581ddb ("qeth: exploit asynchronous delivery of storage blocks")
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-09 16:14:53 -08:00
David S. Miller
b005c9ef5a Merge branch 'virtio_net-infinite-loop'
Balazs Nemeth says:

====================
net: prevent infinite loop caused by incorrect proto from virtio_net_hdr_set_proto

These patches prevent an infinite loop for gso packets with a protocol
from virtio net hdr that doesn't match the protocol in the packet.
Note that packets coming from a device without
header_ops->parse_protocol being implemented will not be caught by
the check in virtio_net_hdr_to_skb, but the infinite loop will still
be prevented by the check in the gso layer.

Changes from v2 to v3:
  - Remove unused *eth.
  - Use MPLS_HLEN to also check if the MPLS header length is a multiple
    of four.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-09 16:12:20 -08:00
Balazs Nemeth
d348ede32e net: avoid infinite loop in mpls_gso_segment when mpls_hlen == 0
A packet with skb_inner_network_header(skb) == skb_network_header(skb)
and ETH_P_MPLS_UC will prevent mpls_gso_segment from pulling any headers
from the packet. Subsequently, the call to skb_mac_gso_segment will
again call mpls_gso_segment with the same packet leading to an infinite
loop. In addition, ensure that the header length is a multiple of four,
which should hold irrespective of the number of stacked labels.

Signed-off-by: Balazs Nemeth <bnemeth@redhat.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-09 16:12:20 -08:00
Balazs Nemeth
924a9bc362 net: check if protocol extracted by virtio_net_hdr_set_proto is correct
For gso packets, virtio_net_hdr_set_proto sets the protocol (if it isn't
set) based on the type in the virtio net hdr, but the skb could contain
anything since it could come from packet_snd through a raw socket. If
there is a mismatch between what virtio_net_hdr_set_proto sets and
the actual protocol, then the skb could be handled incorrectly later
on.

An example where this poses an issue is with the subsequent call to
skb_flow_dissect_flow_keys_basic which relies on skb->protocol being set
correctly. A specially crafted packet could fool
skb_flow_dissect_flow_keys_basic preventing EINVAL to be returned.

Avoid blindly trusting the information provided by the virtio net header
by checking that the protocol in the packet actually matches the
protocol set by virtio_net_hdr_set_proto. Note that since the protocol
is only checked if skb->dev implements header_ops->parse_protocol,
packets from devices without the implementation are not checked at this
stage.

Fixes: 9274124f02 ("net: stricter validation of untrusted gso packets")
Signed-off-by: Balazs Nemeth <bnemeth@redhat.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-09 16:12:20 -08:00
Daniel Borkmann
32f91529e2 Merge branch 'bpf-xdp-redirect'
Björn Töpel says:

====================
This two patch series contain two optimizations for the
bpf_redirect_map() helper and the xdp_do_redirect() function.

The bpf_redirect_map() optimization is about avoiding the map lookup
dispatching. Instead of having a switch-statement and selecting the
correct lookup function, we let bpf_redirect_map() be a map operation,
where each map has its own bpf_redirect_map() implementation. This way
the run-time lookup is avoided.

The xdp_do_redirect() patch restructures the code, so that the map
pointer indirection can be avoided.

Performance-wise I got 4% improvement for XSKMAP
(sample:xdpsock/rx-drop), and 8% (sample:xdp_redirect_map) on my
machine.

v5->v6:  Removed REDIR enum, and instead use map_id and map_type. (Daniel)
         Applied Daniel's fixups on patch 1. (Daniel)
v4->v5:  Renamed map operation to map_redirect. (Daniel)
v3->v4:  Made bpf_redirect_map() a map operation. (Daniel)
v2->v3:  Fix build when CONFIG_NET is not set. (lkp)
v1->v2:  Removed warning when CONFIG_BPF_SYSCALL was not set. (lkp)
         Cleaned up case-clause in xdp_do_generic_redirect_map(). (Toke)
         Re-added comment. (Toke)
rfc->v1: Use map_id, and remove bpf_clear_redirect_map(). (Toke)
         Get rid of the macro and use __always_inline. (Jesper)
====================

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2021-03-10 01:07:21 +01:00
Björn Töpel
ee75aef23a bpf, xdp: Restructure redirect actions
The XDP_REDIRECT implementations for maps and non-maps are fairly
similar, but obviously need to take different code paths depending on
if the target is using a map or not. Today, the redirect targets for
XDP either uses a map, or is based on ifindex.

Here, the map type and id are added to bpf_redirect_info, instead of
the actual map. Map type, map item/ifindex, and the map_id (if any) is
passed to xdp_do_redirect().

For ifindex-based redirect, used by the bpf_redirect() XDP BFP helper,
a special map type/id are used. Map type of UNSPEC together with map id
equal to INT_MAX has the special meaning of an ifindex based
redirect. Note that valid map ids are 1 inclusive, INT_MAX exclusive
([1,INT_MAX[).

In addition to making the code easier to follow, using explicit type
and id in bpf_redirect_info has a slight positive performance impact
by avoiding a pointer indirection for the map type lookup, and instead
use the cacheline for bpf_redirect_info.

Since the actual map is not passed via bpf_redirect_info anymore, the
map lookup is only done in the BPF helper. This means that the
bpf_clear_redirect_map() function can be removed. The actual map item
is RCU protected.

The bpf_redirect_info flags member is not used by XDP, and not
read/written any more. The map member is only written to when
required/used, and not unconditionally.

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20210308112907.559576-3-bjorn.topel@gmail.com
2021-03-10 01:06:34 +01:00
Björn Töpel
e6a4750ffe bpf, xdp: Make bpf_redirect_map() a map operation
Currently the bpf_redirect_map() implementation dispatches to the
correct map-lookup function via a switch-statement. To avoid the
dispatching, this change adds bpf_redirect_map() as a map
operation. Each map provides its bpf_redirect_map() version, and
correct function is automatically selected by the BPF verifier.

A nice side-effect of the code movement is that the map lookup
functions are now local to the map implementation files, which removes
one additional function call.

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20210308112907.559576-2-bjorn.topel@gmail.com
2021-03-10 01:06:34 +01:00
George McCollister
286a8624d7 net: dsa: xrs700x: check if partner is same as port in hsr join
Don't assign dp to partner if it's the same port that xrs700x_hsr_join
was called with. The partner port is supposed to be the other port in
the HSR/PRP redundant pair not the same port. This fixes an issue
observed in testing where forwarding between redundant HSR ports on this
switch didn't work depending on the order the ports were added to the
hsr device.

Fixes: bd62e6f5e6 ("net: dsa: xrs700x: add HSR offloading support")
Signed-off-by: George McCollister <george.mccollister@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-09 16:03:37 -08:00
Andrii Nakryiko
11d39cfeec selftests/bpf: Fix compiler warning in BPF_KPROBE definition in loop6.c
Add missing return type to BPF_KPROBE definition. Without it, compiler
generates the following warning:

progs/loop6.c:68:12: warning: type specifier missing, defaults to 'int' [-Wimplicit-int]
BPF_KPROBE(trace_virtqueue_add_sgs, void *unused, struct scatterlist **sgs,
           ^
1 warning generated.

Fixes: 86a35af628 ("selftests/bpf: Add a verifier scale test with unknown bounded loop")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210309044322.3487636-1-andrii@kernel.org
2021-03-10 00:11:16 +01:00
Linus Torvalds
4b3d9f9cf1 Merge tag 'gpio-fixes-for-v5.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull gpio fixes from Bartosz Golaszewski:
 "A bunch of fixes for the GPIO subsystem. We have two regressions in
  the core code spotted right after the merge window, a series of fixes
  for ACPI GPIO and a subsequent fix for a related regression in
  gpio-pca953x + a minor tweak in .gitignore and a rework of handling of
  the gpio-line-names to remedy a regression in stm32mp151.

  Summary:

   - fix two regressions in core GPIO subsystem code: one NULL-pointer
     dereference and one list corruption

   - read GPIO line names from fwnode instead of using the generic
     device properties to fix a regression on stm32mp151

   - fixes to ACPI GPIO and gpio-pca953x to handle a regression in IRQ
     handling on Intel Galileo

   - update .gitignore in GPIO selftests"

* tag 'gpio-fixes-for-v5.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
  gpiolib: Read "gpio-line-names" from a firmware node
  gpio: pca953x: Set IRQ type when handle Intel Galileo Gen 2
  gpiolib: acpi: Allow to find GpioInt() resource by name and index
  gpiolib: acpi: Add ACPI_GPIO_QUIRK_ABSOLUTE_NUMBER quirk
  gpiolib: acpi: Add missing IRQF_ONESHOT
  gpio: fix gpio-device list corruption
  gpio: fix NULL-deref-on-deregistration regression
  selftests: gpio: update .gitignore
2021-03-09 12:02:18 -08:00
Linus Torvalds
9c39198a65 Merge tag 'mips-fixes_5.12_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux
Pull MIPS fixes from Thomas Bogendoerfer:

 - fixes for boot breakage because of misaligned FDTs

 - fix for overwritten exception handlers

 - enable MIPS optimized crypto for all MIPS CPUs to improve wireguard
   performance

* tag 'mips-fixes_5.12_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
  MIPS: kernel: Reserve exception base early to prevent corruption
  MIPS: vmlinux.lds.S: align raw appended dtb to 8 bytes
  crypto: mips/poly1305 - enable for all MIPS processors
  MIPS: boot/compressed: Copy DTB to aligned address
2021-03-09 11:56:41 -08:00
Xie He
f7d9d48545 net: lapbether: Remove netif_start_queue / netif_stop_queue
For the devices in this driver, the default qdisc is "noqueue",
because their "tx_queue_len" is 0.

In function "__dev_queue_xmit" in "net/core/dev.c", devices with the
"noqueue" qdisc are specially handled. Packets are transmitted without
being queued after a "dev->flags & IFF_UP" check. However, it's possible
that even if this check succeeds, "ops->ndo_stop" may still have already
been called. This is because in "__dev_close_many", "ops->ndo_stop" is
called before clearing the "IFF_UP" flag.

If we call "netif_stop_queue" in "ops->ndo_stop", then it's possible in
"__dev_queue_xmit", it sees the "IFF_UP" flag is present, and then it
checks "netif_xmit_stopped" and finds that the queue is already stopped.
In this case, it will complain that:
"Virtual device ... asks to queue packet!"

To prevent "__dev_queue_xmit" from generating this complaint, we should
not call "netif_stop_queue" in "ops->ndo_stop".

We also don't need to call "netif_start_queue" in "ops->ndo_open",
because after a netdev is allocated and registered, the
"__QUEUE_STATE_DRV_XOFF" flag is initially not set, so there is no need
to call "netif_start_queue" to clear it.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Xie He <xie.he.0141@gmail.com>
Acked-by: Martin Schiller <ms@dev.tdt.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-09 11:03:09 -08:00
Alexei Starovoitov
34c9a7c5b7 Merge branch 'Add clang-based BTF_KIND_FLOAT tests'
Ilya Leoshkevich says:

====================

The two tests here did not make it into the main BTF_KIND_FLOAT series
because of dependency on LLVM.

v0: https://lore.kernel.org/bpf/20210226202256.116518-10-iii@linux.ibm.com/
v1 -> v0: Per Alexei's suggestion, document the required LLVM commit.

v1: https://lore.kernel.org/bpf/20210305170844.151594-1-iii@linux.ibm.com/
v1 -> v2: Per Andrii's suggestions, use double in
          core_reloc_size___diff_sz and non-pointer member in
          btf_dump_test_case_syntax.
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2021-03-09 10:59:46 -08:00
Ilya Leoshkevich
ccb0e23ca2 selftests/bpf: Add BTF_KIND_FLOAT to btf_dump_test_case_syntax
Check that dumping various floating-point types produces a valid C
code.

Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210309005649.162480-3-iii@linux.ibm.com
2021-03-09 10:59:46 -08:00
Ilya Leoshkevich
3fcd50d6f9 selftests/bpf: Add BTF_KIND_FLOAT to test_core_reloc_size
Verify that bpf_core_field_size() is working correctly with floats.
Also document the required clang version.

Suggested-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210309005649.162480-2-iii@linux.ibm.com
2021-03-09 10:59:46 -08:00
Thomas Bogendoerfer
bd67b711bf MIPS: kernel: Reserve exception base early to prevent corruption
BMIPS is one of the few platforms that do change the exception base.
After commit 2dcb396454 ("memblock: do not start bottom-up allocations
with kernel_end") we started seeing BMIPS boards fail to boot with the
built-in FDT being corrupted.

Before the cited commit, early allocations would be in the [kernel_end,
RAM_END] range, but after commit they would be within [RAM_START +
PAGE_SIZE, RAM_END].

The custom exception base handler that is installed by
bmips_ebase_setup() done for BMIPS5000 CPUs ends-up trampling on the
memory region allocated by unflatten_and_copy_device_tree() thus
corrupting the FDT used by the kernel.

To fix this, we need to perform an early reservation of the custom
exception space. Additional we reserve the first 4k (1k for R3k) for
either normal exception vector space (legacy CPUs) or special vectors
like cache exceptions.

Huge thanks to Serge for analysing and proposing a solution to this
issue.

Fixes: 2dcb396454 ("memblock: do not start bottom-up allocations with kernel_end")
Reported-by: Kamal Dasu <kdasu.kdev@gmail.com>
Debugged-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Tested-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Serge Semin <fancer.lancer@gmail.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2021-03-09 11:22:59 +01:00
Linus Torvalds
987a08741d Merge git://git.kernel.org:/pub/scm/linux/kernel/git/davem/sparc
Pull sparc updates from David Miller:
 "Just some more random bits from Al, including a conversion over to
  generic extables"

* git://git.kernel.org:/pub/scm/linux/kernel/git/davem/sparc:
  sparc32: take ->thread.flags out
  sparc32: get rid of fake_swapper_regs
  sparc64: get rid of fake_swapper_regs
  sparc32: switch to generic extables
  sparc32: switch copy_user.S away from range exception table entries
  sparc32: get rid of range exception table entries in checksum_32.S
  sparc32: switch __bzero() away from range exception table entries
  sparc32: kill lookup_fault()
  sparc32: don't bother with lookup_fault() in __bzero()
2021-03-08 22:01:58 -08:00
Tong Zhang
4416e98594 atm: idt77252: fix null-ptr-dereference
this one is similar to the phy_data allocation fix in uPD98402, the
driver allocate the idt77105_priv and store to dev_data but later
dereference using dev->dev_data, which will cause null-ptr-dereference.

fix this issue by changing dev_data to phy_data so that PRIV(dev) can
work correctly.

Signed-off-by: Tong Zhang <ztong0001@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-08 15:16:30 -08:00
Tong Zhang
3153724fc0 atm: uPD98402: fix incorrect allocation
dev->dev_data is set in zatm.c, calling zatm_start() will overwrite this
dev->dev_data in uPD98402_start() and a subsequent PRIV(dev)->lock
(i.e dev->phy_data->lock) will result in a null-ptr-dereference.

I believe this is a typo and what it actually want to do is to allocate
phy_data instead of dev_data.

Signed-off-by: Tong Zhang <ztong0001@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-08 15:16:30 -08:00
Tong Zhang
1019d7923d atm: fix a typo in the struct description
phy_data means private PHY data not date

Signed-off-by: Tong Zhang <ztong0001@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-08 15:16:30 -08:00
Jia-Ju Bai
179d0ba0c4 net: qrtr: fix error return code of qrtr_sendmsg()
When sock_alloc_send_skb() returns NULL to skb, no error return code of
qrtr_sendmsg() is assigned.
To fix this bug, rc is assigned with -ENOMEM in this case.

Fixes: 194ccc8829 ("net: qrtr: Support decoding incoming v2 packets")
Reported-by: TOTE Robot <oslab@tsinghua.edu.cn>
Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-08 15:02:53 -08:00
Davide Caratti
27ab92d999 mptcp: fix length of ADD_ADDR with port sub-option
in current Linux, MPTCP peers advertising endpoints with port numbers use
a sub-option length that wrongly accounts for the trailing TCP NOP. Also,
receivers will only process incoming ADD_ADDR with port having such wrong
sub-option length. Fix this, making ADD_ADDR compliant to RFC8684 §3.4.1.

this can be verified running tcpdump on the kselftests artifacts:

 unpatched kernel:
 [root@bottarga mptcp]# tcpdump -tnnr unpatched.pcap | grep add-addr
 reading from file unpatched.pcap, link-type LINUX_SLL (Linux cooked v1), snapshot length 65535
 IP 10.0.1.1.10000 > 10.0.1.2.53078: Flags [.], ack 101, win 509, options [nop,nop,TS val 214459678 ecr 521312851,mptcp add-addr v1 id 1 a00:201:2774:2d88:7436:85c3:17fd:101], length 0
 IP 10.0.1.2.53078 > 10.0.1.1.10000: Flags [.], ack 101, win 502, options [nop,nop,TS val 521312852 ecr 214459678,mptcp add-addr[bad opt]]

 patched kernel:
 [root@bottarga mptcp]# tcpdump -tnnr patched.pcap | grep add-addr
 reading from file patched.pcap, link-type LINUX_SLL (Linux cooked v1), snapshot length 65535
 IP 10.0.1.1.10000 > 10.0.1.2.38178: Flags [.], ack 101, win 509, options [nop,nop,TS val 3728873902 ecr 2732713192,mptcp add-addr v1 id 1 10.0.2.1:10100 hmac 0xbccdfcbe59292a1f,nop,nop], length 0
 IP 10.0.1.2.38178 > 10.0.1.1.10000: Flags [.], ack 101, win 502, options [nop,nop,TS val 2732713195 ecr 3728873902,mptcp add-addr v1-echo id 1 10.0.2.1:10100,nop,nop], length 0

Fixes: 22fb85ffae ("mptcp: add port support for ADD_ADDR suboption writing")
CC: stable@vger.kernel.org # 5.11+
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Acked-and-tested-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-08 15:02:03 -08:00