Commit Graph

96089 Commits

Author SHA1 Message Date
Song Liu
e8fce23946 tcp: add tracepoint trace_tcp_set_state()
This patch adds tracepoint trace_tcp_set_state. Besides usual fields
(s/d ports, IP addresses), old and new state of the socket is also
printed with TP_printk, with __print_symbolic().

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-24 01:21:25 +01:00
Song Liu
e1a4aa50f4 tcp: add tracepoint trace_tcp_destroy_sock
This patch adds trace event trace_tcp_destroy_sock.

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-24 01:21:25 +01:00
Song Liu
5941521c05 tcp: add tracepoint trace_tcp_receive_reset
New tracepoint trace_tcp_receive_reset is added and called from
tcp_reset(). This tracepoint is define with a new class tcp_event_sk.

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-24 01:21:25 +01:00
Song Liu
c24b14c46b tcp: add tracepoint trace_tcp_send_reset
New tracepoint trace_tcp_send_reset is added and called from
tcp_v4_send_reset(), tcp_v6_send_reset() and tcp_send_active_reset().

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-24 01:21:25 +01:00
Song Liu
7344e29f28 tcp: mark trace event arguments sk and skb as const
Some functions that we plan to add trace points require const sk
and/or skb. So we mark these fields as const in the tracepoint.

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-24 01:21:25 +01:00
Song Liu
f6e37b2541 tcp: add trace event class tcp_event_sk_skb
Introduce event class tcp_event_sk_skb for tcp tracepoints that
have arguments sk and skb.

Existing tracepoint trace_tcp_retransmit_skb() falls into this class.
This patch rewrites the definition of trace_tcp_retransmit_skb() with
tcp_event_sk_skb.

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-24 01:21:25 +01:00
David S. Miller
5908064a0b Merge tag 'batadv-next-for-davem-20171023' of git://git.open-mesh.org/linux-merge
Simon Wunderlich says:

====================
This documentation/cleanup patchset includes the following patches:

 - Fix parameter kerneldoc which caused kerneldoc warnings, by Sven Eckelmann

 - Remove spurious warnings in B.A.T.M.A.N. V neighbor comparison,
   by Sven Eckelmann

 - Use inline kernel-doc style for UAPI constants, by Sven Eckelmann
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-24 01:15:03 +01:00
Sven Eckelmann
40b16b9be5 batman-adv: use inline kernel-doc for uapi constants
The enums of constants for netlink tends to become rather large over time.
Documenting them is easier when the kernel-doc is actually next to constant
and not in a different block above the enum.

Also inline kernel-doc allows multi-paragraph description. This could be
required to better document the netlink command types and the expected
return values.

Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2017-10-23 14:22:25 +02:00
David S. Miller
f8ddadc4db Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
There were quite a few overlapping sets of changes here.

Daniel's bug fix for off-by-ones in the new BPF branch instructions,
along with the added allowances for "data_end > ptr + x" forms
collided with the metadata additions.

Along with those three changes came veritifer test cases, which in
their final form I tried to group together properly.  If I had just
trimmed GIT's conflict tags as-is, this would have split up the
meta tests unnecessarily.

In the socketmap code, a set of preemption disabling changes
overlapped with the rename of bpf_compute_data_end() to
bpf_compute_data_pointers().

Changes were made to the mv88e6060.c driver set addr method
which got removed in net-next.

The hyperv transport socket layer had a locking change in 'net'
which overlapped with a change of socket state macro usage
in 'net-next'.

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-22 13:39:14 +01:00
Linus Torvalds
b5ac3beb5a Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:
 "A little more than usual this time around. Been travelling, so that is
  part of it.

  Anyways, here are the highlights:

   1) Deal with memcontrol races wrt. listener dismantle, from Eric
      Dumazet.

   2) Handle page allocation failures properly in nfp driver, from Jaku
      Kicinski.

   3) Fix memory leaks in macsec, from Sabrina Dubroca.

   4) Fix crashes in pppol2tp_session_ioctl(), from Guillaume Nault.

   5) Several fixes in bnxt_en driver, including preventing potential
      NVRAM parameter corruption from Michael Chan.

   6) Fix for KRACK attacks in wireless, from Johannes Berg.

   7) rtnetlink event generation fixes from Xin Long.

   8) Deadlock in mlxsw driver, from Ido Schimmel.

   9) Disallow arithmetic operations on context pointers in bpf, from
      Jakub Kicinski.

  10) Missing sock_owned_by_user() check in sctp_icmp_redirect(), from
      Xin Long.

  11) Only TCP is supported for sockmap, make that explicit with a
      check, from John Fastabend.

  12) Fix IP options state races in DCCP and TCP, from Eric Dumazet.

  13) Fix panic in packet_getsockopt(), also from Eric Dumazet.

  14) Add missing locked in hv_sock layer, from Dexuan Cui.

  15) Various aquantia bug fixes, including several statistics handling
      cures. From Igor Russkikh et al.

  16) Fix arithmetic overflow in devmap code, from John Fastabend.

  17) Fix busted socket memory accounting when we get a fault in the tcp
      zero copy paths. From Willem de Bruijn.

  18) Don't leave opt->tot_len uninitialized in ipv6, from Eric Dumazet"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (106 commits)
  stmmac: Don't access tx_q->dirty_tx before netif_tx_lock
  ipv6: flowlabel: do not leave opt->tot_len with garbage
  of_mdio: Fix broken PHY IRQ in case of probe deferral
  textsearch: fix typos in library helpers
  rxrpc: Don't release call mutex on error pointer
  net: stmmac: Prevent infinite loop in get_rx_timestamp_status()
  net: stmmac: Fix stmmac_get_rx_hwtstamp()
  net: stmmac: Add missing call to dev_kfree_skb()
  mlxsw: spectrum_router: Configure TIGCR on init
  mlxsw: reg: Add Tunneling IPinIP General Configuration Register
  net: ethtool: remove error check for legacy setting transceiver type
  soreuseport: fix initialization race
  net: bridge: fix returning of vlan range op errors
  sock: correct sk_wmem_queued accounting on efault in tcp zerocopy
  bpf: add test cases to bpf selftests to cover all access tests
  bpf: fix pattern matches for direct packet access
  bpf: fix off by one for range markings with L{T, E} patterns
  bpf: devmap fix arithmetic overflow in bitmap_size calculation
  net: aquantia: Bad udp rate on default interrupt coalescing
  net: aquantia: Enable coalescing management via ethtool interface
  ...
2017-10-21 22:44:48 -04:00
Lawrence Brakmo
cd86d1fd21 bpf: Adding helper function bpf_getsockops
Adding support for helper function bpf_getsockops to socket_ops BPF
programs. This patch only supports TCP_CONGESTION.

Signed-off-by: Vlad Vysotsky <vlad@cs.ucla.edu>
Acked-by: Lawrence Brakmo <brakmo@fb.com>
Acked-by: Alexei Starovoitov <ast@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-22 03:12:05 +01:00
Lawrence Brakmo
e6546ef6d8 bpf: add support for BPF_SOCK_OPS_BASE_RTT
A congestion control algorithm can make a call to the BPF socket_ops
program to request the base RTT. The base RTT can be congestion control
dependent and is meant to represent a congestion threshold such that
RTTs above it indicate congestion. This is especially useful for flows
within a DC where the base RTT is easy to obtain.

Being provided a base RTT solves a basic problem in RTT based congestion
avoidance algorithms (such as Vegas, NV and BBR). Although it is easy
to get the base RTT when the network is not congested, it is very
diffcult to do when it is very congested. Newer connections get an
inflated value of the base RTT leading to unfariness (newer flows with a
larger base RTT get more bandwidth). As a result, RTT based congestion
avoidance algorithms tend to update their base RTTs to improve fairness.
In very congested networks this can lead to base RTT inflation, reducing
the ability of these RTT based congestion control algorithms to prevent
congestion.

Note that in my experiments with TCP-NV, the base RTT provided can be
much larger than the actual hardware RTT. For example, experimenting
with hosts within a rack where the hardware RTT is 16-20us, I've used
base RTTs up to 150us. The effect of using a larger base RTT is that the
congestion avoidance algorithm will allow more queueing. When there are
only a few flows the main effect is larger measured RTTs and RPC
latencies due to the increased queueing. When there are a lot of flows,
a larger base RTT can lead to more congestion and more packet drops.
For this case, where the hardware RTT is 20us, a base RTT of 80us
produces good results.

This patch only introduces BPF_SOCK_OPS_BASE_RTT, a later patch in this
set adds support for using it in TCP-NV. Further study and testing is
needed before support can be added to other delay based congestion
avoidance algorithms.

Signed-off-by: Lawrence Brakmo <brakmo@fb.com>
Acked-by: Alexei Starovoitov <ast@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-22 03:12:05 +01:00
Linus Torvalds
e5f468b3f2 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input fixes from Dmitry Torokhov:

 - joydev now implements a blacklist to avoid creating joystick nodes
   for accelerometers found in composite devices such as PlaStation
   controllers

 - assorted driver fixes

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: ims-psu - check if CDC union descriptor is sane
  Input: joydev - blacklist ds3/ds4/udraw motion sensors
  Input: allow matching device IDs on property bits
  Input: factor out and export input_device_id matching code
  Input: goodix - poll the 'buffer status' bit before reading data
  Input: axp20x-pek - fix module not auto-loading for axp221 pek
  Input: tca8418 - enable interrupt after it has been requested
  Input: stmfts - fix setting ABS_MT_POSITION_* maximum size
  Input: ti_am335x_tsc - fix incorrect step config for 5 wire touchscreen
  Input: synaptics - disable kernel tracking on SMBus devices
2017-10-21 21:46:39 -04:00
Elena Reshetova
e65f7ee39b drivers, connector: convert cn_callback_entry.refcnt from atomic_t to refcount_t
atomic_t variables are currently used to implement reference
counters with the following properties:
 - counter is initialized to 1 using atomic_set()
 - a resource is freed upon counter reaching zero
 - once counter reaches zero, its further
   increments aren't allowed
 - counter schema uses basic atomic operations
   (set, inc, inc_not_zero, dec_and_test, etc.)

Such atomic variables should be converted to a newly provided
refcount_t type and API that prevents accidental counter overflows
and underflows. This is important since overflows and underflows
can lead to use-after-free situation and be exploitable.

The variable cn_callback_entry.refcnt is used as pure reference counter.
Convert it to refcount_t and fix up the operations.

Suggested-by: Kees Cook <keescook@chromium.org>
Reviewed-by: David Windsor <dwindsor@gmail.com>
Reviewed-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-22 02:22:39 +01:00
Elena Reshetova
a4b51a9f83 drivers, net, mlx5: convert mlx5_cq.refcount from atomic_t to refcount_t
atomic_t variables are currently used to implement reference
counters with the following properties:
 - counter is initialized to 1 using atomic_set()
 - a resource is freed upon counter reaching zero
 - once counter reaches zero, its further
   increments aren't allowed
 - counter schema uses basic atomic operations
   (set, inc, inc_not_zero, dec_and_test, etc.)

Such atomic variables should be converted to a newly provided
refcount_t type and API that prevents accidental counter overflows
and underflows. This is important since overflows and underflows
can lead to use-after-free situation and be exploitable.

The variable mlx5_cq.refcount is used as pure reference counter.
Convert it to refcount_t and fix up the operations.

Suggested-by: Kees Cook <keescook@chromium.org>
Reviewed-by: David Windsor <dwindsor@gmail.com>
Reviewed-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-22 02:22:38 +01:00
Elena Reshetova
17ac99b2b8 drivers, net, mlx4: convert mlx4_srq.refcount from atomic_t to refcount_t
atomic_t variables are currently used to implement reference
counters with the following properties:
 - counter is initialized to 1 using atomic_set()
 - a resource is freed upon counter reaching zero
 - once counter reaches zero, its further
   increments aren't allowed
 - counter schema uses basic atomic operations
   (set, inc, inc_not_zero, dec_and_test, etc.)

Such atomic variables should be converted to a newly provided
refcount_t type and API that prevents accidental counter overflows
and underflows. This is important since overflows and underflows
can lead to use-after-free situation and be exploitable.

The variable mlx4_srq.refcount is used as pure reference counter.
Convert it to refcount_t and fix up the operations.

Suggested-by: Kees Cook <keescook@chromium.org>
Reviewed-by: David Windsor <dwindsor@gmail.com>
Reviewed-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-22 02:22:38 +01:00
Elena Reshetova
0068895ff8 drivers, net, mlx4: convert mlx4_qp.refcount from atomic_t to refcount_t
atomic_t variables are currently used to implement reference
counters with the following properties:
 - counter is initialized to 1 using atomic_set()
 - a resource is freed upon counter reaching zero
 - once counter reaches zero, its further
   increments aren't allowed
 - counter schema uses basic atomic operations
   (set, inc, inc_not_zero, dec_and_test, etc.)

Such atomic variables should be converted to a newly provided
refcount_t type and API that prevents accidental counter overflows
and underflows. This is important since overflows and underflows
can lead to use-after-free situation and be exploitable.

The variable mlx4_qp.refcount is used as pure reference counter.
Convert it to refcount_t and fix up the operations.

Suggested-by: Kees Cook <keescook@chromium.org>
Reviewed-by: David Windsor <dwindsor@gmail.com>
Reviewed-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-22 02:22:38 +01:00
Elena Reshetova
ff61b5e3f0 drivers, net, mlx4: convert mlx4_cq.refcount from atomic_t to refcount_t
atomic_t variables are currently used to implement reference
counters with the following properties:
 - counter is initialized to 1 using atomic_set()
 - a resource is freed upon counter reaching zero
 - once counter reaches zero, its further
   increments aren't allowed
 - counter schema uses basic atomic operations
   (set, inc, inc_not_zero, dec_and_test, etc.)

Such atomic variables should be converted to a newly provided
refcount_t type and API that prevents accidental counter overflows
and underflows. This is important since overflows and underflows
can lead to use-after-free situation and be exploitable.

The variable mlx4_cq.refcount is used as pure reference counter.
Convert it to refcount_t and fix up the operations.

Suggested-by: Kees Cook <keescook@chromium.org>
Reviewed-by: David Windsor <dwindsor@gmail.com>
Reviewed-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-22 02:22:38 +01:00
Jiri Pirko
fa71212e91 net: sched: remove unused is_classid_clsact_ingress/egress helpers
These helpers are no longer in use by drivers, so remove them.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-21 03:04:08 +01:00
Jiri Pirko
d58d31a118 net: sched: remove unused classid field from tc_cls_common_offload
It is no longer used by the drivers, so remove it.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-21 03:04:08 +01:00
Jiri Pirko
208c0f4b52 net: sched: use tc_setup_cb_call to call per-block callbacks
Extend the tc_setup_cb_call entrypoint function originally used only for
action egress devices callbacks to call per-block callbacks as well.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-21 03:04:07 +01:00
Jiri Pirko
acb674428c net: sched: introduce per-block callbacks
Introduce infrastructure that allows drivers to register callbacks that
are called whenever tc would offload inserted rule for a specific block.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-21 03:04:06 +01:00
Jiri Pirko
6e40cf2d4d net: sched: use extended variants of block_get/put in ingress and clsact qdiscs
Use previously introduced extended variants of block get and put
functions. This allows to specify a binder types specific to clsact
ingress/egress which is useful for drivers to distinguish who actually
got the block.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-21 03:04:06 +01:00
Jiri Pirko
8c4083b30e net: sched: add block bind/unbind notif. and extended block_get/put
Introduce new type of ndo_setup_tc message to propage binding/unbinding
of a block to driver. Call this ndo whenever qdisc gets/puts a block.
Alongside with this, there's need to propagate binder type from qdisc
code down to the notifier. So introduce extended variants of
block_get/put in order to pass this info.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-21 03:04:06 +01:00
Paolo Abeni
b65f164d37 ipv6: let trace_fib6_table_lookup() dereference the fib table
The perf traces for ipv6 routing code show a relevant cost around
trace_fib6_table_lookup(), even if no trace is enabled. This is
due to the fib6_table de-referencing currently performed by the
caller.

Let's the tracing code pay this overhead, passing to the trace
helper the table pointer. This gives small but measurable
performance improvement under UDP flood.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Acked-by: David Ahern <dsa@cumulusnetworks.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-21 02:23:38 +01:00
Eric Dumazet
c92e8c02fe tcp/dccp: fix ireq->opt races
syzkaller found another bug in DCCP/TCP stacks [1]

For the reasons explained in commit ce1050089c ("tcp/dccp: fix
ireq->pktopts race"), we need to make sure we do not access
ireq->opt unless we own the request sock.

Note the opt field is renamed to ireq_opt to ease grep games.

[1]
BUG: KASAN: use-after-free in ip_queue_xmit+0x1687/0x18e0 net/ipv4/ip_output.c:474
Read of size 1 at addr ffff8801c951039c by task syz-executor5/3295

CPU: 1 PID: 3295 Comm: syz-executor5 Not tainted 4.14.0-rc4+ #80
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:16 [inline]
 dump_stack+0x194/0x257 lib/dump_stack.c:52
 print_address_description+0x73/0x250 mm/kasan/report.c:252
 kasan_report_error mm/kasan/report.c:351 [inline]
 kasan_report+0x25b/0x340 mm/kasan/report.c:409
 __asan_report_load1_noabort+0x14/0x20 mm/kasan/report.c:427
 ip_queue_xmit+0x1687/0x18e0 net/ipv4/ip_output.c:474
 tcp_transmit_skb+0x1ab7/0x3840 net/ipv4/tcp_output.c:1135
 tcp_send_ack.part.37+0x3bb/0x650 net/ipv4/tcp_output.c:3587
 tcp_send_ack+0x49/0x60 net/ipv4/tcp_output.c:3557
 __tcp_ack_snd_check+0x2c6/0x4b0 net/ipv4/tcp_input.c:5072
 tcp_ack_snd_check net/ipv4/tcp_input.c:5085 [inline]
 tcp_rcv_state_process+0x2eff/0x4850 net/ipv4/tcp_input.c:6071
 tcp_child_process+0x342/0x990 net/ipv4/tcp_minisocks.c:816
 tcp_v4_rcv+0x1827/0x2f80 net/ipv4/tcp_ipv4.c:1682
 ip_local_deliver_finish+0x2e2/0xba0 net/ipv4/ip_input.c:216
 NF_HOOK include/linux/netfilter.h:249 [inline]
 ip_local_deliver+0x1ce/0x6e0 net/ipv4/ip_input.c:257
 dst_input include/net/dst.h:464 [inline]
 ip_rcv_finish+0x887/0x19a0 net/ipv4/ip_input.c:397
 NF_HOOK include/linux/netfilter.h:249 [inline]
 ip_rcv+0xc3f/0x1820 net/ipv4/ip_input.c:493
 __netif_receive_skb_core+0x1a3e/0x34b0 net/core/dev.c:4476
 __netif_receive_skb+0x2c/0x1b0 net/core/dev.c:4514
 netif_receive_skb_internal+0x10b/0x670 net/core/dev.c:4587
 netif_receive_skb+0xae/0x390 net/core/dev.c:4611
 tun_rx_batched.isra.50+0x5ed/0x860 drivers/net/tun.c:1372
 tun_get_user+0x249c/0x36d0 drivers/net/tun.c:1766
 tun_chr_write_iter+0xbf/0x160 drivers/net/tun.c:1792
 call_write_iter include/linux/fs.h:1770 [inline]
 new_sync_write fs/read_write.c:468 [inline]
 __vfs_write+0x68a/0x970 fs/read_write.c:481
 vfs_write+0x18f/0x510 fs/read_write.c:543
 SYSC_write fs/read_write.c:588 [inline]
 SyS_write+0xef/0x220 fs/read_write.c:580
 entry_SYSCALL_64_fastpath+0x1f/0xbe
RIP: 0033:0x40c341
RSP: 002b:00007f469523ec10 EFLAGS: 00000293 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000718000 RCX: 000000000040c341
RDX: 0000000000000037 RSI: 0000000020004000 RDI: 0000000000000015
RBP: 0000000000000086 R08: 0000000000000000 R09: 0000000000000000
R10: 00000000000f4240 R11: 0000000000000293 R12: 00000000004b7fd1
R13: 00000000ffffffff R14: 0000000020000000 R15: 0000000000025000

Allocated by task 3295:
 save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
 save_stack+0x43/0xd0 mm/kasan/kasan.c:447
 set_track mm/kasan/kasan.c:459 [inline]
 kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:551
 __do_kmalloc mm/slab.c:3725 [inline]
 __kmalloc+0x162/0x760 mm/slab.c:3734
 kmalloc include/linux/slab.h:498 [inline]
 tcp_v4_save_options include/net/tcp.h:1962 [inline]
 tcp_v4_init_req+0x2d3/0x3e0 net/ipv4/tcp_ipv4.c:1271
 tcp_conn_request+0xf6d/0x3410 net/ipv4/tcp_input.c:6283
 tcp_v4_conn_request+0x157/0x210 net/ipv4/tcp_ipv4.c:1313
 tcp_rcv_state_process+0x8ea/0x4850 net/ipv4/tcp_input.c:5857
 tcp_v4_do_rcv+0x55c/0x7d0 net/ipv4/tcp_ipv4.c:1482
 tcp_v4_rcv+0x2d10/0x2f80 net/ipv4/tcp_ipv4.c:1711
 ip_local_deliver_finish+0x2e2/0xba0 net/ipv4/ip_input.c:216
 NF_HOOK include/linux/netfilter.h:249 [inline]
 ip_local_deliver+0x1ce/0x6e0 net/ipv4/ip_input.c:257
 dst_input include/net/dst.h:464 [inline]
 ip_rcv_finish+0x887/0x19a0 net/ipv4/ip_input.c:397
 NF_HOOK include/linux/netfilter.h:249 [inline]
 ip_rcv+0xc3f/0x1820 net/ipv4/ip_input.c:493
 __netif_receive_skb_core+0x1a3e/0x34b0 net/core/dev.c:4476
 __netif_receive_skb+0x2c/0x1b0 net/core/dev.c:4514
 netif_receive_skb_internal+0x10b/0x670 net/core/dev.c:4587
 netif_receive_skb+0xae/0x390 net/core/dev.c:4611
 tun_rx_batched.isra.50+0x5ed/0x860 drivers/net/tun.c:1372
 tun_get_user+0x249c/0x36d0 drivers/net/tun.c:1766
 tun_chr_write_iter+0xbf/0x160 drivers/net/tun.c:1792
 call_write_iter include/linux/fs.h:1770 [inline]
 new_sync_write fs/read_write.c:468 [inline]
 __vfs_write+0x68a/0x970 fs/read_write.c:481
 vfs_write+0x18f/0x510 fs/read_write.c:543
 SYSC_write fs/read_write.c:588 [inline]
 SyS_write+0xef/0x220 fs/read_write.c:580
 entry_SYSCALL_64_fastpath+0x1f/0xbe

Freed by task 3306:
 save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
 save_stack+0x43/0xd0 mm/kasan/kasan.c:447
 set_track mm/kasan/kasan.c:459 [inline]
 kasan_slab_free+0x71/0xc0 mm/kasan/kasan.c:524
 __cache_free mm/slab.c:3503 [inline]
 kfree+0xca/0x250 mm/slab.c:3820
 inet_sock_destruct+0x59d/0x950 net/ipv4/af_inet.c:157
 __sk_destruct+0xfd/0x910 net/core/sock.c:1560
 sk_destruct+0x47/0x80 net/core/sock.c:1595
 __sk_free+0x57/0x230 net/core/sock.c:1603
 sk_free+0x2a/0x40 net/core/sock.c:1614
 sock_put include/net/sock.h:1652 [inline]
 inet_csk_complete_hashdance+0xd5/0xf0 net/ipv4/inet_connection_sock.c:959
 tcp_check_req+0xf4d/0x1620 net/ipv4/tcp_minisocks.c:765
 tcp_v4_rcv+0x17f6/0x2f80 net/ipv4/tcp_ipv4.c:1675
 ip_local_deliver_finish+0x2e2/0xba0 net/ipv4/ip_input.c:216
 NF_HOOK include/linux/netfilter.h:249 [inline]
 ip_local_deliver+0x1ce/0x6e0 net/ipv4/ip_input.c:257
 dst_input include/net/dst.h:464 [inline]
 ip_rcv_finish+0x887/0x19a0 net/ipv4/ip_input.c:397
 NF_HOOK include/linux/netfilter.h:249 [inline]
 ip_rcv+0xc3f/0x1820 net/ipv4/ip_input.c:493
 __netif_receive_skb_core+0x1a3e/0x34b0 net/core/dev.c:4476
 __netif_receive_skb+0x2c/0x1b0 net/core/dev.c:4514
 netif_receive_skb_internal+0x10b/0x670 net/core/dev.c:4587
 netif_receive_skb+0xae/0x390 net/core/dev.c:4611
 tun_rx_batched.isra.50+0x5ed/0x860 drivers/net/tun.c:1372
 tun_get_user+0x249c/0x36d0 drivers/net/tun.c:1766
 tun_chr_write_iter+0xbf/0x160 drivers/net/tun.c:1792
 call_write_iter include/linux/fs.h:1770 [inline]
 new_sync_write fs/read_write.c:468 [inline]
 __vfs_write+0x68a/0x970 fs/read_write.c:481
 vfs_write+0x18f/0x510 fs/read_write.c:543
 SYSC_write fs/read_write.c:588 [inline]
 SyS_write+0xef/0x220 fs/read_write.c:580
 entry_SYSCALL_64_fastpath+0x1f/0xbe

Fixes: e994b2f0fb ("tcp: do not lock listener to process SYN packets")
Fixes: 079096f103 ("tcp/dccp: install syn_recv requests into ehash table")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-21 01:33:19 +01:00
Linus Torvalds
545ea16f7c Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Arnd Bergmann:
 "Here is another set of bugfixes for ARM SoCs, mostly harmless:

   - a boot regression fix on ux500

   - PCIe interrupts on NXP i.MX7 and on Marvell Armada 7K/8K were wired
     up wrong, in different ways

   - Armada XP support for large memory never worked

   - the socfpga reset controller now builds on 64-bit

   - minor device tree corrections on gemini, mvebu, r-pi 3, rockchip
     and at91"

* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  ARM: ux500: Fix regression while init PM domains
  ARM: dts: fix PCLK name on Gemini and MOXA ART
  arm64: dts: rockchip: fix typo in iommu nodes
  arm64: dts: rockchip: correct vqmmc voltage for rk3399 platforms
  ARM: dts: imx7d: Invert legacy PCI irq mapping
  bus: mbus: fix window size calculation for 4GB windows
  ARM: dts: at91: sama5d2: add ADC hw trigger edge type
  ARM: dts: at91: sama5d2_xplained: enable ADTRG pin
  ARM: dts: at91: at91-sama5d27_som1: fix PHY ID
  ARM: dts: bcm283x: Fix console path on RPi3
  reset: socfpga: fix for 64-bit compilation
  ARM: dts: Fix I2C repeated start issue on Armada-38x
  arm64: dts: marvell: fix interrupt-map property for Armada CP110 PCIe controller
  arm64: dts: salvator-common: add 12V regulator to backlight
  ARM: dts: sun6i: Fix endpoint IDs in second display pipeline
  arm64: allwinner: a64: pine64: Use dcdc1 regulator for mmc0
2017-10-20 18:17:43 -04:00
Chenbo Feng
f66e448cfd selinux: bpf: Add addtional check for bpf object file receive
Introduce a bpf object related check when sending and receiving files
through unix domain socket as well as binder. It checks if the receiving
process have privilege to read/write the bpf map or use the bpf program.
This check is necessary because the bpf maps and programs are using a
anonymous inode as their shared inode so the normal way of checking the
files and sockets when passing between processes cannot work properly on
eBPF object. This check only works when the BPF_SYSCALL is configured.

Signed-off-by: Chenbo Feng <fengc@google.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Reviewed-by: James Morris <james.l.morris@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-20 13:32:59 +01:00
Chenbo Feng
afdb09c720 security: bpf: Add LSM hooks for bpf object related syscall
Introduce several LSM hooks for the syscalls that will allow the
userspace to access to eBPF object such as eBPF programs and eBPF maps.
The security check is aimed to enforce a per object security protection
for eBPF object so only processes with the right priviliges can
read/write to a specific map or use a specific eBPF program. Besides
that, a general security hook is added before the multiplexer of bpf
syscall to check the cmd and the attribute used for the command. The
actual security module can decide which command need to be checked and
how the cmd should be checked.

Signed-off-by: Chenbo Feng <fengc@google.com>
Acked-by: James Morris <james.l.morris@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-20 13:32:59 +01:00
Chenbo Feng
6e71b04a82 bpf: Add file mode configuration into bpf maps
Introduce the map read/write flags to the eBPF syscalls that returns the
map fd. The flags is used to set up the file mode when construct a new
file descriptor for bpf maps. To not break the backward capability, the
f_flags is set to O_RDWR if the flag passed by syscall is 0. Otherwise
it should be O_RDONLY or O_WRONLY. When the userspace want to modify or
read the map content, it will check the file mode to see if it is
allowed to make the change.

Signed-off-by: Chenbo Feng <fengc@google.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-20 13:32:59 +01:00
Yuchung Cheng
1fba70e5b6 tcp: socket option to set TCP fast open key
New socket option TCP_FASTOPEN_KEY to allow different keys per
listener.  The listener by default uses the global key until the
socket option is set.  The key is a 16 bytes long binary data. This
option has no effect on regular non-listener TCP sockets.

Signed-off-by: Yuchung Cheng <ycheng@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Christoph Paasch <cpaasch@apple.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-20 13:21:36 +01:00
David Ahern
de95e04791 net: Add extack to validator_info structs used for address notifier
Add extack to in_validator_info and in6_validator_info. Update the one
user of each, ipvlan, to return an error message for failures.

Only manual configuration of an address is plumbed in the IPv6 code path.

Signed-off-by: David Ahern <dsahern@gmail.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-20 13:15:07 +01:00
David Ahern
890056783c tcp: Remove use of inet6_sk and add IPv6 checks to tracepoint
386fd5da40 ("tcp: Check daddr_cache before use in tracepoint") was the
second version of the tracepoint fixup patch. This patch is the delta
between v2 and v3.  Specifically, remove the use of inet6_sk and check
sk_family as requested by Eric and add IS_ENABLED(CONFIG_IPV6) around
the use of sk_v6_rcv_saddr and sk_v6_daddr as done in sock_common (noted
by Cong).

Signed-off-by: David Ahern <dsahern@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Tested-by: Song Liu <songliubraving@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-20 13:04:58 +01:00
John Fastabend
34f79502bb bpf: avoid preempt enable/disable in sockmap using tcp_skb_cb region
SK_SKB BPF programs are run from the socket/tcp context but early in
the stack before much of the TCP metadata is needed in tcp_skb_cb. So
we can use some unused fields to place BPF metadata needed for SK_SKB
programs when implementing the redirect function.

This allows us to drop the preempt disable logic. It does however
require an API change so sk_redirect_map() has been updated to
additionally provide ctx_ptr to skb. Note, we do however continue to
disable/enable preemption around actual BPF program running to account
for map updates.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-20 13:01:29 +01:00
Linus Torvalds
03b652e5c0 Merge branch 'fixes-v4.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull key handling fixes from James Morris:
 "This includes a fix for the capabilities code from Colin King, and a
  set of further fixes for the keys subsystem. From David:

   - Fix a bunch of places where kernel drivers may access revoked
     user-type keys and don't do it correctly.

   - Fix some ecryptfs bits.

   - Fix big_key to require CONFIG_CRYPTO.

   - Fix a couple of bugs in the asymmetric key type.

   - Fix a race between updating and finding negative keys.

   - Prevent add_key() from updating uninstantiated keys.

   - Make loading of key flags and expiry time atomic when not holding
     locks"

* 'fixes-v4.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
  commoncap: move assignment of fs_ns to avoid null pointer dereference
  pkcs7: Prevent NULL pointer dereference, since sinfo is not always set.
  KEYS: load key flags and expiry time atomically in proc_keys_show()
  KEYS: Load key expiry time atomically in keyring_search_iterator()
  KEYS: load key flags and expiry time atomically in key_validate()
  KEYS: don't let add_key() update an uninstantiated key
  KEYS: Fix race between updating and finding a negative key
  KEYS: checking the input id parameters before finding asymmetric key
  KEYS: Fix the wrong index when checking the existence of second id
  security/keys: BIG_KEY requires CONFIG_CRYPTO
  ecryptfs: fix dereference of NULL user_key_payload
  fscrypt: fix dereference of NULL user_key_payload
  lib/digsig: fix dereference of NULL user_key_payload
  FS-Cache: fix dereference of NULL user_key_payload
  KEYS: encrypted: fix dereference of NULL user_key_payload
2017-10-20 06:19:38 -04:00
David S. Miller
9854d758f7 Merge tag 'rxrpc-next-20171018' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
David Howells says:

====================
rxrpc: Add bits for kernel services

Here are some patches that add a few things for kernel services to use:

 (1) Allow service upgrade to be requested and allow the resultant actual
     service ID to be obtained.

 (2) Allow the RTT time of a call to be obtained.

 (3) Allow a kernel service to find out if a call is still alive on a
     server between transmitting a request and getting the reply.

 (4) Allow data transmission to ignore signals if transmission progress is
     being made in reasonable time.  This is also usable by userspace by
     passing MSG_WAITALL to sendmsg()[*].

[*] I'm not sure this is the right interface for this or whether a sockopt
    should be used instead.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-20 08:42:09 +01:00
Paul E. McKenney
27fdb35fe9 doc: Fix various RCU docbook comment-header problems
Because many of RCU's files have not been included into docbook, a
number of errors have accumulated.  This commit fixes them.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-10-19 22:26:11 -04:00
Mathieu Desnoyers
a961e40917 membarrier: Provide register expedited private command
This introduces a "register private expedited" membarrier command which
allows eventual removal of important memory barrier constraints on the
scheduler fast-paths. It changes how the "private expedited" membarrier
command (new to 4.14) is used from user-space.

This new command allows processes to register their intent to use the
private expedited command.  This affects how the expedited private
command introduced in 4.14-rc is meant to be used, and should be merged
before 4.14 final.

Processes are now required to register before using
MEMBARRIER_CMD_PRIVATE_EXPEDITED, otherwise that command returns EPERM.

This fixes a problem that arose when designing requested extensions to
sys_membarrier() to allow JITs to efficiently flush old code from
instruction caches.  Several potential algorithms are much less painful
if the user register intent to use this functionality early on, for
example, before the process spawns the second thread.  Registering at
this time removes the need to interrupt each and every thread in that
process at the first expedited sys_membarrier() system call.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-10-19 22:13:40 -04:00
Dmitry Torokhov
8724ecb072 Input: allow matching device IDs on property bits
Let's allow matching input devices on their property bits, both in-kernel
and when generating module aliases.

Tested-by: Roderick Colenbrander <roderick.colenbrander@sony.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2017-10-19 16:54:49 -07:00
Dmitry Torokhov
55dfce873d Input: factor out and export input_device_id matching code
Factor out and export input_match_device_id() so that modules may use it.
It will be needed by joydev to blacklist accelerometers in composite
devices.

Tested-by: Roderick Colenbrander <roderick.colenbrander@sony.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2017-10-19 16:53:34 -07:00
Linus Torvalds
962556b57c Merge tag 'sound-4.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
 "We've got slightly more fixes than wished, but heading to a good
  shape. Most of changes are about HD-audio fixes, one for a buggy code
  that went into 4.13, and another for avoiding a crash due to buggy
  BIOS.

  Apart from HD-audio, a sequencer core change that is only for UP
  config (which must be pretty rare nowadays), and a USB-audio quirk as
  usual"

* tag 'sound-4.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: hda - Fix incorrect TLV callback check introduced during set_fs() removal
  ALSA: hda: Remove superfluous '-' added by printk conversion
  ALSA: hda: Abort capability probe at invalid register read
  ALSA: seq: Enable 'use' locking in all configurations
  ALSA: usb-audio: Add native DSD support for Pro-Ject Pre Box S2 Digital
2017-10-19 16:15:17 -04:00
Arnd Bergmann
840907f941 Merge tag 'mvebu-fixes-4.14-2' of git://git.infradead.org/linux-mvebu into fixes
Pull "mvebu fixes for 4.14 (part 2)" from Gregory CLEMENT

Two device tree related fixes:

- One on Armada 38x using a other compatible string for I2C in order
  to cover an errata.

- One for Armada 7K/8K fixing a typo on interrupt-map property for
  PCIe leading to fail PME and AER root port service initialization

And the last one for the mbus fixing the window size calculation when
it exceed 32bits

* tag 'mvebu-fixes-4.14-2' of git://git.infradead.org/linux-mvebu:
  bus: mbus: fix window size calculation for 4GB windows
  ARM: dts: Fix I2C repeated start issue on Armada-38x
  arm64: dts: marvell: fix interrupt-map property for Armada CP110 PCIe controller
2017-10-19 17:40:11 +02:00
Stephen Hemminger
7a0947e755 dql: make dql_init return void
dql_init always returned 0, and the only place that uses it
in network core code didn't care about the return value anyway.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Acked-by: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-19 13:33:51 +01:00
David S. Miller
8f2e9ca837 Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:

====================
40GbE Intel Wired LAN Driver Updates 2017-10-17

This series contains updates to i40e and ethtool.

Alan provides most of the changes in this series which are mainly fixes
and cleanups.  Renamed the ethtool "cmd" variable to "ks", since the new
ethtool API passes us ksettings structs instead of command structs.
Cleaned up an ifdef that was not accomplishing anything.  Added function
header comments to provide better documentation.  Fixed two issues in
i40e_get_link_ksettings(), by calling
ethtool_link_ksettings_zero_link_mode() to ensure the advertising and
link masks are cleared before we start setting bits.  Cleaned up and fixed
code comments which were incorrect.  Separated the setting of autoneg in
i40e_phy_types_to_ethtool() into its own conditional to clarify what PHYs
support and advertise autoneg, and makes it easier to add new PHY types in
the future.  Added ethtool functionality to intersect two link masks
together to find the common ground between them.  Overhauled i40e to
ensure that the new ethtool API macros are being used, instead of the
old ones.  Fixed the usage of unsigned 64-bit division which is not
supported on all architectures.

Sudheer adds support for 25G Active Optical Cables (AOC) and Active Copper
Cables (ACC) PHY types.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-19 11:44:36 +01:00
James Morris
494b9ae7ab Merge commit 'tags/keys-fixes-20171018' into fixes-v4.14-rc5 2017-10-19 12:28:38 +11:00
Jakub Kicinski
4f9218aaf8 bpf: move knowledge about post-translation offsets out of verifier
Use the fact that verifier ops are now separate from program
ops to define a separate set of callbacks for verification of
already translated programs.

Since we expect the analyzer ops to be defined only for
a small subset of all program types initialize their array
by hand (don't use linux/bpf_types.h).

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-18 14:17:10 +01:00
Jakub Kicinski
00176a34d9 bpf: remove the verifier ops from program structure
Since the verifier ops don't have to be associated with
the program for its entire lifetime we can move it to
verifier's struct bpf_verifier_env.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-18 14:17:10 +01:00
Jakub Kicinski
7de16e3a35 bpf: split verifier and program ops
struct bpf_verifier_ops contains both verifier ops and operations
used later during program's lifetime (test_run).  Split the runtime
ops into a different structure.

BPF_PROG_TYPE() will now append ## _prog_ops or ## _verifier_ops
to the names.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-18 14:17:10 +01:00
David Ahern
386fd5da40 tcp: Check daddr_cache before use in tracepoint
Running perf in one window to capture tcp_retransmit_skb tracepoint:
    $ perf record -e tcp:tcp_retransmit_skb -a

And causing a retransmission on an active TCP session (e.g., dropping
packets in the receiver, changing MTU on the interface to 500 and back
to 1500) triggers a panic:

[   58.543144] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
[   58.545300] IP: perf_trace_tcp_retransmit_skb+0xd0/0x145
[   58.546770] PGD 0 P4D 0
[   58.547472] Oops: 0000 [#1] SMP
[   58.548328] Modules linked in: vrf
[   58.549262] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.14.0-rc4+ #26
[   58.551004] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
[   58.554560] task: ffffffff81a0e540 task.stack: ffffffff81a00000
[   58.555817] RIP: 0010:perf_trace_tcp_retransmit_skb+0xd0/0x145
[   58.557137] RSP: 0018:ffff88003fc03d68 EFLAGS: 00010282
[   58.558292] RAX: 0000000000000000 RBX: ffffe8ffffc0ec80 RCX: ffff880038543098
[   58.559850] RDX: 0400000000000000 RSI: ffff88003fc03d70 RDI: ffff88003fc14b68
[   58.561099] RBP: ffff88003fc03da8 R08: 0000000000000000 R09: ffffea0000d3224a
[   58.562005] R10: ffff88003fc03db8 R11: 0000000000000010 R12: ffff8800385428c0
[   58.562930] R13: ffffe8ffffc0e478 R14: ffffffff81a93a40 R15: ffff88003d4f0c00
[   58.563845] FS:  0000000000000000(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000
[   58.564873] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   58.565613] CR2: 0000000000000008 CR3: 000000003d68f004 CR4: 00000000000606f0
[   58.566538] Call Trace:
[   58.566865]  <IRQ>
[   58.567140]  __tcp_retransmit_skb+0x4ab/0x4c6
[   58.567704]  ? tcp_set_ca_state+0x22/0x3f
[   58.568231]  tcp_retransmit_skb+0x14/0xa3
[   58.568754]  tcp_retransmit_timer+0x472/0x5e3
[   58.569324]  ? tcp_write_timer_handler+0x1e9/0x1e9
[   58.569946]  tcp_write_timer_handler+0x95/0x1e9
[   58.570548]  tcp_write_timer+0x2a/0x58

Check that daddr_cache is non-NULL before de-referencing.

Fixes: e086101b15 ("tcp: add a tracepoint for tcp retransmission")
Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-18 14:15:14 +01:00
David Ahern
fb6ff75e18 tcp: Use pI6c in tcp tracepoint
The compact form for IPv6 addresses is more user friendly than the full
version. For example:
   compact: 2001:db8:1::1
      full: 2001:0db8:0001:0000:0000:0000:0000:0004i

Update the tcp tracepoint to show the compact form.

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-18 14:10:29 +01:00