Commit Graph

101250 Commits

Author SHA1 Message Date
Thadeu Lima de Souza Cascardo
8051ac7643 vlan: use non-archaic spelling of failes
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-03 11:00:52 -04:00
David S. Miller
9c54aeb03a Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Filling in the padding slot in the bpf structure as a bug fix in 'ne'
overlapped with actually using that padding area for something in
'net-next'.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-03 09:31:58 -04:00
Linus Torvalds
918fe1b315 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Infinite loop in _decode_session6(), from Eric Dumazet.

 2) Pass correct argument to nla_strlcpy() in netfilter, also from Eric
    Dumazet.

 3) Out of bounds memory access in ipv6 srh code, from Mathieu Xhonneux.

 4) NULL deref in XDP_REDIRECT handling of tun driver, from Toshiaki
    Makita.

 5) Incorrect idr release in cls_flower, from Paul Blakey.

 6) Probe error handling fix in davinci_emac, from Dan Carpenter.

 7) Memory leak in XPS configuration, from Alexander Duyck.

 8) Use after free with cloned sockets in kcm, from Kirill Tkhai.

 9) MTU handling fixes fo ip_tunnel and ip6_tunnel, from Nicolas
    Dichtel.

10) Fix UAPI hole in bpf data structure for 32-bit compat applications,
    from Daniel Borkmann.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (33 commits)
  bpf: fix uapi hole for 32 bit compat applications
  net: usb: cdc_mbim: add flag FLAG_SEND_ZLP
  ip6_tunnel: remove magic mtu value 0xFFF8
  ip_tunnel: restore binding to ifaces with a large mtu
  net: dsa: b53: Add BCM5389 support
  kcm: Fix use-after-free caused by clonned sockets
  net-sysfs: Fix memory leak in XPS configuration
  ixgbe: fix parsing of TC actions for HW offload
  net: ethernet: davinci_emac: fix error handling in probe()
  net/ncsi: Fix array size in dumpit handler
  cls_flower: Fix incorrect idr release when failing to modify rule
  net/sonic: Use dma_mapping_error()
  xfrm Fix potential error pointer dereference in xfrm_bundle_create.
  vhost_net: flush batched heads before trying to busy polling
  tun: Fix NULL pointer dereference in XDP redirect
  be2net: Fix error detection logic for BE3
  net: qmi_wwan: Add Netgear Aircard 779S
  mlxsw: spectrum: Forbid creation of VLAN 1 over port/LAG
  atm: zatm: fix memcmp casting
  iwlwifi: pcie: compare with number of IRQs requested for, not number of CPUs
  ...
2018-06-02 17:35:53 -07:00
Florian Westphal
1b2470e59f netfilter: nf_tables: handle chain name lookups via rhltable
If there is a significant amount of chains list search is too slow, so
add an rhlist table for this.

This speeds up ruleset loading: for every new rule we have to check if
the name already exists in current generation.

We need to be able to cope with duplicate chain names in case a transaction
drops the nfnl mutex (for request_module) and the abort of this old
transaction is still pending.

The list is kept -- we need a way to iterate chains even if hash resize is
in progress without missing an entry.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-06-03 01:18:37 +02:00
Pablo Neira Ayuso
290180e244 netfilter: nf_tables: add connlimit support
This features which allows you to limit the maximum number of
connections per arbitrary key. The connlimit expression is stateful,
therefore it can be used from meters to dynamically populate a set, this
provides a mapping to the iptables' connlimit match. This patch also
comes that allows you define static connlimit policies.

This extension depends on the nf_conncount infrastructure.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-06-03 01:18:29 +02:00
Linus Torvalds
ada7339efe Merge tag 'drm-fixes-for-v4.17-rc8' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
 "A few final fixes:

  i915:
   - fix for potential Spectre vector in the new query uAPI
   - fix NULL pointer deref (FDO #106559)
   - DMI fix to hide LVDS for Radiant P845 (FDO #105468)

  amdgpu:
   - suspend/resume DC regression fix
   - underscan flicker fix on fiji
   - gamma setting fix after dpms

  omap:
   - fix oops regression

  core:
   - fix PSR timing

  dw-hdmi:
   - fix oops regression"

* tag 'drm-fixes-for-v4.17-rc8' of git://people.freedesktop.org/~airlied/linux:
  drm/amd/display: Update color props when modeset is required
  drm/amd/display: Make atomic-check validate underscan changes
  drm/bridge/synopsys: dw-hdmi: fix dw_hdmi_setup_rx_sense
  drm/amd/display: Fix BUG_ON during CRTC atomic check update
  drm/i915/query: nospec expects no more than an unsigned long
  drm/i915/query: Protect tainted function pointer lookup
  drm/i915/lvds: Move acpi lid notification registration to registration phase
  drm/i915: Disable LVDS on Radiant P845
  drm/omap: fix NULL deref crash with SDI displays
  drm/psr: Fix missed entry in PSR setup time table.
2018-06-02 15:24:45 -07:00
Pablo Neira Ayuso
371ebcbb9e netfilter: nf_tables: add destroy_clone expression
Before this patch, cloned expressions are released via ->destroy. This
is a problem for the new connlimit expression since the ->destroy path
drop a reference on the conntrack modules and it unregisters hooks. The
new ->destroy_clone provides context that this expression is being
released from the packet path, so it is mirroring ->clone(), where
neither module reference is dropped nor hooks need to be unregistered -
because this done from the control plane path from the ->init() path.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-06-03 00:02:11 +02:00
Pablo Neira Ayuso
79b174ade1 netfilter: nf_tables: garbage collection for stateful expressions
Use garbage collector to schedule removal of elements based of feedback
from expression that this element comes with. Therefore, the garbage
collector is not guided by timeout expirations in this new mode.

The new connlimit expression sets on the NFT_EXPR_GC flag to enable this
behaviour, the dynset expression needs to explicitly enable the garbage
collector via set->ops->gc_init call.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-06-03 00:02:10 +02:00
Pablo Neira Ayuso
3453c92731 netfilter: nf_tables: pass ctx to nf_tables_expr_destroy()
nft_set_elem_destroy() can be called from call_rcu context. Annotate
netns and table in set object so we can populate the context object.
Moreover, pass context object to nf_tables_set_elem_destroy() from the
commit phase, since it is already available from there.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-06-03 00:02:09 +02:00
Pablo Neira Ayuso
5e5cbc7b23 netfilter: nf_conncount: expose connection list interface
This patch provides an interface to maintain the list of connections and
the lookup function to obtain the number of connections in the list.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-06-03 00:02:08 +02:00
Pablo Neira Ayuso
00bfb3205e netfilter: nf_tables: pass context to object destroy indirection
The new connlimit object needs this to properly deal with conntrack
dependencies.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-06-03 00:02:06 +02:00
Máté Eckl
45ca4e0cf2 netfilter: Libify xt_TPROXY
The extracted functions will likely be usefull to implement tproxy
support in nf_tables.

Extrancted functions:
	- nf_tproxy_sk_is_transparent
	- nf_tproxy_laddr4
	- nf_tproxy_handle_time_wait4
	- nf_tproxy_get_sock_v4
	- nf_tproxy_laddr6
	- nf_tproxy_handle_time_wait6
	- nf_tproxy_get_sock_v6

(nf_)tproxy_handle_time_wait6 also needed some refactor as its current
implementation was xtables-specific.

Signed-off-by: Máté Eckl <ecklm94@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-06-03 00:02:05 +02:00
Máté Eckl
8d6e555773 netfilter: Decrease code duplication regarding transparent socket option
There is a function in include/net/netfilter/nf_socket.h to decide if a
socket has IP(V6)_TRANSPARENT socket option set or not. However this
does the same as inet_sk_transparent() in include/net/tcp.h

include/net/tcp.h:1733
/* This helper checks if socket has IP_TRANSPARENT set */
static inline bool inet_sk_transparent(const struct sock *sk)
{
	switch (sk->sk_state) {
	case TCP_TIME_WAIT:
		return inet_twsk(sk)->tw_transparent;
	case TCP_NEW_SYN_RECV:
		return inet_rsk(inet_reqsk(sk))->no_srccheck;
	}
	return inet_sk(sk)->transparent;
}

tproxy_sk_is_transparent has also been refactored to use this function
instead of reimplementing it.

Signed-off-by: Máté Eckl <ecklm94@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-06-03 00:02:01 +02:00
Linus Torvalds
34a8e640d1 Merge tag 'staging-4.17-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull IIO driver fixes from Greg KH:
 "Here are some old IIO driver fixes that were sitting in my tree for a
  few weeks. Sorry about not getting them to you sooner. They fix a
  number of small IIO driver issues that have been reported.

  All of these have been in linux-next for a while with no reported
  problems"

* tag 'staging-4.17-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
  iio: adc: select buffer for at91-sama5d2_adc
  iio: hid-sensor-trigger: Fix sometimes not powering up the sensor after resume
  iio: adc: at91-sama5d2_adc: fix channel configuration for differential channels
  iio:kfifo_buf: check for uint overflow
  iio:buffer: make length types match kfifo types
  iio: adc: stm32-dfsdm: fix sample rate for div2 spi clock
  iio: adc: stm32-dfsdm: fix successive oversampling settings
  iio: ad7793: implement IIO_CHAN_INFO_SAMP_FREQ
2018-06-02 10:02:14 -07:00
David S. Miller
1ffdd8e164 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:

====================
Netfilter/IPVS updates for net-next

The following patchset contains Netfilter/IPVS updates for your net-next
tree, the most relevant things in this batch are:

1) Compile masquerade infrastructure into NAT module, from Florian Westphal.
   Same thing with the redirection support.

2) Abort transaction if early initialization of the commit phase fails.
   Also from Florian.

3) Get rid of synchronize_rcu() by using rule array in nf_tables, from
   Florian.

4) Abort nf_tables batch if fatal signal is pending, from Florian.

5) Use .call_rcu nfnetlink from nf_tables to make dumps fully lockless.
   From Florian Westphal.

6) Support to match transparent sockets from nf_tables, from Máté Eckl.

7) Audit support for nf_tables, from Phil Sutter.

8) Validate chain dependencies from commit phase, fall back to fine grain
   validation only in case of errors.

9) Attach dst to skbuff from netfilter flowtable packet path, from
   Jason A. Donenfeld.

10) Use artificial maximum attribute cap to remove VLA from nfnetlink.
    Patch from Kees Cook.

11) Add extension to allow to forward packets through neighbour layer.

12) Add IPv6 conntrack helper support to IPVS, from Julian Anastasov.

13) Add IPv6 FTP conntrack support to IPVS, from Julian Anastasov.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-02 09:04:21 -04:00
Daniel Borkmann
36f9814a49 bpf: fix uapi hole for 32 bit compat applications
In 64 bit, we have a 4 byte hole between ifindex and netns_dev in the
case of struct bpf_map_info but also struct bpf_prog_info. In net-next
commit b85fab0e67 ("bpf: Add gpl_compatible flag to struct bpf_prog_info")
added a bitfield into it to expose some flags related to programs. Thus,
add an unnamed __u32 bitfield for both so that alignment keeps the same
in both 32 and 64 bit cases, and can be naturally extended from there
as in b85fab0e67.

Before:

  # file test.o
  test.o: ELF 32-bit LSB relocatable, Intel 80386, version 1 (SYSV), not stripped
  # pahole test.o
  struct bpf_map_info {
	__u32                      type;                 /*     0     4 */
	__u32                      id;                   /*     4     4 */
	__u32                      key_size;             /*     8     4 */
	__u32                      value_size;           /*    12     4 */
	__u32                      max_entries;          /*    16     4 */
	__u32                      map_flags;            /*    20     4 */
	char                       name[16];             /*    24    16 */
	__u32                      ifindex;              /*    40     4 */
	__u64                      netns_dev;            /*    44     8 */
	__u64                      netns_ino;            /*    52     8 */

	/* size: 64, cachelines: 1, members: 10 */
	/* padding: 4 */
  };

After (same as on 64 bit):

  # file test.o
  test.o: ELF 32-bit LSB relocatable, Intel 80386, version 1 (SYSV), not stripped
  # pahole test.o
  struct bpf_map_info {
	__u32                      type;                 /*     0     4 */
	__u32                      id;                   /*     4     4 */
	__u32                      key_size;             /*     8     4 */
	__u32                      value_size;           /*    12     4 */
	__u32                      max_entries;          /*    16     4 */
	__u32                      map_flags;            /*    20     4 */
	char                       name[16];             /*    24    16 */
	__u32                      ifindex;              /*    40     4 */

	/* XXX 4 bytes hole, try to pack */

	__u64                      netns_dev;            /*    48     8 */
	__u64                      netns_ino;            /*    56     8 */
	/* --- cacheline 1 boundary (64 bytes) --- */

	/* size: 64, cachelines: 1, members: 10 */
	/* sum members: 60, holes: 1, sum holes: 4 */
  };

Reported-by: Dmitry V. Levin <ldv@altlinux.org>
Reported-by: Eugene Syromiatnikov <esyr@redhat.com>
Fixes: 52775b33bb ("bpf: offload: report device information about offloaded maps")
Fixes: 675fc275a3 ("bpf: offload: report device information for offloaded programs")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-06-01 20:41:35 -07:00
Julian Anastasov
d12e12299a ipvs: add ipv6 support to ftp
Add support for FTP commands with extended format (RFC 2428):

- FTP EPRT: IPv4 and IPv6, active mode, similar to PORT
- FTP EPSV: IPv4 and IPv6, passive mode, similar to PASV.
EPSV response usually contains only port but we allow real
server to provide different address

We restrict control and data connection to be from same
address family.

Allow the "(" and ")" to be optional in PASV response.

Also, add ipvsh argument to the pkt_in/pkt_out handlers to better
access the payload after transport header.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-06-01 14:01:54 +02:00
Pablo Neira Ayuso
d32de98ea7 netfilter: nft_fwd_netdev: allow to forward packets via neighbour layer
This allows us to forward packets from the netdev family via neighbour
layer, so you don't need an explicit link-layer destination when using
this expression from rules. The ttl/hop_limit field is decremented.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-06-01 10:35:47 +02:00
Pablo Neira Ayuso
a654de8fdc netfilter: nf_tables: fix chain dependency validation
The following ruleset:

 add table ip filter
 add chain ip filter input { type filter hook input priority 4; }
 add chain ip filter ap
 add rule ip filter input jump ap
 add rule ip filter ap masquerade

results in a panic, because the masquerade extension should be rejected
from the filter chain. The existing validation is missing a chain
dependency check when the rule is added to the non-base chain.

This patch fixes the problem by walking down the rules from the
basechains, searching for either immediate or lookup expressions, then
jumping to non-base chains and again walking down the rules to perform
the expression validation, so we make sure the full ruleset graph is
validated. This is done only once from the commit phase, in case of
problem, we abort the transaction and perform fine grain validation for
error reporting. This patch requires 003087911a ("netfilter:
nfnetlink: allow commit to fail") to achieve this behaviour.

This patch also adds a cleanup callback to nfnl batch interface to reset
the validate state from the exit path.

As a result of this patch, nf_tables_check_loops() doesn't use
->validate to check for loops, instead it just checks for immediate
expressions.

Reported-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-06-01 09:46:22 +02:00
Phil Sutter
1a893b44de netfilter: nf_tables: Add audit support to log statement
This extends log statement to support the behaviour achieved with
AUDIT target in iptables.

Audit logging is enabled via a pseudo log level 8. In this case any
other settings like log prefix are ignored since audit log format is
fixed.

Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-06-01 09:46:21 +02:00
Máté Eckl
554ced0a6e netfilter: nf_tables: add support for native socket matching
Now it can only match the transparent flag of an ip/ipv6 socket.

Signed-off-by: Máté Eckl <ecklm94@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-06-01 09:46:15 +02:00
Kees Cook
ccf8dbcd06 rtnetlink: Remove VLA usage
In the quest to remove all stack VLA usage from the kernel[1], this
allocates the maximum size expected for all possible types and adds
sanity-checks at both registration and usage to make sure nothing gets
out of sync. This matches the proposed VLA solution for nfnetlink[2]. The
values chosen here were based on finding assignments for .maxtype and
.slave_maxtype and manually counting the enums:

slave_maxtype (max 33):
	IFLA_BRPORT_MAX     33
	IFLA_BOND_SLAVE_MAX  9

maxtype (max 45):
	IFLA_BOND_MAX       28
	IFLA_BR_MAX         45
	__IFLA_CAIF_HSI_MAX  8
	IFLA_CAIF_MAX        4
	IFLA_CAN_MAX        16
	IFLA_GENEVE_MAX     12
	IFLA_GRE_MAX        25
	IFLA_GTP_MAX         5
	IFLA_HSR_MAX         7
	IFLA_IPOIB_MAX       4
	IFLA_IPTUN_MAX      21
	IFLA_IPVLAN_MAX      3
	IFLA_MACSEC_MAX     15
	IFLA_MACVLAN_MAX     7
	IFLA_PPP_MAX         2
	__IFLA_RMNET_MAX     4
	IFLA_VLAN_MAX        6
	IFLA_VRF_MAX         2
	IFLA_VTI_MAX         7
	IFLA_VXLAN_MAX      28
	VETH_INFO_MAX        2
	VXCAN_INFO_MAX       2

This additionally changes maxtype and slave_maxtype fields to unsigned,
since they're only ever using positive values.

[1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com
[2] https://patchwork.kernel.org/patch/10439647/

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-31 22:48:46 -04:00
Ilan Tayari
1f0cf89b09 net/mlx5: Add FPGA QP error event
The FPGA queue pair (QP) event fires whenever a QP on the FPGA
transitions to the error state.

At this stage, this event is unrecoverable, it may become recoverable
in the future.

Signed-off-by: Ilan Tayari <ilant@mellanox.com>
Signed-off-by: Adi Nissim <adin@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-31 15:35:38 -04:00
Ilan Tayari
1865ea9adb net/mlx5: Add temperature warning event to log
Temperature warning event is sent by FW to indicate high temperature
as detected by one of the sensors on the board.
Add handling of this event by writing the numbers of the alert sensors
to the kernel log.

Signed-off-by: Ilan Tayari <ilant@mellanox.com>
Signed-off-by: Adi Nissim <adin@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-31 15:35:37 -04:00
Donald Sharp
35aada99b5 rtnetlink: Add more well known protocol values
FRRouting installs routes into the kernel associated with
the originating protocol.  Add these values to the well
known values in rtnetlink.h.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-31 15:25:10 -04:00
Sudarsana Reddy Kalluru
32d26a685c qed*: Add link change count value to ethtool statistics display.
This patch adds driver changes for capturing the link change count in
ethtool statistics display.

Please consider applying this to "net-next".

Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-31 14:02:13 -04:00
Yafang Shao
3d97d88e80 tcp: minor optimization around tcp_hdr() usage in receive path
This is additional to the
commit ea1627c20c ("tcp: minor optimizations around tcp_hdr() usage").
At this point, skb->data is same with tcp_hdr() as tcp header has not
been pulled yet. So use the less expensive one to get the tcp header.

Remove the third parameter of tcp_rcv_established() and put it into
the function body.

Furthermore, the local variables are listed as a reverse christmas tree :)

Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-31 13:20:47 -04:00
Dave Airlie
0e333751cf Merge tag 'drm-misc-fixes-2018-05-30' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
dw-hdmi: Fix Oops regression from rc1 (Neil)

Cc: Neil Armstrong <narmstrong@baylibre.com>

* tag 'drm-misc-fixes-2018-05-30' of git://anongit.freedesktop.org/drm/drm-misc:
  drm/bridge/synopsys: dw-hdmi: fix dw_hdmi_setup_rx_sense
2018-05-31 08:35:47 +10:00
Neil Armstrong
c32048d9e9 drm/bridge/synopsys: dw-hdmi: fix dw_hdmi_setup_rx_sense
The dw_hdmi_setup_rx_sense exported function should not use struct device
to recover the dw-hdmi context using drvdata, but take struct dw_hdmi
directly like other exported functions.

This caused a regression using Meson DRM on S905X since v4.17-rc1 :

Internal error: Oops: 96000007 [#1] PREEMPT SMP
[...]
CPU: 0 PID: 124 Comm: irq/32-dw_hdmi_ Not tainted 4.17.0-rc7 #2
Hardware name: Libre Technology CC (DT)
[...]
pc : osq_lock+0x54/0x188
lr : __mutex_lock.isra.0+0x74/0x530
[...]
Process irq/32-dw_hdmi_ (pid: 124, stack limit = 0x00000000adf418cb)
Call trace:
  osq_lock+0x54/0x188
  __mutex_lock_slowpath+0x10/0x18
  mutex_lock+0x30/0x38
  __dw_hdmi_setup_rx_sense+0x28/0x98
  dw_hdmi_setup_rx_sense+0x10/0x18
  dw_hdmi_top_thread_irq+0x2c/0x50
  irq_thread_fn+0x28/0x68
  irq_thread+0x10c/0x1a0
  kthread+0x128/0x130
  ret_from_fork+0x10/0x18
 Code: 34000964 d00050a2 51000484 9135c042 (f864d844)
 ---[ end trace 945641e1fbbc07da ]---
 note: irq/32-dw_hdmi_[124] exited with preempt_count 1
 genirq: exiting task "irq/32-dw_hdmi_" (124) is an active IRQ thread (irq 32)

Fixes: eea034af90 ("drm/bridge/synopsys: dw-hdmi: don't clobber drvdata")
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Tested-by: Koen Kooi <koen@dominion.thruhere.net>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: https://patchwork.freedesktop.org/patch/msgid/1527673438-20643-1-git-send-email-narmstrong@baylibre.com
2018-05-30 13:42:39 -04:00
Yafang Shao
2d68c0745a tcp: use data length instead of skb->len in tcp_probe
skb->len is meaningless to user.
data length could be more helpful, with which we can easily filter out
the packet without payload.

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-29 10:15:37 -04:00
David Ahern
8308f3ff17 net/ipv6: Add support for specifying metric of connected routes
Add support for IFA_RT_PRIORITY to ipv6 addresses.

If the metric is changed on an existing address then the new route
is inserted before removing the old one. Since the metric is one
of the route keys, the prefix route can not be atomically replaced.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-29 10:12:45 -04:00
David Ahern
af4d768ad2 net/ipv4: Add support for specifying metric of connected routes
Add support for IFA_RT_PRIORITY to ipv4 addresses.

If the metric is changed on an existing address then the new route
is inserted before removing the old one. Since the metric is one
of the route keys, the prefix route can not be replaced.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-29 10:12:45 -04:00
David Ahern
620dee9415 net: Add IFA_RT_PRIORITY address attribute
Currently, if two interfaces have addresses in the same connected route,
then the order of the prefix route entries is determined by the order in
which the addresses are assigned or the links brought up.

Add IFA_RT_PRIORITY to allow user to specify the metric of the prefix
route associated with an address giving interface managers better
control of the order of prefix routes.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-29 10:12:45 -04:00
David Ahern
e6464b8c63 net/ipv6: Convert ipv6_add_addr to struct ifa6_config
Move config parameters for adding an ipv6 address to a struct. struct
names stem from inet6_rtm_newaddr which is the modern handler for
adding an address.

Start the conversion to ifa6_config with ipv6_add_addr. This is an argument
move only; no functional change intended. Mapping of variable changes:

    addr      -->  cfg->pfx
    peer_addr -->  cfg->peer_pfx
    pfxlen    -->  cfg->plen
    flags     -->  cfg->ifa_flags

scope, valid_lft, prefered_lft have the same names within cfg
(with corrected spelling).

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-29 10:12:44 -04:00
Jakub Kicinski
47c669a406 net: sched: mq: request stats from offloads
MQ doesn't hold any statistics on its own, however, statistic
from offloads are requested starting from the root, hence MQ
will read the old values for its sums.  Call into the drivers,
because of the additive nature of the stats drivers are aware
of how much "pending updates" they have to children of the MQ.
Since MQ reset its stats on every dump we can simply offset
the stats, predicting how stats of offloaded children will
change.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-29 09:49:16 -04:00
Jakub Kicinski
f971b13230 net: sched: mq: add simple offload notification
mq offload is trivial, we just need to let the device know
that the root qdisc is mq.  Alternative approach would be
to export qdisc_lookup() and make drivers check the root
type themselves, but notification via ndo_setup_tc is more
in line with other qdiscs.

Note that mq doesn't hold any stats on it's own, it just
adds up stats of its children.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-29 09:49:16 -04:00
Jakub Kicinski
6172abc1e2 net: sched: add qstats.qlen to qlen
AFAICT struct gnet_stats_queue.qlen is not used in Qdiscs.
It may, however, be useful for offloads to report HW queue
length there.  Add that value to the result of qdisc_qlen_sum().

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-29 09:49:15 -04:00
David S. Miller
874fcf1de6 Merge tag 'mlx5e-updates-2018-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:

====================
mlx5e-updates-2018-05-25

This series includes updates for mlx5e netdev driver.

1) Allowr flow based VF vport mirroring under sriov switchdev scheme,
added support for offloading the TC mirred mirror sub-action, from
Chris Mi.

=================
From: Or Gerlitz <ogerlitz@mellanox.com>

The user will typically set the actions order such that the mirror
port (mirror VF) sees packets as the original port (VF under
mirroring) sent them or as it will receive them. In the general case,
it means that packets are potentially sent to the mirror port before
or after some actions were applied on them.

To properly do that, we follow on the exact action order as set for
the flow and make sure this will also be the case when we program the
HW offload.

If all the actions should apply before forwarding to the mirror and dest port,
mirroring is just multicasting to the two vports. Otherwise, we split
the TC flow to two HW rules, where the 1st applies only the actions
needed up to the mirror (if there are such) and the 2nd the rest of
the actions plus the forwarding to the dest vport.
=================

2) Move to order-0 only allocations (using fragmented work queues) for all
work queues used by the driver, RX and TX descriptor rings
(RQs, SQs and Completion Queues (CQs)), from Tariq Toukan.

3) Avoid resetting netdevice statistics on netdevice
state changes, from Eran Ben Elisha.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-29 09:45:13 -04:00
Florian Westphal
0cbc06b3fa netfilter: nf_tables: remove synchronize_rcu in commit phase
synchronize_rcu() is expensive.

The commit phase currently enforces an unconditional
synchronize_rcu() after incrementing the generation counter.

This is to make sure that a packet always sees a consistent chain, either
nft_do_chain is still using old generation (it will skip the newly added
rules), or the new one (it will skip old ones that might still be linked
into the list).

We could just remove the synchronize_rcu(), it would not cause a crash but
it could cause us to evaluate a rule that was removed and new rule for the
same packet, instead of either-or.

To resolve this, add rule pointer array holding two generations, the
current one and the future generation.

In commit phase, allocate the rule blob and populate it with the rules that
will be active in the new generation.

Then, make this rule blob public, replacing the old generation pointer.

Then the generation counter can be incremented.

nft_do_chain() will either continue to use the current generation
(in case loop was invoked right before increment), or the new one.

Suggested-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-05-29 14:49:59 +02:00
Paolo Abeni
e9be0e993d net: sched: shrink struct Qdisc
The struct Qdisc has a lot of holes, especially after commit
a53851e2c3 ("net: sched: explicit locking in gso_cpu fallback"),
which as a side effect, moved the fields just after 'busylock'
on a new cacheline.

Since both 'padded' and 'refcnt' are not updated frequently, and
there is a hole before 'gso_skb', we can move such fields there,
saving a cacheline without any performance side effect.

Before this commit:

pahole -C Qdisc net/sche/sch_generic.o
	# ...
        /* size: 384, cachelines: 6, members: 25 */
        /* sum members: 236, holes: 3, sum holes: 92 */
        /* padding: 56 */

After this commit:
pahole -C Qdisc net/sche/sch_generic.o
	# ...
	/* size: 320, cachelines: 5, members: 25 */
	/* sum members: 236, holes: 2, sum holes: 28 */
	/* padding: 56 */

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-28 23:13:39 -04:00
Yangbo Lu
6c50c1ed72 ptp_qoriq: move some definitions to header file
This patch is to move some definitions in ptp_qoriq.c
to the header file.

Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-28 23:05:11 -04:00
Sridhar Samudrala
9805069d14 virtio_net: Introduce VIRTIO_NET_F_STANDBY feature bit
This feature bit can be used by hypervisor to indicate virtio_net device to
act as a standby for another device with the same MAC address.

VIRTIO_NET_F_STANDBY is defined as bit 62 as it is a device feature bit.

Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-28 22:59:54 -04:00
Sridhar Samudrala
cfc80d9a11 net: Introduce net_failover driver
The net_failover driver provides an automated failover mechanism via APIs
to create and destroy a failover master netdev and manages a primary and
standby slave netdevs that get registered via the generic failover
infrastructure.

The failover netdev acts a master device and controls 2 slave devices. The
original paravirtual interface gets registered as 'standby' slave netdev and
a passthru/vf device with the same MAC gets registered as 'primary' slave
netdev. Both 'standby' and 'failover' netdevs are associated with the same
'pci' device. The user accesses the network interface via 'failover' netdev.
The 'failover' netdev chooses 'primary' netdev as default for transmits when
it is available with link up and running.

This can be used by paravirtual drivers to enable an alternate low latency
datapath. It also enables hypervisor controlled live migration of a VM with
direct attached VF by failing over to the paravirtual datapath when the VF
is unplugged.

Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-28 22:59:54 -04:00
Sridhar Samudrala
30c8bd5aa8 net: Introduce generic failover module
The failover module provides a generic interface for paravirtual drivers
to register a netdev and a set of ops with a failover instance. The ops
are used as event handlers that get called to handle netdev register/
unregister/link change/name change events on slave pci ethernet devices
with the same mac address as the failover netdev.

This enables paravirtual drivers to use a VF as an accelerated low latency
datapath. It also allows migration of VMs with direct attached VFs by
failing over to the paravirtual datapath when the VF is unplugged.

Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-28 22:59:54 -04:00
Máté Eckl
e5a10bb2ac netfilter: add includes to nf_socket.h
These have to be included always when nf_socket.h is included.

Signed-off-by: Máté Eckl <ecklm94@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-05-29 00:23:58 +02:00
David S. Miller
5b79c2af66 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Lots of easy overlapping changes in the confict
resolutions here.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-26 19:46:15 -04:00
Linus Torvalds
cc71efda82 Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Thomas Gleixner:
 "Three fixes for scheduler and kthread code:

   - allow calling kthread_park() on an already parked thread

   - restore the sched_pi_setprio() tracepoint behaviour

   - clarify the unclear string for the scheduling domain debug output"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched, tracing: Fix trace_sched_pi_setprio() for deboosting
  kthread: Allow kthread_park() on a parked kthread
  sched/topology: Clarify root domain(s) debug string
2018-05-26 13:10:16 -07:00
Linus Torvalds
bc2dbc5420 Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
 "16 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  kasan: fix memory hotplug during boot
  kasan: free allocated shadow memory on MEM_CANCEL_ONLINE
  checkpatch: fix macro argument precedence test
  init/main.c: include <linux/mem_encrypt.h>
  kernel/sys.c: fix potential Spectre v1 issue
  mm/memory_hotplug: fix leftover use of struct page during hotplug
  proc: fix smaps and meminfo alignment
  mm: do not warn on offline nodes unless the specific node is explicitly requested
  mm, memory_hotplug: make has_unmovable_pages more robust
  mm/kasan: don't vfree() nonexistent vm_area
  MAINTAINERS: change hugetlbfs maintainer and update files
  ipc/shm: fix shmat() nil address after round-down when remapping
  Revert "ipc/shm: Fix shmat mmap nil-page protection"
  idr: fix invalid ptr dereference on item delete
  ocfs2: revert "ocfs2/o2hb: check len for bio_add_page() to avoid getting incorrect bio"
  mm: fix nr_rotate_swap leak in swapon() error case
2018-05-25 20:24:28 -07:00
Linus Torvalds
03250e1028 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:
 "Let's begin the holiday weekend with some networking fixes:

   1) Whoops need to restrict cfg80211 wiphy names even more to 64
      bytes. From Eric Biggers.

   2) Fix flags being ignored when using kernel_connect() with SCTP,
      from Xin Long.

   3) Use after free in DCCP, from Alexey Kodanev.

   4) Need to check rhltable_init() return value in ipmr code, from Eric
      Dumazet.

   5) XDP handling fixes in virtio_net from Jason Wang.

   6) Missing RTA_TABLE in rtm_ipv4_policy[], from Roopa Prabhu.

   7) Need to use IRQ disabling spinlocks in mlx4_qp_lookup(), from Jack
      Morgenstein.

   8) Prevent out-of-bounds speculation using indexes in BPF, from
      Daniel Borkmann.

   9) Fix regression added by AF_PACKET link layer cure, from Willem de
      Bruijn.

  10) Correct ENIC dma mask, from Govindarajulu Varadarajan.

  11) Missing config options for PMTU tests, from Stefano Brivio"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (48 commits)
  ibmvnic: Fix partial success login retries
  selftests/net: Add missing config options for PMTU tests
  mlx4_core: allocate ICM memory in page size chunks
  enic: set DMA mask to 47 bit
  ppp: remove the PPPIOCDETACH ioctl
  ipv4: remove warning in ip_recv_error
  net : sched: cls_api: deal with egdev path only if needed
  vhost: synchronize IOTLB message with dev cleanup
  packet: fix reserve calculation
  net/mlx5: IPSec, Fix a race between concurrent sandbox QP commands
  net/mlx5e: When RXFCS is set, add FCS data into checksum calculation
  bpf: properly enforce index mask to prevent out-of-bounds speculation
  net/mlx4: Fix irq-unsafe spinlock usage
  net: phy: broadcom: Fix bcm_write_exp()
  net: phy: broadcom: Fix auxiliary control register reads
  net: ipv4: add missing RTA_TABLE to rtm_ipv4_policy
  net/mlx4: fix spelling mistake: "Inrerface" -> "Interface" and rephrase message
  ibmvnic: Only do H_EOI for mobility events
  tuntap: correctly set SOCKWQ_ASYNC_NOSPACE
  virtio-net: fix leaking page for gso packet during mergeable XDP
  ...
2018-05-25 19:54:42 -07:00
Jonathan Cameron
a21558618c mm/memory_hotplug: fix leftover use of struct page during hotplug
The case of a new numa node got missed in avoiding using the node info
from page_struct during hotplug.  In this path we have a call to
register_mem_sect_under_node (which allows us to specify it is hotplug
so don't change the node), via link_mem_sections which unfortunately
does not.

Fix is to pass check_nid through link_mem_sections as well and disable
it in the new numa node path.

Note the bug only 'sometimes' manifests depending on what happens to be
in the struct page structures - there are lots of them and it only needs
to match one of them.

The result of the bug is that (with a new memory only node) we never
successfully call register_mem_sect_under_node so don't get the memory
associated with the node in sysfs and meminfo for the node doesn't
report it.

It came up whilst testing some arm64 hotplug patches, but appears to be
universal.  Whilst I'm triggering it by removing then reinserting memory
to a node with no other elements (thus making the node disappear then
appear again), it appears it would happen on hotplugging memory where
there was none before and it doesn't seem to be related the arm64
patches.

These patches call __add_pages (where most of the issue was fixed by
Pavel's patch).  If there is a node at the time of the __add_pages call
then all is well as it calls register_mem_sect_under_node from there
with check_nid set to false.  Without a node that function returns
having not done the sysfs related stuff as there is no node to use.
This is expected but it is the resulting path that fails...

Exact path to the problem is as follows:

 mm/memory_hotplug.c: add_memory_resource()

   The node is not online so we enter the 'if (new_node)' twice, on the
   second such block there is a call to link_mem_sections which calls
   into

  drivers/node.c: link_mem_sections() which calls

  drivers/node.c: register_mem_sect_under_node() which calls
     get_nid_for_pfn and keeps trying until the output of that matches
     the expected node (passed all the way down from
     add_memory_resource)

It is effectively the same fix as the one referred to in the fixes tag
just in the code path for a new node where the comments point out we
have to rerun the link creation because it will have failed in
register_new_memory (as there was no node at the time).  (actually that
comment is wrong now as we don't have register_new_memory any more it
got renamed to hotplug_memory_register in Pavel's patch).

Link: http://lkml.kernel.org/r/20180504085311.1240-1-Jonathan.Cameron@huawei.com
Fixes: fc44f7f923 ("mm/memory_hotplug: don't read nid from struct page during hotplug")
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-05-25 18:12:11 -07:00