Commit Graph

121580 Commits

Author SHA1 Message Date
Stefano Brivio
4cb47a8644 tunnels: PMTU discovery support for directly bridged IP packets
It's currently possible to bridge Ethernet tunnels carrying IP
packets directly to external interfaces without assigning them
addresses and routes on the bridged network itself: this is the case
for UDP tunnels bridged with a standard bridge or by Open vSwitch.

PMTU discovery is currently broken with those configurations, because
the encapsulation effectively decreases the MTU of the link, and
while we are able to account for this using PMTU discovery on the
lower layer, we don't have a way to relay ICMP or ICMPv6 messages
needed by the sender, because we don't have valid routes to it.

On the other hand, as a tunnel endpoint, we can't fragment packets
as a general approach: this is for instance clearly forbidden for
VXLAN by RFC 7348, section 4.3:

   VTEPs MUST NOT fragment VXLAN packets.  Intermediate routers may
   fragment encapsulated VXLAN packets due to the larger frame size.
   The destination VTEP MAY silently discard such VXLAN fragments.

The same paragraph recommends that the MTU over the physical network
accomodates for encapsulations, but this isn't a practical option for
complex topologies, especially for typical Open vSwitch use cases.

Further, it states that:

   Other techniques like Path MTU discovery (see [RFC1191] and
   [RFC1981]) MAY be used to address this requirement as well.

Now, PMTU discovery already works for routed interfaces, we get
route exceptions created by the encapsulation device as they receive
ICMP Fragmentation Needed and ICMPv6 Packet Too Big messages, and
we already rebuild those messages with the appropriate MTU and route
them back to the sender.

Add the missing bits for bridged cases:

- checks in skb_tunnel_check_pmtu() to understand if it's appropriate
  to trigger a reply according to RFC 1122 section 3.2.2 for ICMP and
  RFC 4443 section 2.4 for ICMPv6. This function is already called by
  UDP tunnels

- a new function generating those ICMP or ICMPv6 replies. We can't
  reuse icmp_send() and icmp6_send() as we don't see the sender as a
  valid destination. This doesn't need to be generic, as we don't
  cover any other type of ICMP errors given that we only provide an
  encapsulation function to the sender

While at it, make the MTU check in skb_tunnel_check_pmtu() accurate:
we might receive GSO buffers here, and the passed headroom already
includes the inner MAC length, so we don't have to account for it
a second time (that would imply three MAC headers on the wire, but
there are just two).

This issue became visible while bridging IPv6 packets with 4500 bytes
of payload over GENEVE using IPv4 with a PMTU of 4000. Given the 50
bytes of encapsulation headroom, we would advertise MTU as 3950, and
we would reject fragmented IPv6 datagrams of 3958 bytes size on the
wire. We're exclusively dealing with network MTU here, though, so we
could get Ethernet frames up to 3964 octets in that case.

v2:
- moved skb_tunnel_check_pmtu() to ip_tunnel_core.c (David Ahern)
- split IPv4/IPv6 functions (David Ahern)

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-04 13:01:45 -07:00
Stefano Brivio
df23bb18b4 ipv4: route: Ignore output interface in FIB lookup for PMTU route
Currently, processes sending traffic to a local bridge with an
encapsulation device as a port don't get ICMP errors if they exceed
the PMTU of the encapsulated link.

David Ahern suggested this as a hack, but it actually looks like
the correct solution: when we update the PMTU for a given destination
by means of updating or creating a route exception, the encapsulation
might trigger this because of PMTU discovery happening either on the
encapsulation device itself, or its lower layer. This happens on
bridged encapsulations only.

The output interface shouldn't matter, because we already have a
valid destination. Drop the output interface restriction from the
associated route lookup.

For UDP tunnels, we will now have a route exception created for the
encapsulation itself, with a MTU value reflecting its headroom, which
allows a bridge forwarding IP packets originated locally to deliver
errors back to the sending socket.

The behaviour is now consistent with IPv6 and verified with selftests
pmtu_ipv{4,6}_br_{geneve,vxlan}{4,6}_exception introduced later in
this series.

v2:
- reset output interface only for bridge ports (David Ahern)
- add and use netif_is_any_bridge_port() helper (David Ahern)

Suggested-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-04 13:01:45 -07:00
David S. Miller
cabf06e5a2 Merge tag 'wireless-drivers-next-2020-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next
Kalle Valo says:

====================
wireless-drivers-next patches for v5.9

Second set of patches for v5.9. mt76 has most of patches this time.
Otherwise it's just smaller fixes and cleanups to other drivers.

There was a major conflict in mt76 driver between wireless-drivers and
wireless-drivers-next. I solved that by merging the former to the
latter.

Major changes:

rtw88

* add support for ieee80211_ops::change_interface

* add support for enabling and disabling beacon

* add debugfs file for testing h2c

mt76

* ARP filter offload for 7663

* runtime power management for 7663

* testmode support for mfg calibration

* support for more channels
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-04 12:57:02 -07:00
David S. Miller
2e7199bd77 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2020-08-04

The following pull-request contains BPF updates for your *net-next* tree.

We've added 73 non-merge commits during the last 9 day(s) which contain
a total of 135 files changed, 4603 insertions(+), 1013 deletions(-).

The main changes are:

1) Implement bpf_link support for XDP. Also add LINK_DETACH operation for the BPF
   syscall allowing processes with BPF link FD to force-detach, from Andrii Nakryiko.

2) Add BPF iterator for map elements and to iterate all BPF programs for efficient
   in-kernel inspection, from Yonghong Song and Alexei Starovoitov.

3) Separate bpf_get_{stack,stackid}() helpers for perf events in BPF to avoid
   unwinder errors, from Song Liu.

4) Allow cgroup local storage map to be shared between programs on the same
   cgroup. Also extend BPF selftests with coverage, from YiFei Zhu.

5) Add BPF exception tables to ARM64 JIT in order to be able to JIT BPF_PROBE_MEM
   load instructions, from Jean-Philippe Brucker.

6) Follow-up fixes on BPF socket lookup in combination with reuseport group
   handling. Also add related BPF selftests, from Jakub Sitnicki.

7) Allow to use socket storage in BPF_PROG_TYPE_CGROUP_SOCK-typed programs for
   socket create/release as well as bind functions, from Stanislav Fomichev.

8) Fix an info leak in xsk_getsockopt() when retrieving XDP stats via old struct
   xdp_statistics, from Peilin Ye.

9) Fix PT_REGS_RC{,_CORE}() macros in libbpf for MIPS arch, from Jerry Crunchtime.

10) Extend BPF kernel test infra with skb->family and skb->{local,remote}_ip{4,6}
    fields and allow user space to specify skb->dev via ifindex, from Dmitry Yakunin.

11) Fix a bpftool segfault due to missing program type name and make it more robust
    to prevent them in future gaps, from Quentin Monnet.

12) Consolidate cgroup helper functions across selftests and fix a v6 localhost
    resolver issue, from John Fastabend.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:27:40 -07:00
David S. Miller
76769c38b4 Merge tag 'mlx5-updates-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:

====================
mlx5-updates-2020-08-03

This patchset introduces some updates to mlx5 driver.

1) Jakub converts mlx5 to use the new udp tunnel infrastructure.
   Starting with a hack to allow drivers to request a static configuration
   of the default vxlan port, and then a patch that converts mlx5.

2) Parav implements change_carrier ndo for VF eswitch representors,
   to speedup link state control of representors netdevices.

3) Alex Vesker, makes a simple update to software steering to fix an issue
   with push vlan action sequence

4) Leon removes a redundant dump stack on error flow.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:24:30 -07:00
Florian Fainelli
c99194eded net: dsa: loop: Wire-up MTU callbacks
For now we simply store the port MTU into a per-port member.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:19:23 -07:00
Florian Fainelli
6c84a58997 net: dsa: loop: Move data structures to header
In preparation for adding support for a mockup data path, move the
driver data structures to include/linux/dsa/loop.h such that we can
share them between net/dsa/ and drivers/net/dsa/ later on.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:19:23 -07:00
Ido Schimmel
c88e11e047 devlink: Pass extack when setting trap's action and group's parameters
A later patch will refuse to set the action of certain traps in mlxsw
and also to change the policer binding of certain groups. Pass extack so
that failure could be communicated clearly to user space.

Reviewed-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:06:46 -07:00
Amit Cohen
08e335f6ad devlink: Add early_drop trap
Add the packet trap that can report packets that were ECN marked due to RED
AQM.

Signed-off-by: Amit Cohen <amitc@mellanox.com>
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:06:46 -07:00
David S. Miller
ee494f42a3 Merge tag 'mac80211-next-for-davem-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next
Johannes Berg says:

====================
A few more changes, notably:
 * handle new SAE (WPA3 authentication) status codes in the correct way
 * fix a while that should be an if instead, avoiding infinite loops
 * handle beacon filtering changing better
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:00:22 -07:00
Ioana-Ruxandra Stăncioi
88fab21c69 seg6_iptunnel: Refactor seg6_lwt_headroom out of uapi header
Refactor the function seg6_lwt_headroom out of the seg6_iptunnel.h uapi
header, because it is only used in seg6_iptunnel.c. Moreover, it is only
used in the kernel code, as indicated by the "#ifdef __KERNEL__".

Suggested-by: David Miller <davem@davemloft.net>
Signed-off-by: Ioana-Ruxandra Stăncioi <stancioi@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 17:57:40 -07:00
David S. Miller
f2e0b29a9a Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

1) UAF in chain binding support from previous batch, from Dan Carpenter.

2) Queue up delayed work to expire connections with no destination,
   from Andrew Sy Kim.

3) Use fallthrough pseudo-keyword, from Gustavo A. R. Silva.

4) Replace HTTP links with HTTPS, from Alexander A. Klimov.

5) Remove superfluous null header checks in ip6tables, from
   Gaurav Singh.

6) Add extended netlink error reporting for expression.

7) Report EEXIST on overlapping chain, set elements and flowtable
   devices.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 16:03:18 -07:00
Eelco Chaudron
9bf24f594c net: openvswitch: make masks cache size configurable
This patch makes the masks cache size configurable, or with
a size of 0, disable it.

Reviewed-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 15:17:48 -07:00
Eelco Chaudron
9d2f627b7e net: openvswitch: add masks cache hit counter
Add a counter that counts the number of masks cache hits, and
export it through the megaflow netlink statistics.

Reviewed-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 15:17:48 -07:00
wenxu
038ebb1a71 net/sched: act_ct: fix miss set mru for ovs after defrag in act_ct
When openvswitch conntrack offload with act_ct action. Fragment packets
defrag in the ingress tc act_ct action and miss the next chain. Then the
packet pass to the openvswitch datapath without the mru. The over
mtu packet will be dropped in output action in openvswitch for over mtu.

"kernel: net2: dropped over-mtu packet: 1528 > 1500"

This patch add mru in the tc_skb_ext for adefrag and miss next chain
situation. And also add mru in the qdisc_skb_cb. The act_ct set the mru
to the qdisc_skb_cb when the packet defrag. And When the chain miss,
The mru is set to tc_skb_ext which can be got by ovs datapath.

Fixes: b57dc7c13e ("net/sched: Introduce action ct")
Signed-off-by: wenxu <wenxu@ucloud.cn>
Reviewed-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 15:04:48 -07:00
Bruno Thomsen
bb3831294c net: mdiobus: add reset-post-delay-us handling
Load new "reset-post-delay-us" value from MDIO properties,
and if configured to a greater then zero delay do a
flexible sleeping delay after MDIO bus reset deassert.
This allows devices to exit reset state before start
bus communication.

Signed-off-by: Bruno Thomsen <bruno.thomsen@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 15:01:02 -07:00
Jakub Kicinski
966e505976 udp_tunnel: add the ability to hard-code IANA VXLAN
mlx5 has the IANA VXLAN port (4789) hard coded by the device,
instead of being added dynamically when tunnels are created.

To support this add a workaround flag to struct udp_tunnel_nic_info.
Skipping updates for the port is fairly trivial, dumping the hard
coded port via ethtool requires some code duplication. The port
is not a part of any real table, we dump it in a special table
which has no tunnel types supported and only one entry.

This is the last known workaround / hack needed to convert
all drivers to the new infra.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-08-03 10:13:54 -07:00
Jouni Malinen
4e56cde15f mac80211: Handle special status codes in SAE commit
SAE authentication has been extended with H2E (IEEE 802.11 REVmd) and PK
(WFA) options. Those extensions use special status code values in the
SAE commit messages (Authentication frame with transaction sequence
number 1) to identify which extension is in use. mac80211 was
interpreting those new values as the AP denying authentication and that
resulted in failure to complete SAE authentication in some cases.

Fix this by adding exceptions for the new status code values 126 and
127.

Signed-off-by: Jouni Malinen <jouni@codeaurora.org>
Link: https://lore.kernel.org/r/20200731183830.18735-1-jouni@codeaurora.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-08-03 10:54:54 +02:00
Ajay Singh
c83e2a6e2f wilc1000: Move wilc1000 SDIO ID's from driver source to common header file
Moved macros used for Vendor/Device ID from wilc1000 driver to common
header file and changed macro name for consistency with other macros.

Signed-off-by: Ajay Singh <ajay.kathat@microchip.com>
Acked-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200717051134.19160-1-ajay.kathat@microchip.com
2020-08-02 18:12:23 +03:00
David S. Miller
bd0b33b248 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Resolved kernel/bpf/btf.c using instructions from merge commit
69138b34a7

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-02 01:02:12 -07:00
Andrii Nakryiko
73b11c2ab0 bpf: Add support for forced LINK_DETACH command
Add LINK_DETACH command to force-detach bpf_link without destroying it. It has
the same behavior as auto-detaching of bpf_link due to cgroup dying for
bpf_cgroup_link or net_device being destroyed for bpf_xdp_link. In such case,
bpf_link is still a valid kernel object, but is defuncts and doesn't hold BPF
program attached to corresponding BPF hook. This functionality allows users
with enough access rights to manually force-detach attached bpf_link without
killing respective owner process.

This patch implements LINK_DETACH for cgroup, xdp, and netns links, mostly
re-using existing link release handling code.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20200731182830.286260-2-andriin@fb.com
2020-08-01 20:38:28 -07:00
Linus Torvalds
ac3a0c8472 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from David Miller:

 1) Encap offset calculation is incorrect in esp6, from Sabrina Dubroca.

 2) Better parameter validation in pfkey_dump(), from Mark Salyzyn.

 3) Fix several clang issues on powerpc in selftests, from Tanner Love.

 4) cmsghdr_from_user_compat_to_kern() uses the wrong length, from Al
    Viro.

 5) Out of bounds access in mlx5e driver, from Raed Salem.

 6) Fix transfer buffer memleak in lan78xx, from Johan Havold.

 7) RCU fixups in rhashtable, from Herbert Xu.

 8) Fix ipv6 nexthop refcnt leak, from Xiyu Yang.

 9) vxlan FDB dump must be done under RCU, from Ido Schimmel.

10) Fix use after free in mlxsw, from Ido Schimmel.

11) Fix map leak in HASH_OF_MAPS bpf code, from Andrii Nakryiko.

12) Fix bug in mac80211 Tx ack status reporting, from Vasanthakumar
    Thiagarajan.

13) Fix memory leaks in IPV6_ADDRFORM code, from Cong Wang.

14) Fix bpf program reference count leaks in mlx5 during
    mlx5e_alloc_rq(), from Xin Xiong.

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (86 commits)
  vxlan: fix memleak of fdb
  rds: Prevent kernel-infoleak in rds_notify_queue_get()
  net/sched: The error lable position is corrected in ct_init_module
  net/mlx5e: fix bpf_prog reference count leaks in mlx5e_alloc_rq
  net/mlx5e: E-Switch, Specify flow_source for rule with no in_port
  net/mlx5e: E-Switch, Add misc bit when misc fields changed for mirroring
  net/mlx5e: CT: Support restore ipv6 tunnel
  net: gemini: Fix missing clk_disable_unprepare() in error path of gemini_ethernet_port_probe()
  ionic: unlock queue mutex in error path
  atm: fix atm_dev refcnt leaks in atmtcp_remove_persistent
  net: ethernet: mtk_eth_soc: fix MTU warnings
  net: nixge: fix potential memory leak in nixge_probe()
  devlink: ignore -EOPNOTSUPP errors on dumpit
  rxrpc: Fix race between recvmsg and sendmsg on immediate call failure
  MAINTAINERS: Replace Thor Thayer as Altera Triple Speed Ethernet maintainer
  selftests/bpf: fix netdevsim trap_flow_action_cookie read
  ipv6: fix memory leaks on IPV6_ADDRFORM path
  net/bpfilter: Initialize pos in __bpfilter_process_sockopt
  igb: reinit_locked() should be called with rtnl_lock
  e1000e: continue to init PHY even when failed to disable ULP
  ...
2020-08-01 16:47:24 -07:00
David S. Miller
6f3de75cdf Merge tag 'mac80211-next-for-davem-2020-07-31' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next
Johannes Berg says:

====================
We have a number of changes
 * code cleanups and fixups as usual
 * AQL & internal TXQ improvements from Felix
 * some mesh 802.1X support bits
 * some injection improvements from Mathy of KRACK
   fame, so we'll see what this results in ;-)
 * some more initial S1G supports bits, this time
   (some of?) the userspace APIs
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-31 18:51:40 -07:00
Roopa Prabhu
829eb208e8 rtnetlink: add support for protodown reason
netdev protodown is a mechanism that allows protocols to
hold an interface down. It was initially introduced in
the kernel to hold links down by a multihoming protocol.
There was also an attempt to introduce protodown
reason at the time but was rejected. protodown and protodown reason
is supported by almost every switching and routing platform.
It was ok for a while to live without a protodown reason.
But, its become more critical now given more than
one protocol may need to keep a link down on a system
at the same time. eg: vrrp peer node, port security,
multihoming protocol. Its common for Network operators and
protocol developers to look for such a reason on a networking
box (Its also known as errDisable by most networking operators)

This patch adds support for link protodown reason
attribute. There are two ways to maintain protodown
reasons.
(a) enumerate every possible reason code in kernel
    - A protocol developer has to make a request and
      have that appear in a certain kernel version
(b) provide the bits in the kernel, and allow user-space
(sysadmin or NOS distributions) to manage the bit-to-reasonname
map.
	- This makes extending reason codes easier (kind of like
      the iproute2 table to vrf-name map /etc/iproute2/rt_tables.d/)

This patch takes approach (b).

a few things about the patch:
- It treats the protodown reason bits as counter to indicate
active protodown users
- Since protodown attribute is already an exposed UAPI,
the reason is not enforced on a protodown set. Its a no-op
if not used.
the patch follows the below algorithm:
  - presence of reason bits set indicates protodown
    is in use
  - user can set protodown and protodown reason in a
    single or multiple setlink operations
  - setlink operation to clear protodown, will return -EBUSY
    if there are active protodown reason bits
  - reason is not included in link dumps if not used

example with patched iproute2:
$cat /etc/iproute2/protodown_reasons.d/r.conf
0 mlag
1 evpn
2 vrrp
3 psecurity

$ip link set dev vxlan0 protodown on protodown_reason vrrp on
$ip link set dev vxlan0 protodown_reason mlag on
$ip link show
14: vxlan0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode
DEFAULT group default qlen 1000
    link/ether f6:06:be:17:91:e7 brd ff:ff:ff:ff:ff:ff protodown on <mlag,vrrp>

$ip link set dev vxlan0 protodown_reason mlag off
$ip link set dev vxlan0 protodown off protodown_reason vrrp off

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-31 18:49:16 -07:00
David S. Miller
8d46215a1f Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec
Steffen Klassert says:

====================
pull request (net): ipsec 2020-07-31

1) Fix policy matching with mark and mask on userspace interfaces.
   From Xin Long.

2) Several fixes for the new ESP in TCP encapsulation.
   From Sabrina Dubroca.

3) Fix crash when the hold queue is used. The assumption that
   xdst->path and dst->child are not a NULL pointer only if dst->xfrm
   is not a NULL pointer is true with the exception of using the
   hold queue. Fix this by checking for hold queue usage before
   dereferencing xdst->path or dst->child.

4) Validate pfkey_dump parameter before sending them.
   From Mark Salyzyn.

5) Fix the location of the transport header with ESP in UDPv6
   encapsulation. From Sabrina Dubroca.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-31 17:10:53 -07:00
Yousuk Seung
48040793fa tcp: add earliest departure time to SCM_TIMESTAMPING_OPT_STATS
This change adds TCP_NLA_EDT to SCM_TIMESTAMPING_OPT_STATS that reports
the earliest departure time(EDT) of the timestamped skb. By tracking EDT
values of the skb from different timestamps, we can observe when and how
much the value changed. This allows to measure the precise delay
injected on the sender host e.g. by a bpf-base throttler.

Signed-off-by: Yousuk Seung <ysseung@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-31 17:00:44 -07:00
Florian Westphal
6fc8c827dd tcp: syncookies: create mptcp request socket for ACK cookies with MPTCP option
If SYN packet contains MP_CAPABLE option, keep it enabled.
Syncokie validation and cookie-based socket creation is changed to
instantiate an mptcp request sockets if the ACK contains an MPTCP
connection request.

Rather than extend both cookie_v4/6_check, add a common helper to create
the (mp)tcp request socket.

Suggested-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-31 16:55:32 -07:00
Florian Westphal
c83a47e50d mptcp: subflow: add mptcp_subflow_init_cookie_req helper
Will be used to initialize the mptcp request socket when a MP_CAPABLE
request was handled in syncookie mode, i.e. when a TCP ACK containing a
MP_CAPABLE option is a valid syncookie value.

Normally (non-cookie case), MPTCP will generate a unique 32 bit connection
ID and stores it in the MPTCP token storage to be able to retrieve the
mptcp socket for subflow joining.

In syncookie case, we do not want to store any state, so just generate the
unique ID and use it in the reply.

This means there is a small window where another connection could generate
the same token.

When Cookie ACK comes back, we check that the token has not been registered
in the mean time.  If it was, the connection needs to fall back to TCP.

Changes in v2:
 - use req->syncookie instead of passing 'want_cookie' arg to ->init_req()
   (Eric Dumazet)

Signed-off-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-31 16:55:32 -07:00
Florian Westphal
08b8d08098 mptcp: rename and export mptcp_subflow_request_sock_ops
syncookie code path needs to create an mptcp request sock.

Prepare for this and add mptcp prefix plus needed export of ops struct.

Signed-off-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-31 16:55:32 -07:00
Florian Westphal
f8ace8d915 tcp: rename request_sock cookie_ts bit to syncookie
Nowadays output function has a 'synack_type' argument that tells us when
the syn/ack is emitted via syncookies.

The request already tells us when timestamps are supported, so check
both to detect special timestamp for tcp option encoding is needed.

We could remove cookie_ts altogether, but a followup patch would
otherwise need to adjust function signatures to pass 'want_cookie' to
mptcp core.

This way, the 'existing' bit can be used.

Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-31 16:55:32 -07:00
David S. Miller
4bb540dbe4 Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next
Johan Hedberg says:

====================
pull request: bluetooth-next 2020-07-31

Here's the main bluetooth-next pull request for 5.9:

 - Fix firmware filenames for Marvell chipsets
 - Several suspend-related fixes
 - Addedd mgmt commands for runtime configuration
 - Multiple fixes for Qualcomm-based controllers
 - Add new monitoring feature for mgmt
 - Fix handling of legacy cipher (E4) together with security level 4
 - Add support for Realtek 8822CE controller
 - Fix issues with Chinese controllers using fake VID/PID values
 - Multiple other smaller fixes & improvements
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-31 15:11:52 -07:00
Linus Torvalds
7dc6fd0f3b Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
 "Some I2C core improvements to prevent NULL pointer usage and a
  MAINTAINERS update"

* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: slave: add sanity check when unregistering
  i2c: slave: improve sanity check when registering
  MAINTAINERS: Update GENI I2C maintainers list
  i2c: also convert placeholder function to return errno
2020-07-31 12:50:54 -07:00
Linus Torvalds
ae2911de2e Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull rdma fixes from Jason Gunthorpe:
 "Two more merge window regressions, a corruption bug in hfi1 and a few
  other small fixes.

   - Missing user input validation regression in ucma

   - Disallowing a previously allowed user combination regression in
     mlx5

   - ODP prefetch memory leaking triggerable by userspace

   - Memory corruption in hf1 due to faulty ring buffer logic

   - Missed mutex initialization crash in mlx5

   - Two small defects with RDMA DIM"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  RDMA/core: Free DIM memory in error unwind
  RDMA/core: Stop DIM before destroying CQ
  RDMA/mlx5: Initialize QP mutex for the debug kernels
  IB/rdmavt: Fix RQ counting issues causing use of an invalid RWQE
  RDMA/mlx5: Allow providing extra scatter CQE QP flag
  RDMA/mlx5: Fix prefetch memory leak if get_prefetchable_mr fails
  RDMA/cm: Add min length checks to user structure copies
2020-07-31 09:22:10 -07:00
Chung-Hsien Hsu
f96622749a nl80211: support 4-way handshake offloading for WPA/WPA2-PSK in AP mode
Let drivers advertise support for AP-mode WPA/WPA2-PSK 4-way handshake
offloading with a new NL80211_EXT_FEATURE_4WAY_HANDSHAKE_AP_PSK flag.

Extend use of NL80211_ATTR_PMK attribute indicating it might be passed
as part of NL80211_CMD_START_AP command, and contain the PSK (which is
the PMK, hence the name).

The driver is assumed to handle the 4-way handshake by itself in this
case, instead of relying on userspace.

Signed-off-by: Chung-Hsien Hsu <stanley.hsu@cypress.com>
Signed-off-by: Chi-Hsien Lin <chi-hsien.lin@cypress.com>
Link: https://lore.kernel.org/r/20200623134938.39997-2-chi-hsien.lin@cypress.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-07-31 09:27:02 +02:00
Johannes Berg
75e6b594bb cfg80211: invert HE BSS color 'disabled' to 'enabled'
This is in fact 'disabled' in the spec, but there it's in a
place where that actually makes sense. In our internal data
structures, it doesn't really make sense, and in fact the
previous commit just fixed a bug in that area.

Make this safer by inverting the polarity from 'disabled' to
'enabled'.

Link: https://lore.kernel.org/r/20200730130051.5d8399545bd9.Ie62fdcd1a6cd9c969315bc124084a494ca6c8df3@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-07-31 09:27:02 +02:00
Felix Fietkau
c5d1686b31 mac80211: add a function for running rx without passing skbs to the stack
This can be used to run mac80211 rx processing on a batch of frames in NAPI
poll before passing them to the network stack in a large batch.
This can improve icache footprint, or it can be used to pass frames via
netif_receive_skb_list.

Signed-off-by: Felix Fietkau <nbd@nbd.name>
Link: https://lore.kernel.org/r/20200726110611.46886-1-nbd@nbd.name
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-07-31 09:27:01 +02:00
Mathy Vanhoef
cb17ed29a7 mac80211: parse radiotap header when selecting Tx queue
Already parse the radiotap header in ieee80211_monitor_select_queue.
In a subsequent commit this will allow us to add a radiotap flag that
influences the queue on which injected packets will be sent.

This also fixes the incomplete validation of the injected frame in
ieee80211_monitor_select_queue: currently an out of bounds memory
access may occur in in the called function ieee80211_select_queue_80211
if the 802.11 header is too small.

Note that in ieee80211_monitor_start_xmit the radiotap header is parsed
again, which is necessairy because ieee80211_monitor_select_queue is not
always called beforehand.

Signed-off-by: Mathy Vanhoef <Mathy.Vanhoef@kuleuven.be>
Link: https://lore.kernel.org/r/20200723100153.31631-6-Mathy.Vanhoef@kuleuven.be
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-07-31 09:27:01 +02:00
Mathy Vanhoef
e02281e7a5 mac80211: add radiotap flag to prevent sequence number overwrite
The radiotap specification contains a flag to indicate that the sequence
number of an injected frame should not be overwritten. Parse this flag
and define and set a corresponding Tx control flag.

Signed-off-by: Mathy Vanhoef <Mathy.Vanhoef@kuleuven.be>
Link: https://lore.kernel.org/r/20200723100153.31631-2-Mathy.Vanhoef@kuleuven.be
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-07-31 09:27:00 +02:00
Veerendranath Jakkam
fd17dba1c8 cfg80211: Add support to advertize OCV support
Add a new feature flag that drivers can use to advertize support for
Operating Channel Validation (OCV) when using driver's SME for RSNA
handshakes.

Signed-off-by: Veerendranath Jakkam <vjakkam@codeaurora.org>
Link: https://lore.kernel.org/r/20200720074225.8990-1-vjakkam@codeaurora.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-07-31 09:26:59 +02:00
Felix Fietkau
48a54f6bc4 net/fq_impl: use skb_get_hash instead of skb_get_hash_perturb
This avoids unnecessarily regenerating the skb flow hash

Signed-off-by: Felix Fietkau <nbd@nbd.name>
Link: https://lore.kernel.org/r/20200726130947.88145-1-nbd@nbd.name
[small commit message fixup]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-07-31 09:24:24 +02:00
Markus Theil
1303a51c24 cfg80211/mac80211: add connected to auth server to station info
This patch adds the necessary bits to later query the auth server
flag for every peer from iw.

Signed-off-by: Markus Theil <markus.theil@tu-ilmenau.de>
Link: https://lore.kernel.org/r/20200611140238.427461-2-markus.theil@tu-ilmenau.de
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-07-31 09:24:24 +02:00
Markus Theil
184eebe664 cfg80211/mac80211: add connected to auth server to meshconf
Besides information about num of peerings and gate connectivity,
the mesh formation byte also contains a flag for authentication
server connectivity, that currently cannot be set in the mesh conf.
This patch adds this capability, which is necessary to implement
802.1X authentication in mesh mode.

Signed-off-by: Markus Theil <markus.theil@tu-ilmenau.de>
Link: https://lore.kernel.org/r/20200611140238.427461-1-markus.theil@tu-ilmenau.de
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-07-31 09:24:24 +02:00
Linus Lüssing
e3718a6114 cfg80211/mac80211: add mesh_param "mesh_nolearn" to skip path discovery
Currently, before being able to forward a packet between two 802.11s
nodes, both a PLINK handshake is performed upon receiving a beacon and
then later a PREQ/PREP exchange for path discovery is performed on
demand upon receiving a data frame to forward.

When running a mesh protocol on top of an 802.11s interface, like
batman-adv, we do not need the multi-hop mesh routing capabilities of
802.11s and usually set mesh_fwding=0. However, even with mesh_fwding=0
the PREQ/PREP path discovery is still performed on demand. Even though
in this scenario the next hop PREQ/PREP will determine is always the
direct 11s neighbor node.

The new mesh_nolearn parameter allows to skip the PREQ/PREP exchange in
this scenario, leading to a reduced delay, reduced packet buffering and
simplifies HWMP in general.

mesh_nolearn is still rather conservative in that if the packet destination
is not a direct 11s neighbor, it will fall back to PREQ/PREP path
discovery.

For normal, multi-hop 802.11s mesh routing it is usually not advisable
to enable mesh_nolearn as a transmission to a direct but distant neighbor
might be worse than reaching that same node via a more robust /
higher throughput etc. multi-hop path.

Cc: Sven Eckelmann <sven@narfation.org>
Cc: Simon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: Linus Lüssing <ll@simonwunderlich.de>
Link: https://lore.kernel.org/r/20200617073034.26149-1-linus.luessing@c0d3.blue
[fix nl80211 policy to range 0/1 only]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-07-31 09:24:23 +02:00
Emmanuel Grumbach
2f1805ea20 cfg80211: allow the low level driver to flush the BSS table
The low level driver adds its own opaque information
in the BSS table in the cfg80211_bss structure.

The low level driver may need to signal that this information
is no longer relevant and needs to be recreated.
Add an API to allow the low level driver to do that.

iwlwifi needs this because it keeps there an information about
the firmware's internal clock. This is kept in mac80211's
struct ieee80211_bss::sync_device_ts.
This information is populated while we scan, we add the
internal firmware's clock to each beacon which allows us to
program the firmware correctly after association so that
it'll know when (in terms of its internal clock) the DTIM
and TBTT will happen.

When the firmware is reset this internal clock is reset as
well and ieee80211_bss::sync_device_ts is no longer accurate.

iwlwifi will call this new API any time the firmware is started.

Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Link: https://lore.kernel.org/r/20200625111524.3992-1-emmanuel.grumbach@intel.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-07-31 09:24:23 +02:00
Randy Dunlap
dec4ca9312 net/wireless: regulatory.h: drop duplicate word in comment
Drop doubled word "of" in a comment.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: netdev@vger.kernel.org
Cc: Kalle Valo <kvalo@codeaurora.org>
Cc: linux-wireless@vger.kernel.org
Cc: Johannes Berg <johannes@sipsolutions.net>
Link: https://lore.kernel.org/r/20200715164325.9109-5-rdunlap@infradead.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-07-31 09:24:23 +02:00
Randy Dunlap
66b239d28c net/wireless: mac80211.h: drop duplicate words in comments
Drop doubled words "are" and "by" in comments.
Change doubled "to to" to "to the".

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: netdev@vger.kernel.org
Cc: Kalle Valo <kvalo@codeaurora.org>
Cc: linux-wireless@vger.kernel.org
Cc: Johannes Berg <johannes@sipsolutions.net>
Link: https://lore.kernel.org/r/20200715164325.9109-4-rdunlap@infradead.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-07-31 09:24:23 +02:00
Randy Dunlap
085a6c109b net/wireless: cfg80211.h: drop duplicate words in comments
Drop doubled word "by" in a comment.
Change "operate in in" to "operate with in" as is used below.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: netdev@vger.kernel.org
Cc: Kalle Valo <kvalo@codeaurora.org>
Cc: linux-wireless@vger.kernel.org
Cc: Johannes Berg <johannes@sipsolutions.net>
Link: https://lore.kernel.org/r/20200715164325.9109-3-rdunlap@infradead.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-07-31 09:24:23 +02:00
Randy Dunlap
0f55c0c500 net/wireless: wireless.h: drop duplicate word in comments
Drop doubled word "threshold" in a comment.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: netdev@vger.kernel.org
Cc: Kalle Valo <kvalo@codeaurora.org>
Cc: linux-wireless@vger.kernel.org
Cc: Johannes Berg <johannes@sipsolutions.net>
Link: https://lore.kernel.org/r/20200715164325.9109-2-rdunlap@infradead.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-07-31 09:24:23 +02:00
Randy Dunlap
987021726f net/wireless: nl80211.h: drop duplicate words in comments
Drop doubled words in several comments.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: netdev@vger.kernel.org
Cc: Kalle Valo <kvalo@codeaurora.org>
Cc: linux-wireless@vger.kernel.org
Cc: Johannes Berg <johannes@sipsolutions.net>
Link: https://lore.kernel.org/r/20200715164325.9109-1-rdunlap@infradead.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-07-31 09:24:22 +02:00
Thomas Pedersen
df78a0c0b6 nl80211: S1G band and channel definitions
Gives drivers the definitions needed to advertise support
for S1G bands.

Signed-off-by: Thomas Pedersen <thomas@adapt-ip.com>
Link: https://lore.kernel.org/r/20200602062247.23212-1-thomas@adapt-ip.com
Link: https://lore.kernel.org/r/20200731055636.795173-1-thomas@adapt-ip.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-07-31 09:24:13 +02:00