Commit Graph

126502 Commits

Author SHA1 Message Date
Geliang Tang
0f9f696a50 mptcp: add set_flags command in PM netlink
This patch added a new command MPTCP_PM_CMD_SET_FLAGS in PM netlink:

In mptcp_nl_cmd_set_flags, parse the input address, get the backup value
according to whether the address's FLAG_BACKUP flag is set from the
user-space. Then check whether this address had been added in the local
address list. If it had been, then call mptcp_nl_addr_backup to deal with
this address.

In mptcp_nl_addr_backup, traverse all the existing msk sockets to find
the relevant sockets, and call mptcp_pm_nl_mp_prio_send_ack to send out
a MP_PRIO ACK packet.

Finally in mptcp_nl_cmd_set_flags, set or clear the address's FLAG_BACKUP
flag.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-09 18:18:43 -08:00
Eric Dumazet
1d11fa6967 net-gro: remove GRO_DROP
GRO_DROP can only be returned from napi_gro_frags()
if the skb has not been allocated by a prior napi_get_frags()

Since drivers must use napi_get_frags() and test its result
before populating the skb with metadata, we can safely remove
GRO_DROP since it offers no practical use.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
Acked-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-09 14:24:26 -08:00
Alex Elder
ce2ceb9b1c soc: qcom: mdt_loader: define stubs for COMPILE_TEST
Define stub functions for the exposed MDT functions in case
QCOM_MDT_LOADER is not configured.  This allows users of these
functions to link correctly for COMPILE_TEST builds without
QCOM_SCM enabled.

Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-09 13:51:36 -08:00
Alex Elder
9941222116 remoteproc: qcom: expose types for COMPILE_TEST
Stub functions are defined for SSR notifier registration in case
QCOM_RPROC_COMMON is not configured.  As a result, code that uses
these functions can link successfully even if the common remoteproc
code is not built.

Code that registers an SSR notifier function likely needs the
types defined in "qcom_rproc.h", but those are only exposed if
QCOM_RPROC_COMMON is enabled.

Rearrange the conditional definition so the qcom_ssr_notify_data
structure and qcom_ssr_notify_type enumerated type are defined
whether or not QCOM_RPROC_COMMON is enabled.

Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-09 13:51:36 -08:00
Jakub Kicinski
833d22f2f9 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Trivial conflict in CAN on file rename.

Conflicts:
	drivers/net/can/m_can/tcan4x5x-core.c

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-08 13:28:00 -08:00
Linus Torvalds
6279d812ea Merge tag 'net-5.11-rc3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull more networking fixes from Jakub Kicinski:
 "Slightly lighter pull request to get back into the Thursday cadence.

  Current release - always broken:

   - can: mcp251xfd: fix Tx/Rx ring buffer driver race conditions

   - dsa: hellcreek: fix led_classdev build errors

  Previous releases - regressions:

   - ipv6: fib: flush exceptions when purging route to avoid netdev
     reference leak

   - ip_tunnels: fix pmtu check in nopmtudisc mode

   - ip: always refragment ip defragmented packets to avoid MTU issues
     when forwarding through tunnels, correct "packet too big" message
     is prohibitively tricky to generate

   - s390/qeth: fix locking for discipline setup / removal and during
     recovery to prevent both deadlocks and races

   - mlx5: Use port_num 1 instead of 0 when delete a RoCE address

  Previous releases - always broken:

   - cdc_ncm: correct overhead calculation in delayed_ndp_size to
     prevent out of bound accesses with Huawei 909s-120 LTE module

   - fix stmmac dwmac-sun8i suspend/resume:
           - PHY being left powered off
           - MAC syscon configuration being reset
           - reference to the reset controller being improperly dropped

   - qrtr: fix null-ptr-deref in qrtr_ns_remove

   - can: tcan4x5x: fix bittiming const, use common bittiming from m_can
     driver

   - mlx5e: CT: Use per flow counter when CT flow accounting is enabled

   - mlx5e: Fix SWP offsets when vlan inserted by driver

  Misc:

   - bpf: Fix a task_iter bug caused by a bpf -> net merge conflict
     resolution

  And the usual many fixes to various error paths"

* tag 'net-5.11-rc3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (69 commits)
  net: dsa: lantiq_gswip: Exclude RMII from modes that report 1 GbE
  s390/qeth: fix L2 header access in qeth_l3_osa_features_check()
  s390/qeth: fix locking for discipline setup / removal
  s390/qeth: fix deadlock during recovery
  selftests: fib_nexthops: Fix wrong mausezahn invocation
  nexthop: Bounce NHA_GATEWAY in FDB nexthop groups
  nexthop: Unlink nexthop group entry in error path
  nexthop: Fix off-by-one error in error path
  octeontx2-af: fix memory leak of lmac and lmac->name
  chtls: Fix chtls resources release sequence
  chtls: Added a check to avoid NULL pointer dereference
  chtls: Replace skb_dequeue with skb_peek
  chtls: Avoid unnecessary freeing of oreq pointer
  chtls: Fix panic when route to peer not configured
  chtls: Remove invalid set_tcb call
  chtls: Fix hardware tid leak
  net: ip: always refragment ip defragmented packets
  net: fix pmtu check in nopmtudisc mode
  selftests: netfilter: add selftest for ipip pmtu discovery with enabled connection tracking
  docs: octeontx2: tune rst markup
  ...
2021-01-08 12:12:30 -08:00
Petr Mladek
a91bd6223e Revert "init/console: Use ttynull as a fallback when there is no console"
This reverts commit 757055ae8d.

The commit caused that ttynull was used as the default console
on several systems[1][2][3]. As a result, the console was
blank even when a better alternative existed.

It happened when there was no console configured
on the command line and ttynull_init() was the first initcall
calling register_console().

Or it happened when /dev/ did not exist when console_on_rootfs()
was called. It was not able to open /dev/console even though
a console driver was registered. It tried to add ttynull console
but it obviously did not help. But ttynull became the preferred
console and was used by /dev/console when it was available later.

The commit tried to fix a historical problem that have been there
for ages. The primary motivation was the commit 3cffa06aee
("printk/console: Allow to disable console output by using console=""
 or console=null"). It provided a clean solution for a workaround
 that was widely used and worked only by chance.

This revert causes that the console="" or console=null command line
options will again work only by chance. These options will cause that
a particular console will be preferred and the default (tty) ones
will not get enabled. There will be no console registered at
all. As a result there won't be stdin, stdout, and stderr for
the init process. But it worked exactly this way even before.

The proper solution has to fulfill many conditions:

  + Register ttynull only when explicitly required or as
    the ultimate fallback.

  + ttynull should get associated with /dev/console but it must
    not become preferred console when used as a fallback.
    Especially, it must still be possible to replace it
    by a better console later.

Such a change requires clean up of the register_console() code.
Otherwise, it would be even harder to follow. Especially, the use
of has_preferred_console and CON_CONSDEV flag is tricky. The clean
up is risky. The ordering of consoles is not well defined. And
any changes tend to break existing user settings.

Do the revert at the least risky solution for now.

[1] https://lore.kernel.org/linux-kselftest/20201221144302.GR4077@smile.fi.intel.com/
[2] https://lore.kernel.org/lkml/d2a3b3c0-e548-7dd1-730f-59bc5c04e191@synopsys.com/
[3] https://patchwork.ozlabs.org/project/linux-um/patch/20210105120128.10854-1-thomas@m3y3r.de/

Reported-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reported-by: Vineet Gupta <vgupta@synopsys.com>
Reported-by: Thomas Meyer <thomas@m3y3r.de>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-01-08 11:02:18 -08:00
Jonathan Lemon
8e04491724 skbuff: Rename skb_zcopy_{get|put} to net_zcopy_{get|put}
Unlike the rest of the skb_zcopy_ functions, these routines
operate on a 'struct ubuf', not a skb.  Remove the 'skb_'
prefix from the naming to make things clearer.

Suggested-by: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-07 16:08:37 -08:00
Jonathan Lemon
9ee5e5ade0 tap/tun: add skb_zcopy_init() helper for initialization.
Replace direct assignments with skb_zcopy_init() for zerocopy
cases where a new skb is initialized, without changing the
reference counts.

Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-07 16:08:37 -08:00
Jonathan Lemon
04c2d33eab skbuff: add flags to ubuf_info for ubuf setup
Currently, when an ubuf is attached to a new skb, the shared
flags word is initialized to a fixed value.  Instead of doing
this, set the default flags in the ubuf, and have new skbs
inherit from this default.

This is needed when setting up different zerocopy types.

Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-07 16:08:37 -08:00
Jonathan Lemon
06b4feb37e net: group skb_shinfo zerocopy related bits together.
In preparation for expanded zerocopy (TX and RX), move
the zerocopy related bits out of tx_flags into their own
flag word.

Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-07 16:08:37 -08:00
Jonathan Lemon
8c793822c5 skbuff: rename sock_zerocopy_* to msg_zerocopy_*
At Willem's suggestion, rename the sock_zerocopy_* functions
so that they match the MSG_ZEROCOPY flag, which makes it clear
they are specific to this zerocopy implementation.

Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-07 16:08:35 -08:00
Jonathan Lemon
236a6b1cd5 skbuff: Call sock_zerocopy_put_abort from skb_zcopy_put_abort
The sock_zerocopy_put_abort function contains logic which is
specific to the current zerocopy implementation.  Add a wrapper
which checks the callback and dispatches apppropriately.

Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-07 16:06:37 -08:00
Jonathan Lemon
36177832f4 skbuff: Add skb parameter to the ubuf zerocopy callback
Add an optional skb parameter to the zerocopy callback parameter,
which is passed down from skb_zcopy_clear().  This gives access
to the original skb, which is needed for upcoming RX zero-copy
error handling.

Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-07 16:06:37 -08:00
Jonathan Lemon
e76d46cfff skbuff: replace sock_zerocopy_get with skb_zcopy_get
Rename the get routines for consistency.

Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-07 16:06:37 -08:00
Jonathan Lemon
59776362b1 skbuff: replace sock_zerocopy_put() with skb_zcopy_put()
Replace sock_zerocopy_put with the generic skb_zcopy_put()
function.  Pass 'true' as the success argument, as this
is identical to no change.

Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-07 16:06:37 -08:00
Jonathan Lemon
75518851a2 skbuff: Push status and refcounts into sock_zerocopy_callback
Before this change, the caller of sock_zerocopy_callback would
need to save the zerocopy status, decrement and check the refcount,
and then call the callback function - the callback was only invoked
when the refcount reached zero.

Now, the caller just passes the status into the callback function,
which saves the status and handles its own refcounts.

This makes the behavior of the sock_zerocopy_callback identical
to the tpacket and vhost callbacks.

Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-07 16:06:37 -08:00
Jonathan Lemon
424f481f06 skbuff: remove unused skb_zcopy_abort function
skb_zcopy_abort() has no in-tree consumers, remove it.

Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-07 16:06:37 -08:00
Vladimir Oltean
1dbb130281 net: dsa: remove the DSA specific notifiers
This effectively reverts commit 60724d4bae ("net: dsa: Add support for
DSA specific notifiers"). The reason is that since commit 2f1e8ea726
("net: dsa: link interfaces with the DSA master to get rid of lockdep
warnings"), it appears that there is a generic way to achieve the same
purpose. The only user thus far, the Broadcom SYSTEMPORT driver, was
converted to use the generic notifiers.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-07 15:42:07 -08:00
Vladimir Oltean
a5e3c9ba92 net: dsa: export dsa_slave_dev_check
Using the NETDEV_CHANGEUPPER notifications, drivers can be aware when
they are enslaved to e.g. a bridge by calling netif_is_bridge_master().

Export this helper from DSA to get the equivalent functionality of
determining whether the upper interface of a CHANGEUPPER notifier is a
DSA switch interface or not.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-07 15:42:07 -08:00
Vladimir Oltean
f46b9b8ee8 net: dsa: move the Broadcom tag information in a separate header file
It is a bit strange to see something as specific as Broadcom SYSTEMPORT
bits in the main DSA include file. Move these away into a separate
header, and have the tagger and the SYSTEMPORT driver include them.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-07 15:42:07 -08:00
Vladimir Oltean
d5f19486ce net: dsa: listen for SWITCHDEV_{FDB,DEL}_ADD_TO_DEVICE on foreign bridge neighbors
Some DSA switches (and not only) cannot learn source MAC addresses from
packets injected from the CPU. They only perform hardware address
learning from inbound traffic.

This can be problematic when we have a bridge spanning some DSA switch
ports and some non-DSA ports (which we'll call "foreign interfaces" from
DSA's perspective).

There are 2 classes of problems created by the lack of learning on
CPU-injected traffic:
- excessive flooding, due to the fact that DSA treats those addresses as
  unknown
- the risk of stale routes, which can lead to temporary packet loss

To illustrate the second class, consider the following situation, which
is common in production equipment (wireless access points, where there
is a WLAN interface and an Ethernet switch, and these form a single
bridging domain).

 AP 1:
 +------------------------------------------------------------------------+
 |                                          br0                           |
 +------------------------------------------------------------------------+
 +------------+ +------------+ +------------+ +------------+ +------------+
 |    swp0    | |    swp1    | |    swp2    | |    swp3    | |    wlan0   |
 +------------+ +------------+ +------------+ +------------+ +------------+
       |                                                       ^        ^
       |                                                       |        |
       |                                                       |        |
       |                                                    Client A  Client B
       |
       |
       |
 +------------+ +------------+ +------------+ +------------+ +------------+
 |    swp0    | |    swp1    | |    swp2    | |    swp3    | |    wlan0   |
 +------------+ +------------+ +------------+ +------------+ +------------+
 +------------------------------------------------------------------------+
 |                                          br0                           |
 +------------------------------------------------------------------------+
 AP 2

- br0 of AP 1 will know that Clients A and B are reachable via wlan0
- the hardware fdb of a DSA switch driver today is not kept in sync with
  the software entries on other bridge ports, so it will not know that
  clients A and B are reachable via the CPU port UNLESS the hardware
  switch itself performs SA learning from traffic injected from the CPU.
  Nonetheless, a substantial number of switches don't.
- the hardware fdb of the DSA switch on AP 2 may autonomously learn that
  Client A and B are reachable through swp0. Therefore, the software br0
  of AP 2 also may or may not learn this. In the example we're
  illustrating, some Ethernet traffic has been going on, and br0 from AP
  2 has indeed learnt that it can reach Client B through swp0.

One of the wireless clients, say Client B, disconnects from AP 1 and
roams to AP 2. The topology now looks like this:

 AP 1:
 +------------------------------------------------------------------------+
 |                                          br0                           |
 +------------------------------------------------------------------------+
 +------------+ +------------+ +------------+ +------------+ +------------+
 |    swp0    | |    swp1    | |    swp2    | |    swp3    | |    wlan0   |
 +------------+ +------------+ +------------+ +------------+ +------------+
       |                                                            ^
       |                                                            |
       |                                                         Client A
       |
       |
       |                                                         Client B
       |                                                            |
       |                                                            v
 +------------+ +------------+ +------------+ +------------+ +------------+
 |    swp0    | |    swp1    | |    swp2    | |    swp3    | |    wlan0   |
 +------------+ +------------+ +------------+ +------------+ +------------+
 +------------------------------------------------------------------------+
 |                                          br0                           |
 +------------------------------------------------------------------------+
 AP 2

- br0 of AP 1 still knows that Client A is reachable via wlan0 (no change)
- br0 of AP 1 will (possibly) know that Client B has left wlan0. There
  are cases where it might never find out though. Either way, DSA today
  does not process that notification in any way.
- the hardware FDB of the DSA switch on AP 1 may learn autonomously that
  Client B can be reached via swp0, if it receives any packet with
  Client 1's source MAC address over Ethernet.
- the hardware FDB of the DSA switch on AP 2 still thinks that Client B
  can be reached via swp0. It does not know that it has roamed to wlan0,
  because it doesn't perform SA learning from the CPU port.

Now Client A contacts Client B.
AP 1 routes the packet fine towards swp0 and delivers it on the Ethernet
segment.
AP 2 sees a frame on swp0 and its fdb says that the destination is swp0.
Hairpinning is disabled => drop.

This problem comes from the fact that these switches have a 'blind spot'
for addresses coming from software bridging. The generic solution is not
to assume that hardware learning can be enabled somehow, but to listen
to more bridge learning events. It turns out that the bridge driver does
learn in software from all inbound frames, in __br_handle_local_finish.
A proper SWITCHDEV_FDB_ADD_TO_DEVICE notification is emitted for the
addresses serviced by the bridge on 'foreign' interfaces. The software
bridge also does the right thing on migration, by notifying that the old
entry is deleted, so that does not need to be special-cased in DSA. When
it is deleted, we just need to delete our static FDB entry towards the
CPU too, and wait.

The problem is that DSA currently only cares about SWITCHDEV_FDB_ADD_TO_DEVICE
events received on its own interfaces, such as static FDB entries.

Luckily we can change that, and DSA can listen to all switchdev FDB
add/del events in the system and figure out if those events were emitted
by a bridge that spans at least one of DSA's own ports. In case that is
true, DSA will also offload that address towards its own CPU port, in
the eventuality that there might be bridge clients attached to the DSA
switch who want to talk to the station connected to the foreign
interface.

In terms of implementation, we need to keep the fdb_info->added_by_user
check for the case where the switchdev event was targeted directly at a
DSA switch port. But we don't need to look at that flag for snooped
events. So the check is currently too late, we need to move it earlier.
This also simplifies the code a bit, since we avoid uselessly allocating
and freeing switchdev_work.

We could probably do some improvements in the future. For example,
multi-bridge support is rudimentary at the moment. If there are two
bridges spanning a DSA switch's ports, and both of them need to service
the same MAC address, then what will happen is that the migration of one
of those stations will trigger the deletion of the FDB entry from the
CPU port while it is still used by other bridge. That could be improved
with reference counting but is left for another time.

This behavior needs to be enabled at driver level by setting
ds->assisted_learning_on_cpu_port = true. This is because we don't want
to inflict a potential performance penalty (accesses through
MDIO/I2C/SPI are expensive) to hardware that really doesn't need it
because address learning on the CPU port works there.

Reported-by: DENG Qingfang <dqfext@gmail.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-07 15:34:46 -08:00
Florian Fainelli
8b86850bf9 net: phy: bcm7xxx: Add an entry for BCM72116
BCM72116 features a 28nm integrated EPHY, add an entry to match this PHY
OUI.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20210106170944.1253046-1-f.fainelli@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-07 14:46:09 -08:00
Jakub Kicinski
b9ef3fecd1 udp_tunnel: reshuffle NETIF_F_RX_UDP_TUNNEL_PORT checks
Move the NETIF_F_RX_UDP_TUNNEL_PORT feature check into
udp_tunnel_nic_*_port() helpers, since they're always
done right before the call.

Add similar checks before calling the notifier.
udp_tunnel_nic invokes the notifier without checking
features which could result in some wasted cycles.

Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-07 12:53:29 -08:00
Jakub Kicinski
30bfce1094 net: remove ndo_udp_tunnel_* callbacks
All UDP tunnel port management is now routed via udp_tunnel_nic
infra directly. Remove the old callbacks.

Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-07 12:53:29 -08:00
Aya Levin
9c9be85f6b net/mlx5e: Add missing capability check for uplink follow
Expose firmware indication that it supports setting eswitch uplink state
to follow (follow the physical link). Condition setting the eswitch
uplink admin-state with this capability bit. Older FW may not support
the uplink state setting.

Fixes: 7d0314b11c ("net/mlx5e: Modify uplink state on interface up/down")
Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-01-07 12:22:48 -08:00
Jakub Kicinski
cf07206971 net: suggest L2 discards be counted towards rx_dropped
From the existing definitions it's unclear which stat to
use to report filtering based on L2 dst addr in old
broadcast-medium Ethernet.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-01-05 16:23:57 -08:00
Linus Torvalds
aa35e45cd4 Merge tag 'net-5.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
 "Networking fixes, including fixes from netfilter, wireless and bpf
  trees.

  Current release - regressions:

   - mt76: fix NULL pointer dereference in mt76u_status_worker and
     mt76s_process_tx_queue

   - net: ipa: fix interconnect enable bug

  Current release - always broken:

   - netfilter: fixes possible oops in mtype_resize in ipset

   - ath11k: fix number of coding issues found by static analysis tools
     and spurious error messages

  Previous releases - regressions:

   - e1000e: re-enable s0ix power saving flows for systems with the
     Intel i219-LM Ethernet controllers to fix power use regression

   - virtio_net: fix recursive call to cpus_read_lock() to avoid a
     deadlock

   - ipv4: ignore ECN bits for fib lookups in fib_compute_spec_dst()

   - sysfs: take the rtnl lock around XPS configuration

   - xsk: fix memory leak for failed bind and rollback reservation at
     NETDEV_TX_BUSY

   - r8169: work around power-saving bug on some chip versions

  Previous releases - always broken:

   - dcb: validate netlink message in DCB handler

   - tun: fix return value when the number of iovs exceeds MAX_SKB_FRAGS
     to prevent unnecessary retries

   - vhost_net: fix ubuf refcount when sendmsg fails

   - bpf: save correct stopping point in file seq iteration

   - ncsi: use real net-device for response handler

   - neighbor: fix div by zero caused by a data race (TOCTOU)

   - bareudp: fix use of incorrect min_headroom size and a false
     positive lockdep splat from the TX lock

   - mvpp2:
      - clear force link UP during port init procedure in case
        bootloader had set it
      - add TCAM entry to drop flow control pause frames
      - fix PPPoE with ipv6 packet parsing
      - fix GoP Networking Complex Control config of port 3
      - fix pkt coalescing IRQ-threshold configuration

   - xsk: fix race in SKB mode transmit with shared cq

   - ionic: account for vlan tag len in rx buffer len

   - stmmac: ignore the second clock input, current clock framework does
     not handle exclusive clock use well, other drivers may reconfigure
     the second clock

  Misc:

   - ppp: change PPPIOCUNBRIDGECHAN ioctl request number to follow
     existing scheme"

* tag 'net-5.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (99 commits)
  net: dsa: lantiq_gswip: Fix GSWIP_MII_CFG(p) register access
  net: dsa: lantiq_gswip: Enable GSWIP_MII_CFG_EN also for internal PHYs
  net: lapb: Decrease the refcount of "struct lapb_cb" in lapb_device_event
  r8169: work around power-saving bug on some chip versions
  net: usb: qmi_wwan: add Quectel EM160R-GL
  selftests: mlxsw: Set headroom size of correct port
  net: macb: Correct usage of MACB_CAPS_CLK_HW_CHG flag
  ibmvnic: fix: NULL pointer dereference.
  docs: networking: packet_mmap: fix old config reference
  docs: networking: packet_mmap: fix formatting for C macros
  vhost_net: fix ubuf refcount incorrectly when sendmsg fails
  bareudp: Fix use of incorrect min_headroom size
  bareudp: set NETIF_F_LLTX flag
  net: hdlc_ppp: Fix issues when mod_timer is called while timer is running
  atlantic: remove architecture depends
  erspan: fix version 1 check in gre_parse_header()
  net: hns: fix return value check in __lb_other_process()
  net: sched: prevent invalid Scell_log shift count
  net: neighbor: fix a crash caused by mod zero
  ipv4: Ignore ECN bits for fib lookups in fib_compute_spec_dst()
  ...
2021-01-05 12:38:56 -08:00
Linus Torvalds
6207214a70 Merge tag 'afs-fixes-04012021' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
Pull AFS fixes from David Howells:
 "Two fixes.

  The first is the fix for the strnlen() array limit check and the
  second fixes the calculation of the number of dirent records used to
  represent any particular filename length"

* tag 'afs-fixes-04012021' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
  afs: Fix directory entry size calculation
  afs: Work around strnlen() oops with CONFIG_FORTIFIED_SOURCE=y
2021-01-05 11:55:46 -08:00
Jakub Kicinski
a8f33c038f Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following patchset contains Netfilter fixes for net:

1) Missing sanitization of rateest userspace string, bug has been
   triggered by syzbot, patch from Florian Westphal.

2) Report EOPNOTSUPP on missing set features in nft_dynset, otherwise
   error reporting to userspace via EINVAL is misleading since this is
   reserved for malformed netlink requests.

3) New binaries with old kernels might silently accept several set
   element expressions. New binaries set on the NFT_SET_EXPR and
   NFT_DYNSET_F_EXPR flags to request for several expressions per
   element, hence old kernels which do not support for this bail out
   with EOPNOTSUPP.

* git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf:
  netfilter: nftables: add set expression flags
  netfilter: nft_dynset: report EOPNOTSUPP on missing set feature
  netfilter: xt_RATEEST: reject non-null terminated string from userspace
====================

Link: https://lore.kernel.org/r/20210103192920.18639-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-04 14:02:02 -08:00
Linus Torvalds
36bbbd0e23 Merge branch 'rcu/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu
Pull RCU fix from Paul McKenney:
 "This is a fix for a regression in the v5.10 merge window, but it was
  reported quite late in the v5.10 process, plus generating and testing
  the fix took some time.

  The regression is due to commit 36dadef23f ("kprobes: Init kprobes
  in early_initcall") which on powerpc can use RCU Tasks before
  initialization, resulting in boot failures.

  The fix is straightforward, simply moving initialization of RCU Tasks
  before the early_initcall()s. The fix has been exposed to -next and
  kbuild test robot testing, and has been tested by the PowerPC guys"

* 'rcu/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu:
  rcu-tasks: Move RCU-tasks initialization to before early_initcall()
2021-01-04 10:55:19 -08:00
Linus Torvalds
f4f6a2e329 Merge tag 'compiler-attributes-for-linus-v5.11' of git://github.com/ojeda/linux
Pull ENABLE_MUST_CHECK removal from Miguel Ojeda:
 "Remove CONFIG_ENABLE_MUST_CHECK (Masahiro Yamada)"

Note that this removes the config option by making the must-check
unconditional, not by removing must check itself.

* tag 'compiler-attributes-for-linus-v5.11' of git://github.com/ojeda/linux:
  Compiler Attributes: remove CONFIG_ENABLE_MUST_CHECK
2021-01-04 10:47:38 -08:00
David Howells
366911cd76 afs: Fix directory entry size calculation
The number of dirent records used by an AFS directory entry should be
calculated using the assumption that there is a 16-byte name field in the
first block, rather than a 20-byte name field (which is actually the case).
This miscalculation is historic and effectively standard, so we have to use
it.

The calculation we need to use is:

	1 + (((strlen(name) + 1) + 15) >> 5)

where we are adding one to the strlen() result to account for the NUL
termination.

Fix this by the following means:

 (1) Create an inline function to do the calculation for a given name
     length.

 (2) Use the function to calculate the number of records used for a dirent
     in afs_dir_iterate_block().

     Use this to move the over-end check out of the loop since it only
     needs to be done once.

     Further use this to only go through the loop for the 2nd+ records
     composing an entry.  The only test there now is for if the record is
     allocated - and we already checked the first block at the top of the
     outer loop.

 (3) Add a max name length check in afs_dir_iterate_block().

 (4) Make afs_edit_dir_add() and afs_edit_dir_remove() use the function
     from (1) to calculate the number of blocks rather than doing it
     incorrectly themselves.

Fixes: 63a4681ff3 ("afs: Locally edit directory data for mkdir/create/unlink/...")
Fixes: ^1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Marc Dionne <marc.dionne@auristor.com>
2021-01-04 12:25:19 +00:00
Linus Torvalds
eda809aef5 Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
 "This is a load of driver fixes (12 ufs, 1 mpt3sas, 1 cxgbi).

  The big core two fixes are for power management ("block: Do not accept
  any requests while suspended" and "block: Fix a race in the runtime
  power management code") which finally sorts out the resume problems
  we've occasionally been having.

  To make the resume fix, there are seven necessary precursors which
  effectively renames REQ_PREEMPT to REQ_PM, so every "special" request
  in block is automatically a power management exempt one.

  All of the non-PM preempt cases are removed except for the one in the
  SCSI Parallel Interface (spi) domain validation which is a genuine
  case where we have to run requests at high priority to validate the
  bus so this becomes an autopm get/put protected request"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (22 commits)
  scsi: cxgb4i: Fix TLS dependency
  scsi: ufs: Un-inline ufshcd_vops_device_reset function
  scsi: ufs: Re-enable WriteBooster after device reset
  scsi: ufs-mediatek: Use correct path to fix compile error
  scsi: mpt3sas: Signedness bug in _base_get_diag_triggers()
  scsi: block: Do not accept any requests while suspended
  scsi: block: Remove RQF_PREEMPT and BLK_MQ_REQ_PREEMPT
  scsi: core: Only process PM requests if rpm_status != RPM_ACTIVE
  scsi: scsi_transport_spi: Set RQF_PM for domain validation commands
  scsi: ide: Mark power management requests with RQF_PM instead of RQF_PREEMPT
  scsi: ide: Do not set the RQF_PREEMPT flag for sense requests
  scsi: block: Introduce BLK_MQ_REQ_PM
  scsi: block: Fix a race in the runtime power management code
  scsi: ufs-pci: Enable UFSHCD_CAP_RPM_AUTOSUSPEND for Intel controllers
  scsi: ufs-pci: Fix recovery from hibernate exit errors for Intel controllers
  scsi: ufs-pci: Ensure UFS device is in PowerDown mode for suspend-to-disk ->poweroff()
  scsi: ufs-pci: Fix restore from S4 for Intel controllers
  scsi: ufs-mediatek: Keep VCC always-on for specific devices
  scsi: ufs: Allow regulators being always-on
  scsi: ufs: Clear UAC for RPMB after ufshcd resets
  ...
2021-01-01 12:58:07 -08:00
Linus Torvalds
f6e1ea1964 Merge tag 'ceph-for-5.11-rc2' of git://github.com/ceph/ceph-client
Pull ceph fixes from Ilya Dryomov:
 "A fix for an edge case in MClientRequest encoding and a couple of
  trivial fixups for the new msgr2 support"

* tag 'ceph-for-5.11-rc2' of git://github.com/ceph/ceph-client:
  libceph: add __maybe_unused to DEFINE_MSGR2_FEATURE
  libceph: align session_key and con_secret to 16 bytes
  libceph: fix auth_signature buffer allocation in secure mode
  ceph: reencode gid_list when reconnecting
2020-12-30 12:02:12 -08:00
Josh Poimboeuf
aa8c7db494 kdev_t: always inline major/minor helper functions
Silly GCC doesn't always inline these trivial functions.

Fixes the following warning:

  arch/x86/kernel/sys_ia32.o: warning: objtool: cp_stat64()+0xd8: call to new_encode_dev() with UACCESS enabled

Link: https://lkml.kernel.org/r/984353b44a4484d86ba9f73884b7306232e25e30.1608737428.git.jpoimboe@redhat.com
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Randy Dunlap <rdunlap@infradead.org>	[build-tested]
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-29 15:36:49 -08:00
Huang Shijie
8b0fac44bd sizes.h: add SZ_8G/SZ_16G/SZ_32G macros
Add these macros, since we can use them in drivers.

Link: https://lkml.kernel.org/r/20201229072819.11183-1-sjhuang@iluvatar.ai
Signed-off-by: Huang Shijie <sjhuang@iluvatar.ai>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-29 15:36:49 -08:00
Randy Dunlap
87dbc209ea local64.h: make <asm/local64.h> mandatory
Make <asm-generic/local64.h> mandatory in include/asm-generic/Kbuild and
remove all arch/*/include/asm/local64.h arch-specific files since they
only #include <asm-generic/local64.h>.

This fixes build errors on arch/c6x/ and arch/nios2/ for
block/blk-iocost.c.

Build-tested on 21 of 25 arch-es.  (tools problems on the others)

Yes, we could even rename <asm-generic/local64.h> to
<linux/local64.h> and change all #includes to use
<linux/local64.h> instead.

Link: https://lkml.kernel.org/r/20201227024446.17018-1-rdunlap@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Suggested-by: Christoph Hellwig <hch@infradead.org>
Reviewed-by: Masahiro Yamada <masahiroy@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Aurelien Jacquiot <jacquiot.aurelien@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-29 15:36:49 -08:00
Baoquan He
dc2da7b45f mm: memmap defer init doesn't work as expected
VMware observed a performance regression during memmap init on their
platform, and bisected to commit 73a6e474cb ("mm: memmap_init:
iterate over memblock regions rather that check each PFN") causing it.

Before the commit:

  [0.033176] Normal zone: 1445888 pages used for memmap
  [0.033176] Normal zone: 89391104 pages, LIFO batch:63
  [0.035851] ACPI: PM-Timer IO Port: 0x448

With commit

  [0.026874] Normal zone: 1445888 pages used for memmap
  [0.026875] Normal zone: 89391104 pages, LIFO batch:63
  [2.028450] ACPI: PM-Timer IO Port: 0x448

The root cause is the current memmap defer init doesn't work as expected.

Before, memmap_init_zone() was used to do memmap init of one whole zone,
to initialize all low zones of one numa node, but defer memmap init of
the last zone in that numa node.  However, since commit 73a6e474cb,
function memmap_init() is adapted to iterater over memblock regions
inside one zone, then call memmap_init_zone() to do memmap init for each
region.

E.g, on VMware's system, the memory layout is as below, there are two
memory regions in node 2.  The current code will mistakenly initialize the
whole 1st region [mem 0xab00000000-0xfcffffffff], then do memmap defer to
iniatialize only one memmory section on the 2nd region [mem
0x10000000000-0x1033fffffff].  In fact, we only expect to see that there's
only one memory section's memmap initialized.  That's why more time is
costed at the time.

[    0.008842] ACPI: SRAT: Node 0 PXM 0 [mem 0x00000000-0x0009ffff]
[    0.008842] ACPI: SRAT: Node 0 PXM 0 [mem 0x00100000-0xbfffffff]
[    0.008843] ACPI: SRAT: Node 0 PXM 0 [mem 0x100000000-0x55ffffffff]
[    0.008844] ACPI: SRAT: Node 1 PXM 1 [mem 0x5600000000-0xaaffffffff]
[    0.008844] ACPI: SRAT: Node 2 PXM 2 [mem 0xab00000000-0xfcffffffff]
[    0.008845] ACPI: SRAT: Node 2 PXM 2 [mem 0x10000000000-0x1033fffffff]

Now, let's add a parameter 'zone_end_pfn' to memmap_init_zone() to pass
down the real zone end pfn so that defer_init() can use it to judge
whether defer need be taken in zone wide.

Link: https://lkml.kernel.org/r/20201223080811.16211-1-bhe@redhat.com
Link: https://lkml.kernel.org/r/20201223080811.16211-2-bhe@redhat.com
Fixes: commit 73a6e474cb ("mm: memmap_init: iterate over memblock regions rather that check each PFN")
Signed-off-by: Baoquan He <bhe@redhat.com>
Reported-by: Rahul Gopakumar <gopakumarr@vmware.com>
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-29 15:36:49 -08:00
Souptick Joarder
6d87d0ece5 mm: add prototype for __add_to_page_cache_locked()
Otherwise it causes a gcc warning:

  mm/filemap.c:830:14: warning: no previous prototype for `__add_to_page_cache_locked' [-Wmissing-prototypes]

A previous attempt to make this function static led to compilation
errors when CONFIG_DEBUG_INFO_BTF is enabled because
__add_to_page_cache_locked() is referred to by BPF code.

Adding a prototype will silence the warning.

Link: https://lkml.kernel.org/r/1608693702-4665-1-git-send-email-jrdr.linux@gmail.com
Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
Cc: Alex Shi <alex.shi@linux.alibaba.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-29 15:36:49 -08:00
Masahiro Yamada
3a176b9460 Revert "kbuild: avoid static_assert for genksyms"
This reverts commit 14dc3983b5.

Macro Elver had sent a fix proper fix earlier, and also pointed out
corner cases:
 "I guess what you propose is simpler, but might still have corner cases
  where we still get warnings. In particular, if some file (for whatever
  reason) does not include build_bug.h and uses a raw _Static_assert(),
  then we still get warnings. E.g. I see 1 user of raw _Static_assert()
  (drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h )."

I believe the raw use of _Static_assert() should be allowed, so this
should be fixed in genksyms.

Even after commit 14dc3983b5 ("kbuild: avoid static_assert for
genksyms"), I confirmed the following test code emits the warning.

  ---------------->8----------------
  #include <linux/export.h>

  _Static_assert((1 ?: 0), "");

  void foo(void) { }
  EXPORT_SYMBOL(foo);
  ---------------->8----------------

  WARNING: modpost: EXPORT symbol "foo" [vmlinux] version generation failed, symbol will not be versioned.

Now that commit 869b91992bce ("genksyms: Ignore module scoped
_Static_assert()") fixed this issue properly, the workaround should
be reverted.

Link: https://lkml.org/lkml/2020/12/10/845
Link: https://lkml.kernel.org/r/20201219183911.181442-1-masahiroy@kernel.org
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Cc: Marco Elver <elver@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-29 15:36:49 -08:00
David S. Miller
4bfc471484 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
pull-request: bpf 2020-12-28

The following pull-request contains BPF updates for your *net* tree.

There is a small merge conflict between bpf tree commit 69ca310f34
("bpf: Save correct stopping point in file seq iteration") and net tree
commit 66ed594409 ("bpf/task_iter: In task_file_seq_get_next use
task_lookup_next_fd_rcu"). The get_files_struct() does not exist anymore
in net, so take the hunk in HEAD and add the `info->tid = curr_tid` to
the error path:

  [...]
                curr_task = task_seq_get_next(ns, &curr_tid, true);
                if (!curr_task) {
                        info->task = NULL;
                        info->tid = curr_tid;
                        return NULL;
                }

                /* set info->task and info->tid */
  [...]

We've added 10 non-merge commits during the last 9 day(s) which contain
a total of 11 files changed, 75 insertions(+), 20 deletions(-).

The main changes are:

1) Various AF_XDP fixes such as fill/completion ring leak on failed bind and
   fixing a race in skb mode's backpressure mechanism, from Magnus Karlsson.

2) Fix latency spikes on lockdep enabled kernels by adding a rescheduling
   point to BPF hashtab initialization, from Eric Dumazet.

3) Fix a splat in task iterator by saving the correct stopping point in the
   seq file iteration, from Jonathan Lemon.

4) Fix BPF maps selftest by adding retries in case hashtab returns EBUSY
   errors on update/deletes, from Andrii Nakryiko.

5) Fix BPF selftest error reporting to something more user friendly if the
   vmlinux BTF cannot be found, from Kamal Mostafa.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-28 15:26:11 -08:00
Randy Dunlap
bd1248f1dd net: sched: prevent invalid Scell_log shift count
Check Scell_log shift size in red_check_params() and modify all callers
of red_check_params() to pass Scell_log.

This prevents a shift out-of-bounds as detected by UBSAN:
  UBSAN: shift-out-of-bounds in ./include/net/red.h:252:22
  shift exponent 72 is too large for 32-bit type 'int'

Fixes: 8afa10cbe2 ("net_sched: red: Avoid illegal values")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: syzbot+97c5bd9cc81eca63d36e@syzkaller.appspotmail.com
Cc: Nogah Frankel <nogahf@mellanox.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: netdev@vger.kernel.org
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-28 14:52:54 -08:00
Ilya Dryomov
664f1e259a libceph: add __maybe_unused to DEFINE_MSGR2_FEATURE
Avoid -Wunused-const-variable warnings for "make W=1".

Reported-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-12-28 20:34:33 +01:00
Pablo Neira Ayuso
b4e70d8dd9 netfilter: nftables: add set expression flags
The set flag NFT_SET_EXPR provides a hint to the kernel that userspace
supports for multiple expressions per set element. In the same
direction, NFT_DYNSET_F_EXPR specifies that dynset expression defines
multiple expressions per set element.

This allows new userspace software with old kernels to bail out with
EOPNOTSUPP. This update is similar to ef516e8625 ("netfilter:
nf_tables: reintroduce the NFT_SET_CONCAT flag"). The NFT_SET_EXPR flag
needs to be set on when the NFTA_SET_EXPRESSIONS attribute is specified.
The NFT_SET_EXPR flag is not set on with NFTA_SET_EXPR to retain
backward compatibility in old userspace binaries.

Fixes: 48b0ae046e ("netfilter: nftables: netlink support for several set element expressions")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-12-28 10:50:26 +01:00
Linus Torvalds
7bb5226c8a Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull misc vfs updates from Al Viro:
 "Assorted patches from previous cycle(s)..."

* 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  fix hostfs_open() use of ->f_path.dentry
  Make sure that make_create_in_sticky() never sees uninitialized value of dir_mode
  fs: Kill DCACHE_DONTCACHE dentry even if DCACHE_REFERENCED is set
  fs: Handle I_DONTCACHE in iput_final() instead of generic_drop_inode()
  fs/namespace.c: WARN if mnt_count has become negative
2020-12-25 10:54:29 -08:00
Linus Torvalds
555a6e8c11 Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext4 updates from Ted Ts'o:
 "Various bug fixes and cleanups for ext4; no new features this cycle"

* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (29 commits)
  ext4: remove unnecessary wbc parameter from ext4_bio_write_page
  ext4: avoid s_mb_prefetch to be zero in individual scenarios
  ext4: defer saving error info from atomic context
  ext4: simplify ext4 error translation
  ext4: move functions in super.c
  ext4: make ext4_abort() use __ext4_error()
  ext4: standardize error message in ext4_protect_reserved_inode()
  ext4: remove redundant sb checksum recomputation
  ext4: don't remount read-only with errors=continue on reboot
  ext4: fix deadlock with fs freezing and EA inodes
  jbd2: add a helper to find out number of fast commit blocks
  ext4: make fast_commit.h byte identical with e2fsprogs/fast_commit.h
  ext4: fix fall-through warnings for Clang
  ext4: add docs about fast commit idempotence
  ext4: remove the unused EXT4_CURRENT_REV macro
  ext4: fix an IS_ERR() vs NULL check
  ext4: check for invalid block size early when mounting a file system
  ext4: fix a memory leak of ext4_free_data
  ext4: delete nonsensical (commented-out) code inside ext4_xattr_block_set()
  ext4: update ext4_data_block_valid related comments
  ...
2020-12-24 14:16:02 -08:00
Linus Torvalds
3913d00ac5 Merge tag 'irq-core-2020-12-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq updates from Thomas Gleixner:
 "This is the second attempt after the first one failed miserably and
  got zapped to unblock the rest of the interrupt related patches.

  A treewide cleanup of interrupt descriptor (ab)use with all sorts of
  racy accesses, inefficient and disfunctional code. The goal is to
  remove the export of irq_to_desc() to prevent these things from
  creeping up again"

* tag 'irq-core-2020-12-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (30 commits)
  genirq: Restrict export of irq_to_desc()
  xen/events: Implement irq distribution
  xen/events: Reduce irq_info:: Spurious_cnt storage size
  xen/events: Only force affinity mask for percpu interrupts
  xen/events: Use immediate affinity setting
  xen/events: Remove disfunct affinity spreading
  xen/events: Remove unused bind_evtchn_to_irq_lateeoi()
  net/mlx5: Use effective interrupt affinity
  net/mlx5: Replace irq_to_desc() abuse
  net/mlx4: Use effective interrupt affinity
  net/mlx4: Replace irq_to_desc() abuse
  PCI: mobiveil: Use irq_data_get_irq_chip_data()
  PCI: xilinx-nwl: Use irq_data_get_irq_chip_data()
  NTB/msi: Use irq_has_action()
  mfd: ab8500-debugfs: Remove the racy fiddling with irq_desc
  pinctrl: nomadik: Use irq_has_action()
  drm/i915/pmu: Replace open coded kstat_irqs() copy
  drm/i915/lpe_audio: Remove pointless irq_to_desc() usage
  s390/irq: Use irq_desc_kstat_cpu() in show_msi_interrupt()
  parisc/irq: Use irq_desc_kstat_cpu() in show_interrupts()
  ...
2020-12-24 13:50:23 -08:00
Linus Torvalds
4a1106afee Merge tag 'efi_updates_for_v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull EFI updates from Borislav Petkov:
 "These got delayed due to a last minute ia64 build issue which got
  fixed in the meantime.

  EFI updates collected by Ard Biesheuvel:

   - Don't move BSS section around pointlessly in the x86 decompressor

   - Refactor helper for discovering the EFI secure boot mode

   - Wire up EFI secure boot to IMA for arm64

   - Some fixes for the capsule loader

   - Expose the RT_PROP table via the EFI test module

   - Relax DT and kernel placement restrictions on ARM

  with a few followup fixes:

   - fix the build breakage on IA64 caused by recent capsule loader
     changes

   - suppress a type mismatch build warning in the expansion of
     EFI_PHYS_ALIGN on ARM"

* tag 'efi_updates_for_v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  efi: arm: force use of unsigned type for EFI_PHYS_ALIGN
  efi: ia64: disable the capsule loader
  efi: stub: get rid of efi_get_max_fdt_addr()
  efi/efi_test: read RuntimeServicesSupported
  efi: arm: reduce minimum alignment of uncompressed kernel
  efi: capsule: clean scatter-gather entries from the D-cache
  efi: capsule: use atomic kmap for transient sglist mappings
  efi: x86/xen: switch to efi_get_secureboot_mode helper
  arm64/ima: add ima_arch support
  ima: generalize x86/EFI arch glue for other EFI architectures
  efi: generalize efi_get_secureboot
  efi/libstub: EFI_GENERIC_STUB_INITRD_CMDLINE_LOADER should not default to yes
  efi/x86: Only copy the compressed kernel image in efi_relocate_kernel()
  efi/libstub/x86: simplify efi_is_native()
2020-12-24 12:40:07 -08:00
Linus Torvalds
771e7e4161 Merge tag 'block-5.11-2020-12-23' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
 "A few stragglers in here, but mostly just straight fixes. In
  particular:

   - Set of rnbd fixes for issues around changes for the merge window
     (Gioh, Jack, Md Haris Iqbal)

   - iocost tracepoint addition (Baolin)

   - Copyright/maintainers update (Christoph)

   - Remove old blk-mq fast path CPU warning (Daniel)

   - loop max_part fix (Josh)

   - Remote IPI threaded IRQ fix (Sebastian)

   - dasd stable fixes (Stefan)

   - bcache merge window fixup and style fixup (Yi, Zheng)"

* tag 'block-5.11-2020-12-23' of git://git.kernel.dk/linux-block:
  md/bcache: convert comma to semicolon
  bcache:remove a superfluous check in register_bcache
  block: update some copyrights
  block: remove a pointless self-reference in block_dev.c
  MAINTAINERS: add fs/block_dev.c to the block section
  blk-mq: Don't complete on a remote CPU in force threaded mode
  s390/dasd: fix list corruption of lcu list
  s390/dasd: fix list corruption of pavgroup group list
  s390/dasd: prevent inconsistent LCU device data
  s390/dasd: fix hanging device offline processing
  blk-iocost: Add iocg idle state tracepoint
  nbd: Respect max_part for all partition scans
  block/rnbd-clt: Does not request pdu to rtrs-clt
  block/rnbd-clt: Dynamically allocate sglist for rnbd_iu
  block/rnbd: Set write-back cache and fua same to the target device
  block/rnbd: Fix typos
  block/rnbd-srv: Protect dev session sysfs removal
  block/rnbd-clt: Fix possible memleak
  block/rnbd-clt: Get rid of warning regarding size argument in strlcpy
  blk-mq: Remove 'running from the wrong CPU' warning
2020-12-24 12:28:35 -08:00