The function pointer ip_vs_protocol->csum_check is only used in protocol
specific code, and never in the generic one.
Remove the function pointer from struct ip_vs_protocol and call the
checksum functions directly.
This reduces the performance impact of the Spectre mitigation, and
should give a small improvement even with RETPOLINES disabled.
Signed-off-by: Matteo Croce <mcroce@redhat.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Acked-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
A better way to implement this from userspace has been found without
specific code in the kernel side, revert this.
Fixes: b9ccc07e3f ("netfilter: nft_hash: add map lookups for hashing operations")
Signed-off-by: Laura Garcia Liebana <nevola@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Not used since 203f2e7820 ("netfilter: nat: remove l4proto->unique_tuple")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
In the ip_rcv the skb goes through the PREROUTING hook first, then kicks
in vrf device and go through the same hook again. When conntrack dnat
works with vrf, there will be some conflict with rules because the
packet goes through the hook twice with different nf status.
ip link add user1 type vrf table 1
ip link add user2 type vrf table 2
ip l set dev tun1 master user1
ip l set dev tun2 master user2
nft add table firewall
nft add chain firewall zones { type filter hook prerouting priority - 300 \; }
nft add rule firewall zones counter ct zone set iif map { "tun1" : 1, "tun2" : 2 }
nft add chain firewall rule-1000-ingress
nft add rule firewall rule-1000-ingress ct zone 1 tcp dport 22 ct state new counter accept
nft add rule firewall rule-1000-ingress counter drop
nft add chain firewall rule-1000-egress
nft add rule firewall rule-1000-egress tcp dport 22 ct state new counter drop
nft add rule firewall rule-1000-egress counter accept
nft add chain firewall rules-all { type filter hook prerouting priority - 150 \; }
nft add rule firewall rules-all ip daddr vmap { "2.2.2.11" : jump rule-1000-ingress }
nft add rule firewall rules-all ct zone vmap { 1 : jump rule-1000-egress }
nft add rule firewall dnat-all ct zone vmap { 1 : jump dnat-1000 }
nft add rule firewall dnat-1000 ip daddr 2.2.2.11 counter dnat to 10.0.0.7
For a package with ip daddr 2.2.2.11 and tcp dport 22, first time accept in the
rule-1000-ingress and dnat to 10.0.0.7. Then second time the packet goto the wrong
chain rule-1000-egress which leads the packet drop
With this patch, userspace can add the 'don't re-do entire ruleset for
vrf' policy itself via:
nft add rule firewall rules-all meta iifkind "vrf" counter accept
Signed-off-by: wenxu <wenxu@ucloud.cn>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Its now same as __nf_ct_l4proto_find(), so rename that to
nf_ct_l4proto_find and use it everywhere.
It never returns NULL and doesn't need locks or reference counts.
Before this series:
302824 net/netfilter/nf_conntrack.ko
21504 net/netfilter/nf_conntrack_proto_gre.ko
text data bss dec hex filename
6281 1732 4 8017 1f51 nf_conntrack_proto_gre.ko
108356 20613 236 129205 1f8b5 nf_conntrack.ko
After:
294864 net/netfilter/nf_conntrack.ko
text data bss dec hex filename
106979 19557 240 126776 1ef38 nf_conntrack.ko
so, even with builtin gre, total size got reduced.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Only one user (gre), add a direct call and remove this facility.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Those were needed we still had modular trackers.
As we don't have those anymore, prefer direct calls and remove all
the (un)register infrastructure associated with this.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
after removal of the packet and invert function pointers, several
places do not need to lookup the l4proto structure anymore.
Remove those lookups.
The function nf_ct_invert_tuplepr becomes redundant, replace
it with nf_ct_invert_tuple everywhere.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Now that all l4trackers are builtin, no need to use a mix of direct and
indirect calls.
This removes the last two users: gre and the generic l4 protocol
tracker.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
No need to get/put module owner reference, none of these can be removed
anymore.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Only used by icmp(v6). Prefer a direct call and remove this
function from the l4proto struct.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
GRE is now builtin, so we can handle it via direct call and
remove the callback.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This makes the last of the modular l4 trackers 'bool'.
After this, all infrastructure to handle dynamic l4 protocol registration
becomes obsolete and can be removed in followup patches.
Old:
302824 net/netfilter/nf_conntrack.ko
21504 net/netfilter/nf_conntrack_proto_gre.ko
New:
313728 net/netfilter/nf_conntrack.ko
Old:
text data bss dec hex filename
6281 1732 4 8017 1f51 nf_conntrack_proto_gre.ko
108356 20613 236 129205 1f8b5 nf_conntrack.ko
New:
112095 21381 240 133716 20a54 nf_conntrack.ko
The size increase is only temporary.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
We can use gre. Lock is only needed when a new expectation is added.
In case a single spinlock proves to be problematic we can either add one
per netns or use an array of locks combined with net_hash_mix() or similar
to pick the 'correct' one.
But given this is only needed for an expectation rather than per packet
a single one should be ok.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
rather than handling them via indirect call, use a direct one instead.
This leaves GRE as the last user of this indirect call facility.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The l4 protocol trackers are invoked via indirect call: l4proto->packet().
With one exception (gre), all l4trackers are builtin, so we can make
.packet optional and use a direct call for most protocols.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
To allow for a batch to contain rules in arbitrary ordering, introduce
NFTA_RULE_POSITION_ID attribute which works just like NFTA_RULE_POSITION
but contains the ID of another rule within the same batch. This helps
iptables-nft-restore handling dumps with mixed insert/append commands
correctly.
Note that NFTA_RULE_POSITION takes precedence over
NFTA_RULE_POSITION_ID, so if the former is present, the latter is
ignored.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Following command:
iptables -D FORWARD -m physdev ...
causes connectivity loss in some setups.
Reason is that iptables userspace will probe kernel for the module revision
of the physdev patch, and physdev has an artificial dependency on
br_netfilter (xt_physdev use makes no sense unless a br_netfilter module
is loaded).
This causes the "phydev" module to be loaded, which in turn enables the
"call-iptables" infrastructure.
bridged packets might then get dropped by the iptables ruleset.
The better fix would be to change the "call-iptables" defaults to 0 and
enforce explicit setting to 1, but that breaks backwards compatibility.
This does the next best thing: add a request_module call to checkentry.
This was a stray '-D ... -m physdev' won't activate br_netfilter
anymore.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
With CONFIG_RETPOLINE its faster to add an if (ptr == &foo_func)
check and and use direct calls for all the built-in expressions.
~15% improvement in pathological cases.
checkpatch doesn't like the X macro due to the embedded return statement,
but the macro has a very limited scope so I don't think its a problem.
I would like to avoid bugs of the form
If (e->ops->eval == (unsigned long)nft_foo_eval)
nft_bar_eval();
and open-coded if ()/else if()/else cascade, thus the macro.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Instead of linear search, use rhlist interface to look up the objects.
This fixes rulesets with thousands of named objects (quota, counters and
the like).
We only use a single table for this and consider the address of the
table we're doing the lookup in as a part of the key.
This reduces restore time of a sample ruleset with ~20k named counters
from 37 seconds to 0.8 seconds.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Add a 'key' structure for object, so we can look them up by name + table
combination (the name can be the same in each table).
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
A follow-up patch will enable vetoing of FDB entries. Make it possible
to communicate details of why an FDB entry is not acceptable back to the
user.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are four sources of VXLAN switchdev notifier calls:
- the changelink() link operation, which already supports extack,
- ndo_fdb_add() which got extack support in a previous patch,
- FDB updates due to packet forwarding,
- and vxlan_fdb_replay().
Extend vxlan_fdb_switchdev_call_notifiers() to include extack in the
switchdev message that it sends, and propagate the argument upwards to
the callers. For the first two cases, pass in the extack gotten through
the operation. For case #3, pass in NULL.
To cover the last case, extend vxlan_fdb_replay() to take extack
argument, which might come from whatever operation necessitated the FDB
replay.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Drivers may not be able to support certain FDB entries, and an error
code is insufficient to give clear hints as to the reasons of rejection.
In order to make it possible to communicate the rejection reason, extend
ndo_fdb_add() with an extack argument. Adapt the existing
implementations of ndo_fdb_add() to take the parameter (and ignore it).
Pass the extack parameter when invoking ndo_fdb_add() from rtnl_fdb_add().
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This introduces a new generic SOL_SOCKET-level socket option called
SO_BINDTOIFINDEX. It behaves similar to SO_BINDTODEVICE, but takes a
network interface index as argument, rather than the network interface
name.
User-space often refers to network-interfaces via their index, but has
to temporarily resolve it to a name for a call into SO_BINDTODEVICE.
This might pose problems when the network-device is renamed
asynchronously by other parts of the system. When this happens, the
SO_BINDTODEVICE might either fail, or worse, it might bind to the wrong
device.
In most cases user-space only ever operates on devices which they
either manage themselves, or otherwise have a guarantee that the device
name will not change (e.g., devices that are UP cannot be renamed).
However, particularly in libraries this guarantee is non-obvious and it
would be nice if that race-condition would simply not exist. It would
make it easier for those libraries to operate even in situations where
the device-name might change under the hood.
A real use-case that we recently hit is trying to start the network
stack early in the initrd but make it survive into the real system.
Existing distributions rename network-interfaces during the transition
from initrd into the real system. This, obviously, cannot affect
devices that are up and running (unless you also consider moving them
between network-namespaces). However, the network manager now has to
make sure its management engine for dormant devices will not run in
parallel to these renames. Particularly, when you offload operations
like DHCP into separate processes, these might setup their sockets
early, and thus have to resolve the device-name possibly running into
this race-condition.
By avoiding a call to resolve the device-name, we no longer depend on
the name and can run network setup of dormant devices in parallel to
the transition off the initrd. The SO_BINDTOIFINDEX ioctl plugs this
race.
Reviewed-by: Tom Gundersen <teg@jklm.no>
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This fixes recvmsg() to be able to peek across multiple tls records.
Without this patch, the tls's selftests test case
'recv_peek_large_buf_mult_recs' fails. Each tls receive context now
maintains a 'rx_list' to retain incoming skb carrying tls records. If a
tls record needs to be retained e.g. for peek case or for the case when
the buffer passed to recvmsg() has a length smaller than decrypted
record length, then it is added to 'rx_list'. Additionally, records are
added in 'rx_list' if the crypto operation runs in async mode. The
records are dequeued from 'rx_list' after the decrypted data is consumed
by copying into the buffer passed to recvmsg(). In case, the MSG_PEEK
flag is used in recvmsg(), then records are not consumed or removed
from the 'rx_list'.
Signed-off-by: Vakul Garg <vakul.garg@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We are already checking in phy_detach() that the PHY driver is of
generic kind (1G or 10G) and we are going to make use of that in the SFP
layer as well for 1000BaseT SFP modules, so expose helper functions to
return that information.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
b53 and mv88e6xxx support passing platform_data, and now that we have
split the platform_data portion from the main net/dsa.h header file,
include only the relevant parts.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Instead of having net/dsa.h contain both the internal switch tree/driver
structures, split the relevant platform_data parts into
include/linux/platform_data/dsa.h and make that header be included by
net/dsa.h in order not to break any setup. A subsequent set of patches
will update code including net/dsa.h to include only the platform_data
header.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is not currently way to infer the port number through sysfs that
is being used as the CPU port number. Overlay a ndo_get_phys_port_name()
operation onto the DSA master network device in order to retrieve that
information.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull networking fixes from David Miller:
1) Fix regression in multi-SKB responses to RTM_GETADDR, from Arthur
Gautier.
2) Fix ipv6 frag parsing in openvswitch, from Yi-Hung Wei.
3) Unbounded recursion in ipv4 and ipv6 GUE tunnels, from Stefano
Brivio.
4) Use after free in hns driver, from Yonglong Liu.
5) icmp6_send() needs to handle the case of NULL skb, from Eric
Dumazet.
6) Missing rcu read lock in __inet6_bind() when operating on mapped
addresses, from David Ahern.
7) Memory leak in tipc-nl_compat_publ_dump(), from Gustavo A. R. Silva.
8) Fix PHY vs r8169 module loading ordering issues, from Heiner
Kallweit.
9) Fix bridge vlan memory leak, from Ido Schimmel.
10) Dev refcount leak in AF_PACKET, from Jason Gunthorpe.
11) Infoleak in ipv6_local_error(), flow label isn't completely
initialized. From Eric Dumazet.
12) Handle mv88e6390 errata, from Andrew Lunn.
13) Making vhost/vsock CID hashing consistent, from Zha Bin.
14) Fix lack of UMH cleanup when it unexpectedly exits, from Taehee Yoo.
15) Bridge forwarding must clear skb->tstamp, from Paolo Abeni.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (87 commits)
bnxt_en: Fix context memory allocation.
bnxt_en: Fix ring checking logic on 57500 chips.
mISDN: hfcsusb: Use struct_size() in kzalloc()
net: clear skb->tstamp in bridge forwarding path
net: bpfilter: disallow to remove bpfilter module while being used
net: bpfilter: restart bpfilter_umh when error occurred
net: bpfilter: use cleanup callback to release umh_info
umh: add exit routine for UMH process
isdn: i4l: isdn_tty: Fix some concurrency double-free bugs
vhost/vsock: fix vhost vsock cid hashing inconsistent
net: stmmac: Prevent RX starvation in stmmac_napi_poll()
net: stmmac: Fix the logic of checking if RX Watchdog must be enabled
net: stmmac: Check if CBS is supported before configuring
net: stmmac: dwxgmac2: Only clear interrupts that are active
net: stmmac: Fix PCI module removal leak
tools/bpf: fix bpftool map dump with bitfields
tools/bpf: test btf bitfield with >=256 struct member offset
bpf: fix bpffs bitfield pretty print
net: ethernet: mediatek: fix warning in phy_start_aneg
tcp: change txhash on SYN-data timeout
...
Pull MFD updates from Lee Jones:
"New Device Support
- Add support for Power Supply to AXP813
- Add support for GPIO, ADC, AC and Battery Power Supply to AXP803
- Add support for UART to Exynos LPASS
Fix-ups:
- Use supplied MACROS; ti_am335x_tscadc
- Trivial spelling/whitespace/alignment; tmio, axp20x, rave-sp
- Regmap changes; bd9571mwv, wm5110-tables
- Kconfig dependencies; MFD_AT91_USART
- Supply shared data for child-devices; madera-core
- Use new of_node_name_eq() API call; max77620, stmpe
- Use managed resources (devm_*); tps65218
- Comment descriptions; ingenic-tcu
- Coding style; madera-core
Bug Fixes:
- Fix section mismatches; twl-core, db8500-prcmu
- Correct error path related issues; mt6397-core, ab8500-core, mc13xxx-core
- IRQ related fixes; tps6586x
- Ensure proper initialisation sequence; qcom_rpm
- Repair potential memory leak; cros_ec_dev"
* tag 'mfd-next-4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd: (25 commits)
mfd: exynos-lpass: Enable UART module support
mfd: mc13xxx: Fix a missing check of a register-read failure
mfd: cros_ec: Add commands to control codec
mfd: madera: Remove spurious semicolon in while loop
mfd: rave-sp: Fix typo in rave_sp_checksum comment
mfd: ingenic-tcu: Fix bit field description in header
mfd: tps65218: Use devm_regmap_add_irq_chip and clean up error path in probe()
mfd: Use of_node_name_eq() for node name comparisons
mfd: cros_ec_dev: Add missing mfd_remove_devices() call in remove
mfd: axp20x: Add supported cells for AXP803
mfd: axp20x: Re-align MFD cell entries
mfd: axp20x: Add AC power supply cell for AXP813
mfd: wm5110: Add missing ASRC rate register
mfd: qcom_rpm: write fw_version to CTRL_REG
mfd: tps6586x: Handle interrupts on suspend
mfd: madera: Add shared data for accessory detection
mfd: at91-usart: Add platform dependency
mfd: bd9571mwv: Add volatile register to make DVFS work
mfd: ab8500-core: Return zero in get_register_interruptible()
mfd: tmio: Typo s/use use/use/
...
Pull ARM SoC fixes from Olof Johansson:
"A bigger batch than I anticipated this week, for two reasons:
- Some fallout on Davinci from board file -> DTB conversion, that
also includes a few longer-standing fixes (i.e. not recent
regressions).
- drivers/reset material that has been in linux-next for a while, but
didn't get sent to us until now for a variety of reasons
(maintainer out sick, holidays, etc). There's a functional
dependency in there such that one platform (Altera's SoCFPGA) won't
boot without one of the patches; instead of reverting the patch
that got merged, I looked at this set and decided it was small
enough that I'll pick it up anyway. If you disagree I can revisit
with a smaller set.
That being said, there's also a handful of the usual stuff:
- Fix for a crash on Armada 7K/8K when the kernel touches
PSCI-reserved memory
- Fix for PCIe reset on Macchiatobin (Armada 8K development board,
what this email is sent from in fact :)
- Enable a few new-merged modules for Amlogic in arm64 defconfig
- Error path fixes on Integrator
- Build fix for Renesas and Qualcomm
- Initialization fix for Renesas RZ/G2E
.. plus a few more fixlets"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (28 commits)
ARM: integrator: impd1: use struct_size() in devm_kzalloc()
qcom-scm: Include <linux/err.h> header
gpio: pl061: handle failed allocations
ARM: dts: kirkwood: Fix polarity of GPIO fan lines
arm64: dts: marvell: mcbin: fix PCIe reset signal
arm64: dts: marvell: armada-ap806: reserve PSCI area
ARM: dts: da850-lcdk: Correct the sound card name
ARM: dts: da850-lcdk: Correct the audio codec regulators
ARM: dts: da850-evm: Correct the sound card name
ARM: dts: da850-evm: Correct the audio codec regulators
ARM: davinci: omapl138-hawk: fix label names in GPIO lookup entries
ARM: davinci: dm644x-evm: fix label names in GPIO lookup entries
ARM: davinci: dm355-evm: fix label names in GPIO lookup entries
ARM: davinci: da850-evm: fix label names in GPIO lookup entries
ARM: davinci: da830-evm: fix label names in GPIO lookup entries
arm64: defconfig: enable modules for amlogic s400 sound card
reset: uniphier-glue: Add AHCI reset control support in glue layer
dt-bindings: reset: uniphier: Add AHCI core reset description
reset: uniphier-usb3: Rename to reset-uniphier-glue
dt-bindings: reset: uniphier: Replace the expression of USB3 with generic peripherals
...
Late reset controller changes for v5.0
This adds missing deassert functionality to the ARC HSDK reset driver,
fixes some indentation and grammar issues in the kernel docs, adds a
helper to count the number of resets on a device for the non-DT case
as well, adds an early reset driver for SoCFPGA and simple reset driver
support for Stratix10, and generalizes the uniphier USB3 glue layer
reset to also cover AHCI.
* tag 'reset-for-5.0-rc2' of git://git.pengutronix.de/git/pza/linux:
reset: uniphier-glue: Add AHCI reset control support in glue layer
dt-bindings: reset: uniphier: Add AHCI core reset description
reset: uniphier-usb3: Rename to reset-uniphier-glue
dt-bindings: reset: uniphier: Replace the expression of USB3 with generic peripherals
ARM: socfpga: dts: document "altr,stratix10-rst-mgr" binding
reset: socfpga: add an early reset driver for SoCFPGA
reset: fix null pointer dereference on dev by dev_name
reset: Add reset_control_get_count()
reset: Improve reset controller kernel docs
ARC: HSDK: improve reset driver
Signed-off-by: Olof Johansson <olof@lixom.net>
Commit 49e54187ae ("ata: libahci_platform: comply to PHY framework") uses
the PHY_MODE_SATA, but that enum had not yet been added. This caused a
build failure for me, with today's linux.git.
Also, there is a potentially conflicting (mis-named) PHY_MODE_SATA, hiding
in the Marvell Berlin SATA PHY driver.
Fix the build by:
1) Renaming Marvell's defined value to a more scoped name,
in order to avoid any potential conflicts: PHY_BERLIN_MODE_SATA.
2) Adding the missing enum, which was going to be added anyway as part
of [1].
[1] https://lkml.kernel.org/r/20190108163124.6409-3-miquel.raynal@bootlin.com
Fixes: 49e54187ae ("ata: libahci_platform: comply to PHY framework")
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Acked-by: Jens Axboe <axboe@kernel.dk>
Acked-by: Olof Johansson <olof@lixom.net>
Cc: Grzegorz Jaszczyk <jaz@semihalf.com>
Cc: Miquel Raynal <miquel.raynal@bootlin.com>
Cc: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull dma_zalloc_coherent() removal from Christoph Hellwig:
"We've always had a weird situation around dma_zalloc_coherent. To
safely support mapping the allocations to userspace major
architectures like x86 and arm have always zeroed allocations from
dma_alloc_coherent, but a couple other architectures were missing that
zeroing either always or in corner cases.
Then later we grew anothe dma_zalloc_coherent interface to explicitly
request zeroing, but that just added __GFP_ZERO to the allocation
flags, which for some allocators that didn't end up using the page
allocator ended up being a no-op and still not zeroing the
allocations.
So for this merge window I fixed up all remaining architectures to
zero the memory in dma_alloc_coherent, and made dma_zalloc_coherent a
no-op wrapper around dma_alloc_coherent, which fixes all of the above
issues.
dma_zalloc_coherent is now pointless and can go away, and Luis helped
me writing a cocchinelle script and patch series to kill it, which I
think we should apply now just after -rc1 to finally settle these
issue"
* tag 'remove-dma_zalloc_coherent-5.0' of git://git.infradead.org/users/hch/dma-mapping:
dma-mapping: remove dma_zalloc_coherent()
cross-tree: phase out dma_zalloc_coherent() on headers
cross-tree: phase out dma_zalloc_coherent()
Pull more drm fixes from Daniel Vetter:
"Dave sends out his pull, everybody remembers holidays are over :-)
Since Dave's already in weekend mode and it was quite a few patches I
figured better to apply all the pulls and forward them to you. Hence
here 2nd part of bugfixes for -rc2.
nouveau:
- backlight fix
- falcon register access fix
- fan fix.
i915:
- Disable PSR for Apple panels
- Broxton ERR_PTR error state fix
- Kabylake VECS workaround fix
- Unwind failure on pinning the gen7 ppgtt
- GVT workload request allocation fix
core:
- Fix fb-helper to work correctly with SDL 1.2 bugs
- Fix lockdep warning in the atomic ioctl and setproperty"
* tag 'drm-fixes-2019-01-11-1' of git://anongit.freedesktop.org/drm/drm:
drm/nouveau/falcon: avoid touching registers if engine is off
drm/nouveau: Don't disable polling in fallback mode
drm/nouveau: register backlight on pascal and newer
drm: Fix documentation generation for DP_DPCD_QUIRK_NO_PSR
drm/i915: init per-engine WAs for all engines
drm/i915: Unwind failure on pinning the gen7 ppgtt
drm/i915: Skip the ERR_PTR error state
drm/i915: Disable PSR in Apple panels
gpu/drm: Fix lock held when returning to user space.
drm/fb-helper: Ignore the value of fb_var_screeninfo.pixclock
drm/fb-helper: Partially bring back workaround for bugs of SDL 1.2
drm/i915/gvt: Fix workload request allocation before request add
The bpfilter_umh will be stopped via __stop_umh() when the bpfilter
error occurred.
The bpfilter_umh() couldn't start again because there is no restart
routine.
The section of the bpfilter_umh_{start/end} is no longer .init.rodata
because these area should be reused in the restart routine. hence
the section name is changed to .bpfilter_umh.
The bpfilter_ops->start() is restart callback. it will be called when
bpfilter_umh is stopped.
The stop bit means bpfilter_umh is stopped. this bit is set by both
start and stop routine.
Before this patch,
Test commands:
$ iptables -vnL
$ kill -9 <pid of bpfilter_umh>
$ iptables -vnL
[ 480.045136] bpfilter: write fail -32
$ iptables -vnL
All iptables commands will fail.
After this patch,
Test commands:
$ iptables -vnL
$ kill -9 <pid of bpfilter_umh>
$ iptables -vnL
$ iptables -vnL
Now, all iptables commands will work.
Fixes: d2ba09c17a ("net: add skeleton of bpfilter kernel module")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now, UMH process is killed, do_exit() calls the umh_info->cleanup callback
to release members of the umh_info.
This patch makes bpfilter_umh's cleanup routine to use the
umh_info->cleanup callback.
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A UMH process which is created by the fork_usermode_blob() such as
bpfilter needs to release members of the umh_info when process is
terminated.
But the do_exit() does not release members of the umh_info. hence module
which uses UMH needs own code to detect whether UMH process is
terminated or not.
But this implementation needs extra code for checking the status of
UMH process. it eventually makes the code more complex.
The new PF_UMH flag is added and it is used to identify UMH processes.
The exit_umh() does not release members of the umh_info.
Hence umh_info->cleanup callback should release both members of the
umh_info and the private data.
Suggested-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull ceph updates from Ilya Dryomov:
"A patch to allow setting abort_on_full and a fix for an old "rbd
unmap" edge case, marked for stable"
* tag 'ceph-for-5.0-rc2' of git://github.com/ceph/ceph-client:
rbd: don't return 0 on unmap if RBD_DEV_FLAG_REMOVING is set
ceph: use vmf_error() in ceph_filemap_fault()
libceph: allow setting abort_on_full for rbd
Pull x86 fixes from Ingo Molnar:
"A 32-bit build fix, CONFIG_RETPOLINE fixes and rename CONFIG_RESCTRL
to CONFIG_X86_RESCTRL"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86, modpost: Replace last remnants of RETPOLINE with CONFIG_RETPOLINE
x86/cache: Rename config option to CONFIG_X86_RESCTRL
samples/seccomp: Fix 32-bit build
Pull power management updates from Rafael Wysocki:
"These fix fallout after starting to use hrtimers in the runtime PM
framework, fix a few cpufreq issues, fix a recently broken reference
to cpuidle documentation, update MAINTAINERS entries for cpufreq and
cpuidle and make the recently added system suspend and resume support
in devfreq actually work.
Specifics:
- Prevent integer overflows from occurring on 32-bit when converting
milliseconds to nanoseconds in the runtime PM framework and update
comments that still refer to jiffies in it (Vincent Guittot,
Ladislav Michl).
- Fix the SCMI cpufreq driver to always use the same frequency units
for arch_set_freq_scale() and make the scale-invariant load
tracking acutally work with this driver (Quentin Perret).
- Fix freeing of dynamic OPPs in the SCPI and SCMI cpufreq drivers
broken during the 4.20 defelopment cycle (Viresh Kumar).
- Prevent the cpufreq core from attempting to return the current
frequency of offline CPUs (Sudeep Holla).
- Add devfreq suspend and resume hooks (missed previously) to the PM
core to make the recently added system suspend and resume support
in devfreq actually work (Lukasz Luba).
- Update MAINTAINERS entries for cpufreq and cpuidle, mostly to add
references to new/current documentation to them (Rafael Wysocki).
- Fix a recently broken reference to cpuidle documentation (Otto
Sabart)"
* tag 'pm-5.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
PM-runtime: Fix autosuspend_delay on 32bits arch
PM-runtime: Fix 'jiffies' in comments after switch to hrtimers
cpufreq: scmi: Fix frequency invariance in slow path
doc: trace: fix reference to cpuidle documentation file
cpufreq: check if policy is inactive early in __cpufreq_get()
cpufreq: scpi/scmi: Fix freeing of dynamic OPPs
cpuidle / Documentation: Update cpuidle MAINTAINERS entry
cpufreq / Documentation: Update cpufreq MAINTAINERS entry
PM: sleep: call devfreq suspend/resume
Pull drm fixes from Dave Airlie:
"Not a huge amount for rc2, assume the usual quiet period, and rc3 will
be most of it.
amdgpu:
- Powerplay fixes
- Virtual display pinning fixes
- Golden register updates for Vega
- Pitch and gem size validation fixes
- SR-IOV init error fix
- Pagetables in system RAM disable for some Raven system
- DP-MST resume fixes
tc358767 bridge:
- fix to work with displayport connector"
* tag 'drm-fixes-2019-01-11' of git://anongit.freedesktop.org/drm/drm: (26 commits)
drm/amdgpu: disable system memory page tables for now
drm/amdgpu: set WRITE_BURST_LENGTH to 64B to workaround SDMA1 hang
drm/amdgpu: fix CPDMA hang in PRT mode for VEGA20
drm/bridge: tc358767: use DP connector if no panel set
drm/bridge: tc358767: fix output H/V syncs
drm/bridge: tc358767: reject modes which require too much BW
drm/bridge: tc358767: fix initial DP0/1_SRCCTRL value
drm/bridge: tc358767: fix single lane configuration
drm/bridge: tc358767: add defines for DP1_SRCCTRL & PHY_2LANE
drm/bridge: tc358767: add bus flags
drm/dp_mst: Add __must_check to drm_dp_mst_topology_mgr_resume()
drm/amdgpu: Don't fail resume process if resuming atomic state fails
drm/amdgpu: Don't ignore rc from drm_dp_mst_topology_mgr_resume()
drm/amdgpu: validate user GEM object size
drm/amdgpu: validate user pitch alignment
drm/amd/powerplay: drop the unnecessary uclk hard min setting
drm/amd/powerplay: avoid possible buffer overflow
drm/amd/powerplay: create pp_od_clk_voltage device file under OD support
drm/amd/powerplay: update OD support flag for SKU with no OD capabilities
drm/amdgpu: make gfx9 enter into rlc safe mode when set MGCG
...