This PHY has two PHY IDs depending on its mode. Adjust the mask so that
it includes both IDs.
Signed-off-by: Michael Walle <michael@walle.cc>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Output PPS signal on FIPER2 (Fixed Period Interval Pulse) in default
which is more desired by user.
Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In commit 86de5921a3 ("tcp: defer SACK compression after DupThresh")
I added a TCP_FASTRETRANS_THRESH bias to tp->compressed_ack in order
to enable sack compression only after 3 dupacks.
Since we plan to relax this rule for flows that involve
stacks not requiring this old rule, this patch adds
a distinct tp->dup_ack_counter.
This means the TCP_FASTRETRANS_THRESH value is now used
in a single location that a future patch can adjust:
if (tp->dup_ack_counter < TCP_FASTRETRANS_THRESH) {
tp->dup_ack_counter++;
goto send_now;
}
This patch also introduces tcp_sack_compress_send_ack()
helper to ease following patch comprehension.
This patch refines LINUX_MIB_TCPACKCOMPRESSED to not
count the acks that we had to send if the timer expires
or tcp_sack_compress_send_ack() is sending an ack.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Saeed Mahameed says:
====================
mlx5-updates-2020-04-30
1) Add release all pages support, From Eran.
to release all FW pages at once on driver unload, when supported by FW.
2) From Maxim and Tariq, Trivial Data path cleanup and code improvements
in preparation for their next features, TLS offload and TX performance
improvements
3) Multiple cleanups.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Not much to be done here:
- add SPDX header;
- adjust titles and chapters, adding proper markups;
- add to networking/index.rst.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds cgroup v2 ID to common inet diag message attributes.
Cgroup v2 ID is kernfs ID (ino or ino+gen). This attribute allows filter
inet diag output by cgroup ID obtained by name_to_handle_at() syscall.
When net_cls or net_prio cgroup is activated this ID is equal to 1 (root
cgroup ID) for newly created sockets.
Some notes about this ID:
1) gets initialized in socket() syscall
2) incoming socket gets ID from listening socket
(not during accept() syscall)
3) not changed when process get moved to another cgroup
4) can point to deleted cgroup (refcounting)
v2:
- use CONFIG_SOCK_CGROUP_DATA instead if CONFIG_CGROUPS
v3:
- fix attr size by using nla_total_size_64bit() (Eric Dumazet)
- more detailed commit message (Konstantin Khlebnikov)
Signed-off-by: Dmitry Yakunin <zeil@yandex-team.ru>
Reviewed-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Acked-By: Tejun Heo <tj@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
mlx5 updates for both net-next and rdma-next:
1) HW bits and definitions for TLS and IPsec offlaods
2) Release all pages capability bits
3) New command interface helpers and some code cleanup as a result
4) Move qp.c out of mlx5 core driver into mlx5_ib rdma driver
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Add new TX WQE field for Connect-X6DX trailer insertion support,
when set, the HW adds a trailer to the packet, the WQE trailer
association flags are used to set to HW the header which the
trailer belongs.
Signed-off-by: Raed Salem <raeds@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Add a bit in HCA capabilities layout to indicate if release all pages is
supported.
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Add needed structure layouts and defines for pci sync for fw update
event. The downstream patches will include event handlers for this event
type.
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Add needed structure layouts and defines for MFRL (Management Firmware
Reset Level) register. This structure will be used for the firmware
upgrade and reset flow in the downstream patches.
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
The imm_inval_pkey field can hold four different types of data,
depends on the usage, the data could be one of the below:
- Immediate field of the received message
- Invalidate rkey
- Pkey of the packet
- Flow table metadata
Current implementation doesn't reflect the intended usage of the
field at usage time.
Reflect the different types by replace this field with a union,
modify code where this field is used to reflect its intended
usage.
Signed-off-by: Raed Salem <raeds@mellanox.com>
Reviewed-by: Huy Nguyen <huyn@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
The alignment value is part of the input structure, so use it and spare
extra memory allocation when is not needed.
Now, using the new ability when allocating icm for Direct-Rule
insertion.
Signed-off-by: Ariel Levkovich <lariel@mellanox.com>
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Add COPY type to modify_header action. IPsec feature is the first
feature that needs COPY steering action.
Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Signed-off-by: Raed Salem <raeds@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Acked-by: Leon Romanovsky <leonro@mellanox.com>
To integrate MRP into the bridge, first the bridge needs to be aware of ports
that are part of an MRP ring and which rings are on the bridge.
Therefore extend bridge interface with the following:
- add new flag(BR_MPP_AWARE) to the net bridge ports, this bit will be
set when the port is added to an MRP instance. In this way it knows if
the frame was received on MRP ring port
- add new flag(BR_MRP_LOST_CONT) to the net bridge ports, this bit will be set
when the port lost the continuity of MRP Test frames.
- add a list of MRP instances
Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull networking fixes from David Miller:
1) Fix memory leak in netfilter flowtable, from Roi Dayan.
2) Ref-count leaks in netrom and tipc, from Xiyu Yang.
3) Fix warning when mptcp socket is never accepted before close, from
Florian Westphal.
4) Missed locking in ovs_ct_exit(), from Tonghao Zhang.
5) Fix large delays during PTP synchornization in cxgb4, from Rahul
Lakkireddy.
6) team_mode_get() can hang, from Taehee Yoo.
7) Need to use kvzalloc() when allocating fw tracer in mlx5 driver,
from Niklas Schnelle.
8) Fix handling of bpf XADD on BTF memory, from Jann Horn.
9) Fix BPF_STX/BPF_B encoding in x86 bpf jit, from Luke Nelson.
10) Missing queue memory release in iwlwifi pcie code, from Johannes
Berg.
11) Fix NULL deref in macvlan device event, from Taehee Yoo.
12) Initialize lan87xx phy correctly, from Yuiko Oshino.
13) Fix looping between VRF and XFRM lookups, from David Ahern.
14) etf packet scheduler assumes all sockets are full sockets, which is
not necessarily true. From Eric Dumazet.
15) Fix mptcp data_fin handling in RX path, from Paolo Abeni.
16) fib_select_default() needs to handle nexthop objects, from David
Ahern.
17) Use GFP_ATOMIC under spinlock in mac80211_hwsim, from Wei Yongjun.
18) vxlan and geneve use wrong nlattr array, from Sabrina Dubroca.
19) Correct rx/tx stats in bcmgenet driver, from Doug Berger.
20) BPF_LDX zero-extension is encoded improperly in x86_32 bpf jit, fix
from Luke Nelson.
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (100 commits)
selftests/bpf: Fix a couple of broken test_btf cases
tools/runqslower: Ensure own vmlinux.h is picked up first
bpf: Make bpf_link_fops static
bpftool: Respect the -d option in struct_ops cmd
selftests/bpf: Add test for freplace program with expected_attach_type
bpf: Propagate expected_attach_type when verifying freplace programs
bpf: Fix leak in LINK_UPDATE and enforce empty old_prog_fd
bpf, x86_32: Fix logic error in BPF_LDX zero-extension
bpf, x86_32: Fix clobbering of dst for BPF_JSET
bpf, x86_32: Fix incorrect encoding in BPF_LDX zero-extension
bpf: Fix reStructuredText markup
net: systemport: suppress warnings on failed Rx SKB allocations
net: bcmgenet: suppress warnings on failed Rx SKB allocations
macsec: avoid to set wrong mtu
mac80211: sta_info: Add lockdep condition for RCU list usage
mac80211: populate debugfs only after cfg80211 init
net: bcmgenet: correct per TX/RX ring statistics
net: meth: remove spurious copyright text
net: phy: bcm84881: clear settings on link down
chcr: Fix CPU hard lockup
...
Since 6e2d85ec05 ("net: phy: Stop with excessive soft reset")
we don't need genphy_no_soft_reset() any longer. Not setting
callback soft_reset results in a no-op now.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull PNP cleanup from Rafael Wysocki:
"Make the PNP code use list_for_each_entry() in a few places instead of
open-coding it (Jason Gunthorpe)"
* tag 'pnp-5.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
pnp: Use list_for_each_entry() instead of open coding
Pull block fixes from Jens Axboe:
"A few fixes/changes that should go into this release:
- null_blk zoned fixes (Damien)
- blkdev_close() sync improvement (Douglas)
- Fix regression in blk-iocost that impacted (at least) systemtap
(Waiman)
- Comment fix, header removal (Zhiqiang, Jianpeng)"
* tag 'block-5.7-2020-04-24' of git://git.kernel.dk/linux-block:
null_blk: Cleanup zoned device initialization
null_blk: Fix zoned command handling
block: remove unused header
blk-iocost: Fix error on iocost_ioc_vrate_adj
bdev: Reduce time holding bd_mutex in sync in blkdev_close()
buffer: remove useless comment and WB_REASON_FREE_MORE_MEM, reason.
Pull tracing fixes from Steven Rostedt:
"A few tracing fixes:
- Two fixes for memory leaks detected by kmemleak
- Removal of some dead code
- A few local functions turned static"
* tag 'trace-v5.7-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing: Convert local functions in tracing_map.c to static
tracing: Remove DECLARE_TRACE_NOARGS
ftrace: Fix memory leak caused by not freeing entry in unregister_ftrace_direct()
tracing: Fix memory leaks in trace_events_hist.c
Pull Kbuild fixes from Masahiro Yamada:
- fix scripts/config to properly handle ':' in string type CONFIG
options
- fix unneeded rebuilds of DT schema check rule
- git rid of ordering dependency between <linux/vermagic.h> and
<linux/module.h> to fix build errors in some network drivers
- clean up generated headers of host arch with 'make ARCH=um mrproper'
* tag 'kbuild-fixes-v5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
h8300: ignore vmlinux.lds
Documentation: kbuild: fix the section title format
um: ensure `make ARCH=um mrproper` removes arch/$(SUBARCH)/include/generated/
arch: split MODULE_ARCH_VERMAGIC definitions out to <asm/vermagic.h>
kbuild: fix DT binding schema rule again to avoid needless rebuilds
scripts/config: allow colons in option strings for sed
Back in commit 3b47d30396 ("net: gro: add a per device gro flush timer")
we added the ability to arm one high resolution timer, that we used
to keep not-complete packets in GRO engine a bit longer, hoping that further
frames might be added to them.
Since then, we added the napi_complete_done() interface, and commit
364b605573 ("net: busy-poll: return busypolling status to drivers")
allowed drivers to avoid re-arming NIC interrupts if we made a promise
that their NAPI poll() handler would be called in the near future.
This infrastructure can be leveraged, thanks to a new device parameter,
which allows to arm the napi hrtimer, instead of re-arming the device
hard IRQ.
We have noticed that on some servers with 32 RX queues or more, the chit-chat
between the NIC and the host caused by IRQ delivery and re-arming could hurt
throughput by ~20% on 100Gbit NIC.
In contrast, hrtimers are using local (percpu) resources and might have lower
cost.
The new tunable, named napi_defer_hard_irqs, is placed in the same hierarchy
than gro_flush_timeout (/sys/class/net/ethX/)
By default, both gro_flush_timeout and napi_defer_hard_irqs are zero.
This patch does not change the prior behavior of gro_flush_timeout
if used alone : NIC hard irqs should be rearmed as before.
One concrete usage can be :
echo 20000 >/sys/class/net/eth1/gro_flush_timeout
echo 10 >/sys/class/net/eth1/napi_defer_hard_irqs
If at least one packet is retired, then we will reset napi counter
to 10 (napi_defer_hard_irqs), ensuring at least 10 periodic scans
of the queue.
On busy queues, this should avoid NIC hard IRQ, while before this patch IRQ
avoidance was only possible if napi->poll() was exhausting its budget
and not call napi_complete_done().
This feature also can be used to work around some non-optimal NIC irq
coalescing strategies.
Having the ability to insert XX usec delays between each napi->poll()
can increase cache efficiency, since we increase batch sizes.
It also keeps serving cpus not idle too long, reducing tail latencies.
Co-developed-by: Luigi Rizzo <lrizzo@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Do mass update of cq.c to reuse newly introduced
mlx5_cmd_exec_in*() interfaces.
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Pull nfsd fixes from Chuck Lever:
"The first set of 5.7-rc fixes for NFS server issues.
These were all unresolved at the time the 5.7 window opened, and
needed some additional time to ensure they were correctly addressed.
They are ready now.
At the moment I know of one more urgent issue regarding the NFS
server. A fix has been tested and is under review. I expect to send
one more pull request, containing this fix (which now consists of 3
patches).
Fixes:
- Address several use-after-free and memory leak bugs
- Prevent a backchannel livelock"
* tag 'nfsd-5.7-rc-1' of git://git.linux-nfs.org/projects/cel/cel-2.6:
svcrdma: Fix leak of svc_rdma_recv_ctxt objects
svcrdma: Fix trace point use-after-free race
SUNRPC: Fix backchannel RPC soft lockups
SUNRPC/cache: Fix unsafe traverse caused double-free in cache_purge
nfsd: memory corruption in nfsd4_lock()
This function will be needed in tja11xx driver for secondary PHY
support.
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
As the bug report [1] pointed out, <linux/vermagic.h> must be included
after <linux/module.h>.
I believe we should not impose any include order restriction. We often
sort include directives alphabetically, but it is just coding style
convention. Technically, we can include header files in any order by
making every header self-contained.
Currently, arch-specific MODULE_ARCH_VERMAGIC is defined in
<asm/module.h>, which is not included from <linux/vermagic.h>.
Hence, the straight-forward fix-up would be as follows:
|--- a/include/linux/vermagic.h
|+++ b/include/linux/vermagic.h
|@@ -1,5 +1,6 @@
| /* SPDX-License-Identifier: GPL-2.0 */
| #include <generated/utsrelease.h>
|+#include <linux/module.h>
|
| /* Simply sanity version stamp for modules. */
| #ifdef CONFIG_SMP
This works enough, but for further cleanups, I split MODULE_ARCH_VERMAGIC
definitions into <asm/vermagic.h>.
With this, <linux/module.h> and <linux/vermagic.h> will be orthogonal,
and the location of MODULE_ARCH_VERMAGIC definitions will be consistent.
For arc and ia64, MODULE_PROC_FAMILY is only used for defining
MODULE_ARCH_VERMAGIC. I squashed it.
For hexagon, nds32, and xtensa, I removed <asm/modules.h> entirely
because they contained nothing but MODULE_ARCH_VERMAGIC definition.
Kbuild will automatically generate <asm/modules.h> at build-time,
wrapping <asm-generic/module.h>.
[1] https://lore.kernel.org/lkml/20200411155623.GA22175@zn.tnic
Reported-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Jessica Yu <jeyu@kernel.org>
If there's no special ordering requirement for mdiobus_unregister(),
then driver code can be simplified by using a device-managed version
of mdiobus_register(). Prerequisite is that bus allocation has been
done device-managed too. Else mdiobus_free() may be called whilst
bus is still registered, resulting in a BUG_ON(). Therefore let
devm_mdiobus_register() return -EPERM if bus was allocated
non-managed.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The Broadcom BCM54140 is a Quad SGMII/QSGMII Copper/Fiber Gigabit
Ethernet transceiver.
This also adds support for tunables to set and get downshift and
energy detect auto power-down.
The PHY has four ports and each port has its own PHY address.
There are per-port registers as well as global registers.
Unfortunately, the global registers can only be accessed by reading
and writing from/to the PHY address of the first port. Further,
there is no way to find out what port you actually are by just
reading the per-port registers. We therefore, have to scan the
bus on the PHY probe to determine the port and thus what address
we need to access the global registers.
Signed-off-by: Michael Walle <michael@walle.cc>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
RDB (Register Data Base) registers are used on newer Broadcom PHYs. Add
helper to read, write and modify these registers.
Signed-off-by: Michael Walle <michael@walle.cc>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Aside from good practice, this avoids a warning from gcc 10:
./include/linux/kernel.h:997:3: warning: array subscript -31 is outside array bounds of ‘struct list_head[1]’ [-Warray-bounds]
997 | ((type *)(__mptr - offsetof(type, member))); })
| ~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./include/linux/list.h:493:2: note: in expansion of macro ‘container_of’
493 | container_of(ptr, type, member)
| ^~~~~~~~~~~~
./include/linux/pnp.h:275:30: note: in expansion of macro ‘list_entry’
275 | #define global_to_pnp_dev(n) list_entry(n, struct pnp_dev, global_list)
| ^~~~~~~~~~
./include/linux/pnp.h:281:11: note: in expansion of macro ‘global_to_pnp_dev’
281 | (dev) != global_to_pnp_dev(&pnp_global); \
| ^~~~~~~~~~~~~~~~~
arch/x86/kernel/rtc.c:189:2: note: in expansion of macro ‘pnp_for_each_dev’
189 | pnp_for_each_dev(dev) {
Because the common code doesn't cast the starting list_head to the
containing struct.
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
[ rjw: Whitespace adjustments ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This patch is to enable Intel SERDES power up/down sequence. The SERDES
converts 8/10 bits data to SGMII signal. Below is an example of
HW configuration for SGMII mode. The SERDES is located in the PHY IF
in the diagram below.
<-----------------GBE Controller---------->|<--External PHY chip-->
+----------+ +----+ +---+ +----------+
| EQoS | <-GMII->| DW | < ------ > |PHY| <-SGMII-> | External |
| MAC | |xPCS| |IF | | PHY |
+----------+ +----+ +---+ +----------+
^ ^ ^ ^
| | | |
+---------------------MDIO-------------------------+
PHY IF configuration and status registers are accessible through
mdio address 0x15 which is defined as mdio_adhoc_addr. During D0,
The driver will need to power up PHY IF by changing the power state
to P0. Likewise, for D3, the driver sets PHY IF power state to P3.
Signed-off-by: Voon Weifeng <weifeng.voon@intel.com>
Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
VERMAGIC* definitions are not supposed to be used by the drivers,
see this [1] bug report, so introduce special define to guard inclusion
of this header file and define it in kernel/modules.h and in internal
script that generates *.mod.c files.
In-tree module build:
➜ kernel git:(vermagic) ✗ make clean
➜ kernel git:(vermagic) ✗ make M=drivers/infiniband/hw/mlx5
➜ kernel git:(vermagic) ✗ modinfo drivers/infiniband/hw/mlx5/mlx5_ib.ko
filename: /images/leonro/src/kernel/drivers/infiniband/hw/mlx5/mlx5_ib.ko
<...>
vermagic: 5.6.0+ SMP mod_unload modversions
Out-of-tree module build:
➜ mlx5 make -C /images/leonro/src/kernel clean M=/tmp/mlx5
➜ mlx5 make -C /images/leonro/src/kernel M=/tmp/mlx5
➜ mlx5 modinfo /tmp/mlx5/mlx5_ib.ko
filename: /tmp/mlx5/mlx5_ib.ko
<...>
vermagic: 5.6.0+ SMP mod_unload modversions
[1] https://lore.kernel.org/lkml/20200411155623.GA22175@zn.tnic
Reported-by: Borislav Petkov <bp@suse.de>
Acked-by: Borislav Petkov <bp@suse.de>
Acked-by: Jessica Yu <jeyu@kernel.org>
Co-developed-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Merge misc fixes from Andrew Morton:
"15 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
tools/vm: fix cross-compile build
coredump: fix null pointer dereference on coredump
mm: shmem: disable interrupt when acquiring info->lock in userfaultfd_copy path
shmem: fix possible deadlocks on shmlock_user_lock
vmalloc: fix remap_vmalloc_range() bounds checks
mm/shmem: fix build without THP
mm/ksm: fix NULL pointer dereference when KSM zero page is enabled
tools/build: tweak unused value workaround
checkpatch: fix a typo in the regex for $allocFunctions
mm, gup: return EINTR when gup is interrupted by fatal signals
mm/hugetlb: fix a addressing exception caused by huge_pte_offset
MAINTAINERS: add an entry for kfifo
mm/userfaultfd: disable userfaultfd-wp on x86_32
slub: avoid redzone when choosing freepointer location
sh: fix build error in mm/init.c
Pull kvm fixes from Paolo Bonzini:
"Bugfixes, and a few cleanups to the newly-introduced assembly language
vmentry code for AMD"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: PPC: Book3S HV: Handle non-present PTEs in page fault functions
kvm: Disable objtool frame pointer checking for vmenter.S
MAINTAINERS: add a reviewer for KVM/s390
KVM: s390: Fix PV check in deliverable_irqs()
kvm: Handle reads of SandyBridge RAPL PMU MSRs rather than injecting #GP
KVM: Remove CREATE_IRQCHIP/SET_PIT2 race
KVM: SVM: Fix __svm_vcpu_run declaration.
KVM: SVM: Do not setup frame pointer in __svm_vcpu_run
KVM: SVM: Fix build error due to missing release_pages() include
KVM: SVM: Do not mark svm_vcpu_run with STACK_FRAME_NON_STANDARD
kvm: nVMX: match comment with return type for nested_vmx_exit_reflected
kvm: nVMX: reflect MTF VM-exits if injected by L1
KVM: s390: Return last valid slot if approx index is out-of-bounds
KVM: Check validity of resolved slot when searching memslots
KVM: VMX: Enable machine check support for 32bit targets
KVM: SVM: move more vmentry code to assembly
KVM: SVM: fix compilation with modular PSP and non-modular KVM
Pull virtio fixes and cleanups from Michael Tsirkin:
- Some bug fixes
- Cleanup a couple of issues that surfaced meanwhile
- Disable vhost on ARM with OABI for now - to be fixed fully later in
the cycle or in the next release.
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (24 commits)
vhost: disable for OABI
virtio: drop vringh.h dependency
virtio_blk: add a missing include
virtio-balloon: Avoid using the word 'report' when referring to free page hinting
virtio-balloon: make virtballoon_free_page_report() static
vdpa: fix comment of vdpa_register_device()
vdpa: make vhost, virtio depend on menu
vdpa: allow a 32 bit vq alignment
drm/virtio: fix up for include file changes
remoteproc: pull in slab.h
rpmsg: pull in slab.h
virtio_input: pull in slab.h
remoteproc: pull in slab.h
virtio-rng: pull in slab.h
virtgpu: pull in uaccess.h
tools/virtio: make asm/barrier.h self contained
tools/virtio: define aligned attribute
virtio/test: fix up after IOTLB changes
vhost: Create accessors for virtqueues private_data
vdpasim: Return status in vdpasim_get_status
...
remap_vmalloc_range() has had various issues with the bounds checks it
promises to perform ("This function checks that addr is a valid
vmalloc'ed area, and that it is big enough to cover the vma") over time,
e.g.:
- not detecting pgoff<<PAGE_SHIFT overflow
- not detecting (pgoff<<PAGE_SHIFT)+usize overflow
- not checking whether addr and addr+(pgoff<<PAGE_SHIFT) are the same
vmalloc allocation
- comparing a potentially wildly out-of-bounds pointer with the end of
the vmalloc region
In particular, since commit fc9702273e ("bpf: Add mmap() support for
BPF_MAP_TYPE_ARRAY"), unprivileged users can cause kernel null pointer
dereferences by calling mmap() on a BPF map with a size that is bigger
than the distance from the start of the BPF map to the end of the
address space.
This could theoretically be used as a kernel ASLR bypass, by using
whether mmap() with a given offset oopses or returns an error code to
perform a binary search over the possible address range.
To allow remap_vmalloc_range_partial() to verify that addr and
addr+(pgoff<<PAGE_SHIFT) are in the same vmalloc region, pass the offset
to remap_vmalloc_range_partial() instead of adding it to the pointer in
remap_vmalloc_range().
In remap_vmalloc_range_partial(), fix the check against
get_vm_area_size() by using size comparisons instead of pointer
comparisons, and add checks for pgoff.
Fixes: 833423143c ("[PATCH] mm: introduce remap_vmalloc_range()")
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: stable@vger.kernel.org
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: Andrii Nakryiko <andriin@fb.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@chromium.org>
Link: http://lkml.kernel.org/r/20200415222312.236431-1-jannh@google.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Saeed Mahameed says:
====================
mlx5-updates-2020-04-20
This series includes misc updates and clean ups to mlx5 driver:
1) improve some comments from Hu Haowen.
2) Handles errors of netif_set_real_num_{tx,rx}_queues, from Maxim
3) IPsec and FPGA related code cleanup to prepare for ASIC devices
IPsec offloads, from Raed
4) Allow partial mask for tunnel options, from Roi.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently the FPGA IPsec is the only hw implementation of the IPsec
acceleration api, and so the mlx5_accel_esp_create_hw_context was
wrongly made to suit this HW api, among other in its parameter list
and some of its parameter endianness.
This implementation might not be suitable for different HW.
Refactor by group and pass all function arguments of
mlx5_accel_esp_create_hw_context in common mlx5_accel_esp_xfrm_attrs
struct field of mlx5_accel_esp_xfrm struct and correct the endianness
according to the HW being called.
Signed-off-by: Raed Salem <raeds@mellanox.com>
Reviewed-by: Boris Pismenny <borisp@mellanox.com>
Reviewed-by: Huy Nguyen <huyn@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
RFC 2863 defines the operational state testing. Add support for this
state, both as a IF_LINK_MODE_ and __LINK_STATE_.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull irq fixes from Thomas Gleixner:
"A set of fixes/updates for the interrupt subsystem:
- Remove setup_irq() and remove_irq(). All users have been converted
so remove them before new users surface.
- A set of bugfixes for various interrupt chip drivers
- Add a few missing static attributes to address sparse warnings"
* tag 'irq-urgent-2020-04-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/irq-bcm7038-l1: Make bcm7038_l1_of_init() static
irqchip/irq-mvebu-icu: Make legacy_bindings static
irqchip/meson-gpio: Fix HARDIRQ-safe -> HARDIRQ-unsafe lock order
irqchip/sifive-plic: Fix maximum priority threshold value
irqchip/ti-sci-inta: Fix processing of masked irqs
irqchip/mbigen: Free msi_desc on device teardown
irqchip/gic-v4.1: Update effective affinity of virtual SGIs
irqchip/gic-v4.1: Add support for VPENDBASER's Dirty+Valid signaling
genirq: Remove setup_irq() and remove_irq()
Pull ext4 fixes from Ted Ts'o:
"Miscellaneous bug fixes and cleanups for ext4, including a fix for
generic/388 in data=journal mode, removing some BUG_ON's, and cleaning
up some compiler warnings"
* tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: convert BUG_ON's to WARN_ON's in mballoc.c
ext4: increase wait time needed before reuse of deleted inode numbers
ext4: remove set but not used variable 'es' in ext4_jbd2.c
ext4: remove set but not used variable 'es'
ext4: do not zeroout extents beyond i_disksize
ext4: fix return-value types in several function comments
ext4: use non-movable memory for superblock readahead
ext4: use matching invalidatepage in ext4_writepage
Pull flexible-array member conversion from Gustavo Silva:
"The current codebase makes use of the zero-length array language
extension to the C90 standard, but the preferred mechanism to declare
variable-length types such as these ones is a flexible array
member[1][2], introduced in C99:
struct foo {
int stuff;
struct boo array[];
};
By making use of the mechanism above, we will get a compiler warning
in case the flexible array does not occur last in the structure, which
will help us prevent some kind of undefined behavior bugs from being
inadvertently introduced[3] to the codebase from now on.
Also, notice that, dynamic memory allocations won't be affected by
this change:
"Flexible array members have incomplete type, and so the sizeof
operator may not be applied. As a quirk of the original
implementation of zero-length arrays, sizeof evaluates to zero."[1]
sizeof(flexible-array-member) triggers a warning because flexible
array members have incomplete type[1]. There are some instances of
code in which the sizeof operator is being incorrectly/erroneously
applied to zero-length arrays and the result is zero. Such instances
may be hiding some bugs. So, this work (flexible-array member
convertions) will also help to get completely rid of those sorts of
issues.
Notice that all of these patches have been baking in linux-next for
quite a while now and, 238 more of these patches have already been
merged into 5.7-rc1.
There are a couple hundred more of these issues waiting to be
addressed in the whole codebase"
[1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
[2] https://github.com/KSPP/linux/issues/21
[3] commit 7649773293 ("cxgb3/l2t: Fix undefined behaviour")
* tag 'flexible-array-member-5.7-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux: (28 commits)
xattr.h: Replace zero-length array with flexible-array member
uapi: linux: fiemap.h: Replace zero-length array with flexible-array member
uapi: linux: dlm_device.h: Replace zero-length array with flexible-array member
tpm_eventlog.h: Replace zero-length array with flexible-array member
ti_wilink_st.h: Replace zero-length array with flexible-array member
swap.h: Replace zero-length array with flexible-array member
skbuff.h: Replace zero-length array with flexible-array member
sched: topology.h: Replace zero-length array with flexible-array member
rslib.h: Replace zero-length array with flexible-array member
rio.h: Replace zero-length array with flexible-array member
posix_acl.h: Replace zero-length array with flexible-array member
platform_data: wilco-ec.h: Replace zero-length array with flexible-array member
memcontrol.h: Replace zero-length array with flexible-array member
list_lru.h: Replace zero-length array with flexible-array member
lib: cpu_rmap: Replace zero-length array with flexible-array member
irq.h: Replace zero-length array with flexible-array member
ihex.h: Replace zero-length array with flexible-array member
igmp.h: Replace zero-length array with flexible-array member
genalloc.h: Replace zero-length array with flexible-array member
ethtool.h: Replace zero-length array with flexible-array member
...