Commit Graph

132603 Commits

Author SHA1 Message Date
Linus Torvalds
ddf21bd8ab Merge tag 'iov_iter.3-5.15-2021-09-17' of git://git.kernel.dk/linux-block
Pull io_uring iov_iter retry fixes from Jens Axboe:
 "This adds a helper to save/restore iov_iter state, and modifies
  io_uring to use it.

  After that is done, we can now kill the iter->truncated addition that
  we added for this release. The io_uring change is being overly
  cautious with the save/restore/advance, but better safe than sorry and
  we can always improve that and reduce the overhead if it proves to be
  of concern. The only case to be worried about in this regard is huge
  IO, where iteration can take a while to iterate segments.

  I spent some time writing test cases, and expanded the coverage quite
  a bit from the last posting of this. liburing carries this regression
  test case now:

      https://git.kernel.dk/cgit/liburing/tree/test/file-verify.c

  which exercises all of this. It now also supports provided buffers,
  and explicitly tests for end-of-file/device truncation as well.

  On top of that, Pavel sanitized the IOPOLL retry path to follow the
  exact same pattern as normal IO"

* tag 'iov_iter.3-5.15-2021-09-17' of git://git.kernel.dk/linux-block:
  io_uring: move iopoll reissue into regular IO path
  Revert "iov_iter: track truncated size"
  io_uring: use iov_iter state save/restore helpers
  iov_iter: add helper to save iov_iter state
2021-09-17 09:23:44 -07:00
Linus Torvalds
0bc7eb03cb Merge tag 'io_uring-5.15-2021-09-17' of git://git.kernel.dk/linux-block
Pull io_uring fixes from Jens Axboe:
 "Mostly fixes for regressions in this cycle, but also a few fixes that
  predate this release.

  The odd one out is a tweak to the direct files added in this release,
  where attempting to reuse a slot is allowed instead of needing an
  explicit removal of that slot first. It's a considerable improvement
  in usability to that API, hence I'm sending it for -rc2.

   - io-wq race fix and cleanup (Hao)

   - loop_rw_iter() type fix

   - SQPOLL max worker race fix

   - Allow poll arm for O_NONBLOCK files, fixing a case where it's
     impossible to properly use io_uring if you cannot modify the file
     flags

   - Allow direct open to simply reuse a slot, instead of needing it
     explicitly removed first (Pavel)

   - Fix a case where we missed signal mask restoring in cqring_wait, if
     we hit -EFAULT (Xiaoguang)"

* tag 'io_uring-5.15-2021-09-17' of git://git.kernel.dk/linux-block:
  io_uring: allow retry for O_NONBLOCK if async is supported
  io_uring: auto-removal for direct open/accept
  io_uring: fix missing sigmask restore in io_cqring_wait()
  io_uring: pin SQPOLL data before unlocking ring lock
  io-wq: provide IO_WQ_* constants for IORING_REGISTER_IOWQ_MAX_WORKERS arg items
  io-wq: fix potential race of acct->nr_workers
  io-wq: code clean of io_wqe_create_worker()
  io_uring: ensure symmetry in handling iter types in loop_rw_iter()
2021-09-17 09:19:59 -07:00
Linus Torvalds
fc0c0548c1 Merge tag 'net-5.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
 "Including fixes from bpf.

  Current release - regressions:

   - vhost_net: fix OoB on sendmsg() failure

   - mlx5: bridge, fix uninitialized variable usage

   - bnxt_en: fix error recovery regression

  Current release - new code bugs:

   - bpf, mm: fix lockdep warning triggered by stack_map_get_build_id_offset()

  Previous releases - regressions:

   - r6040: restore MDIO clock frequency after MAC reset

   - tcp: fix tp->undo_retrans accounting in tcp_sacktag_one()

   - dsa: flush switchdev workqueue before tearing down CPU/DSA ports

  Previous releases - always broken:

   - ptp: dp83640: don't define PAGE0, avoid compiler warning

   - igc: fix tunnel segmentation offloads

   - phylink: update SFP selected interface on advertising changes

   - stmmac: fix system hang caused by eee_ctrl_timer during suspend/resume

   - mlx5e: fix mutual exclusion between CQE compression and HW TS

  Misc:

   - bpf, cgroups: fix cgroup v2 fallback on v1/v2 mixed mode

   - sfc: fallback for lack of xdp tx queues

   - hns3: add option to turn off page pool feature"

* tag 'net-5.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (67 commits)
  mlxbf_gige: clear valid_polarity upon open
  igc: fix tunnel offloading
  net/{mlx5|nfp|bnxt}: Remove unnecessary RTNL lock assert
  net: wan: wanxl: define CROSS_COMPILE_M68K
  selftests: nci: replace unsigned int with int
  net: dsa: flush switchdev workqueue before tearing down CPU/DSA ports
  Revert "net: phy: Uniform PHY driver access"
  net: dsa: destroy the phylink instance on any error in dsa_slave_phy_setup
  ptp: dp83640: don't define PAGE0
  bnx2x: Fix enabling network interfaces without VFs
  Revert "Revert "ipv4: fix memory leaks in ip_cmsg_send() callers""
  tcp: fix tp->undo_retrans accounting in tcp_sacktag_one()
  net-caif: avoid user-triggerable WARN_ON(1)
  bpf, selftests: Add test case for mixed cgroup v1/v2
  bpf, selftests: Add cgroup v1 net_cls classid helpers
  bpf, cgroups: Fix cgroup v2 fallback on v1/v2 mixed mode
  bpf: Add oversize check before call kvcalloc()
  net: hns3: fix the timing issue of VF clearing interrupt sources
  net: hns3: fix the exception when query imp info
  net: hns3: disable mac in flr process
  ...
2021-09-16 13:05:42 -07:00
Linus Torvalds
ff1ffd71d5 Merge tag 'hyperv-fixes-signed-20210915' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux
Pull hyperv fixes from Wei Liu:

 - Fix kernel crash caused by uio driver (Vitaly Kuznetsov)

 - Remove on-stack cpumask from HV APIC code (Wei Liu)

* tag 'hyperv-fixes-signed-20210915' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
  x86/hyperv: remove on-stack cpumask from hv_send_ipi_mask_allbutself
  asm-generic/hyperv: provide cpumask_to_vpset_noself
  Drivers: hv: vmbus: Fix kernel crash upon unbinding a device from uio_hv_generic driver
2021-09-15 17:18:56 -07:00
Vladimir Oltean
a57d8c217a net: dsa: flush switchdev workqueue before tearing down CPU/DSA ports
Sometimes when unbinding the mv88e6xxx driver on Turris MOX, these error
messages appear:

mv88e6085 d0032004.mdio-mii:12: port 1 failed to delete be:79:b4:9e:9e:96 vid 1 from fdb: -2
mv88e6085 d0032004.mdio-mii:12: port 1 failed to delete be:79:b4:9e:9e:96 vid 0 from fdb: -2
mv88e6085 d0032004.mdio-mii:12: port 1 failed to delete d8:58:d7:00:ca:6d vid 100 from fdb: -2
mv88e6085 d0032004.mdio-mii:12: port 1 failed to delete d8:58:d7:00:ca:6d vid 1 from fdb: -2
mv88e6085 d0032004.mdio-mii:12: port 1 failed to delete d8:58:d7:00:ca:6d vid 0 from fdb: -2

(and similarly for other ports)

What happens is that DSA has a policy "even if there are bugs, let's at
least not leak memory" and dsa_port_teardown() clears the dp->fdbs and
dp->mdbs lists, which are supposed to be empty.

But deleting that cleanup code, the warnings go away.

=> the FDB and MDB lists (used for refcounting on shared ports, aka CPU
and DSA ports) will eventually be empty, but are not empty by the time
we tear down those ports. Aka we are deleting them too soon.

The addresses that DSA complains about are host-trapped addresses: the
local addresses of the ports, and the MAC address of the bridge device.

The problem is that offloading those entries happens from a deferred
work item scheduled by the SWITCHDEV_FDB_DEL_TO_DEVICE handler, and this
races with the teardown of the CPU and DSA ports where the refcounting
is kept.

In fact, not only it races, but fundamentally speaking, if we iterate
through the port list linearly, we might end up tearing down the shared
ports even before we delete a DSA user port which has a bridge upper.

So as it turns out, we need to first tear down the user ports (and the
unused ones, for no better place of doing that), then the shared ports
(the CPU and DSA ports). In between, we need to ensure that all work
items scheduled by our switchdev handlers (which only run for user
ports, hence the reason why we tear them down first) have finished.

Fixes: 161ca59d39 ("net: dsa: reference count the MDB entries at the cross-chip notifier level")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20210914134726.2305133-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-09-15 15:09:46 -07:00
Linus Torvalds
d6efd3f187 Merge branch 'absolute-pointer' (patches from Guenter)
Merge absolute_pointer macro series from Guenter Roeck:
 "Kernel test builds currently fail for several architectures with error
  messages such as the following.

  drivers/net/ethernet/i825xx/82596.c: In function 'i82596_probe':
  arch/m68k/include/asm/string.h:72:25: error:
        '__builtin_memcpy' reading 6 bytes from a region of size 0
                [-Werror=stringop-overread]

  Such warnings may be reported by gcc 11.x for string and memory
  operations on fixed addresses if gcc's builtin functions are used for
  those operations.

  This series introduces absolute_pointer() to fix the problem.
  absolute_pointer() disassociates a pointer from its originating symbol
  type and context, and thus prevents gcc from making assumptions about
  pointers passed to memory operations"

* emailed patches from Guenter Roeck <linux@roeck-us.net>:
  alpha: Use absolute_pointer to define COMMAND_LINE
  alpha: Move setup.h out of uapi
  net: i825xx: Use absolute_pointer for memcpy from fixed memory location
  compiler.h: Introduce absolute_pointer macro
2021-09-15 12:11:48 -07:00
Guenter Roeck
f6b5f1a569 compiler.h: Introduce absolute_pointer macro
absolute_pointer() disassociates a pointer from its originating symbol
type and context. Use it to prevent compiler warnings/errors such as

  drivers/net/ethernet/i825xx/82596.c: In function 'i82596_probe':
  arch/m68k/include/asm/string.h:72:25: error:
	'__builtin_memcpy' reading 6 bytes from a region of size 0 [-Werror=stringop-overread]

Such warnings may be reported by gcc 11.x for string and memory
operations on fixed addresses.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-09-15 12:04:28 -07:00
Jens Axboe
7dedd3e180 Revert "iov_iter: track truncated size"
This reverts commit 2112ff5ce0.

We no longer need to track the truncation count, the one user that did
need it has been converted to using iov_iter_restore() instead.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-09-15 09:22:35 -06:00
Linus Torvalds
77e02cf57b memblock: introduce saner 'memblock_free_ptr()' interface
The boot-time allocation interface for memblock is a mess, with
'memblock_alloc()' returning a virtual pointer, but then you are
supposed to free it with 'memblock_free()' that takes a _physical_
address.

Not only is that all kinds of strange and illogical, but it actually
causes bugs, when people then use it like a normal allocation function,
and it fails spectacularly on a NULL pointer:

   https://lore.kernel.org/all/20210912140820.GD25450@xsang-OptiPlex-9020/

or just random memory corruption if the debug checks don't catch it:

   https://lore.kernel.org/all/61ab2d0c-3313-aaab-514c-e15b7aa054a0@suse.cz/

I really don't want to apply patches that treat the symptoms, when the
fundamental cause is this horribly confusing interface.

I started out looking at just automating a sane replacement sequence,
but because of this mix or virtual and physical addresses, and because
people have used the "__pa()" macro that can take either a regular
kernel pointer, or just the raw "unsigned long" address, it's all quite
messy.

So this just introduces a new saner interface for freeing a virtual
address that was allocated using 'memblock_alloc()', and that was kept
as a regular kernel pointer.  And then it converts a couple of users
that are obvious and easy to test, including the 'xbc_nodes' case in
lib/bootconfig.c that caused problems.

Reported-by: kernel test robot <oliver.sang@intel.com>
Fixes: 40caa127f3 ("init: bootconfig: Remove all bootconfig data when the init memory is removed")
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-09-14 13:23:22 -07:00
Jens Axboe
8fb0f47a9d iov_iter: add helper to save iov_iter state
In an ideal world, when someone is passed an iov_iter and returns X bytes,
then X bytes would have been consumed/advanced from the iov_iter. But we
have use cases that always consume the entire iterator, a few examples
of that are iomap and bdev O_DIRECT. This means we cannot rely on the
state of the iov_iter once we've called ->read_iter() or ->write_iter().

This would be easier if we didn't always have to deal with truncate of
the iov_iter, as rewinding would be trivial without that. We recently
added a commit to track the truncate state, but that grew the iov_iter
by 8 bytes and wasn't the best solution.

Implement a helper to save enough of the iov_iter state to sanely restore
it after we've called the read/write iterator helpers. This currently
only works for IOVEC/BVEC/KVEC as that's all we need, support for other
iterator types are left as an exercise for the reader.

Link: https://lore.kernel.org/linux-fsdevel/CAHk-=wiacKV4Gh-MYjteU0LwNBSGpWrK-Ov25HdqB1ewinrFPg@mail.gmail.com/
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-09-14 08:12:18 -06:00
David S. Miller
2865ba8247 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
pull-request: bpf 2021-09-14

The following pull-request contains BPF updates for your *net* tree.

We've added 7 non-merge commits during the last 13 day(s) which contain
a total of 18 files changed, 334 insertions(+), 193 deletions(-).

The main changes are:

1) Fix mmap_lock lockdep splat in BPF stack map's build_id lookup, from Yonghong Song.

2) Fix BPF cgroup v2 program bypass upon net_cls/prio activation, from Daniel Borkmann.

3) Fix kvcalloc() BTF line info splat on oversized allocation attempts, from Bixuan Cui.

4) Fix BPF selftest build of task_pt_regs test for arm64/s390, from Jean-Philippe Brucker.

5) Fix BPF's disasm.{c,h} to dual-license so that it is aligned with bpftool given the former
   is a build dependency for the latter, from Daniel Borkmann with ACKs from contributors.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-14 13:09:54 +01:00
Daniel Borkmann
8520e224f5 bpf, cgroups: Fix cgroup v2 fallback on v1/v2 mixed mode
Fix cgroup v1 interference when non-root cgroup v2 BPF programs are used.
Back in the days, commit bd1060a1d6 ("sock, cgroup: add sock->sk_cgroup")
embedded per-socket cgroup information into sock->sk_cgrp_data and in order
to save 8 bytes in struct sock made both mutually exclusive, that is, when
cgroup v1 socket tagging (e.g. net_cls/net_prio) is used, then cgroup v2
falls back to the root cgroup in sock_cgroup_ptr() (&cgrp_dfl_root.cgrp).

The assumption made was "there is no reason to mix the two and this is in line
with how legacy and v2 compatibility is handled" as stated in bd1060a1d6.
However, with Kubernetes more widely supporting cgroups v2 as well nowadays,
this assumption no longer holds, and the possibility of the v1/v2 mixed mode
with the v2 root fallback being hit becomes a real security issue.

Many of the cgroup v2 BPF programs are also used for policy enforcement, just
to pick _one_ example, that is, to programmatically deny socket related system
calls like connect(2) or bind(2). A v2 root fallback would implicitly cause
a policy bypass for the affected Pods.

In production environments, we have recently seen this case due to various
circumstances: i) a different 3rd party agent and/or ii) a container runtime
such as [0] in the user's environment configuring legacy cgroup v1 net_cls
tags, which triggered implicitly mentioned root fallback. Another case is
Kubernetes projects like kind [1] which create Kubernetes nodes in a container
and also add cgroup namespaces to the mix, meaning programs which are attached
to the cgroup v2 root of the cgroup namespace get attached to a non-root
cgroup v2 path from init namespace point of view. And the latter's root is
out of reach for agents on a kind Kubernetes node to configure. Meaning, any
entity on the node setting cgroup v1 net_cls tag will trigger the bypass
despite cgroup v2 BPF programs attached to the namespace root.

Generally, this mutual exclusiveness does not hold anymore in today's user
environments and makes cgroup v2 usage from BPF side fragile and unreliable.
This fix adds proper struct cgroup pointer for the cgroup v2 case to struct
sock_cgroup_data in order to address these issues; this implicitly also fixes
the tradeoffs being made back then with regards to races and refcount leaks
as stated in bd1060a1d6, and removes the fallback, so that cgroup v2 BPF
programs always operate as expected.

  [0] https://github.com/nestybox/sysbox/
  [1] https://kind.sigs.k8s.io/

Fixes: bd1060a1d6 ("sock, cgroup: add sock->sk_cgroup")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Stanislav Fomichev <sdf@google.com>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/bpf/20210913230759.2313-1-daniel@iogearbox.net
2021-09-13 16:35:58 -07:00
Linus Torvalds
316346243b Merge branch 'gcc-min-version-5.1' (make gcc-5.1 the minimum version)
Merge patch series from Nick Desaulniers to update the minimum gcc
version to 5.1.

This is some of the left-overs from the merge window that I didn't want
to deal with yesterday, so it comes in after -rc1 but was sent before.

Gcc-4.9 support has been an annoyance for some time, and with -Werror I
had the choice of applying a fairly big patch from Kees Cook to remove a
fair number of initializer warnings (still leaving some), or this patch
series from Nick that just removes the source of the problem.

The initializer cleanups might still be worth it regardless, but
honestly, I preferred just tackling the problem with gcc-4.9 head-on.
We've been more aggressiuve about no longer having to care about
compilers that were released a long time ago, and I think it's been a
good thing.

I added a couple of patches on top to sort out a few left-overs now that
we no longer support gcc-4.x.

As noted by Arnd, as a result of this minimum compiler version upgrade
we can probably change our use of '--std=gnu89' to '--std=gnu11', and
finally start using local loop declarations etc.  But this series does
_not_ yet do that.

Link: https://lore.kernel.org/all/20210909182525.372ee687@canb.auug.org.au/
Link: https://lore.kernel.org/lkml/CAK7LNASs6dvU6D3jL2GG3jW58fXfaj6VNOe55NJnTB8UPuk2pA@mail.gmail.com/
Link: https://github.com/ClangBuiltLinux/linux/issues/1438

* emailed patches from Nick Desaulniers <ndesaulniers@google.com>:
  Drop some straggling mentions of gcc-4.9 as being stale
  compiler_attributes.h: drop __has_attribute() support for gcc4
  vmlinux.lds.h: remove old check for GCC 4.9
  compiler-gcc.h: drop checks for older GCC versions
  Makefile: drop GCC < 5 -fno-var-tracking-assignments workaround
  arm64: remove GCC version check for ARCH_SUPPORTS_INT128
  powerpc: remove GCC version check for UPD_CONSTR
  riscv: remove Kconfig check for GCC version for ARCH_RV64I
  Kconfig.debug: drop GCC 5+ version check for DWARF5
  mm/ksm: remove old GCC 4.9+ check
  compiler.h: drop fallback overflow checkers
  Documentation: raise minimum supported version of GCC to 5.1
2021-09-13 10:43:04 -07:00
Linus Torvalds
df26327ea0 Drop some straggling mentions of gcc-4.9 as being stale
Fix up the admin-guide README file to the new gcc-5.1 requirement, and
remove a stale comment about gcc support for the __assume_aligned__
attribute.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-09-13 10:29:44 -07:00
Linus Torvalds
6d2ef226f2 compiler_attributes.h: drop __has_attribute() support for gcc4
Now that GCC 5.1 is the minimally supported default, the manual
workaround for older gcc versions not having __has_attribute() are no
longer relevant and can be removed.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-09-13 10:20:01 -07:00
Nick Desaulniers
6f20fa2dfa vmlinux.lds.h: remove old check for GCC 4.9
Now that GCC 5.1 is the minimally supported version of GCC, we can
effectively revert commit 85c2ce9104 ("sched, vmlinux.lds: Increase
STRUCT_ALIGNMENT to 64 bytes for GCC-4.9")

Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-09-13 10:18:29 -07:00
Nick Desaulniers
4e59869aa6 compiler-gcc.h: drop checks for older GCC versions
Now that GCC 5.1 is the minimally supported default, drop the values we
don't use.

Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-09-13 10:18:29 -07:00
Nick Desaulniers
4eb6bd55cf compiler.h: drop fallback overflow checkers
Once upgrading the minimum supported version of GCC to 5.1, we can drop
the fallback code for !COMPILER_HAS_GENERIC_BUILTIN_OVERFLOW.

This is effectively a revert of commit f0907827a8 ("compiler.h: enable
builtin overflow checkers and add fallback code")

Link: https://github.com/ClangBuiltLinux/linux/issues/1438#issuecomment-916745801
Suggested-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Acked-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-09-13 10:18:28 -07:00
Eugene Syromiatnikov
dd47c10453 io-wq: provide IO_WQ_* constants for IORING_REGISTER_IOWQ_MAX_WORKERS arg items
The items passed in the array pointed by the arg parameter
of IORING_REGISTER_IOWQ_MAX_WORKERS io_uring_register operation
carry certain semantics: they refer to different io-wq worker categories;
provide IO_WQ_* constants in the UAPI, so these categories can be referenced
in the user space code.

Suggested-by: Jens Axboe <axboe@kernel.dk>
Complements: 2e480058dd ("io-wq: provide a way to limit max number of workers")
Signed-off-by: Eugene Syromiatnikov <esyr@redhat.com>
Link: https://lore.kernel.org/r/20210913154415.GA12890@asgard.redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-09-13 10:38:13 -06:00
Linus Torvalds
c3e46874df Merge tag 'compiler-attributes-for-linus-v5.15-rc1-v2' of git://github.com/ojeda/linux
Pull compiler attributes updates from Miguel Ojeda:

 - Fix __has_attribute(__no_sanitize_coverage__) for GCC 4 (Marco Elver)

 - Add Nick as Reviewer for compiler_attributes.h (Nick Desaulniers)

 - Move __compiletime_{error|warning} (Nick Desaulniers)

* tag 'compiler-attributes-for-linus-v5.15-rc1-v2' of git://github.com/ojeda/linux:
  compiler_attributes.h: move __compiletime_{error|warning}
  MAINTAINERS: add Nick as Reviewer for compiler_attributes.h
  Compiler Attributes: fix __has_attribute(__no_sanitize_coverage__) for GCC 4
2021-09-12 16:09:26 -07:00
Linus Torvalds
f306b90c69 Merge tag 'smp-urgent-2021-09-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull CPU hotplug updates from Thomas Gleixner:
 "Updates for the SMP and CPU hotplug:

   - Remove DEFINE_SMP_CALL_CACHE_FUNCTION() which is a left over of the
     original hotplug code and now causing trouble with the ARM64 cache
     topology setup due to the pointless SMP function call.

     It's not longer required as the hotplug callbacks are guaranteed to
     be invoked on the upcoming CPU.

   - Remove the deprecated and now unused CPU hotplug functions

   - Rewrite the CPU hotplug API documentation"

* tag 'smp-urgent-2021-09-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  Documentation: core-api/cpuhotplug: Rewrite the API section
  cpu/hotplug: Remove deprecated CPU-hotplug functions.
  thermal: Replace deprecated CPU-hotplug functions.
  drivers: base: cacheinfo: Get rid of DEFINE_SMP_CALL_CACHE_FUNCTION()
2021-09-12 12:42:51 -07:00
Linus Torvalds
165d05d88c Merge tag 'locking_urgent_for_v5.15_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking fixes from Borislav Petkov:

 - Fix the futex PI requeue machinery to not return to userspace in
   inconsistent state

 - Avoid a potential null pointer dereference in the ww_mutex deadlock
   check

 - Other smaller cleanups and optimizations

* tag 'locking_urgent_for_v5.15_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/rtmutex: Fix ww_mutex deadlock check
  futex: Remove unused variable 'vpid' in futex_proxy_trylock_atomic()
  futex: Avoid redundant task lookup
  futex: Clarify comment for requeue_pi_wake_futex()
  futex: Prevent inconsistent state and exit race
  futex: Return error code instead of assigning it without effect
  locking/rwsem: Add missing __init_rwsem() for PREEMPT_RT
2021-09-12 11:27:05 -07:00
Linus Torvalds
7bf3142625 Merge tag 'timers_urgent_for_v5.15_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fix from Borislav Petkov:

 - Handle negative second values properly when converting a timespec64
   to nanoseconds.

* tag 'timers_urgent_for_v5.15_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  time: Handle negative seconds correctly in timespec64_to_ns()
2021-09-12 11:10:31 -07:00
Linus Torvalds
78e709522d Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull virtio updates from Michael Tsirkin:

 - vduse driver ("vDPA Device in Userspace") supporting emulated virtio
   block devices

 - virtio-vsock support for end of record with SEQPACKET

 - vdpa: mac and mq support for ifcvf and mlx5

 - vdpa: management netlink for ifcvf

 - virtio-i2c, gpio dt bindings

 - misc fixes and cleanups

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (39 commits)
  Documentation: Add documentation for VDUSE
  vduse: Introduce VDUSE - vDPA Device in Userspace
  vduse: Implement an MMU-based software IOTLB
  vdpa: Support transferring virtual addressing during DMA mapping
  vdpa: factor out vhost_vdpa_pa_map() and vhost_vdpa_pa_unmap()
  vdpa: Add an opaque pointer for vdpa_config_ops.dma_map()
  vhost-iotlb: Add an opaque pointer for vhost IOTLB
  vhost-vdpa: Handle the failure of vdpa_reset()
  vdpa: Add reset callback in vdpa_config_ops
  vdpa: Fix some coding style issues
  file: Export receive_fd() to modules
  eventfd: Export eventfd_wake_count to modules
  iova: Export alloc_iova_fast() and free_iova_fast()
  virtio-blk: remove unneeded "likely" statements
  virtio-balloon: Use virtio_find_vqs() helper
  vdpa: Make use of PFN_PHYS/PFN_UP/PFN_DOWN helper macro
  vsock_test: update message bounds test for MSG_EOR
  af_vsock: rename variables in receive loop
  virtio/vsock: support MSG_EOR bit processing
  vhost/vsock: support MSG_EOR bit processing
  ...
2021-09-11 14:48:42 -07:00
Linus Torvalds
ce4c8f8820 Merge tag 'trace-v5.15-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fixes from Steven Rostedt:
 "Minor fixes to the processing of the bootconfig tree"

* tag 'trace-v5.15-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  bootconfig: Rename xbc_node_find_child() to xbc_node_find_subkey()
  tracing/boot: Fix to check the histogram control param is a leaf node
  tracing/boot: Fix trace_boot_hist_add_array() to check array is value
2021-09-11 10:16:30 -07:00
Linus Torvalds
6701e7e7d8 Merge tag 'pwm/for-5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm
Pull pwm updates from Thierry Reding:
 "The changes this time around are mostly janitorial in nature. A lot of
  this is simplifications of drivers using device-managed functions and
  improving compilation coverage.

  The Mediatek display PWM driver now supports the atomic API.

  Cleanups and minor fixes make up the remainder of this set"

* tag 'pwm/for-5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm: (54 commits)
  pwm: mtk-disp: Implement atomic API .get_state()
  pwm: mtk-disp: Fix overflow in period and duty calculation
  pwm: mtk-disp: Implement atomic API .apply()
  pwm: mtk-disp: Adjust the clocks to avoid them mismatch
  dt-bindings: pwm: rockchip: Add description for rk3568
  pwm: Make pwmchip_remove() return void
  pwm: sun4i: Don't check the return code of pwmchip_remove()
  pwm: sifive: Don't check the return code of pwmchip_remove()
  pwm: samsung: Don't check the return code of pwmchip_remove()
  pwm: renesas-tpu: Don't check the return code of pwmchip_remove()
  pwm: rcar: Don't check the return code of pwmchip_remove()
  pwm: pca9685: Don't check the return code of pwmchip_remove()
  pwm: omap-dmtimer: Don't check the return code of pwmchip_remove()
  pwm: mtk-disp: Don't check the return code of pwmchip_remove()
  pwm: imx-tpm: Don't check the return code of pwmchip_remove()
  pwm: img: Don't check the return code of pwmchip_remove()
  pwm: cros-ec: Don't check the return code of pwmchip_remove()
  pwm: brcmstb: Don't check the return code of pwmchip_remove()
  pwm: atmel-tcb: Don't check the return code of pwmchip_remove()
  pwm: atmel-hlcdc: Don't check the return code of pwmchip_remove()
  ...
2021-09-11 09:26:00 -07:00
Linus Torvalds
dd4703876e Merge tag 'thermal-v5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux
Pull thermal updates from Daniel Lezcano:

 - Add the tegra3 thermal sensor and fix the compilation testing on
   tegra by adding a dependency on ARCH_TEGRA along with COMPILE_TEST
   (Dmitry Osipenko)

 - Fix the error code for the exynos when devm_get_clk() fails (Dan
   Carpenter)

 - Add the TCC cooling support for AlderLake platform (Sumeet Pawnikar)

 - Add support for hardware trip points for the rcar gen3 thermal driver
   and store TSC id as unsigned int (Niklas Söderlund)

 - Replace the deprecated CPU-hotplug functions get_online_cpus() and
   put_online_cpus (Sebastian Andrzej Siewior)

 - Add the thermal tools directory in the MAINTAINERS file (Daniel
   Lezcano)

 - Fix the Makefile and the cross compilation flags for the userspace
   'tmon' tool (Rolf Eike Beer)

 - Allow to use the IMOK independently from the GDDV on Int340x (Sumeet
   Pawnikar)

 - Fix the stub thermal_cooling_device_register() function prototype
   which does not match the real function (Arnd Bergmann)

 - Make the thermal trip point optional in the DT bindings (Maxime
   Ripard)

 - Fix a typo in a comment in the core code (Geert Uytterhoeven)

 - Reduce the verbosity of the trace in the SoC thermal tegra driver
   (Dmitry Osipenko)

 - Add the support for the LMh (Limit Management hardware) driver on the
   QCom platforms (Thara Gopinath)

 - Allow processing of HWP interrupt by adding a weak function in the
   Intel driver (Srinivas Pandruvada)

 - Prevent an abort of the sensor probe is a channel is not used
   (Matthias Kaehlcke)

* tag 'thermal-v5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux:
  thermal/drivers/qcom/spmi-adc-tm5: Don't abort probing if a sensor is not used
  thermal/drivers/intel: Allow processing of HWP interrupt
  dt-bindings: thermal: Add dt binding for QCOM LMh
  thermal/drivers/qcom: Add support for LMh driver
  firmware: qcom_scm: Introduce SCM calls to access LMh
  thermal/drivers/tegra-soctherm: Silence message about clamped temperature
  thermal: Spelling s/scallbacks/callbacks/
  dt-bindings: thermal: Make trips node optional
  thermal/core: Fix thermal_cooling_device_register() prototype
  thermal/drivers/int340x: Use IMOK independently
  tools/thermal/tmon: Add cross compiling support
  thermal/tools/tmon: Improve the Makefile
  MAINTAINERS: Add missing userspace thermal tools to the thermal section
  thermal/drivers/intel_powerclamp: Replace deprecated CPU-hotplug functions.
  thermal/drivers/rcar_gen3_thermal: Store TSC id as unsigned int
  thermal/drivers/rcar_gen3_thermal: Add support for hardware trip points
  drivers/thermal/intel: Add TCC cooling support for AlderLake platform
  thermal/drivers/exynos: Fix an error code in exynos_tmu_probe()
  thermal/drivers/tegra: Correct compile-testing of drivers
  thermal/drivers/tegra: Add driver for Tegra30 thermal sensor
2021-09-11 09:20:57 -07:00
Wei Liu
7ad9bb9d0f asm-generic/hyperv: provide cpumask_to_vpset_noself
This is a new variant which removes `self' cpu from the vpset. It will
be used in Hyper-V enlightened IPI code.

Signed-off-by: Wei Liu <wei.liu@kernel.org>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20210910185714.299411-2-wei.liu@kernel.org
2021-09-11 15:40:09 +00:00
Thomas Gleixner
c9871c800f Documentation: core-api/cpuhotplug: Rewrite the API section
Dave stumbled over the incomplete and confusing documentation of the CPU
hotplug API.

Rewrite it, add the missing function documentations and correct the
existing ones.

Reported-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20210909123212.489059409@linutronix.de
2021-09-11 00:41:21 +02:00
Sebastian Andrzej Siewior
8c854303ce cpu/hotplug: Remove deprecated CPU-hotplug functions.
No users in tree use the deprecated CPU-hotplug functions anymore.

Remove them.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20210803141621.780504-39-bigeasy@linutronix.de
2021-09-11 00:41:21 +02:00
Thomas Gleixner
c2f4954c2d Merge branch 'linus' into smp/urgent
Ensure that all usage sites of get/put_online_cpus() except for the
struggler in drivers/thermal are gone. So the last user and the deprecated
inlines can be removed.
2021-09-11 00:38:47 +02:00
Yonghong Song
2f1aaf3ea6 bpf, mm: Fix lockdep warning triggered by stack_map_get_build_id_offset()
Currently the bpf selftest "get_stack_raw_tp" triggered the warning:

  [ 1411.304463] WARNING: CPU: 3 PID: 140 at include/linux/mmap_lock.h:164 find_vma+0x47/0xa0
  [ 1411.304469] Modules linked in: bpf_testmod(O) [last unloaded: bpf_testmod]
  [ 1411.304476] CPU: 3 PID: 140 Comm: systemd-journal Tainted: G        W  O      5.14.0+ #53
  [ 1411.304479] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
  [ 1411.304481] RIP: 0010:find_vma+0x47/0xa0
  [ 1411.304484] Code: de 48 89 ef e8 ba f5 fe ff 48 85 c0 74 2e 48 83 c4 08 5b 5d c3 48 8d bf 28 01 00 00 be ff ff ff ff e8 2d 9f d8 00 85 c0 75 d4 <0f> 0b 48 89 de 48 8
  [ 1411.304487] RSP: 0018:ffffabd440403db8 EFLAGS: 00010246
  [ 1411.304490] RAX: 0000000000000000 RBX: 00007f00ad80a0e0 RCX: 0000000000000000
  [ 1411.304492] RDX: 0000000000000001 RSI: ffffffff9776b144 RDI: ffffffff977e1b0e
  [ 1411.304494] RBP: ffff9cf5c2f50000 R08: ffff9cf5c3eb25d8 R09: 00000000fffffffe
  [ 1411.304496] R10: 0000000000000001 R11: 00000000ef974e19 R12: ffff9cf5c39ae0e0
  [ 1411.304498] R13: 0000000000000000 R14: 0000000000000000 R15: ffff9cf5c39ae0e0
  [ 1411.304501] FS:  00007f00ae754780(0000) GS:ffff9cf5fba00000(0000) knlGS:0000000000000000
  [ 1411.304504] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [ 1411.304506] CR2: 000000003e34343c CR3: 0000000103a98005 CR4: 0000000000370ee0
  [ 1411.304508] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  [ 1411.304510] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  [ 1411.304512] Call Trace:
  [ 1411.304517]  stack_map_get_build_id_offset+0x17c/0x260
  [ 1411.304528]  __bpf_get_stack+0x18f/0x230
  [ 1411.304541]  bpf_get_stack_raw_tp+0x5a/0x70
  [ 1411.305752] RAX: 0000000000000000 RBX: 5541f689495641d7 RCX: 0000000000000000
  [ 1411.305756] RDX: 0000000000000001 RSI: ffffffff9776b144 RDI: ffffffff977e1b0e
  [ 1411.305758] RBP: ffff9cf5c02b2f40 R08: ffff9cf5ca7606c0 R09: ffffcbd43ee02c04
  [ 1411.306978]  bpf_prog_32007c34f7726d29_bpf_prog1+0xaf/0xd9c
  [ 1411.307861] R10: 0000000000000001 R11: 0000000000000044 R12: ffff9cf5c2ef60e0
  [ 1411.307865] R13: 0000000000000005 R14: 0000000000000000 R15: ffff9cf5c2ef6108
  [ 1411.309074]  bpf_trace_run2+0x8f/0x1a0
  [ 1411.309891] FS:  00007ff485141700(0000) GS:ffff9cf5fae00000(0000) knlGS:0000000000000000
  [ 1411.309896] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [ 1411.311221]  syscall_trace_enter.isra.20+0x161/0x1f0
  [ 1411.311600] CR2: 00007ff48514d90e CR3: 0000000107114001 CR4: 0000000000370ef0
  [ 1411.312291]  do_syscall_64+0x15/0x80
  [ 1411.312941] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  [ 1411.313803]  entry_SYSCALL_64_after_hwframe+0x44/0xae
  [ 1411.314223] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  [ 1411.315082] RIP: 0033:0x7f00ad80a0e0
  [ 1411.315626] Call Trace:
  [ 1411.315632]  stack_map_get_build_id_offset+0x17c/0x260

To reproduce, first build `test_progs` binary:

  make -C tools/testing/selftests/bpf -j60

and then run the binary at tools/testing/selftests/bpf directory:

  ./test_progs -t get_stack_raw_tp

The warning is due to commit 5b78ed24e8 ("mm/pagemap: add mmap_assert_locked()
annotations to find_vma*()") which added mmap_assert_locked() in find_vma()
function. The mmap_assert_locked() function asserts that mm->mmap_lock needs
to be held. But this is not the case for bpf_get_stack() or bpf_get_stackid()
helper (kernel/bpf/stackmap.c), which uses mmap_read_trylock_non_owner()
instead. Since mm->mmap_lock is not held in bpf_get_stack[id]() use case,
the above warning is emitted during test run.

This patch fixed the issue by (1). using mmap_read_trylock() instead of
mmap_read_trylock_non_owner() to satisfy lockdep checking in find_vma(), and
(2). droping lockdep for mmap_lock right before the irq_work_queue(). The
function mmap_read_trylock_non_owner() is also removed since after this
patch nobody calls it any more.

Fixes: 5b78ed24e8 ("mm/pagemap: add mmap_assert_locked() annotations to find_vma*()")
Suggested-by: Jason Gunthorpe <jgg@ziepe.ca>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Luigi Rizzo <lrizzo@google.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: linux-mm@kvack.org
Link: https://lore.kernel.org/bpf/20210909155000.1610299-1-yhs@fb.com
2021-09-10 22:24:23 +02:00
Linus Torvalds
d6498af58f Merge tag 'pm-5.15-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull more power management updates from Rafael Wysocki:
 "These improve hybrid processors support in intel_pstate, fix an issue
  in the core devices PM code, clean up the handling of dedicated wake
  IRQs, update the Energy Model documentation and update MAINTAINERS.

  Specifics:

   - Make the HWP performance levels calibration on hybrid processors in
     intel_pstate more straightforward (Rafael Wysocki).

   - Prevent the PM core from leaving devices in suspend after a failing
     system-wide suspend transition in some cases when driver PM flags
     are used (Prasad Sodagudi).

   - Drop unused function argument from the dedicated wake IRQs handling
     code (Sergey Shtylyov).

   - Fix up Energy Model kerneldoc comments and include them in the
     Energy Model documentation (Lukasz Luba).

   - Use my kernel.org address in MAINTAINERS insead of the personal one
     (Rafael Wysocki)"

* tag 'pm-5.15-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  MAINTAINERS: Change Rafael's e-mail address
  PM: sleep: core: Avoid setting power.must_resume to false
  Documentation: power: include kernel-doc in Energy Model doc
  PM: EM: fix kernel-doc comments
  cpufreq: intel_pstate: hybrid: Rework HWP calibration
  ACPI: CPPC: Introduce cppc_get_nominal_perf()
  PM: sleep: wakeirq: drop useless parameter from dev_pm_attach_wake_irq()
2021-09-10 13:20:47 -07:00
Linus Torvalds
5ffc06ebea Merge tag 'char-misc-5.15-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull habanalabs updates from Greg KH:
 "Here is another round of misc driver patches for 5.15-rc1.

  In here is only updates for the Habanalabs driver. This request is
  late because the previously-objected-to dma-buf patches are all
  removed and some fixes that you and others found are now included in
  here as well.

  All of these have been in linux-next for well over a week with no
  reports of problems, and they are all self-contained to only this one
  driver. Full details are in the shortlog"

* tag 'char-misc-5.15-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (61 commits)
  habanalabs/gaudi: hwmon default card name
  habanalabs: add support for f/w reset
  habanalabs/gaudi: block ICACHE_BASE_ADDERESS_HIGH in TPC
  habanalabs: cannot sleep while holding spinlock
  habanalabs: never copy_from_user inside spinlock
  habanalabs: remove unnecessary device status check
  habanalabs: disable IRQ in user interrupts spinlock
  habanalabs: add "in device creation" status
  habanalabs/gaudi: invalidate PMMU mem cache on init
  habanalabs/gaudi: size should be printed in decimal
  habanalabs/gaudi: define DC POWER for secured PMC
  habanalabs/gaudi: unmask out of bounds SLM access interrupt
  habanalabs: add userptr_lookup node in debugfs
  habanalabs/gaudi: fetch TPC/MME ECC errors from F/W
  habanalabs: modify multi-CS to wait on stream masters
  habanalabs/gaudi: add monitored SOBs to state dump
  habanalabs/gaudi: restore user registers when context opens
  habanalabs/gaudi: increase boot fit timeout
  habanalabs: update to latest firmware headers
  habanalabs/gaudi: minimize number of register reads
  ...
2021-09-10 11:31:47 -07:00
Rafael J. Wysocki
be2d24336f Merge branches 'pm-cpufreq', 'pm-sleep' and 'pm-em'
* pm-cpufreq:
  cpufreq: intel_pstate: hybrid: Rework HWP calibration
  ACPI: CPPC: Introduce cppc_get_nominal_perf()

* pm-sleep:
  PM: sleep: core: Avoid setting power.must_resume to false
  PM: sleep: wakeirq: drop useless parameter from dev_pm_attach_wake_irq()

* pm-em:
  Documentation: power: include kernel-doc in Energy Model doc
  PM: EM: fix kernel-doc comments
2021-09-10 20:26:08 +02:00
Linus Torvalds
a668acb8f0 Merge tag 'drm-next-2021-09-10' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
 "Just an initial bunch of fixes for the merge window, amdgpu is most of
  them with a few ttm fixes and an fbdev avoid multiply overflow fix.

  core:
   - Make some dma-buf config options depend on DMA_SHARED_BUFFER
   - Handle multiplication overflow of fbdev xres/yres in the core

  ttm:
   - Fix ttm_bo_move_memcpy() when ttm_resource is subclassed
   - Fix ttm deadlock if target BO isn't idle
   - ttm build fix
   - ttm docs fix

  dma-buf:
   - config option fixes

  fbdev:
   - limit resolutions to avoid int overflow

  i915:
   - stddef change.

  amdgpu:
   - Misc cleanups, typo fixes
   - EEPROM fix
   - Add some new PCI IDs
   - Scatter/Gather display support for Yellow Carp
   - PCIe DPM fix for RKL platforms
   - RAS fix

  amdkfd:
   - SVM fix

  vc4:
   - static function fix

  mgag200:
   - fix uninit var

  panfrost:
   - lock_region fixes"

* tag 'drm-next-2021-09-10' of git://anongit.freedesktop.org/drm/drm: (36 commits)
  drm/ttm: Fix a deadlock if the target BO is not idle during swap
  fbmem: don't allow too huge resolutions
  dma-buf: DMABUF_SYSFS_STATS should depend on DMA_SHARED_BUFFER
  dma-buf: DMABUF_DEBUG should depend on DMA_SHARED_BUFFER
  drm/i915: use linux/stddef.h due to "isystem: trim/fixup stdarg.h and other headers"
  dma-buf: DMABUF_MOVE_NOTIFY should depend on DMA_SHARED_BUFFER
  drm/amdkfd: drop process ref count when xnack disable
  drm/amdgpu: enable more pm sysfs under SRIOV 1-VF mode
  drm/amdgpu: fix fdinfo race with process exit
  drm/amdgpu: Fix a deadlock if previous GEM object allocation fails
  drm/amdgpu: stop scheduler when calling hw_fini (v2)
  drm/amdgpu: Clear RAS interrupt status on aldebaran
  drm/amd/display: Initialize lt_settings on instantiation
  drm/amd/display: cleanup idents after a revert
  drm/amd/display: Fix memory leak reported by coverity
  drm/ttm: Fix ttm_bo_move_memcpy() for subclassed struct ttm_resource
  drm/amdgpu/swsmu: fix spelling mistake "minimun" -> "minimum"
  drm/amdgpu: Disable PCIE_DPM on Intel RKL Platform
  drm/amdgpu: show both cmd id and name when psp cmd failed
  drm/amd/display: setup system context for APUs
  ...
2021-09-10 11:22:23 -07:00
Masami Hiramatsu
5dfe50b055 bootconfig: Rename xbc_node_find_child() to xbc_node_find_subkey()
Rename xbc_node_find_child() to xbc_node_find_subkey() for
clarifying that function returns a key node (no value node).
Since there are xbc_node_for_each_child() (loop on all child
nodes) and xbc_node_for_each_subkey() (loop on only subkey
nodes), this name distinction is necessary to avoid confusing
users.

Link: https://lkml.kernel.org/r/163119459826.161018.11200274779483115300.stgit@devnote2

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-09-09 19:14:33 -04:00
Linus Torvalds
d6c338a741 Merge tag 'for-linus-5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml
Pull UML updates from Richard Weinberger:

 - Support for VMAP_STACK

 - Support for splice_write in hostfs

 - Fixes for virt-pci

 - Fixes for virtio_uml

 - Various fixes

* tag 'for-linus-5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml:
  um: fix stub location calculation
  um: virt-pci: fix uapi documentation
  um: enable VMAP_STACK
  um: virt-pci: don't do DMA from stack
  hostfs: support splice_write
  um: virtio_uml: fix memory leak on init failures
  um: virtio_uml: include linux/virtio-uml.h
  lib/logic_iomem: fix sparse warnings
  um: make PCI emulation driver init/exit static
2021-09-09 13:45:26 -07:00
Linus Torvalds
35776f1051 Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm
Pull ARM development updates from Russell King:

 - Rename "mod_init" and "mod_exit" so that initcall debug output is
   actually useful (Randy Dunlap)

 - Update maintainers entries for linux-arm-kernel to indicate it is
   moderated for non-subscribers (Randy Dunlap)

 - Move install rules to arch/arm/Makefile (Masahiro Yamada)

 - Drop unnecessary ARCH_NR_GPIOS definition (Linus Walleij)

 - Don't warn about atags_to_fdt() stack size (David Heidelberg)

 - Speed up unaligned copy_{from,to}_kernel_nofault (Arnd Bergmann)

 - Get rid of set_fs() usage (Arnd Bergmann)

 - Remove checks for GCC prior to v4.6 (Geert Uytterhoeven)

* tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
  ARM: 9118/1: div64: Remove always-true __div64_const32_is_OK() duplicate
  ARM: 9117/1: asm-generic: div64: Remove always-true __div64_const32_is_OK()
  ARM: 9116/1: unified: Remove check for gcc < 4
  ARM: 9110/1: oabi-compat: fix oabi epoll sparse warning
  ARM: 9113/1: uaccess: remove set_fs() implementation
  ARM: 9112/1: uaccess: add __{get,put}_kernel_nofault
  ARM: 9111/1: oabi-compat: rework fcntl64() emulation
  ARM: 9114/1: oabi-compat: rework sys_semtimedop emulation
  ARM: 9108/1: oabi-compat: rework epoll_wait/epoll_pwait emulation
  ARM: 9107/1: syscall: always store thread_info->abi_syscall
  ARM: 9109/1: oabi-compat: add epoll_pwait handler
  ARM: 9106/1: traps: use get_kernel_nofault instead of set_fs()
  ARM: 9115/1: mm/maccess: fix unaligned copy_{from,to}_kernel_nofault
  ARM: 9105/1: atags_to_fdt: don't warn about stack size
  ARM: 9103/1: Drop ARCH_NR_GPIOS definition
  ARM: 9102/1: move theinstall rules to arch/arm/Makefile
  ARM: 9100/1: MAINTAINERS: mark all linux-arm-kernel@infradead list as moderated
  ARM: 9099/1: crypto: rename 'mod_init' & 'mod_exit' functions to be module-specific
2021-09-09 13:25:49 -07:00
Linus Torvalds
7b871c7713 Merge branch 'work.gfs2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull gfs2 setattr updates from Al Viro:
 "Make it possible for filesystems to use a generic 'may_setattr()' and
  switch gfs2 to using it"

* 'work.gfs2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  gfs2: Switch to may_setattr in gfs2_setattr
  fs: Move notify_change permission checks into may_setattr
2021-09-09 12:45:26 -07:00
Linus Torvalds
e2e694b9e6 Merge branch 'work.init' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull root filesystem type handling updates from Al Viro:
 "Teach init/do_mounts.c to handle non-block filesystems, hopefully
  preventing even more special-cased kludges (such as root=/dev/nfs,
  etc)"

* 'work.init' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  fs: simplify get_filesystem_list / get_all_fs_names
  init: allow mounting arbitrary non-blockdevice filesystems as root
  init: split get_fs_names
2021-09-09 12:38:18 -07:00
Linus Torvalds
7b7699c09f Merge branch 'work.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull iov_iter fixes from Al Viro:
 "Fixes for io-uring handling of iov_iter reexpands"

* 'work.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  io_uring: reexpand under-reexpanded iters
  iov_iter: track truncated size
2021-09-09 12:13:46 -07:00
Linus Torvalds
70868a1805 Merge tag 'cxl-for-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl
Pull CXL (Compute Express Link) updates from Dan Williams:

 - Fix detection of CXL host bridges to filter out disabled ACPI0016
   devices in the ACPI DSDT.

 - Fix kernel lockdown integration to disable raw commands when raw PCI
   access is disabled.

 - Fix a broken debug message.

 - Add support for "Get Partition Info". I.e. enumerate the split
   between volatile and persistent capacity on bi-modal CXL memory
   expanders.

 - Re-factor the core by subject area. This is a work in progress.

 - Prepare libnvdimm to understand CXL labels in addition to EFI labels.
   This is a work in progress.

* tag 'cxl-for-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (25 commits)
  cxl/registers: Fix Documentation warning
  cxl/pmem: Fix Documentation warning
  cxl/uapi: Fix defined but not used warnings
  cxl/pci: Fix debug message in cxl_probe_regs()
  cxl/pci: Fix lockdown level
  cxl/acpi: Do not add DSDT disabled ACPI0016 host bridge ports
  libnvdimm/labels: Add claim class helpers
  libnvdimm/labels: Add type-guid helpers
  libnvdimm/labels: Add blk special cases for nlabel and position helpers
  libnvdimm/labels: Add blk isetcookie set / validation helpers
  libnvdimm/labels: Add a checksum calculation helper
  libnvdimm/labels: Introduce label setter helpers
  libnvdimm/labels: Add isetcookie validation helper
  libnvdimm/labels: Introduce getters for namespace label fields
  cxl/mem: Adjust ram/pmem range to represent DPA ranges
  cxl/mem: Account for partitionable space in ram/pmem ranges
  cxl/pci: Store memory capacity values
  cxl/pci: Simplify register setup
  cxl/pci: Ignore unknown register block types
  cxl/core: Move memdev management to core
  ...
2021-09-09 11:48:27 -07:00
Linus Torvalds
2e5fd489a4 Merge tag 'libnvdimm-for-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm updates from Dan Williams:

 - Fix a race condition in the teardown path of raw mode pmem
   namespaces.

 - Cleanup the code that filesystems use to detect filesystem-dax
   capabilities of their underlying block device.

* tag 'libnvdimm-for-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  dax: remove bdev_dax_supported
  xfs: factor out a xfs_buftarg_is_dax helper
  dax: stub out dax_supported for !CONFIG_FS_DAX
  dax: remove __generic_fsdax_supported
  dax: move the dax_read_lock() locking into dax_supported
  dax: mark dax_get_by_host static
  dm: use fs_dax_get_by_bdev instead of dax_get_by_host
  dax: stop using bdevname
  fsdax: improve the FS_DAX Kconfig description and help text
  libnvdimm/pmem: Fix crash triggered when I/O in-flight during unbind
2021-09-09 11:39:57 -07:00
Linus Torvalds
0aa2516017 Merge tag 'dmaengine-5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine
Pull dmaengine updates from Vinod Koul:
 "New drivers/devices
   - Support for Renesas RZ/G2L dma controller
   - New driver for AMD PTDMA controller

  Updates:
   - Big pile of idxd updates
   - Updates for Altera driver, stm32-dma, dw etc"

* tag 'dmaengine-5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: (83 commits)
  dmaengine: sh: fix some NULL dereferences
  dmaengine: sh: Fix unused initialization of pointer lmdesc
  MAINTAINERS: Fix AMD PTDMA DRIVER entry
  dmaengine: ptdma: remove PT_OFFSET to avoid redefnition
  dmaengine: ptdma: Add debugfs entries for PTDMA
  dmaengine: ptdma: register PTDMA controller as a DMA resource
  dmaengine: ptdma: Initial driver for the AMD PTDMA
  dmaengine: fsl-dpaa2-qdma: Fix spelling mistake "faile" -> "failed"
  dmaengine: idxd: remove interrupt disable for dev_lock
  dmaengine: idxd: remove interrupt disable for cmd_lock
  dmaengine: idxd: fix setting up priv mode for dwq
  dmaengine: xilinx_dma: Set DMA mask for coherent APIs
  dmaengine: ti: k3-psil-j721e: Add entry for CSI2RX
  dmaengine: sh: Add DMAC driver for RZ/G2L SoC
  dmaengine: Extend the dma_slave_width for 128 bytes
  dt-bindings: dma: Document RZ/G2L bindings
  dmaengine: ioat: depends on !UML
  dmaengine: idxd: set descriptor allocation size to threshold for swq
  dmaengine: idxd: make submit failure path consistent on desc freeing
  dmaengine: idxd: remove interrupt flag for completion list spinlock
  ...
2021-09-09 11:07:47 -07:00
Eric Dumazet
04f08eb44b net/af_unix: fix a data-race in unix_dgram_poll
syzbot reported another data-race in af_unix [1]

Lets change __skb_insert() to use WRITE_ONCE() when changing
skb head qlen.

Also, change unix_dgram_poll() to use lockless version
of unix_recvq_full()

It is verry possible we can switch all/most unix_recvq_full()
to the lockless version, this will be done in a future kernel version.

[1] HEAD commit: 8596e589b7

BUG: KCSAN: data-race in skb_queue_tail / unix_dgram_poll

write to 0xffff88814eeb24e0 of 4 bytes by task 25815 on cpu 0:
 __skb_insert include/linux/skbuff.h:1938 [inline]
 __skb_queue_before include/linux/skbuff.h:2043 [inline]
 __skb_queue_tail include/linux/skbuff.h:2076 [inline]
 skb_queue_tail+0x80/0xa0 net/core/skbuff.c:3264
 unix_dgram_sendmsg+0xff2/0x1600 net/unix/af_unix.c:1850
 sock_sendmsg_nosec net/socket.c:703 [inline]
 sock_sendmsg net/socket.c:723 [inline]
 ____sys_sendmsg+0x360/0x4d0 net/socket.c:2392
 ___sys_sendmsg net/socket.c:2446 [inline]
 __sys_sendmmsg+0x315/0x4b0 net/socket.c:2532
 __do_sys_sendmmsg net/socket.c:2561 [inline]
 __se_sys_sendmmsg net/socket.c:2558 [inline]
 __x64_sys_sendmmsg+0x53/0x60 net/socket.c:2558
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x3d/0x90 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x44/0xae

read to 0xffff88814eeb24e0 of 4 bytes by task 25834 on cpu 1:
 skb_queue_len include/linux/skbuff.h:1869 [inline]
 unix_recvq_full net/unix/af_unix.c:194 [inline]
 unix_dgram_poll+0x2bc/0x3e0 net/unix/af_unix.c:2777
 sock_poll+0x23e/0x260 net/socket.c:1288
 vfs_poll include/linux/poll.h:90 [inline]
 ep_item_poll fs/eventpoll.c:846 [inline]
 ep_send_events fs/eventpoll.c:1683 [inline]
 ep_poll fs/eventpoll.c:1798 [inline]
 do_epoll_wait+0x6ad/0xf00 fs/eventpoll.c:2226
 __do_sys_epoll_wait fs/eventpoll.c:2238 [inline]
 __se_sys_epoll_wait fs/eventpoll.c:2233 [inline]
 __x64_sys_epoll_wait+0xf6/0x120 fs/eventpoll.c:2233
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x3d/0x90 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x44/0xae

value changed: 0x0000001b -> 0x00000001

Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 25834 Comm: syz-executor.1 Tainted: G        W         5.14.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011

Fixes: 86b18aaa2b ("skbuff: fix a data race in skb_queue_len()")
Cc: Qian Cai <cai@lca.pw>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-09 10:57:52 +01:00
Dave Airlie
de04744d65 Merge tag 'drm-misc-next-fixes-2021-09-03' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
drm-misc-next-fixes for v5.15:
- Fix ttm_bo_move_memcpy() when ttm_resource is subclassed.
- Small fixes to panfrost, mgag200, vc4.
- Small ttm compilation fixes.

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/41ff5e54-0837-2226-a182-97ffd11ef01e@linux.intel.com
2021-09-09 13:35:54 +10:00
Linus Torvalds
a3fa7a101d Merge branches 'akpm' and 'akpm-hotfixes' (patches from Andrew)
Merge yet more updates and hotfixes from Andrew Morton:
 "Post-linux-next material, based upon latest upstream to catch the
  now-merged dependencies:

   - 10 patches.

     Subsystems affected by this patch series: mm (vmstat and migration)
     and compat.

  And bunch of hotfixes, mostly cc:stable:

   - 8 patches.

     Subsystems affected by this patch series: mm (hmm, hugetlb, vmscan,
     pagealloc, pagemap, kmemleak, mempolicy, and memblock)"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  arch: remove compat_alloc_user_space
  compat: remove some compat entry points
  mm: simplify compat numa syscalls
  mm: simplify compat_sys_move_pages
  kexec: avoid compat_alloc_user_space
  kexec: move locking into do_kexec_load
  mm: migrate: change to use bool type for 'page_was_mapped'
  mm: migrate: fix the incorrect function name in comments
  mm: migrate: introduce a local variable to get the number of pages
  mm/vmstat: protect per cpu variables with preempt disable on RT

* emailed hotfixes from Andrew Morton <akpm@linux-foundation.org>:
  nds32/setup: remove unused memblock_region variable in setup_memory()
  mm/mempolicy: fix a race between offset_il_node and mpol_rebind_task
  mm/kmemleak: allow __GFP_NOLOCKDEP passed to kmemleak's gfp
  mmap_lock: change trace and locking order
  mm/page_alloc.c: avoid accessing uninitialized pcp page migratetype
  mm,vmscan: fix divide by zero in get_scan_count
  mm/hugetlb: initialize hugetlb_usage in mm_init
  mm/hmm: bypass devmap pte when all pfn requested flags are fulfilled
2021-09-08 18:52:05 -07:00
Liam Howlett
1099431608 mmap_lock: change trace and locking order
Print to the trace log before releasing the lock to avoid racing with
other trace log printers of the same lock type.

Link: https://lkml.kernel.org/r/20210903022041.1843024-1-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Suggested-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michel Lespinasse <walken.cr@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-09-08 18:45:53 -07:00
Liu Zixian
13db8c5047 mm/hugetlb: initialize hugetlb_usage in mm_init
After fork, the child process will get incorrect (2x) hugetlb_usage.  If
a process uses 5 2MB hugetlb pages in an anonymous mapping,

	HugetlbPages:	   10240 kB

and then forks, the child will show,

	HugetlbPages:	   20480 kB

The reason for double the amount is because hugetlb_usage will be copied
from the parent and then increased when we copy page tables from parent
to child.  Child will have 2x actual usage.

Fix this by adding hugetlb_count_init in mm_init.

Link: https://lkml.kernel.org/r/20210826071742.877-1-liuzixian4@huawei.com
Fixes: 5d317b2b65 ("mm: hugetlb: proc: add HugetlbPages field to /proc/PID/status")
Signed-off-by: Liu Zixian <liuzixian4@huawei.com>
Reviewed-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-09-08 18:45:53 -07:00