Commit Graph

119132 Commits

Author SHA1 Message Date
David Howells
998f50407f security: Add hooks to rule on setting a watch
Add security hooks that will allow an LSM to rule on whether or not a watch
may be set.  More than one hook is required as the watches watch different
types of object.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jamorris@linux.microsoft.com>
cc: Casey Schaufler <casey@schaufler-ca.com>
cc: Stephen Smalley <sds@tycho.nsa.gov>
cc: linux-security-module@vger.kernel.org
2020-05-19 15:16:08 +01:00
David Howells
c73be61ced pipe: Add general notification queue support
Make it possible to have a general notification queue built on top of a
standard pipe.  Notifications are 'spliced' into the pipe and then read
out.  splice(), vmsplice() and sendfile() are forbidden on pipes used for
notifications as post_one_notification() cannot take pipe->mutex.  This
means that notifications could be posted in between individual pipe
buffers, making iov_iter_revert() difficult to effect.

The way the notification queue is used is:

 (1) An application opens a pipe with a special flag and indicates the
     number of messages it wishes to be able to queue at once (this can
     only be set once):

	pipe2(fds, O_NOTIFICATION_PIPE);
	ioctl(fds[0], IOC_WATCH_QUEUE_SET_SIZE, queue_depth);

 (2) The application then uses poll() and read() as normal to extract data
     from the pipe.  read() will return multiple notifications if the
     buffer is big enough, but it will not split a notification across
     buffers - rather it will return a short read or EMSGSIZE.

     Notification messages include a length in the header so that the
     caller can split them up.

Each message has a header that describes it:

	struct watch_notification {
		__u32	type:24;
		__u32	subtype:8;
		__u32	info;
	};

The type indicates the source (eg. mount tree changes, superblock events,
keyring changes, block layer events) and the subtype indicates the event
type (eg. mount, unmount; EIO, EDQUOT; link, unlink).  The info field
indicates a number of things, including the entry length, an ID assigned to
a watchpoint contributing to this buffer and type-specific flags.

Supplementary data, such as the key ID that generated an event, can be
attached in additional slots.  The maximum message size is 127 bytes.
Messages may not be padded or aligned, so there is no guarantee, for
example, that the notification type will be on a 4-byte bounary.

Signed-off-by: David Howells <dhowells@redhat.com>
2020-05-19 15:08:24 +01:00
David Howells
b580b93664 pipe: Add O_NOTIFICATION_PIPE
Add an O_NOTIFICATION_PIPE flag that can be passed to pipe2() to indicate
that the pipe being created is going to be used for notifications.  This
suppresses the use of splice(), vmsplice(), tee() and sendfile() on the
pipe as calling iov_iter_revert() on a pipe when a kernel notification
message has been inserted into the middle of a multi-buffer splice will be
messy.

The flag is given the same value as O_EXCL as it seems unlikely that
this flag will ever be applicable to pipes and I don't want to use up
another O_* bit unnecessarily.  An alternative could be to add a pipe3()
system call.

Signed-off-by: David Howells <dhowells@redhat.com>
2020-05-19 15:08:23 +01:00
David Howells
344fa64ef8 security: Add a hook for the point of notification insertion
Add a security hook that allows an LSM to rule on whether a notification
message is allowed to be inserted into a particular watch queue.

The hook is given the following information:

 (1) The credentials of the triggerer (which may be init_cred for a system
     notification, eg. a hardware error).

 (2) The credentials of the whoever set the watch.

 (3) The notification message.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jamorris@linux.microsoft.com>
cc: Casey Schaufler <casey@schaufler-ca.com>
cc: Stephen Smalley <sds@tycho.nsa.gov>
cc: linux-security-module@vger.kernel.org
2020-05-19 15:08:23 +01:00
David Howells
0858caa419 uapi: General notification queue definitions
Add UAPI definitions for the general notification queue, including the
following pieces:

 (*) struct watch_notification.

     This is the metadata header for notification messages.  It includes a
     type and subtype that indicate the source of the message
     (eg. WATCH_TYPE_MOUNT_NOTIFY) and the kind of the message
     (eg. NOTIFY_MOUNT_NEW_MOUNT).

     The header also contains an information field that conveys the
     following information:

	- WATCH_INFO_LENGTH.  The size of the entry (entries are variable
          length).

	- WATCH_INFO_ID.  The watch ID specified when the watchpoint was
          set.

	- WATCH_INFO_TYPE_INFO.  (Sub)type-specific information.

	- WATCH_INFO_FLAG_*.  Flag bits overlain on the type-specific
          information.  For use by the type.

     All the information in the header can be used in filtering messages at
     the point of writing into the buffer.

 (*) struct watch_notification_removal

     This is an extended watch-removal notification record that includes an
     'id' field that can indicate the identifier of the object being
     removed if available (for instance, a keyring serial number).

Signed-off-by: David Howells <dhowells@redhat.com>
2020-05-19 15:08:23 +01:00
Linus Torvalds
fb27bc034d Merge tag 'usb-5.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
 "Here are a number of USB fixes for 5.7-rc6

  The "largest" in here is a bunch of raw-gadget fixes and api changes
  as the driver just showed up in -rc1 and work has been done to fix up
  some uapi issues found with the original submission, before it shows
  up in a -final release.

  Other than that, a bunch of other small USB gadget fixes, xhci fixes,
  some quirks, andother tiny fixes for reported issues.

  All of these have been in linux-next with no reported issues"

* tag 'usb-5.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (26 commits)
  USB: gadget: fix illegal array access in binding with UDC
  usb: core: hub: limit HUB_QUIRK_DISABLE_AUTOSUSPEND to USB5534B
  USB: usbfs: fix mmap dma mismatch
  usb: host: xhci-plat: keep runtime active when removing host
  usb: xhci: Fix NULL pointer dereference when enqueuing trbs from urb sg list
  usb: cdns3: gadget: make a bunch of functions static
  usb: mtu3: constify struct debugfs_reg32
  usb: gadget: udc: atmel: Make some symbols static
  usb: raw-gadget: fix null-ptr-deref when reenabling endpoints
  usb: raw-gadget: documentation updates
  usb: raw-gadget: support stalling/halting/wedging endpoints
  usb: raw-gadget: fix gadget endpoint selection
  usb: raw-gadget: improve uapi headers comments
  usb: typec: mux: intel: Fix DP_HPD_LVL bit field
  usb: raw-gadget: fix return value of ep read ioctls
  usb: dwc3: select USB_ROLE_SWITCH
  usb: gadget: legacy: fix error return code in gncm_bind()
  usb: gadget: legacy: fix error return code in cdc_bind()
  usb: gadget: legacy: fix redundant initialization warnings
  usb: gadget: tegra-xudc: Fix idle suspend/resume
  ...
2020-05-17 12:31:22 -07:00
Linus Torvalds
43567139f5 Merge tag 'x86_urgent_for_v5.7-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fix from Borislav Petkov:
 "A single fix for early boot crashes of kernels built with gcc10 and
  stack protector enabled"

* tag 'x86_urgent_for_v5.7-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86: Fix early boot crash on gcc-10, third try
2020-05-17 11:08:29 -07:00
Linus Torvalds
5d438e071f Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
 "A new testcase for guest debugging (gdbstub) that exposed a bunch of
  bugs, mostly for AMD processors. And a few other x86 fixes"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: x86: Fix off-by-one error in kvm_vcpu_ioctl_x86_setup_mce
  KVM: x86: Fix pkru save/restore when guest CR4.PKE=0, move it to x86.c
  KVM: SVM: Disable AVIC before setting V_IRQ
  KVM: Introduce kvm_make_all_cpus_request_except()
  KVM: VMX: pass correct DR6 for GD userspace exit
  KVM: x86, SVM: isolate vcpu->arch.dr6 from vmcb->save.dr6
  KVM: SVM: keep DR6 synchronized with vcpu->arch.dr6
  KVM: nSVM: trap #DB and #BP to userspace if guest debugging is on
  KVM: selftests: Add KVM_SET_GUEST_DEBUG test
  KVM: X86: Fix single-step with KVM_SET_GUEST_DEBUG
  KVM: X86: Set RTM for DB_VECTOR too for KVM_EXIT_DEBUG
  KVM: x86: fix DR6 delivery for various cases of #DB injection
  KVM: X86: Declare KVM_CAP_SET_GUEST_DEBUG properly
2020-05-16 13:39:22 -07:00
Linus Torvalds
f85c1598dd Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from David Miller:

 1) Fix sk_psock reference count leak on receive, from Xiyu Yang.

 2) CONFIG_HNS should be invisible, from Geert Uytterhoeven.

 3) Don't allow locking route MTUs in ipv6, RFCs actually forbid this,
    from Maciej Żenczykowski.

 4) ipv4 route redirect backoff wasn't actually enforced, from Paolo
    Abeni.

 5) Fix netprio cgroup v2 leak, from Zefan Li.

 6) Fix infinite loop on rmmod in conntrack, from Florian Westphal.

 7) Fix tcp SO_RCVLOWAT hangs, from Eric Dumazet.

 8) Various bpf probe handling fixes, from Daniel Borkmann.

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (68 commits)
  selftests: mptcp: pm: rm the right tmp file
  dpaa2-eth: properly handle buffer size restrictions
  bpf: Restrict bpf_trace_printk()'s %s usage and add %pks, %pus specifier
  bpf: Add bpf_probe_read_{user, kernel}_str() to do_refine_retval_range
  bpf: Restrict bpf_probe_read{, str}() only to archs where they work
  MAINTAINERS: Mark networking drivers as Maintained.
  ipmr: Add lockdep expression to ipmr_for_each_table macro
  ipmr: Fix RCU list debugging warning
  drivers: net: hamradio: Fix suspicious RCU usage warning in bpqether.c
  net: phy: broadcom: fix BCM54XX_SHD_SCR3_TRDDAPD value for BCM54810
  tcp: fix error recovery in tcp_zerocopy_receive()
  MAINTAINERS: Add Jakub to networking drivers.
  MAINTAINERS: another add of Karsten Graul for S390 networking
  drivers: ipa: fix typos for ipa_smp2p structure doc
  pppoe: only process PADT targeted at local interfaces
  selftests/bpf: Enforce returning 0 for fentry/fexit programs
  bpf: Enforce returning 0 for fentry/fexit progs
  net: stmmac: fix num_por initialization
  security: Fix the default value of secid_to_secctx hook
  libbpf: Fix register naming in PT_REGS s390 macros
  ...
2020-05-15 13:10:06 -07:00
David S. Miller
8e1381049e Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Alexei Starovoitov says:

====================
pull-request: bpf 2020-05-15

The following pull-request contains BPF updates for your *net* tree.

We've added 9 non-merge commits during the last 2 day(s) which contain
a total of 14 files changed, 137 insertions(+), 43 deletions(-).

The main changes are:

1) Fix secid_to_secctx LSM hook default value, from Anders.

2) Fix bug in mmap of bpf array, from Andrii.

3) Restrict bpf_probe_read to archs where they work, from Daniel.

4) Enforce returning 0 for fentry/fexit progs, from Yonghong.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-15 10:57:21 -07:00
Linus Torvalds
1742bcd0cb Merge tag 'sound-5.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
 "Things look good and calming down; the only change to ALSA core is the
  fix for racy rawmidi buffer accesses spotted by syzkaller, and the
  rest are all small device-specific quirks for HD-audio and USB-audio
  devices"

* tag 'sound-5.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: hda/realtek - Limit int mic boost for Thinkpad T530
  ALSA: hda/realtek - Add COEF workaround for ASUS ZenBook UX431DA
  ALSA: hda/realtek: Enable headset mic of ASUS UX581LV with ALC295
  ALSA: hda/realtek - Enable headset mic of ASUS UX550GE with ALC295
  ALSA: hda/realtek - Enable headset mic of ASUS GL503VM with ALC295
  ALSA: hda/realtek: Add quirk for Samsung Notebook
  ALSA: rawmidi: Fix racy buffer resize under concurrent accesses
  ALSA: usb-audio: add mapping for ASRock TRX40 Creator
  ALSA: hda/realtek - Fix S3 pop noise on Dell Wyse
  Revert "ALSA: hda/realtek: Fix pop noise on ALC225"
  ALSA: firewire-lib: fix 'function sizeof not defined' error of tracepoints format
  ALSA: usb-audio: Add control message quirk delay for Kingston HyperX headset
2020-05-15 10:06:49 -07:00
Linus Torvalds
e7cea79058 Merge tag 'drm-fixes-2020-05-15' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
 "As mentioned last week an i915 PR came in late, but I left it, so the
  i915 bits of this cover 2 weeks, which is why it's likely a bit larger
  than usual.

  Otherwise it's mostly amdgpu fixes, one tegra fix, one meson fix.

  i915:
   - Handle idling during i915_gem_evict_something busy loops (Chris)
   - Mark current submissions with a weak-dependency (Chris)
   - Propagate error from completed fences (Chris)
   - Fixes on execlist to avoid GPU hang situation (Chris)
   - Fixes couple deadlocks (Chris)
   - Timeslice preemption fixes (Chris)
   - Fix Display Port interrupt handling on Tiger Lake (Imre)
   - Reduce debug noise around Frame Buffer Compression (Peter)
   - Fix logic around IPC W/a for Coffee Lake and Kaby Lake (Sultan)
   - Avoid dereferencing a dead context (Chris)

  tegra:
   - tegra120/4 smmu fixes

  amdgpu:
   - Clockgating fixes
   - Fix fbdev with scatter/gather display
   - S4 fix for navi
   - Soft recovery for gfx10
   - Freesync fixes
   - Atomic check cursor fix
   - Add a gfxoff quirk
   - MST fix

  amdkfd:
   - Fix GEM reference counting

  meson:
   - error code propogation fix"

* tag 'drm-fixes-2020-05-15' of git://anongit.freedesktop.org/drm/drm: (29 commits)
  drm/i915: Handle idling during i915_gem_evict_something busy loops
  drm/meson: pm resume add return errno branch
  drm/amd/amdgpu: Update update_config() logic
  drm/amd/amdgpu: add raven1 part to the gfxoff quirk list
  drm/i915: Mark concurrent submissions with a weak-dependency
  drm/i915: Propagate error from completed fences
  drm/i915/gvt: Fix kernel oops for 3-level ppgtt guest
  drm/i915/gvt: Init DPLL/DDI vreg for virtual display instead of inheritance.
  drm/amd/display: add basic atomic check for cursor plane
  drm/amd/display: Fix vblank and pageflip event handling for FreeSync
  drm/amdgpu: implement soft_recovery for gfx10
  drm/amdgpu: enable hibernate support on Navi1X
  drm/amdgpu: Use GEM obj reference for KFD BOs
  drm/amdgpu: force fbdev into vram
  drm/amd/powerplay: perform PG ungate prior to CG ungate
  drm/amdgpu: drop unnecessary cancel_delayed_work_sync on PG ungate
  drm/amdgpu: disable MGCG/MGLS also on gfx CG ungate
  drm/i915/execlists: Track inflight CCID
  drm/i915/execlists: Avoid reusing the same logical CCID
  drm/i915/gem: Remove object_is_locked assertion from unpin_from_display_plane
  ...
2020-05-15 09:59:49 -07:00
Borislav Petkov
a9a3ed1eff x86: Fix early boot crash on gcc-10, third try
... or the odyssey of trying to disable the stack protector for the
function which generates the stack canary value.

The whole story started with Sergei reporting a boot crash with a kernel
built with gcc-10:

  Kernel panic — not syncing: stack-protector: Kernel stack is corrupted in: start_secondary
  CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.6.0-rc5—00235—gfffb08b37df9 #139
  Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./H77M—D3H, BIOS F12 11/14/2013
  Call Trace:
    dump_stack
    panic
    ? start_secondary
    __stack_chk_fail
    start_secondary
    secondary_startup_64
  -—-[ end Kernel panic — not syncing: stack—protector: Kernel stack is corrupted in: start_secondary

This happens because gcc-10 tail-call optimizes the last function call
in start_secondary() - cpu_startup_entry() - and thus emits a stack
canary check which fails because the canary value changes after the
boot_init_stack_canary() call.

To fix that, the initial attempt was to mark the one function which
generates the stack canary with:

  __attribute__((optimize("-fno-stack-protector"))) ... start_secondary(void *unused)

however, using the optimize attribute doesn't work cumulatively
as the attribute does not add to but rather replaces previously
supplied optimization options - roughly all -fxxx options.

The key one among them being -fno-omit-frame-pointer and thus leading to
not present frame pointer - frame pointer which the kernel needs.

The next attempt to prevent compilers from tail-call optimizing
the last function call cpu_startup_entry(), shy of carving out
start_secondary() into a separate compilation unit and building it with
-fno-stack-protector, was to add an empty asm("").

This current solution was short and sweet, and reportedly, is supported
by both compilers but we didn't get very far this time: future (LTO?)
optimization passes could potentially eliminate this, which leads us
to the third attempt: having an actual memory barrier there which the
compiler cannot ignore or move around etc.

That should hold for a long time, but hey we said that about the other
two solutions too so...

Reported-by: Sergei Trofimovich <slyfox@gentoo.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Tested-by: Kalle Valo <kvalo@codeaurora.org>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/20200314164451.346497-1-slyfox@gentoo.org
2020-05-15 11:48:01 +02:00
Kevin Lo
cc8a677a76 net: phy: broadcom: fix BCM54XX_SHD_SCR3_TRDDAPD value for BCM54810
Set the correct bit when checking for PHY_BRCM_DIS_TXCRXC_NOENRGY on the
BCM54810 PHY.

Fixes: 0ececcfc92 ("net: phy: broadcom: Allow BCM54810 to use bcm54xx_adjust_rxrefclk()")
Signed-off-by: Kevin Lo <kevlo@kevlo.org>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-14 17:40:06 -07:00
David S. Miller
1b54f4fa4d Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following patchset contains Netfilter fixes for net:

1) Fix gcc-10 compilation warning in nf_conntrack, from Arnd Bergmann.

2) Add NF_FLOW_HW_PENDING to avoid races between stats and deletion
   commands, from Paul Blakey.

3) Remove WQ_MEM_RECLAIM from the offload workqueue, from Roi Dayan.

4) Infinite loop when removing nf_conntrack module, from Florian Westphal.

5) Set NF_FLOW_TEARDOWN bit on expiration to avoid races when refreshing
   the timeout from the software path.

6) Missing nft_set_elem_expired() check in the rbtree, from Phil Sutter.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-14 13:15:02 -07:00
Anders Roxell
625236ba38 security: Fix the default value of secid_to_secctx hook
security_secid_to_secctx is called by the bpf_lsm hook and a successful
return value (i.e 0) implies that the parameter will be consumed by the
LSM framework. The current behaviour return success when the pointer
isn't initialized when CONFIG_BPF_LSM is enabled, with the default
return from kernel/bpf/bpf_lsm.c.

This is the internal error:

[ 1229.341488][ T2659] usercopy: Kernel memory exposure attempt detected from null address (offset 0, size 280)!
[ 1229.374977][ T2659] ------------[ cut here ]------------
[ 1229.376813][ T2659] kernel BUG at mm/usercopy.c:99!
[ 1229.378398][ T2659] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
[ 1229.380348][ T2659] Modules linked in:
[ 1229.381654][ T2659] CPU: 0 PID: 2659 Comm: systemd-journal Tainted: G    B   W         5.7.0-rc5-next-20200511-00019-g864e0c6319b8-dirty #13
[ 1229.385429][ T2659] Hardware name: linux,dummy-virt (DT)
[ 1229.387143][ T2659] pstate: 80400005 (Nzcv daif +PAN -UAO BTYPE=--)
[ 1229.389165][ T2659] pc : usercopy_abort+0xc8/0xcc
[ 1229.390705][ T2659] lr : usercopy_abort+0xc8/0xcc
[ 1229.392225][ T2659] sp : ffff000064247450
[ 1229.393533][ T2659] x29: ffff000064247460 x28: 0000000000000000
[ 1229.395449][ T2659] x27: 0000000000000118 x26: 0000000000000000
[ 1229.397384][ T2659] x25: ffffa000127049e0 x24: ffffa000127049e0
[ 1229.399306][ T2659] x23: ffffa000127048e0 x22: ffffa000127048a0
[ 1229.401241][ T2659] x21: ffffa00012704b80 x20: ffffa000127049e0
[ 1229.403163][ T2659] x19: ffffa00012704820 x18: 0000000000000000
[ 1229.405094][ T2659] x17: 0000000000000000 x16: 0000000000000000
[ 1229.407008][ T2659] x15: 0000000000000000 x14: 003d090000000000
[ 1229.408942][ T2659] x13: ffff80000d5b25b2 x12: 1fffe0000d5b25b1
[ 1229.410859][ T2659] x11: 1fffe0000d5b25b1 x10: ffff80000d5b25b1
[ 1229.412791][ T2659] x9 : ffffa0001034bee0 x8 : ffff00006ad92d8f
[ 1229.414707][ T2659] x7 : 0000000000000000 x6 : ffffa00015eacb20
[ 1229.416642][ T2659] x5 : ffff0000693c8040 x4 : 0000000000000000
[ 1229.418558][ T2659] x3 : ffffa0001034befc x2 : d57a7483a01c6300
[ 1229.420610][ T2659] x1 : 0000000000000000 x0 : 0000000000000059
[ 1229.422526][ T2659] Call trace:
[ 1229.423631][ T2659]  usercopy_abort+0xc8/0xcc
[ 1229.425091][ T2659]  __check_object_size+0xdc/0x7d4
[ 1229.426729][ T2659]  put_cmsg+0xa30/0xa90
[ 1229.428132][ T2659]  unix_dgram_recvmsg+0x80c/0x930
[ 1229.429731][ T2659]  sock_recvmsg+0x9c/0xc0
[ 1229.431123][ T2659]  ____sys_recvmsg+0x1cc/0x5f8
[ 1229.432663][ T2659]  ___sys_recvmsg+0x100/0x160
[ 1229.434151][ T2659]  __sys_recvmsg+0x110/0x1a8
[ 1229.435623][ T2659]  __arm64_sys_recvmsg+0x58/0x70
[ 1229.437218][ T2659]  el0_svc_common.constprop.1+0x29c/0x340
[ 1229.438994][ T2659]  do_el0_svc+0xe8/0x108
[ 1229.440587][ T2659]  el0_svc+0x74/0x88
[ 1229.441917][ T2659]  el0_sync_handler+0xe4/0x8b4
[ 1229.443464][ T2659]  el0_sync+0x17c/0x180
[ 1229.444920][ T2659] Code: aa1703e2 aa1603e1 910a8260 97ecc860 (d4210000)
[ 1229.447070][ T2659] ---[ end trace 400497d91baeaf51 ]---
[ 1229.448791][ T2659] Kernel panic - not syncing: Fatal exception
[ 1229.450692][ T2659] Kernel Offset: disabled
[ 1229.452061][ T2659] CPU features: 0x240002,20002004
[ 1229.453647][ T2659] Memory Limit: none
[ 1229.455015][ T2659] ---[ end Kernel panic - not syncing: Fatal exception ]---

Rework the so the default return value is -EOPNOTSUPP.

There are likely other callbacks such as security_inode_getsecctx() that
may have the same problem, and that someone that understand the code
better needs to audit them.

Thank you Arnd for helping me figure out what went wrong.

Fixes: 98e828a065 ("security: Refactor declaration of LSM hooks")
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: James Morris <jamorris@linux.microsoft.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/bpf/20200512174607.9630-1-anders.roxell@linaro.org
2020-05-14 12:46:34 -07:00
Linus Torvalds
decd6167bf Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
 "7 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  kasan: add missing functions declarations to kasan.h
  kasan: consistently disable debugging features
  ipc/util.c: sysvipc_find_ipc() incorrectly updates position index
  userfaultfd: fix remap event with MREMAP_DONTUNMAP
  mm/gup: fix fixup_user_fault() on multiple retries
  epoll: call final ep_events_available() check under the lock
  mm, memcg: fix inconsistent oom event behavior
2020-05-14 12:26:55 -07:00
Linus Torvalds
f44d5c4890 Merge tag 'trace-v5.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull more tracing fixes from Steven Rostedt:
 "Various tracing fixes:

   - Fix a crash when having function tracing and function stack tracing
     on the command line.

     The ftrace trampolines are created as executable and read only. But
     the stack tracer tries to modify them with text_poke() which
     expects all kernel text to still be writable at boot. Keep the
     trampolines writable at boot, and convert them to read-only with
     the rest of the kernel.

   - A selftest was triggering in the ring buffer iterator code, that is
     no longer valid with the update of keeping the ring buffer writable
     while a iterator is reading.

     Just bail after three failed attempts to get an event and remove
     the warning and disabling of the ring buffer.

   - While modifying the ring buffer code, decided to remove all the
     unnecessary BUG() calls"

* tag 'trace-v5.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  ring-buffer: Remove all BUG() calls
  ring-buffer: Don't deactivate the ring buffer on failed iterator reads
  x86/ftrace: Have ftrace trampolines turn read-only at the end of system boot up
2020-05-14 11:46:52 -07:00
Yafang Shao
04fd61a4e0 mm, memcg: fix inconsistent oom event behavior
A recent commit 9852ae3fe5 ("mm, memcg: consider subtrees in
memory.events") changed the behavior of memcg events, which will now
consider subtrees in memory.events.

But oom_kill event is a special one as it is used in both cgroup1 and
cgroup2.  In cgroup1, it is displayed in memory.oom_control.  The file
memory.oom_control is in both root memcg and non root memcg, that is
different with memory.event as it only in non-root memcg.  That commit
is okay for cgroup2, but it is not okay for cgroup1 as it will cause
inconsistent behavior between root memcg and non-root memcg.

Here's an example on why this behavior is inconsistent in cgroup1.

       root memcg
       /
    memcg foo
     /
  memcg bar

Suppose there's an oom_kill in memcg bar, then the oon_kill will be

       root memcg : memory.oom_control(oom_kill)  0
       /
    memcg foo : memory.oom_control(oom_kill)  1
     /
  memcg bar : memory.oom_control(oom_kill)  1

For the non-root memcg, its memory.oom_control(oom_kill) includes its
descendants' oom_kill, but for root memcg, it doesn't include its
descendants' oom_kill.  That means, memory.oom_control(oom_kill) has
different meanings in different memcgs.  That is inconsistent.  Then the
user has to know whether the memcg is root or not.

If we can't fully support it in cgroup1, for example by adding
memory.events.local into cgroup1 as well, then let's don't touch its
original behavior.

Fixes: 9852ae3fe5 ("mm, memcg: consider subtrees in memory.events")
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Chris Down <chris@chrisdown.name>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20200502141055.7378-1-laoar.shao@gmail.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-14 10:00:35 -07:00
Andrey Konovalov
c61769bd47 usb: raw-gadget: support stalling/halting/wedging endpoints
Raw Gadget is currently unable to stall/halt/wedge gadget endpoints,
which is required for proper emulation of certain USB classes.

This patch adds a few more ioctls:

- USB_RAW_IOCTL_EP0_STALL allows to stall control endpoint #0 when
  there's a pending setup request for it.
- USB_RAW_IOCTL_SET/CLEAR_HALT/WEDGE allow to set/clear halt/wedge status
  on non-control non-isochronous endpoints.

Fixes: f2c2e71764 ("usb: gadget: add raw-gadget interface")
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Felipe Balbi <balbi@kernel.org>
2020-05-14 12:30:18 +03:00
Andrey Konovalov
97df5e5758 usb: raw-gadget: fix gadget endpoint selection
Currently automatic gadget endpoint selection based on required features
doesn't work. Raw Gadget tries iterating over the list of available
endpoints and finding one that has the right direction and transfer type.
Unfortunately selecting arbitrary gadget endpoints (even if they satisfy
feature requirements) doesn't work, as (depending on the UDC driver) they
might have fixed addresses, and one also needs to provide matching
endpoint addresses in the descriptors sent to the host.

The composite framework deals with this by assigning endpoint addresses
in usb_ep_autoconfig() before enumeration starts. This approach won't work
with Raw Gadget as the endpoints are supposed to be enabled after a
set_configuration/set_interface request from the host, so it's too late to
patch the endpoint descriptors that had already been sent to the host.

For Raw Gadget we take another approach. Similarly to GadgetFS, we allow
the user to make the decision as to which gadget endpoints to use.

This patch adds another Raw Gadget ioctl USB_RAW_IOCTL_EPS_INFO that
exposes information about all non-control endpoints that a currently
connected UDC has. This information includes endpoints addresses, as well
as their capabilities and limits to allow the user to choose the most
fitting gadget endpoint.

The USB_RAW_IOCTL_EP_ENABLE ioctl is updated to use the proper endpoint
validation routine usb_gadget_ep_match_desc().

These changes affect the portability of the gadgets that use Raw Gadget
when running on different UDCs. Nevertheless, as long as the user relies
on the information provided by USB_RAW_IOCTL_EPS_INFO to dynamically
choose endpoint addresses, UDC-agnostic gadgets can still be written with
Raw Gadget.

Fixes: f2c2e71764 ("usb: gadget: add raw-gadget interface")
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Felipe Balbi <balbi@kernel.org>
2020-05-14 12:30:17 +03:00
Andrey Konovalov
17ff3b72e7 usb: raw-gadget: improve uapi headers comments
Fix typo "trasferred" => "transferred".

Don't call USB requests URBs.

Fix comment style.

Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Felipe Balbi <balbi@kernel.org>
2020-05-14 12:30:17 +03:00
Dave Airlie
6da9b046af Merge tag 'drm/tegra/for-5.7-fixes' of git://anongit.freedesktop.org/tegra/linux into drm-fixes
drm/tegra: Fixes for v5.7

This contains a pair of patches which fix SMMU support on Tegra124 and
Tegra210 for host1x and the Tegra DRM driver.

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Thierry Reding <thierry.reding@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200508101355.3031268-1-thierry.reding@gmail.com
2020-05-14 12:29:50 +10:00
Steven Rostedt (VMware)
59566b0b62 x86/ftrace: Have ftrace trampolines turn read-only at the end of system boot up
Booting one of my machines, it triggered the following crash:

 Kernel/User page tables isolation: enabled
 ftrace: allocating 36577 entries in 143 pages
 Starting tracer 'function'
 BUG: unable to handle page fault for address: ffffffffa000005c
 #PF: supervisor write access in kernel mode
 #PF: error_code(0x0003) - permissions violation
 PGD 2014067 P4D 2014067 PUD 2015063 PMD 7b253067 PTE 7b252061
 Oops: 0003 [#1] PREEMPT SMP PTI
 CPU: 0 PID: 0 Comm: swapper Not tainted 5.4.0-test+ #24
 Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./To be filled by O.E.M., BIOS SDBLI944.86P 05/08/2007
 RIP: 0010:text_poke_early+0x4a/0x58
 Code: 34 24 48 89 54 24 08 e8 bf 72 0b 00 48 8b 34 24 48 8b 4c 24 08 84 c0 74 0b 48 89 df f3 a4 48 83 c4 10 5b c3 9c 58 fa 48 89 df <f3> a4 50 9d 48 83 c4 10 5b e9 d6 f9 ff ff
0 41 57 49
 RSP: 0000:ffffffff82003d38 EFLAGS: 00010046
 RAX: 0000000000000046 RBX: ffffffffa000005c RCX: 0000000000000005
 RDX: 0000000000000005 RSI: ffffffff825b9a90 RDI: ffffffffa000005c
 RBP: ffffffffa000005c R08: 0000000000000000 R09: ffffffff8206e6e0
 R10: ffff88807b01f4c0 R11: ffffffff8176c106 R12: ffffffff8206e6e0
 R13: ffffffff824f2440 R14: 0000000000000000 R15: ffffffff8206eac0
 FS:  0000000000000000(0000) GS:ffff88807d400000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: ffffffffa000005c CR3: 0000000002012000 CR4: 00000000000006b0
 Call Trace:
  text_poke_bp+0x27/0x64
  ? mutex_lock+0x36/0x5d
  arch_ftrace_update_trampoline+0x287/0x2d5
  ? ftrace_replace_code+0x14b/0x160
  ? ftrace_update_ftrace_func+0x65/0x6c
  __register_ftrace_function+0x6d/0x81
  ftrace_startup+0x23/0xc1
  register_ftrace_function+0x20/0x37
  func_set_flag+0x59/0x77
  __set_tracer_option.isra.19+0x20/0x3e
  trace_set_options+0xd6/0x13e
  apply_trace_boot_options+0x44/0x6d
  register_tracer+0x19e/0x1ac
  early_trace_init+0x21b/0x2c9
  start_kernel+0x241/0x518
  ? load_ucode_intel_bsp+0x21/0x52
  secondary_startup_64+0xa4/0xb0

I was able to trigger it on other machines, when I added to the kernel
command line of both "ftrace=function" and "trace_options=func_stack_trace".

The cause is the "ftrace=function" would register the function tracer
and create a trampoline, and it will set it as executable and
read-only. Then the "trace_options=func_stack_trace" would then update
the same trampoline to include the stack tracer version of the function
tracer. But since the trampoline already exists, it updates it with
text_poke_bp(). The problem is that text_poke_bp() called while
system_state == SYSTEM_BOOTING, it will simply do a memcpy() and not
the page mapping, as it would think that the text is still read-write.
But in this case it is not, and we take a fault and crash.

Instead, lets keep the ftrace trampolines read-write during boot up,
and then when the kernel executable text is set to read-only, the
ftrace trampolines get set to read-only as well.

Link: https://lkml.kernel.org/r/20200430202147.4dc6e2de@oasis.local.home

Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: stable@vger.kernel.org
Fixes: 768ae4406a ("x86/ftrace: Use text_poke()")
Acked-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-05-12 18:24:34 -04:00
Eric Dumazet
24adbc1676 tcp: fix SO_RCVLOWAT hangs with fat skbs
We autotune rcvbuf whenever SO_RCVLOWAT is set to account for 100%
overhead in tcp_set_rcvlowat()

This works well when skb->len/skb->truesize ratio is bigger than 0.5

But if we receive packets with small MSS, we can end up in a situation
where not enough bytes are available in the receive queue to satisfy
RCVLOWAT setting.
As our sk_rcvbuf limit is hit, we send zero windows in ACK packets,
preventing remote peer from sending more data.

Even autotuning does not help, because it only triggers at the time
user process drains the queue. If no EPOLLIN is generated, this
can not happen.

Note poll() has a similar issue, after commit
c7004482e8 ("tcp: Respect SO_RCVLOWAT in tcp_poll().")

Fixes: 03f45c883c ("tcp: avoid extra wakeups for SO_RCVLOWAT users")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-12 12:49:47 -07:00
Jacob Keller
2c864c78c2 ptp: fix struct member comment for do_aux_work
The do_aux_work callback had documentation in the structure comment
which referred to it as "do_work".

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-12 12:10:17 -07:00
Linus Torvalds
152036d137 Merge tag 'nfsd-5.7-rc-2' of git://git.linux-nfs.org/projects/cel/cel-2.6
Pull nfsd fixes from Chuck Lever:
 "Resolve a data integrity problem with NFSD that I inadvertently
  introduced last year.

  The change I made makes the NFS server's duplicate reply cache
  ineffective when krb5i or krb5p are in use, thus allowing the replay
  of non-idempotent NFS requests such as RENAME, SETATTR, or even
  WRITEs"

* tag 'nfsd-5.7-rc-2' of git://git.linux-nfs.org/projects/cel/cel-2.6:
  SUNRPC: Revert 241b1f419f ("SUNRPC: Remove xdr_buf_trim()")
  SUNRPC: Fix GSS privacy computation of auth->au_ralign
  SUNRPC: Add "@len" parameter to gss_unwrap()
2020-05-11 12:04:52 -07:00
Linus Torvalds
995b819f29 drm: fix trivial field description cut-and-paste error
As reported by Amarnath Baliyase, the drm_mode_status enumeration
documentation describes MODE_V_ILLEGAL as "mode has illegal horizontal
timings".  But that's just a cut-and-paste error from the previous line.
The "V" stands for vertical, of course.

I'm just fixing this directly rather than bothering with going through
the proper channels.  Less work for everybody.

Reported-by: Amarnath Baliyase <baliyaseamarnath@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-11 10:48:53 -07:00
Paul Blakey
2c8897953f netfilter: flowtable: Add pending bit for offload work
Gc step can queue offloaded flow del work or stats work.
Those work items can race each other and a flow could be freed
before the stats work is executed and querying it.
To avoid that, add a pending bit that if a work exists for a flow
don't queue another work for it.
This will also avoid adding multiple stats works in case stats work
didn't complete but gc step started again.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-05-11 16:26:33 +02:00
Arnd Bergmann
2c407aca64 netfilter: conntrack: avoid gcc-10 zero-length-bounds warning
gcc-10 warns around a suspicious access to an empty struct member:

net/netfilter/nf_conntrack_core.c: In function '__nf_conntrack_alloc':
net/netfilter/nf_conntrack_core.c:1522:9: warning: array subscript 0 is outside the bounds of an interior zero-length array 'u8[0]' {aka 'unsigned char[0]'} [-Wzero-length-bounds]
 1522 |  memset(&ct->__nfct_init_offset[0], 0,
      |         ^~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from net/netfilter/nf_conntrack_core.c:37:
include/net/netfilter/nf_conntrack.h:90:5: note: while referencing '__nfct_init_offset'
   90 |  u8 __nfct_init_offset[0];
      |     ^~~~~~~~~~~~~~~~~~

The code is correct but a bit unusual. Rework it slightly in a way that
does not trigger the warning, using an empty struct instead of an empty
array. There are probably more elegant ways to do this, but this is the
smallest change.

Fixes: c41884ce05 ("netfilter: conntrack: avoid zeroing timer")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-05-10 23:20:24 +02:00
Linus Torvalds
0a85ed6e7f Merge tag 'block-5.7-2020-05-09' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:

 - a small series fixing a use-after-free of bdi name (Christoph,Yufen)

 - NVMe fix for a regression with the smaller CQ update (Alexey)

 - NVMe fix for a hang at namespace scanning error recovery (Sagi)

 - fix race with blk-iocost iocg->abs_vdebt updates (Tejun)

* tag 'block-5.7-2020-05-09' of git://git.kernel.dk/linux-block:
  nvme: fix possible hang when ns scanning fails during error recovery
  nvme-pci: fix "slimmer CQ head update"
  bdi: add a ->dev_name field to struct backing_dev_info
  bdi: use bdi_dev_name() to get device name
  bdi: move bdi_dev_name out of line
  vboxsf: don't use the source name in the bdi name
  iocost: protect iocg->abs_vdebt with iocg->waitq.lock
2020-05-10 11:16:07 -07:00
Christoph Hellwig
6bd87eec23 bdi: add a ->dev_name field to struct backing_dev_info
Cache a copy of the name for the life time of the backing_dev_info
structure so that we can reference it even after unregistering.

Fixes: 68f23b8906 ("memcg: fix a crash in wb_workfn when a device disappears")
Reported-by: Yufen Yu <yuyufen@huawei.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-09 16:07:57 -06:00
Yufen Yu
d51cfc53ad bdi: use bdi_dev_name() to get device name
Use the common interface bdi_dev_name() to get device name.

Signed-off-by: Yufen Yu <yuyufen@huawei.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>

Add missing <linux/backing-dev.h> include BFQ

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-09 16:07:39 -06:00
Jakub Kicinski
14d8f7486a Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
pull-request: bpf 2020-05-09

The following pull-request contains BPF updates for your *net* tree.

We've added 4 non-merge commits during the last 9 day(s) which contain
a total of 4 files changed, 11 insertions(+), 6 deletions(-).

The main changes are:

1) Fix msg_pop_data() helper incorrectly setting an sge length in some
   cases as well as fixing bpf_tcp_ingress() wrongly accounting bytes
   in sg.size, from John Fastabend.

2) Fix to return an -EFAULT error when copy_to_user() of the value
   fails in map_lookup_and_delete_elem(), from Wei Yongjun.

3) Fix sk_psock refcnt leak in tcp_bpf_recvmsg(), from Xiyu Yang.
====================

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-05-08 18:58:39 -07:00
Linus Torvalds
4334f30ebf Merge tag 'char-misc-5.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver fixes from Greg KH:
 "Here are some small driver fixes for 5.7-rc5 that resolve a number of
  minor reported issues:

   - mhi bus driver fixes found as people actually use the code

   - phy driver fixes and compat string additions

   - most driver fix due to link order changing when the core moved out
     of staging

   - mei driver fix

   - interconnect build warning fix

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'char-misc-5.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
  bus: mhi: core: Fix channel device name conflict
  bus: mhi: core: Fix typo in comment
  bus: mhi: core: Offload register accesses to the controller
  bus: mhi: core: Remove link_status() callback
  bus: mhi: core: Make sure to powerdown if mhi_sync_power_up fails
  bus: mhi: Fix parsing of mhi_flags
  mei: me: disable mei interface on LBG servers.
  phy: qualcomm: usb-hs-28nm: Prepare clocks in init
  MAINTAINERS: Add Vinod Koul as Generic PHY co-maintainer
  interconnect: qcom: Move the static keyword to the front of declaration
  most: core: use function subsys_initcall()
  bus: mhi: core: Fix a NULL vs IS_ERR check in mhi_create_devices()
  phy: qcom-qusb2: Re add "qcom,sdm845-qusb2-phy" compat string
  phy: tegra: Select USB_COMMON for usb_get_maximum_speed()
2020-05-08 09:11:53 -07:00
Linus Torvalds
c61529f6f5 Merge tag 'driver-core-5.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core fixes from Greg KH:
 "Here are a number of small driver core fixes for 5.7-rc5 to resolve a
  bunch of reported issues with the current tree.

  Biggest here are the reverts and patches from John Stultz to resolve a
  bunch of deferred probe regressions we have been seeing in 5.7-rc
  right now.

  Along with those are some other smaller fixes:

   - coredump crash fix

   - devlink fix for when permissive mode was enabled

   - amba and platform device dma_parms fixes

   - component error silenced for when deferred probe happens

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'driver-core-5.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
  regulator: Revert "Use driver_deferred_probe_timeout for regulator_init_complete_work"
  driver core: Ensure wait_for_device_probe() waits until the deferred_probe_timeout fires
  driver core: Use dev_warn() instead of dev_WARN() for deferred_probe_timeout warnings
  driver core: Revert default driver_deferred_probe_timeout value to 0
  component: Silence bind error on -EPROBE_DEFER
  driver core: Fix handling of fw_devlink=permissive
  coredump: fix crash when umh is disabled
  amba: Initialize dma_parms for amba devices
  driver core: platform: Initialize dma_parms for platform devices
2020-05-08 09:06:34 -07:00
Suravee Suthikulpanit
54163a346d KVM: Introduce kvm_make_all_cpus_request_except()
This allows making request to all other vcpus except the one
specified in the parameter.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Message-Id: <1588771076-73790-2-git-send-email-suravee.suthikulpanit@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-05-08 07:44:32 -04:00
Linus Torvalds
79dede78c0 Merge branch 'for-v5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security subsystem fix from James Morris:
 "Fix the default value of fs_context_parse_param hook"

* 'for-v5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
  security: Fix the default value of fs_context_parse_param hook
2020-05-07 19:43:13 -07:00
Arnd Bergmann
ee28755668 net: bareudp: avoid uninitialized variable warning
clang points out that building without IPv6 would lead to returning
an uninitialized variable if a packet with family!=AF_INET is
passed into bareudp_udp_encap_recv():

drivers/net/bareudp.c:139:6: error: variable 'err' is used uninitialized whenever 'if' condition is false [-Werror,-Wsometimes-uninitialized]
        if (family == AF_INET)
            ^~~~~~~~~~~~~~~~~
drivers/net/bareudp.c:146:15: note: uninitialized use occurs here
        if (unlikely(err)) {
                     ^~~
include/linux/compiler.h:78:42: note: expanded from macro 'unlikely'
 # define unlikely(x)    __builtin_expect(!!(x), 0)
                                            ^
drivers/net/bareudp.c:139:2: note: remove the 'if' if its condition is always true
        if (family == AF_INET)
        ^~~~~~~~~~~~~~~~~~~~~~

This cannot happen in practice, so change the condition in a way that
gcc sees the IPv4 case as unconditionally true here.
For consistency, change all the similar constructs in this file the
same way, using "if(IS_ENABLED())" instead of #if IS_ENABLED()".

Fixes: 571912c69f ("net: UDP tunnel encapsulation module for tunnelling different protocols like MPLS, IP, NSH etc.")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-07 17:28:18 -07:00
Linus Torvalds
192ffb7515 Merge tag 'trace-v5.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fixes from Steven Rostedt:

 - Fix bootconfig causing kernels to fail with CONFIG_BLK_DEV_RAM
   enabled

 - Fix allocation leaks in bootconfig tool

 - Fix a double initialization of a variable

 - Fix API bootconfig usage from kprobe boot time events

 - Reject NULL location for kprobes

 - Fix crash caused by preempt delay module not cleaning up kthread
   correctly

 - Add vmalloc_sync_mappings() to prevent x86_64 page faults from
   recursively faulting from tracing page faults

 - Fix comment in gpu/trace kerneldoc header

 - Fix documentation of how to create a trace event class

 - Make the local tracing_snapshot_instance_cond() function static

* tag 'trace-v5.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tools/bootconfig: Fix resource leak in apply_xbc()
  tracing: Make tracing_snapshot_instance_cond() static
  tracing: Fix doc mistakes in trace sample
  gpu/trace: Minor comment updates for gpu_mem_total tracepoint
  tracing: Add a vmalloc_sync_mappings() for safe measure
  tracing: Wait for preempt irq delay thread to finish
  tracing/kprobes: Reject new event if loc is NULL
  tracing/boottime: Fix kprobe event API usage
  tracing/kprobes: Fix a double initialization typo
  bootconfig: Fix to remove bootconfig data from initrd while boot
2020-05-07 15:27:11 -07:00
Takashi Iwai
c1f6e3c818 ALSA: rawmidi: Fix racy buffer resize under concurrent accesses
The rawmidi core allows user to resize the runtime buffer via ioctl,
and this may lead to UAF when performed during concurrent reads or
writes: the read/write functions unlock the runtime lock temporarily
during copying form/to user-space, and that's the race window.

This patch fixes the hole by introducing a reference counter for the
runtime buffer read/write access and returns -EBUSY error when the
resize is performed concurrently against read/write.

Note that the ref count field is a simple integer instead of
refcount_t here, since the all contexts accessing the buffer is
basically protected with a spinlock, hence we need no expensive atomic
ops.  Also, note that this busy check is needed only against read /
write functions, and not in receive/transmit callbacks; the race can
happen only at the spinlock hole mentioned in the above, while the
whole function is protected for receive / transmit callbacks.

Reported-by: butt3rflyh4ck <butterflyhuangxx@gmail.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/CAFcO6XMWpUVK_yzzCpp8_XP7+=oUpQvuBeCbMffEDkpe8jWrfg@mail.gmail.com
Link: https://lore.kernel.org/r/s5heerw3r5z.wl-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2020-05-07 22:29:14 +02:00
Yiwei Zhang
386c82a703 gpu/trace: Minor comment updates for gpu_mem_total tracepoint
This change updates the improper comment for the 'size' attribute in the
tracepoint definition. Most gfx drivers pre-fault in physical pages
instead of making virtual allocations. So we drop the 'Virtual' keyword
here and leave this to the implementations.

Link: http://lkml.kernel.org/r/20200428220825.169606-1-zzyiwei@google.com

Signed-off-by: Yiwei Zhang <zzyiwei@google.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-05-07 13:32:57 -04:00
Maciej Żenczykowski
64082b67ba net: remove spurious declaration of tcp_default_init_rwnd()
it doesn't actually exist...

Test: builds and 'git grep tcp_default_init_rwnd' comes up empty
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-07 07:58:55 -07:00
Christoph Hellwig
eb7ae5e06b bdi: move bdi_dev_name out of line
bdi_dev_name is not a fast path function, move it out of line.  This
prepares for using it from modular callers without having to export
an implementation detail like bdi_unknown_name.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-07 08:45:47 -06:00
Linus Torvalds
a811c1fa0a Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from David Miller:

 1) Fix reference count leaks in various parts of batman-adv, from Xiyu
    Yang.

 2) Update NAT checksum even when it is zero, from Guillaume Nault.

 3) sk_psock reference count leak in tls code, also from Xiyu Yang.

 4) Sanity check TCA_FQ_CODEL_DROP_BATCH_SIZE netlink attribute in
    fq_codel, from Eric Dumazet.

 5) Fix panic in choke_reset(), also from Eric Dumazet.

 6) Fix VLAN accel handling in bnxt_fix_features(), from Michael Chan.

 7) Disallow out of range quantum values in sch_sfq, from Eric Dumazet.

 8) Fix crash in x25_disconnect(), from Yue Haibing.

 9) Don't pass pointer to local variable back to the caller in
    nf_osf_hdr_ctx_init(), from Arnd Bergmann.

10) Wireguard should use the ECN decap helper functions, from Toke
    Høiland-Jørgensen.

11) Fix command entry leak in mlx5 driver, from Moshe Shemesh.

12) Fix uninitialized variable access in mptcp's
    subflow_syn_recv_sock(), from Paolo Abeni.

13) Fix unnecessary out-of-order ingress frame ordering in macsec, from
    Scott Dial.

14) IPv6 needs to use a global serial number for dst validation just
    like ipv4, from David Ahern.

15) Fix up PTP_1588_CLOCK deps, from Clay McClure.

16) Missing NLM_F_MULTI flag in gtp driver netlink messages, from
    Yoshiyuki Kurauchi.

17) Fix a regression in that dsa user port errors should not be fatal,
    from Florian Fainelli.

18) Fix iomap leak in enetc driver, from Dejin Zheng.

19) Fix use after free in lec_arp_clear_vccs(), from Cong Wang.

20) Initialize protocol value earlier in neigh code paths when
    generating events, from Roman Mashak.

21) netdev_update_features() must be called with RTNL mutex in macsec
    driver, from Antoine Tenart.

22) Validate untrusted GSO packets even more strictly, from Willem de
    Bruijn.

23) Wireguard decrypt worker needs a cond_resched(), from Jason
    Donenfeld.

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (111 commits)
  net: flow_offload: skip hw stats check for FLOW_ACTION_HW_STATS_DONT_CARE
  MAINTAINERS: put DYNAMIC INTERRUPT MODERATION in proper order
  wireguard: send/receive: use explicit unlikely branch instead of implicit coalescing
  wireguard: selftests: initalize ipv6 members to NULL to squelch clang warning
  wireguard: send/receive: cond_resched() when processing worker ringbuffers
  wireguard: socket: remove errant restriction on looping to self
  wireguard: selftests: use normal kernel stack size on ppc64
  net: ethernet: ti: am65-cpsw-nuss: fix irqs type
  ionic: Use debugfs_create_bool() to export bool
  net: dsa: Do not leave DSA master with NULL netdev_ops
  net: dsa: remove duplicate assignment in dsa_slave_add_cls_matchall_mirred
  net: stricter validation of untrusted gso packets
  seg6: fix SRH processing to comply with RFC8754
  net: mscc: ocelot: ANA_AUTOAGE_AGE_PERIOD holds a value in seconds, not ms
  net: dsa: ocelot: the MAC table on Felix is twice as large
  net: dsa: sja1105: the PTP_CLK extts input reacts on both edges
  selftests: net: tcp_mmap: fix SO_RCVLOWAT setting
  net: hsr: fix incorrect type usage for protocol variable
  net: macsec: fix rtnl locking issue
  net: mvpp2: cls: Prevent buffer overflow in mvpp2_ethtool_cls_rule_del()
  ...
2020-05-06 20:53:22 -07:00
Pablo Neira Ayuso
16f8036086 net: flow_offload: skip hw stats check for FLOW_ACTION_HW_STATS_DONT_CARE
This patch adds FLOW_ACTION_HW_STATS_DONT_CARE which tells the driver
that the frontend does not need counters, this hw stats type request
never fails. The FLOW_ACTION_HW_STATS_DISABLED type explicitly requests
the driver to disable the stats, however, if the driver cannot disable
counters, it bails out.

TCA_ACT_HW_STATS_* maintains the 1:1 mapping with FLOW_ACTION_HW_STATS_*
except by disabled which is mapped to FLOW_ACTION_HW_STATS_DISABLED
(this is 0 in tc). Add tc_act_hw_stats() to perform the mapping between
TCA_ACT_HW_STATS_* and FLOW_ACTION_HW_STATS_*.

Fixes: 319a1d1947 ("flow_offload: check for basic action hw stats type")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-06 20:13:10 -07:00
Willem de Bruijn
9274124f02 net: stricter validation of untrusted gso packets
Syzkaller again found a path to a kernel crash through bad gso input:
a packet with transport header extending beyond skb_headlen(skb).

Tighten validation at kernel entry:

- Verify that the transport header lies within the linear section.

    To avoid pulling linux/tcp.h, verify just sizeof tcphdr.
    tcp_gso_segment will call pskb_may_pull (th->doff * 4) before use.

- Match the gso_type against the ip_proto found by the flow dissector.

Fixes: bfd5f4a3d6 ("packet: Add GSO/csum offload support.")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-06 17:23:06 -07:00
Vladimir Oltean
21ce7f3e16 net: dsa: ocelot: the MAC table on Felix is twice as large
When running 'bridge fdb dump' on Felix, sometimes learnt and static MAC
addresses would appear, sometimes they wouldn't.

Turns out, the MAC table has 4096 entries on VSC7514 (Ocelot) and 8192
entries on VSC9959 (Felix), so the existing code from the Ocelot common
library only dumped half of Felix's MAC table. They are both organized
as a 4-way set-associative TCAM, so we just need a single variable
indicating the correct number of rows.

Fixes: 5605194877 ("net: dsa: ocelot: add driver for Felix switch family")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-06 17:15:24 -07:00
Linus Torvalds
b9388959ba Merge tag 'tag-chrome-platform-fixes-for-v5.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/chrome-platform/linux
Pull chrome platform fix from Benson Leung:
 "Fix a resource allocation issue in cros_ec_sensorhub.c"

* tag 'tag-chrome-platform-fixes-for-v5.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/chrome-platform/linux:
  platform/chrome: cros_ec_sensorhub: Allocate sensorhub resource before claiming sensors
2020-05-06 16:40:14 -07:00
John Fastabend
81aabbb9fb bpf, sockmap: bpf_tcp_ingress needs to subtract bytes from sg.size
In bpf_tcp_ingress we used apply_bytes to subtract bytes from sg.size
which is used to track total bytes in a message. But this is not
correct because apply_bytes is itself modified in the main loop doing
the mem_charge.

Then at the end of this we have sg.size incorrectly set and out of
sync with actual sk values. Then we can get a splat if we try to
cork the data later and again try to redirect the msg to ingress. To
fix instead of trying to track msg.size do the easy thing and include
it as part of the sk_msg_xfer logic so that when the msg is moved the
sg.size is always correct.

To reproduce the below users will need ingress + cork and hit an
error path that will then try to 'free' the skmsg.

[  173.699981] BUG: KASAN: null-ptr-deref in sk_msg_free_elem+0xdd/0x120
[  173.699987] Read of size 8 at addr 0000000000000008 by task test_sockmap/5317

[  173.700000] CPU: 2 PID: 5317 Comm: test_sockmap Tainted: G          I       5.7.0-rc1+ #43
[  173.700005] Hardware name: Dell Inc. Precision 5820 Tower/002KVM, BIOS 1.9.2 01/24/2019
[  173.700009] Call Trace:
[  173.700021]  dump_stack+0x8e/0xcb
[  173.700029]  ? sk_msg_free_elem+0xdd/0x120
[  173.700034]  ? sk_msg_free_elem+0xdd/0x120
[  173.700042]  __kasan_report+0x102/0x15f
[  173.700052]  ? sk_msg_free_elem+0xdd/0x120
[  173.700060]  kasan_report+0x32/0x50
[  173.700070]  sk_msg_free_elem+0xdd/0x120
[  173.700080]  __sk_msg_free+0x87/0x150
[  173.700094]  tcp_bpf_send_verdict+0x179/0x4f0
[  173.700109]  tcp_bpf_sendpage+0x3ce/0x5d0

Fixes: 604326b41a ("bpf, sockmap: convert to generic sk_msg interface")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/158861290407.14306.5327773422227552482.stgit@john-Precision-5820-Tower
2020-05-06 00:22:22 +02:00