Commit Graph

110968 Commits

Author SHA1 Message Date
Felix Kuehling
d8f4981e2e drm/amdgpu: Add flag to wipe VRAM on release
This memory allocation flag will be used to indicate BOs containing
sensitive data that should not be leaked to other processes.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:18:23 -05:00
Felix Kuehling
274840e544 drm/ttm: Add release_notify callback to ttm_bo_driver
This notifies the driver that a BO is about to be released.

Releasing a BO also invokes the move_notify callback from
ttm_bo_cleanup_memtype_use, but that happens too late for anything
that would add fences to the BO and require a delayed delete.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:18:04 -05:00
Le Ma
d6c3b24ea2 drm/amdgpu: add Arcturus asic type
Add asic type for Arcturus.

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:01 -05:00
Xiaojie Yuan
87dbad02d2 drm/amdgpu: add navi14 asic type
Add CHIP_NAVI14 to the list of asic types.

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Jack Xiao <Jack.Xiao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:17:58 -05:00
Emil Velikov
cf03447732 drm/amdgpu: extend AMDGPU_CTX_PRIORITY_NORMAL comment
Currently the AMDGPU_CTX_PRIORITY_* defines are used in both
drm_amdgpu_ctx_in::priority and drm_amdgpu_sched_in::priority.

Extend the comment to mention the CAP_SYS_NICE or DRM_MASTER requirement
is only applicable with the former.

Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-16 13:08:30 -05:00
Alex Deucher
d7929c1e13 Merge branch 'drm-next' into drm-next-5.3
Backmerge drm-next and fix up conflicts due to drmP.h removal.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-06-25 08:42:25 -05:00
Andres Rodriguez
e28ad544f4 drm/edid: parse CEA blocks embedded in DisplayID
DisplayID blocks allow embedding of CEA blocks. The payloads are
identical to traditional top level CEA extension blocks, but the header
is slightly different.

This change allows the CEA parser to find a CEA block inside a DisplayID
block. Additionally, it adds support for parsing the embedded CTA
header. No further changes are necessary due to payload parity.

This change fixes audio support for the Valve Index HMD.

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: <stable@vger.kernel.org> # v4.15
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190619180901.17901-1-andresx7@gmail.com
2019-06-25 14:32:26 +10:00
Dave Airlie
dfd03396d7 Merge tag 'drm/tegra/for-5.3-rc1' of git://anongit.freedesktop.org/tegra/linux into drm-next
drm/tegra: Changes for v5.3-rc1

This contains a couple of small improvements and cleanups for the Tegra
DRM driver.

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Thierry Reding <thierry.reding@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621150753.19550-1-thierry.reding@gmail.com
2019-06-25 12:59:43 +10:00
Nikola Cornij
f446489adc drm/amd/display: Add support for extended DSC DPCD caps
[why]
A few of the new DSC DPCD caps were introduced by a DP 1.4a SCR in order
to give DSC branch decoders a chance to expose their maximum throughput
and maximum line width limitations.

Signed-off-by: Nikola Cornij <nikola.cornij@amd.com>
Reviewed-by: Wenjing Liu <Wenjing.Liu@amd.com>
Acked-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-06-22 09:34:11 -05:00
Nikola Cornij
d7cd0e053b drm/amd/display: Add 170Mpix/sec DSC throughput support
[why]
It was missing, although defined in DP spec

[how]
- Add handling of this value to DSC code
- Also remove unused file dsc_helpers.c

Signed-off-by: Nikola Cornij <nikola.cornij@amd.com>
Reviewed-by: Joshua Aberback <Joshua.Aberback@amd.com>
Acked-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-06-22 09:34:11 -05:00
Hawking Zhang
22e96fa62e drm/amdgpu: add pa_sc_tile_steering_override to drm_amdgpu_info_device
the initial/default value of pa_sc_tile_steering_override need to
be exposed to user mode driver

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-06-21 18:58:21 -05:00
Dave Airlie
39a207d0cf Merge tag 'drm-misc-next-2019-06-20' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
drm-misc-next for v5.3:

UAPI Changes:
- Give each dma-buf their own inode, add DMA_BUF_SET_NAME ioctl and a show_fdinfo handler.

Cross-subsystem Changes:
- Pull in the topic/remove-fbcon-notifiers branch:
  * remove fbdev notifier usage for fbcon, as prep work to clean up the fbcon locking
  * assorted locking checks in vt/console code
  * assorted notifier and cleanups in fbdev and backlight code

Core Changes:
- Make drm_debugfs_create_files() never fail.
- add debug print to update_vblank_count.
- Add DP_DPCD_QUIRK_NO_SINK_COUNT quirk.
- Add todo item for drm_gem_objects.
- Unexport drm_gem_(un)pin/v(un)map.
- Document struct drm_cmdline_mode.
- Rewrite the command handler for mode names, and add support to specify
  rotation, reflection and overscan. With a new selftest! :)
- Fixes to drm/client for improving rotation support, and fixing variable scope.
- Small fixes to self refresh helper.

Driver Changes:
- Add rockchip RK3328 support.
- Assorted driver fixes to rockchip, vc4, rcar-du, vkms.
- Expose panfrost performance counters through unstable ioctl's, hidden
  behind a module parameter.
- Enumerate CRC sources list in vkms.
- Add a basic kms driver for the Ingenic JZ47xx SoC, which will be expanded
  soon with more advanced features.
- Suspend/resume fix for stm.

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/18e22ec1-adf3-3a75-34a3-9fe09a91eef5@linux.intel.com
2019-06-21 13:54:59 +10:00
Dave Airlie
031e610a6a Merge branch 'vmwgfx-next' of git://people.freedesktop.org/~thomash/linux into drm-next
- The coherent memory changes including mm changes.
- Some vmwgfx debug fixes.
- Removal of vmwgfx legacy security checks.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Hellstrom <VMware> <thomas@shipmail.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20190619072531.4026-1-thomas@shipmail.org
2019-06-21 12:18:16 +10:00
Huang Rui
d67383e6b7 drm/amdgpu: add GDDR6 vram type
New vram type.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-06-20 21:14:42 -05:00
Huang Rui
107c34bcbf drm/amdgpu: add NV series gpu family id
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-06-20 15:54:58 -05:00
Huang Rui
852a6626d5 drm/amdgpu: add navi10 asic type
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-06-20 15:54:56 -05:00
Maarten Lankhorst
d609f60add Merge branch 'topic/remove-fbcon-notifiers' into drm-misc-next
topic/remove-fbcon-notifiers:
- remove fbdev notifier usage for fbcon, as prep work to clean up the fbcon locking
- assorted locking checks in vt/console code
- assorted notifier and cleanups in fbdev and backlight code

This is the pull request that was sent out, plus the compile fix for
sh4 reported by kbuild.

Acked-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2019-06-19 12:33:05 +02:00
Maarten Lankhorst
bcb7416e34 Merge remote-tracking branch 'drm/drm-next' into drm-misc-next
remove-fbcon-notifiers topic branch is based on rc4, so we need a fresh
backmerge of drm-next to pull it in.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2019-06-19 12:32:13 +02:00
Maxime Ripard
731514b446 drm/atomic: Add a function to reset connector TV properties
During the connector reset, if that connector has a TV property, it needs
to be reset to the value provided on the command line.

Provide a helper to do that.

Reviewed-by: Noralf Trønnes <noralf@tronnes.org>
Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
Link: https://patchwork.freedesktop.org/patch/msgid/84a7b657f09303a2850e1cc79e68f623547f3fdd.1560783090.git-series.maxime.ripard@bootlin.com
2019-06-19 12:17:52 +02:00
Maxime Ripard
3d46a3007c drm/modes: Parse overscan properties
Properly configuring the overscan properties might be needed for the
initial setup of the framebuffer for display that still have overscan.
Let's allow for more properties on the kernel command line to setup each
margin.

Reviewed-by: Noralf Trønnes <noralf@tronnes.org>
Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
Link: https://patchwork.freedesktop.org/patch/msgid/e481f1628e3768ca49226ec2115cfa4dfcbd5e4c.1560783090.git-series.maxime.ripard@bootlin.com
2019-06-19 12:17:51 +02:00
Maxime Ripard
22045e8e52 drm/connector: Introduce a TV margins structure
The TV margins has been defined as a structure inside the
drm_connector_state structure so far. However, we will need it in other
structures as well, so let's move that structure definition so that it can
be reused.

Reviewed-by: Noralf Trønnes <noralf@tronnes.org>
Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
Link: https://patchwork.freedesktop.org/patch/msgid/38b773b03f15ec7a135cdf8f7db669e5ada20cf2.1560783090.git-series.maxime.ripard@bootlin.com
2019-06-19 12:17:51 +02:00
Maxime Ripard
1bf4e09227 drm/modes: Allow to specify rotation and reflection on the commandline
Rotations and reflections setup are needed in some scenarios to initialise
properly the initial framebuffer. Some drivers already had a bunch of
quirks to deal with this, such as either a private kernel command line
parameter (omapdss) or on the device tree (various panels).

In order to accomodate this, let's create a video mode parameter to deal
with the rotation and reflexion.

Reviewed-by: Noralf Trønnes <noralf@tronnes.org>
Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
Link: https://patchwork.freedesktop.org/patch/msgid/777da16e42db757c1f5b414b5ca34507097fed5c.1560783090.git-series.maxime.ripard@bootlin.com
2019-06-19 12:17:51 +02:00
Maxime Ripard
3aeeb13d89 drm/modes: Support modes names on the command line
The drm subsystem also uses the video= kernel parameter, and in the
documentation refers to the fbdev documentation for that parameter.

However, that documentation also says that instead of giving the mode using
its resolution we can also give a name. However, DRM doesn't handle that
case at the moment. Even though in most case it shouldn't make any
difference, it might be useful for analog modes, where different standards
might have the same resolution, but still have a few different parameters
that are not encoded in the modes (NTSC vs NTSC-J vs PAL-M for example).

Reviewed-by: Noralf Trønnes <noralf@tronnes.org>
Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
Link: https://patchwork.freedesktop.org/patch/msgid/18443e0c3bdbbd16cea4ec63bc7f2079b820b43b.1560783090.git-series.maxime.ripard@bootlin.com
2019-06-19 12:17:50 +02:00
Maxime Ripard
a99076e87e drm/client: Change drm_client_panel_rotation name
The drm_client_panel_rotation function has been used so far to set the
default rotation based on the panel orientation.

However, we can have more sources of information to make that decision,
starting with the command line that we will introduce later in this series.

Change the name to remove the panel mention.

Reviewed-by: Noralf Trønnes <noralf@tronnes.org>
Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
Link: https://patchwork.freedesktop.org/patch/msgid/8cb0f0d9569d41685bbf30a1538da6578cd2769b.1560783090.git-series.maxime.ripard@bootlin.com
2019-06-19 12:17:49 +02:00
Maxime Ripard
772cd52c55 drm/connector: Add documentation for drm_cmdline_mode
The struct drm_cmdline_mode holds the result of the command line parsers.
However, it wasn't documented so far, so let's do that.

Reviewed-by: Noralf Trønnes <noralf@tronnes.org>
Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
Link: https://patchwork.freedesktop.org/patch/msgid/963c893c16c6a25fc469b53c726f493d99bdc578.1560783090.git-series.maxime.ripard@bootlin.com
2019-06-19 12:17:48 +02:00
Daniel Vetter
52d2d44eee Merge v5.2-rc5 into drm-next
Maarten needs -rc4 backmerged so he can pull in the fbcon notifier
removal topic branch into drm-misc-next.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2019-06-19 12:07:29 +02:00
Boris Brezillon
7786fd1087 drm/panfrost: Expose performance counters through unstable ioctls
Expose performance counters through 2 driver specific ioctls: one to
enable/disable the perfcnt block, and one to dump the counter values.

There are discussions to expose global performance monitors (those
counters that can't be retrieved on a per-job basis) in a consistent
way, but this is likely to take time to settle on something that works
for various HW/users.
The ioctls are marked unstable so we can get rid of them when the time
comes. We initally went for a debugfs-based interface, but this was
making the transition to per-FD address space more complicated (we need
to specify the namespace the GPU has to use when dumping the perf
counters), hence the decision to switch back to driver specific ioctls
which are passed the FD they operate on and thus will have a dedicated
address space attached to them.

Other than that, the implementation is pretty simple: it basically dumps
all counters and copy the values to a userspace buffer. The parsing is
left to userspace which has to know the specific layout that's used
by the GPU (layout differs on a per-revision basis).

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Signed-off-by: Rob Herring <robh@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20190618081648.17297-5-boris.brezillon@collabora.com
2019-06-18 09:23:48 -06:00
Thomas Hellstrom
4ba3976712 drm/vmwgfx: Add surface dirty-tracking callbacks
Add the callbacks necessary to implement emulated coherent memory for
surfaces. Add a flag to the gb_surface_create ioctl to indicate that
surface memory should be coherent.
Also bump the drm minor version to signal the availability of coherent
surfaces.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Deepak Rawat <drawat@vmware.com>
2019-06-18 15:19:35 +02:00
Thomas Hellstrom
7a39f35ce4 drm/ttm: TTM fault handler helpers
With the vmwgfx dirty tracking, the default TTM fault handler is not
completely sufficient (vmwgfx need to modify the vma->vm_flags member,
and also needs to restrict the number of prefaults).

We also want to replicate the new ttm_bo_vm_reserve() functionality

So start turning the TTM vm code into helpers: ttm_bo_vm_fault_reserved()
and ttm_bo_vm_reserve(), and provide a default TTM fault handler for other
drivers to use.

Cc: "Christian König" <christian.koenig@amd.com>

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: "Christian König" <christian.koenig@amd.com> #v1
2019-06-18 15:19:34 +02:00
Thomas Hellstrom
32d1f6985c drm/ttm: Allow the driver to provide the ttm struct vm_operations_struct
Add a pointer to the struct vm_operations_struct in the bo_device, and
assign that pointer to the default value currently used.

The driver can then optionally modify that pointer and the new value
can be used for each new vma created.

Cc: "Christian König" <christian.koenig@amd.com>

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2019-06-18 15:19:34 +02:00
Thomas Hellstrom
4fe51e9e79 mm: Add write-protect and clean utilities for address space ranges
Add two utilities to a) write-protect and b) clean all ptes pointing into
a range of an address space.
The utilities are intended to aid in tracking dirty pages (either
driver-allocated system memory or pci device memory).
The write-protect utility should be used in conjunction with
page_mkwrite() and pfn_mkwrite() to trigger write page-faults on page
accesses. Typically one would want to use this on sparse accesses into
large memory regions. The clean utility should be used to utilize
hardware dirtying functionality and avoid the overhead of page-faults,
typically on large accesses into small memory regions.

The added file "as_dirty_helpers.c" is initially listed as maintained by
VMware under our DRM driver. If somebody would like it elsewhere,
that's of course no problem.

Notable changes since RFC:
- Added comments to help avoid the usage of these function for VMAs
  it's not intended for. We also do advisory checks on the vm_flags and
  warn on illegal usage.
- Perform the pte modifications the same way softdirty does.
- Add mmu_notifier range invalidation calls.
- Add a config option so that this code is not unconditionally included.
- Tell the mmu_gather code about pending tlb flushes.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Souptick Joarder <jrdr.linux@gmail.com>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Cc: linux-mm@kvack.org
Cc: linux-kernel@vger.kernel.org

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com> #v1
2019-06-18 15:19:34 +02:00
Thomas Hellstrom
29875a5291 mm: Add an apply_to_pfn_range interface
This is basically apply_to_page_range with added functionality:
Allocating missing parts of the page table becomes optional, which
means that the function can be guaranteed not to error if allocation
is disabled. Also passing of the closure struct and callback function
becomes different and more in line with how things are done elsewhere.

Finally we keep apply_to_page_range as a wrapper around apply_to_pfn_range

The reason for not using the page-walk code is that we want to perform
the page-walk on vmas pointing to an address space without requiring the
mmap_sem to be held rather than on vmas belonging to a process with the
mmap_sem held.

Notable changes since RFC:
Don't export apply_to_pfn range.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Souptick Joarder <jrdr.linux@gmail.com>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Cc: linux-mm@kvack.org
Cc: linux-kernel@vger.kernel.org

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com> #v1
2019-06-18 15:19:33 +02:00
Daniel Vetter
eb69c8a4bf drm/gem: Unexport drm_gem_(un)pin/v(un)map
They're purely for internal use, not for drivers.

Cc: Noralf Trønnes <noralf@tronnes.org>
Cc: Christian König <christian.koenig@amd.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190614203615.12639-3-daniel.vetter@ffwll.ch
2019-06-17 17:37:01 +02:00
Maarten Lankhorst
f5500f385b Merge remote-tracking branch 'drm/drm-next' into drm-misc-next
Pick up rc3 and rc4 and the merges from the other branches,
we're a bit out of date.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2019-06-17 10:17:38 +02:00
Linus Torvalds
963172d9c7 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
 "The accumulated fixes from this and last week:

   - Fix vmalloc TLB flush and map range calculations which lead to
     stale TLBs, spurious faults and other hard to diagnose issues.

   - Use fault_in_pages_writable() for prefaulting the user stack in the
     FPU code as it's less fragile than the current solution

   - Use the PF_KTHREAD flag when checking for a kernel thread instead
     of current->mm as the latter can give the wrong answer due to
     use_mm()

   - Compute the vmemmap size correctly for KASLR and 5-Level paging.
     Otherwise this can end up with a way too small vmemmap area.

   - Make KASAN and 5-level paging work again by making sure that all
     invalid bits are masked out when computing the P4D offset. This
     worked before but got broken recently when the LDT remap area was
     moved.

   - Prevent a NULL pointer dereference in the resource control code
     which can be triggered with certain mount options when the
     requested resource is not available.

   - Enforce ordering of microcode loading vs. perf initialization on
     secondary CPUs. Otherwise perf tries to access a non-existing MSR
     as the boot CPU marked it as available.

   - Don't stop the resource control group walk early otherwise the
     control bitmaps are not updated correctly and become inconsistent.

   - Unbreak kgdb by returning 0 on success from
     kgdb_arch_set_breakpoint() instead of an error code.

   - Add more Icelake CPU model defines so depending changes can be
     queued in other trees"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/microcode, cpuhotplug: Add a microcode loader CPU hotplug callback
  x86/kasan: Fix boot with 5-level paging and KASAN
  x86/fpu: Don't use current->mm to check for a kthread
  x86/kgdb: Return 0 from kgdb_arch_set_breakpoint()
  x86/resctrl: Prevent NULL pointer dereference when local MBM is disabled
  x86/resctrl: Don't stop walking closids when a locksetup group is found
  x86/fpu: Update kernel's FPU state before using for the fsave header
  x86/mm/KASLR: Compute the size of the vmemmap section properly
  x86/fpu: Use fault_in_pages_writeable() for pre-faulting
  x86/CPU: Add more Icelake model numbers
  mm/vmalloc: Avoid rare case of flushing TLB with weird arguments
  mm/vmalloc: Fix calculation of direct map addr range
2019-06-16 07:28:14 -10:00
Borislav Petkov
78f4e932f7 x86/microcode, cpuhotplug: Add a microcode loader CPU hotplug callback
Adric Blake reported the following warning during suspend-resume:

  Enabling non-boot CPUs ...
  x86: Booting SMP configuration:
  smpboot: Booting Node 0 Processor 1 APIC 0x2
  unchecked MSR access error: WRMSR to 0x10f (tried to write 0x0000000000000000) \
   at rIP: 0xffffffff8d267924 (native_write_msr+0x4/0x20)
  Call Trace:
   intel_set_tfa
   intel_pmu_cpu_starting
   ? x86_pmu_dead_cpu
   x86_pmu_starting_cpu
   cpuhp_invoke_callback
   ? _raw_spin_lock_irqsave
   notify_cpu_starting
   start_secondary
   secondary_startup_64
  microcode: sig=0x806ea, pf=0x80, revision=0x96
  microcode: updated to revision 0xb4, date = 2019-04-01
  CPU1 is up

The MSR in question is MSR_TFA_RTM_FORCE_ABORT and that MSR is emulated
by microcode. The log above shows that the microcode loader callback
happens after the PMU restoration, leading to the conjecture that
because the microcode hasn't been updated yet, that MSR is not present
yet, leading to the #GP.

Add a microcode loader-specific hotplug vector which comes before
the PERF vectors and thus executes earlier and makes sure the MSR is
present.

Fixes: 400816f60c ("perf/x86/intel: Implement support for TSX Force Abort")
Reported-by: Adric Blake <promarbler14@gmail.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: <stable@vger.kernel.org>
Cc: x86@kernel.org
Link: https://bugzilla.kernel.org/show_bug.cgi?id=203637
2019-06-15 10:00:29 +02:00
Linus Torvalds
0011572c88 Merge branch 'for-5.2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup fixes from Tejun Heo:
 "This has an unusually high density of tricky fixes:

   - task_get_css() could deadlock when it races against a dying cgroup.

   - cgroup.procs didn't list thread group leaders with live threads.

     This could mislead readers to think that a cgroup is empty when
     it's not. Fixed by making PROCS iterator include dead tasks. I made
     a couple mistakes making this change and this pull request contains
     a couple follow-up patches.

   - When cpusets run out of online cpus, it updates cpusmasks of member
     tasks in bizarre ways. Joel improved the behavior significantly"

* 'for-5.2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cpuset: restore sanity to cpuset_cpus_allowed_fallback()
  cgroup: Fix css_task_iter_advance_css_set() cset skip condition
  cgroup: css_task_iter_skip()'d iterators must be advanced before accessed
  cgroup: Include dying leaders with live threads in PROCS iterations
  cgroup: Implement css_task_iter_skip()
  cgroup: Call cgroup_release() before __exit_signal()
  docs cgroups: add another example size for hugetlb
  cgroup: Use css_tryget() instead of css_tryget_online() in task_get_css()
2019-06-14 17:46:14 -10:00
Linus Torvalds
6aa7a22b97 Merge tag 'drm-fixes-2019-06-14' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Daniel Vetter:
 "Nothing unsettling here, also not aware of anything serious still
  pending.

  The edid override regression fix took a bit longer since this seems to
  be an area with an overabundance of bad options. But the fix we have
  now seems like a good path forward.

  Next week it should be back to Dave.

  Summary:

   - fix regression on amdgpu on SI

   - fix edid override regression

   - driver fixes: amdgpu, i915, mediatek, meson, panfrost

   - fix writecombine for vmap in gem-shmem helper (used by panfrost)

   - add more panel quirks"

* tag 'drm-fixes-2019-06-14' of git://anongit.freedesktop.org/drm/drm: (25 commits)
  drm/amdgpu: return 0 by default in amdgpu_pm_load_smu_firmware
  drm/amdgpu: Fix bounds checking in amdgpu_ras_is_supported()
  drm: add fallback override/firmware EDID modes workaround
  drm/edid: abstract override/firmware EDID retrieval
  drm/i915/perf: fix whitelist on Gen10+
  drm/i915/sdvo: Implement proper HDMI audio support for SDVO
  drm/i915: Fix per-pixel alpha with CCS
  drm/i915/dmc: protect against reading random memory
  drm/i915/dsi: Use a fuzzy check for burst mode clock check
  drm/amdgpu/{uvd,vcn}: fetch ring's read_ptr after alloc
  drm/panfrost: Require the simple_ondemand governor
  drm/panfrost: make devfreq optional again
  drm/gem_shmem: Use a writecombine mapping for ->vaddr
  drm: panel-orientation-quirks: Add quirk for GPD MicroPC
  drm: panel-orientation-quirks: Add quirk for GPD pocket2
  drm/meson: fix G12A primary plane disabling
  drm/meson: fix primary plane disabling
  drm/meson: fix G12A HDMI PLL settings for 4K60 1000/1001 variations
  drm/mediatek: call mtk_dsi_stop() after mtk_drm_crtc_atomic_disable()
  drm/mediatek: clear num_pipes when unbind driver
  ...
2019-06-14 17:34:45 -10:00
Ville Syrjälä
7974033e52 drm/dp: Add DP_DPCD_QUIRK_NO_SINK_COUNT
CH7511 eDP->LVDS bridge doesn't seem to set SINK_COUNT properly
causing i915 to detect it as disconnected. Add a quirk to ignore
SINK_COUNT on these devices.

Cc: David S. <david@majinbuu.com>
Cc: Peteris Rudzusiks <peteris.rudzusiks@gmail.com>
Tested-by: Peteris Rudzusiks <peteris.rudzusiks@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105406
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190528140650.19230-1-ville.syrjala@linux.intel.com
Acked-by: Jani Nikula <jani.nikula@intel.com> #irc
2019-06-14 19:11:10 +03:00
Linus Torvalds
fd6b99fa41 Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
 "16 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  mm/devm_memremap_pages: fix final page put race
  PCI/P2PDMA: track pgmap references per resource, not globally
  lib/genalloc: introduce chunk owners
  PCI/P2PDMA: fix the gen_pool_add_virt() failure path
  mm/devm_memremap_pages: introduce devm_memunmap_pages
  drivers/base/devres: introduce devm_release_action()
  mm/vmscan.c: fix trying to reclaim unevictable LRU page
  coredump: fix race condition between collapse_huge_page() and core dumping
  mm/mlock.c: change count_mm_mlocked_page_nr return type
  mm: mmu_gather: remove __tlb_reset_range() for force flush
  fs/ocfs2: fix race in ocfs2_dentry_attach_lock()
  mm/vmscan.c: fix recent_rotated history
  mm/mlock.c: mlockall error for flag MCL_ONFAULT
  scripts/decode_stacktrace.sh: prefix addr2line with $CROSS_COMPILE
  mm/list_lru.c: fix memory leak in __memcg_init_list_lru_node
  mm: memcontrol: don't batch updates of local VM stats and events
2019-06-14 06:08:46 -10:00
Linus Torvalds
bcb46a0e0e Merge tag 'sound-5.2-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
 "It might feel like deja vu to receive a bulk of changes at rc5, and it
  happens again; we've got a collection of fixes for ASoC. Most of fixes
  are targeted for the newly merged SOF (Sound Open Firmware) stuff and
  the relevant fixes for Intel platforms.

  Other than that, there are a few regression fixes for the recent ASoC
  core changes and HD-audio quirk, as well as a couple of FireWire fixes
  and for other ASoC codecs"

* tag 'sound-5.2-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (54 commits)
  Revert "ALSA: hda/realtek - Improve the headset mic for Acer Aspire laptops"
  ALSA: ice1712: Check correct return value to snd_i2c_sendbytes (EWS/DMX 6Fire)
  ALSA: oxfw: allow PCM capture for Stanton SCS.1m
  ALSA: firewire-motu: fix destruction of data for isochronous resources
  ASoC: Intel: sst: fix kmalloc call with wrong flags
  ASoC: core: Fix deadlock in snd_soc_instantiate_card()
  SoC: rt274: Fix internal jack assignment in set_jack callback
  ALSA: hdac: fix memory release for SST and SOF drivers
  ASoC: SOF: Intel: hda: use the defined ppcap functions
  ASoC: core: move DAI pre-links initiation to snd_soc_instantiate_card
  ASoC: Intel: cht_bsw_rt5672: fix kernel oops with platform_name override
  ASoC: Intel: cht_bsw_nau8824: fix kernel oops with platform_name override
  ASoC: Intel: bytcht_es8316: fix kernel oops with platform_name override
  ASoC: Intel: cht_bsw_max98090: fix kernel oops with platform_name override
  ASoC: sun4i-i2s: Add offset to RX channel select
  ASoC: sun4i-i2s: Fix sun8i tx channel offset mask
  ASoC: max98090: remove 24-bit format support if RJ is 0
  ASoC: da7219: Fix build error without CONFIG_I2C
  ASoC: SOF: Intel: hda: Fix COMPILE_TEST build error
  ASoC: SOF: fix DSP oops definitions in FW ABI
  ...
2019-06-14 05:37:06 -10:00
Daniel Vetter
2454fcea33 Merge tag 'drm-misc-next-2019-06-14' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
drm-misc-next for v5.3:

UAPI Changes:

Cross-subsystem Changes:
- Add code to signal all dma-fences when freed with pending signals.
- Annotate reservation object access in CONFIG_DEBUG_MUTEXES

Core Changes:
- Assorted documentation fixes.
- Use irqsave/restore spinlock to add crc entry.
- Move code around to drm_client, for internal modeset clients.
- Make drm_crtc.h and drm_debugfs.h self-contained.
- Remove drm_fb_helper_connector.
- Add bootsplash to todo.
- Fix lock ordering in pan_display_legacy.
- Support pinning buffers to current location in gem-vram.
- Remove the now unused locking functions from gem-vram.
- Remove the now unused kmap-object argument from vram helpers.
- Stop checking return value of debugfs_create.
- Add atomic encoder enable/disable helpers.
- pass drm_atomic_state to atomic connector check.
- Add atomic support for bridge enable/disable.
- Add self refresh helpers to core.

Driver Changes:
- Add extra delay to make MTP SDM845 work.
- Small fixes to virtio, vkms, sii902x, sii9234, ast, mcde, analogix, rockchip.
- Add zpos and ?BGR8888 support to meson.
- More removals of drm_os_linux and drmP headers for amd, radeon, sti, r128, r128, savage, sis.
- Allow synopsis to unwedge the i2c hdmi bus.
- Add orientation quirks for GPD panels.
- Edid cleanups and fixing handling for edid < 1.2.
- Add runtime pm to stm.
- Handle s/r in dw-hdmi.
- Add hooks for power on/off to dsi for stm.
- Remove virtio dirty tracking code, done in drm core.
- Rework BO handling in ast and mgag200.

Tiny conflict in drivers/gpu/drm/amd/display/dc/clk_mgr/clk_mgr.c,
needed #include <linux/slab.h> to make it compile.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/0e01de30-9797-853c-732f-4a5bd6e61445@linux.intel.com
2019-06-14 11:44:24 +02:00
Daniel Vetter
744ed8cb8a Merge tag 'drm-misc-fixes-2019-06-13' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
Sean writes:

meson: A few G12A fixes across the driver (Neil)
quirks: A couple quirks for GPD devices (Hans)
gem_shmem: Use writecombine when vmapping non-dmabuf BOs (Boris)
panfrost: A couple tweaks to requiring devfreq (Neil & Ezequiel)
edid: Ensure we return the override mode when ddc probe fails (Jani)

Cc: Hans de Goede <hdegoede@redhat.com>
Cc: Neil Armstrong <narmstrong@baylibre.com>
Cc: Boris Brezillon <boris.brezillon@collabora.com>
Cc: Ezequiel Garcia <ezequiel@collabora.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Sean Paul <sean@poorly.run>
Link: https://patchwork.freedesktop.org/patch/msgid/20190613143946.GA24233@art_vandelay
2019-06-14 11:36:30 +02:00
Greg Hackmann
bb2bb90304 dma-buf: add DMA_BUF_SET_NAME ioctls
This patch adds complimentary DMA_BUF_SET_NAME  ioctls, which lets
userspace processes attach a free-form name to each buffer.

This information can be extremely helpful for tracking and accounting
shared buffers.  For example, on Android, we know what each buffer will
be used for at allocation time: GL, multimedia, camera, etc.  The
userspace allocator can use DMA_BUF_SET_NAME to associate that
information with the buffer, so we can later give developers a
breakdown of how much memory they're allocating for graphics, camera,
etc.

Signed-off-by: Greg Hackmann <ghackmann@google.com>
Signed-off-by: Chenbo Feng <fengc@google.com>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20190613223408.139221-3-fengc@google.com
2019-06-14 15:00:51 +05:30
Greg Hackmann
ed63bb1d1f dma-buf: give each buffer a full-fledged inode
By traversing /proc/*/fd and /proc/*/map_files, processes with CAP_ADMIN
can get a lot of fine-grained data about how shmem buffers are shared
among processes.  stat(2) on each entry gives the caller a unique
ID (st_ino), the buffer's size (st_size), and even the number of pages
currently charged to the buffer (st_blocks / 512).

In contrast, all dma-bufs share the same anonymous inode.  So while we
can count how many dma-buf fds or mappings a process has, we can't get
the size of the backing buffers or tell if two entries point to the same
dma-buf.  On systems with debugfs, we can get a per-buffer breakdown of
size and reference count, but can't tell which processes are actually
holding the references to each buffer.

Replace the singleton inode with full-fledged inodes allocated by
alloc_anon_inode().  This involves creating and mounting a
mini-pseudo-filesystem for dma-buf, following the example in fs/aio.c.

Signed-off-by: Greg Hackmann <ghackmann@google.com>
Signed-off-by: Chenbo Feng <fengc@google.com>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20190613223408.139221-2-fengc@google.com
2019-06-14 15:00:50 +05:30
Dan Williams
50f44ee724 mm/devm_memremap_pages: fix final page put race
Logan noticed that devm_memremap_pages_release() kills the percpu_ref
drops all the page references that were acquired at init and then
immediately proceeds to unplug, arch_remove_memory(), the backing pages
for the pagemap.  If for some reason device shutdown actually collides
with a busy / elevated-ref-count page then arch_remove_memory() should
be deferred until after that reference is dropped.

As it stands the "wait for last page ref drop" happens *after*
devm_memremap_pages_release() returns, which is obviously too late and
can lead to crashes.

Fix this situation by assigning the responsibility to wait for the
percpu_ref to go idle to devm_memremap_pages() with a new ->cleanup()
callback.  Implement the new cleanup callback for all
devm_memremap_pages() users: pmem, devdax, hmm, and p2pdma.

Link: http://lkml.kernel.org/r/155727339156.292046.5432007428235387859.stgit@dwillia2-desk3.amr.corp.intel.com
Fixes: 41e94a8513 ("add devm_memremap_pages")
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reported-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-13 17:34:56 -10:00
Dan Williams
795ee30648 lib/genalloc: introduce chunk owners
The p2pdma facility enables a provider to publish a pool of dma
addresses for a consumer to allocate.  A genpool is used internally by
p2pdma to collect dma resources, 'chunks', to be handed out to
consumers.  Whenever a consumer allocates a resource it needs to pin the
'struct dev_pagemap' instance that backs the chunk selected by
pci_alloc_p2pmem().

Currently that reference is taken globally on the entire provider
device.  That sets up a lifetime mismatch whereby the p2pdma core needs
to maintain hacks to make sure the percpu_ref is not released twice.

This lifetime mismatch also stands in the way of a fix to
devm_memremap_pages() whereby devm_memremap_pages_release() must wait for
the percpu_ref ->release() callback to complete before it can proceed to
teardown pages.

So, towards fixing this situation, introduce the ability to store a 'chunk
owner' at gen_pool_add() time, and a facility to retrieve the owner at
gen_pool_{alloc,free}() time.  For p2pdma this will be used to store and
recall individual dev_pagemap reference counter instances per-chunk.

Link: http://lkml.kernel.org/r/155727338118.292046.13407378933221579644.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-13 17:34:56 -10:00
Dan Williams
2e3f139e8e mm/devm_memremap_pages: introduce devm_memunmap_pages
Use the new devm_release_action() facility to allow
devm_memremap_pages_release() to be manually triggered.

Link: http://lkml.kernel.org/r/155727337088.292046.5774214552136776763.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-13 17:34:56 -10:00
Dan Williams
2374b68225 drivers/base/devres: introduce devm_release_action()
Patch series "mm/devm_memremap_pages: Fix page release race", v2.

Logan audited the devm_memremap_pages() shutdown path and noticed that
it was possible to proceed to arch_remove_memory() before all potential
page references have been reaped.

Introduce a new ->cleanup() callback to do the work of waiting for any
straggling page references and then perform the percpu_ref_exit() in
devm_memremap_pages_release() context.

For p2pdma this involves some deeper reworks to reference count
resources on a per-instance basis rather than a per pci-device basis.  A
modified genalloc api is introduced to convey a driver-private pointer
through gen_pool_{alloc,free}() interfaces.  Also, a
devm_memunmap_pages() api is introduced since p2pdma does not
auto-release resources on a setup failure.

The dax and pmem changes pass the nvdimm unit tests, and the p2pdma
changes should now pass testing with the pci_p2pdma_release() fix.
Jrme, how does this look for HMM?

This patch (of 6):

The devm_add_action() facility allows a resource allocation routine to
add custom devm semantics.  One such user is devm_memremap_pages().

There is now a need to manually trigger
devm_memremap_pages_release().  Introduce devm_release_action() so the
release action can be triggered via a new devm_memunmap_pages() api in a
follow-on change.

Link: http://lkml.kernel.org/r/155727336530.292046.2926860263201336366.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-13 17:34:56 -10:00
Andrea Arcangeli
59ea6d06cf coredump: fix race condition between collapse_huge_page() and core dumping
When fixing the race conditions between the coredump and the mmap_sem
holders outside the context of the process, we focused on
mmget_not_zero()/get_task_mm() callers in 04f5866e41 ("coredump: fix
race condition between mmget_not_zero()/get_task_mm() and core
dumping"), but those aren't the only cases where the mmap_sem can be
taken outside of the context of the process as Michal Hocko noticed
while backporting that commit to older -stable kernels.

If mmgrab() is called in the context of the process, but then the
mm_count reference is transferred outside the context of the process,
that can also be a problem if the mmap_sem has to be taken for writing
through that mm_count reference.

khugepaged registration calls mmgrab() in the context of the process,
but the mmap_sem for writing is taken later in the context of the
khugepaged kernel thread.

collapse_huge_page() after taking the mmap_sem for writing doesn't
modify any vma, so it's not obvious that it could cause a problem to the
coredump, but it happens to modify the pmd in a way that breaks an
invariant that pmd_trans_huge_lock() relies upon.  collapse_huge_page()
needs the mmap_sem for writing just to block concurrent page faults that
call pmd_trans_huge_lock().

Specifically the invariant that "!pmd_trans_huge()" cannot become a
"pmd_trans_huge()" doesn't hold while collapse_huge_page() runs.

The coredump will call __get_user_pages() without mmap_sem for reading,
which eventually can invoke a lockless page fault which will need a
functional pmd_trans_huge_lock().

So collapse_huge_page() needs to use mmget_still_valid() to check it's
not running concurrently with the coredump...  as long as the coredump
can invoke page faults without holding the mmap_sem for reading.

This has "Fixes: khugepaged" to facilitate backporting, but in my view
it's more a bug in the coredump code that will eventually have to be
rewritten to stop invoking page faults without the mmap_sem for reading.
So the long term plan is still to drop all mmget_still_valid().

Link: http://lkml.kernel.org/r/20190607161558.32104-1-aarcange@redhat.com
Fixes: ba76149f47 ("thp: khugepaged")
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Reported-by: Michal Hocko <mhocko@suse.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Jason Gunthorpe <jgg@mellanox.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-13 17:34:56 -10:00