Commit Graph

840956 Commits

Author SHA1 Message Date
Dan Williams
1570175abd PCI/P2PDMA: track pgmap references per resource, not globally
In preparation for fixing a race between devm_memremap_pages_release()
and the final put of a page from the device-page-map, allocate a
percpu-ref per p2pdma resource mapping.

Link: http://lkml.kernel.org/r/155727338646.292046.9922678317501435597.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-13 17:34:56 -10:00
Dan Williams
795ee30648 lib/genalloc: introduce chunk owners
The p2pdma facility enables a provider to publish a pool of dma
addresses for a consumer to allocate.  A genpool is used internally by
p2pdma to collect dma resources, 'chunks', to be handed out to
consumers.  Whenever a consumer allocates a resource it needs to pin the
'struct dev_pagemap' instance that backs the chunk selected by
pci_alloc_p2pmem().

Currently that reference is taken globally on the entire provider
device.  That sets up a lifetime mismatch whereby the p2pdma core needs
to maintain hacks to make sure the percpu_ref is not released twice.

This lifetime mismatch also stands in the way of a fix to
devm_memremap_pages() whereby devm_memremap_pages_release() must wait for
the percpu_ref ->release() callback to complete before it can proceed to
teardown pages.

So, towards fixing this situation, introduce the ability to store a 'chunk
owner' at gen_pool_add() time, and a facility to retrieve the owner at
gen_pool_{alloc,free}() time.  For p2pdma this will be used to store and
recall individual dev_pagemap reference counter instances per-chunk.

Link: http://lkml.kernel.org/r/155727338118.292046.13407378933221579644.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-13 17:34:56 -10:00
Dan Williams
e615a19121 PCI/P2PDMA: fix the gen_pool_add_virt() failure path
The pci_p2pdma_add_resource() implementation immediately frees the pgmap
if gen_pool_add_virt() fails.  However, that means that when @dev
triggers a devres release devm_memremap_pages_release() will crash
trying to access the freed @pgmap.

Use the new devm_memunmap_pages() to manually free the mapping in the
error path.

Link: http://lkml.kernel.org/r/155727337603.292046.13101332703665246702.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Fixes: 52916982af ("PCI/P2PDMA: Support peer-to-peer memory")
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-13 17:34:56 -10:00
Dan Williams
2e3f139e8e mm/devm_memremap_pages: introduce devm_memunmap_pages
Use the new devm_release_action() facility to allow
devm_memremap_pages_release() to be manually triggered.

Link: http://lkml.kernel.org/r/155727337088.292046.5774214552136776763.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-13 17:34:56 -10:00
Dan Williams
2374b68225 drivers/base/devres: introduce devm_release_action()
Patch series "mm/devm_memremap_pages: Fix page release race", v2.

Logan audited the devm_memremap_pages() shutdown path and noticed that
it was possible to proceed to arch_remove_memory() before all potential
page references have been reaped.

Introduce a new ->cleanup() callback to do the work of waiting for any
straggling page references and then perform the percpu_ref_exit() in
devm_memremap_pages_release() context.

For p2pdma this involves some deeper reworks to reference count
resources on a per-instance basis rather than a per pci-device basis.  A
modified genalloc api is introduced to convey a driver-private pointer
through gen_pool_{alloc,free}() interfaces.  Also, a
devm_memunmap_pages() api is introduced since p2pdma does not
auto-release resources on a setup failure.

The dax and pmem changes pass the nvdimm unit tests, and the p2pdma
changes should now pass testing with the pci_p2pdma_release() fix.
Jrme, how does this look for HMM?

This patch (of 6):

The devm_add_action() facility allows a resource allocation routine to
add custom devm semantics.  One such user is devm_memremap_pages().

There is now a need to manually trigger
devm_memremap_pages_release().  Introduce devm_release_action() so the
release action can be triggered via a new devm_memunmap_pages() api in a
follow-on change.

Link: http://lkml.kernel.org/r/155727336530.292046.2926860263201336366.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-13 17:34:56 -10:00
Minchan Kim
a58f2cef26 mm/vmscan.c: fix trying to reclaim unevictable LRU page
There was the below bug report from Wu Fangsuo.

On the CMA allocation path, isolate_migratepages_range() could isolate
unevictable LRU pages and reclaim_clean_page_from_list() can try to
reclaim them if they are clean file-backed pages.

  page:ffffffbf02f33b40 count:86 mapcount:84 mapping:ffffffc08fa7a810 index:0x24
  flags: 0x19040c(referenced|uptodate|arch_1|mappedtodisk|unevictable|mlocked)
  raw: 000000000019040c ffffffc08fa7a810 0000000000000024 0000005600000053
  raw: ffffffc009b05b20 ffffffc009b05b20 0000000000000000 ffffffc09bf3ee80
  page dumped because: VM_BUG_ON_PAGE(PageLRU(page) || PageUnevictable(page))
  page->mem_cgroup:ffffffc09bf3ee80
  ------------[ cut here ]------------
  kernel BUG at /home/build/farmland/adroid9.0/kernel/linux/mm/vmscan.c:1350!
  Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
  Modules linked in:
  CPU: 0 PID: 7125 Comm: syz-executor Tainted: G S              4.14.81 #3
  Hardware name: ASR AQUILAC EVB (DT)
  task: ffffffc00a54cd00 task.stack: ffffffc009b00000
  PC is at shrink_page_list+0x1998/0x3240
  LR is at shrink_page_list+0x1998/0x3240
  pc : [<ffffff90083a2158>] lr : [<ffffff90083a2158>] pstate: 60400045
  sp : ffffffc009b05940
  ..
     shrink_page_list+0x1998/0x3240
     reclaim_clean_pages_from_list+0x3c0/0x4f0
     alloc_contig_range+0x3bc/0x650
     cma_alloc+0x214/0x668
     ion_cma_allocate+0x98/0x1d8
     ion_alloc+0x200/0x7e0
     ion_ioctl+0x18c/0x378
     do_vfs_ioctl+0x17c/0x1780
     SyS_ioctl+0xac/0xc0

Wu found it's due to commit ad6b67041a ("mm: remove SWAP_MLOCK in
ttu").  Before that, unevictable pages go to cull_mlocked so that we
can't reach the VM_BUG_ON_PAGE line.

To fix the issue, this patch filters out unevictable LRU pages from the
reclaim_clean_pages_from_list in CMA.

Link: http://lkml.kernel.org/r/20190524071114.74202-1-minchan@kernel.org
Fixes: ad6b67041a ("mm: remove SWAP_MLOCK in ttu")
Signed-off-by: Minchan Kim <minchan@kernel.org>
Reported-by: Wu Fangsuo <fangsuowu@asrmicro.com>
Debugged-by: Wu Fangsuo <fangsuowu@asrmicro.com>
Tested-by: Wu Fangsuo <fangsuowu@asrmicro.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Pankaj Suryawanshi <pankaj.suryawanshi@einfochips.com>
Cc: <stable@vger.kernel.org>	[4.12+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-13 17:34:56 -10:00
Andrea Arcangeli
59ea6d06cf coredump: fix race condition between collapse_huge_page() and core dumping
When fixing the race conditions between the coredump and the mmap_sem
holders outside the context of the process, we focused on
mmget_not_zero()/get_task_mm() callers in 04f5866e41 ("coredump: fix
race condition between mmget_not_zero()/get_task_mm() and core
dumping"), but those aren't the only cases where the mmap_sem can be
taken outside of the context of the process as Michal Hocko noticed
while backporting that commit to older -stable kernels.

If mmgrab() is called in the context of the process, but then the
mm_count reference is transferred outside the context of the process,
that can also be a problem if the mmap_sem has to be taken for writing
through that mm_count reference.

khugepaged registration calls mmgrab() in the context of the process,
but the mmap_sem for writing is taken later in the context of the
khugepaged kernel thread.

collapse_huge_page() after taking the mmap_sem for writing doesn't
modify any vma, so it's not obvious that it could cause a problem to the
coredump, but it happens to modify the pmd in a way that breaks an
invariant that pmd_trans_huge_lock() relies upon.  collapse_huge_page()
needs the mmap_sem for writing just to block concurrent page faults that
call pmd_trans_huge_lock().

Specifically the invariant that "!pmd_trans_huge()" cannot become a
"pmd_trans_huge()" doesn't hold while collapse_huge_page() runs.

The coredump will call __get_user_pages() without mmap_sem for reading,
which eventually can invoke a lockless page fault which will need a
functional pmd_trans_huge_lock().

So collapse_huge_page() needs to use mmget_still_valid() to check it's
not running concurrently with the coredump...  as long as the coredump
can invoke page faults without holding the mmap_sem for reading.

This has "Fixes: khugepaged" to facilitate backporting, but in my view
it's more a bug in the coredump code that will eventually have to be
rewritten to stop invoking page faults without the mmap_sem for reading.
So the long term plan is still to drop all mmget_still_valid().

Link: http://lkml.kernel.org/r/20190607161558.32104-1-aarcange@redhat.com
Fixes: ba76149f47 ("thp: khugepaged")
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Reported-by: Michal Hocko <mhocko@suse.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Jason Gunthorpe <jgg@mellanox.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-13 17:34:56 -10:00
swkhack
0874bb49bb mm/mlock.c: change count_mm_mlocked_page_nr return type
On a 64-bit machine the value of "vma->vm_end - vma->vm_start" may be
negative when using 32 bit ints and the "count >> PAGE_SHIFT"'s result
will be wrong.  So change the local variable and return value to
unsigned long to fix the problem.

Link: http://lkml.kernel.org/r/20190513023701.83056-1-swkhack@gmail.com
Fixes: 0cf2f6f6dc ("mm: mlock: check against vma for actual mlock() size")
Signed-off-by: swkhack <swkhack@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-13 17:34:56 -10:00
Yang Shi
7a30df49f6 mm: mmu_gather: remove __tlb_reset_range() for force flush
A few new fields were added to mmu_gather to make TLB flush smarter for
huge page by telling what level of page table is changed.

__tlb_reset_range() is used to reset all these page table state to
unchanged, which is called by TLB flush for parallel mapping changes for
the same range under non-exclusive lock (i.e.  read mmap_sem).

Before commit dd2283f260 ("mm: mmap: zap pages with read mmap_sem in
munmap"), the syscalls (e.g.  MADV_DONTNEED, MADV_FREE) which may update
PTEs in parallel don't remove page tables.  But, the forementioned
commit may do munmap() under read mmap_sem and free page tables.  This
may result in program hang on aarch64 reported by Jan Stancek.  The
problem could be reproduced by his test program with slightly modified
below.

---8<---

static int map_size = 4096;
static int num_iter = 500;
static long threads_total;

static void *distant_area;

void *map_write_unmap(void *ptr)
{
	int *fd = ptr;
	unsigned char *map_address;
	int i, j = 0;

	for (i = 0; i < num_iter; i++) {
		map_address = mmap(distant_area, (size_t) map_size, PROT_WRITE | PROT_READ,
			MAP_SHARED | MAP_ANONYMOUS, -1, 0);
		if (map_address == MAP_FAILED) {
			perror("mmap");
			exit(1);
		}

		for (j = 0; j < map_size; j++)
			map_address[j] = 'b';

		if (munmap(map_address, map_size) == -1) {
			perror("munmap");
			exit(1);
		}
	}

	return NULL;
}

void *dummy(void *ptr)
{
	return NULL;
}

int main(void)
{
	pthread_t thid[2];

	/* hint for mmap in map_write_unmap() */
	distant_area = mmap(0, DISTANT_MMAP_SIZE, PROT_WRITE | PROT_READ,
			MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
	munmap(distant_area, (size_t)DISTANT_MMAP_SIZE);
	distant_area += DISTANT_MMAP_SIZE / 2;

	while (1) {
		pthread_create(&thid[0], NULL, map_write_unmap, NULL);
		pthread_create(&thid[1], NULL, dummy, NULL);

		pthread_join(thid[0], NULL);
		pthread_join(thid[1], NULL);
	}
}
---8<---

The program may bring in parallel execution like below:

        t1                                        t2
munmap(map_address)
  downgrade_write(&mm->mmap_sem);
  unmap_region()
  tlb_gather_mmu()
    inc_tlb_flush_pending(tlb->mm);
  free_pgtables()
    tlb->freed_tables = 1
    tlb->cleared_pmds = 1

                                        pthread_exit()
                                        madvise(thread_stack, 8M, MADV_DONTNEED)
                                          zap_page_range()
                                            tlb_gather_mmu()
                                              inc_tlb_flush_pending(tlb->mm);

  tlb_finish_mmu()
    if (mm_tlb_flush_nested(tlb->mm))
      __tlb_reset_range()

__tlb_reset_range() would reset freed_tables and cleared_* bits, but this
may cause inconsistency for munmap() which do free page tables.  Then it
may result in some architectures, e.g.  aarch64, may not flush TLB
completely as expected to have stale TLB entries remained.

Use fullmm flush since it yields much better performance on aarch64 and
non-fullmm doesn't yields significant difference on x86.

The original proposed fix came from Jan Stancek who mainly debugged this
issue, I just wrapped up everything together.

Jan's testing results:

v5.2-rc2-24-gbec7550cca10
--------------------------
         mean     stddev
real    37.382   2.780
user     1.420   0.078
sys     54.658   1.855

v5.2-rc2-24-gbec7550cca10 + "mm: mmu_gather: remove __tlb_reset_range() for force flush"
---------------------------------------------------------------------------------------_
         mean     stddev
real    37.119   2.105
user     1.548   0.087
sys     55.698   1.357

[akpm@linux-foundation.org: coding-style fixes]
Link: http://lkml.kernel.org/r/1558322252-113575-1-git-send-email-yang.shi@linux.alibaba.com
Fixes: dd2283f260 ("mm: mmap: zap pages with read mmap_sem in munmap")
Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
Signed-off-by: Jan Stancek <jstancek@redhat.com>
Reported-by: Jan Stancek <jstancek@redhat.com>
Tested-by: Jan Stancek <jstancek@redhat.com>
Suggested-by: Will Deacon <will.deacon@arm.com>
Tested-by: Will Deacon <will.deacon@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Nick Piggin <npiggin@gmail.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Cc: Nadav Amit <namit@vmware.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: <stable@vger.kernel.org>	[4.20+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-13 17:34:56 -10:00
Wengang Wang
be99ca2716 fs/ocfs2: fix race in ocfs2_dentry_attach_lock()
ocfs2_dentry_attach_lock() can be executed in parallel threads against the
same dentry.  Make that race safe.  The race is like this:

            thread A                               thread B

(A1) enter ocfs2_dentry_attach_lock,
seeing dentry->d_fsdata is NULL,
and no alias found by
ocfs2_find_local_alias, so kmalloc
a new ocfs2_dentry_lock structure
to local variable "dl", dl1

               .....

                                    (B1) enter ocfs2_dentry_attach_lock,
                                    seeing dentry->d_fsdata is NULL,
                                    and no alias found by
                                    ocfs2_find_local_alias so kmalloc
                                    a new ocfs2_dentry_lock structure
                                    to local variable "dl", dl2.

                                                   ......

(A2) set dentry->d_fsdata with dl1,
call ocfs2_dentry_lock() and increase
dl1->dl_lockres.l_ro_holders to 1 on
success.
              ......

                                    (B2) set dentry->d_fsdata with dl2
                                    call ocfs2_dentry_lock() and increase
				    dl2->dl_lockres.l_ro_holders to 1 on
				    success.

                                                  ......

(A3) call ocfs2_dentry_unlock()
and decrease
dl2->dl_lockres.l_ro_holders to 0
on success.
             ....

                                    (B3) call ocfs2_dentry_unlock(),
                                    decreasing
				    dl2->dl_lockres.l_ro_holders, but
				    see it's zero now, panic

Link: http://lkml.kernel.org/r/20190529174636.22364-1-wen.gang.wang@oracle.com
Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com>
Reported-by: Daniel Sobe <daniel.sobe@nxp.com>
Tested-by: Daniel Sobe <daniel.sobe@nxp.com>
Reviewed-by: Changwei Ge <gechangwei@live.cn>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-13 17:34:56 -10:00
Kirill Tkhai
b17f18aff2 mm/vmscan.c: fix recent_rotated history
Johannes pointed out that after commit 886cf1901d ("mm: move
recent_rotated pages calculation to shrink_inactive_list()") we lost all
zone_reclaim_stat::recent_rotated history.

This fixes it.

Link: http://lkml.kernel.org/r/155905972210.26456.11178359431724024112.stgit@localhost.localdomain
Fixes: 886cf1901d ("mm: move recent_rotated pages calculation to shrink_inactive_list()")
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Reported-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-13 17:34:56 -10:00
Potyra, Stefan
dedca63504 mm/mlock.c: mlockall error for flag MCL_ONFAULT
If mlockall() is called with only MCL_ONFAULT as flag, it removes any
previously applied lockings and does nothing else.

This behavior is counter-intuitive and doesn't match the Linux man page.

  For mlockall():

  EINVAL Unknown flags were specified or MCL_ONFAULT was specified
  without either MCL_FUTURE or MCL_CURRENT.

Consequently, return the error EINVAL, if only MCL_ONFAULT is passed.
That way, applications will at least detect that they are calling
mlockall() incorrectly.

Link: http://lkml.kernel.org/r/20190527075333.GA6339@er01809n.ebgroup.elektrobit.com
Fixes: b0f205c2a3 ("mm: mlock: add mlock flags to enable VM_LOCKONFAULT usage")
Signed-off-by: Stefan Potyra <Stefan.Potyra@elektrobit.com>
Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-13 17:34:56 -10:00
Manuel Traut
c04e32e911 scripts/decode_stacktrace.sh: prefix addr2line with $CROSS_COMPILE
At least for ARM64 kernels compiled with the crosstoolchain from
Debian/stretch or with the toolchain from kernel.org the line number is
not decoded correctly by 'decode_stacktrace.sh':

  $ echo "[  136.513051]  f1+0x0/0xc [kcrash]" | \
    CROSS_COMPILE=/opt/gcc-8.1.0-nolibc/aarch64-linux/bin/aarch64-linux- \
   ./scripts/decode_stacktrace.sh /scratch/linux-arm64/vmlinux \
                                  /scratch/linux-arm64 \
                                  /nfs/debian/lib/modules/4.20.0-devel
  [  136.513051] f1 (/linux/drivers/staging/kcrash/kcrash.c:68) kcrash

If addr2line from the toolchain is used the decoded line number is correct:

  [  136.513051] f1 (/linux/drivers/staging/kcrash/kcrash.c:57) kcrash

Link: http://lkml.kernel.org/r/20190527083425.3763-1-manut@linutronix.de
Signed-off-by: Manuel Traut <manut@linutronix.de>
Acked-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-13 17:34:56 -10:00
Shakeel Butt
3510955b32 mm/list_lru.c: fix memory leak in __memcg_init_list_lru_node
Syzbot reported following memory leak:

ffffffffda RBX: 0000000000000003 RCX: 0000000000441f79
BUG: memory leak
unreferenced object 0xffff888114f26040 (size 32):
  comm "syz-executor626", pid 7056, jiffies 4294948701 (age 39.410s)
  hex dump (first 32 bytes):
    40 60 f2 14 81 88 ff ff 40 60 f2 14 81 88 ff ff  @`......@`......
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
     slab_post_alloc_hook mm/slab.h:439 [inline]
     slab_alloc mm/slab.c:3326 [inline]
     kmem_cache_alloc_trace+0x13d/0x280 mm/slab.c:3553
     kmalloc include/linux/slab.h:547 [inline]
     __memcg_init_list_lru_node+0x58/0xf0 mm/list_lru.c:352
     memcg_init_list_lru_node mm/list_lru.c:375 [inline]
     memcg_init_list_lru mm/list_lru.c:459 [inline]
     __list_lru_init+0x193/0x2a0 mm/list_lru.c:626
     alloc_super+0x2e0/0x310 fs/super.c:269
     sget_userns+0x94/0x2a0 fs/super.c:609
     sget+0x8d/0xb0 fs/super.c:660
     mount_nodev+0x31/0xb0 fs/super.c:1387
     fuse_mount+0x2d/0x40 fs/fuse/inode.c:1236
     legacy_get_tree+0x27/0x80 fs/fs_context.c:661
     vfs_get_tree+0x2e/0x120 fs/super.c:1476
     do_new_mount fs/namespace.c:2790 [inline]
     do_mount+0x932/0xc50 fs/namespace.c:3110
     ksys_mount+0xab/0x120 fs/namespace.c:3319
     __do_sys_mount fs/namespace.c:3333 [inline]
     __se_sys_mount fs/namespace.c:3330 [inline]
     __x64_sys_mount+0x26/0x30 fs/namespace.c:3330
     do_syscall_64+0x76/0x1a0 arch/x86/entry/common.c:301
     entry_SYSCALL_64_after_hwframe+0x44/0xa9

This is a simple off by one bug on the error path.

Link: http://lkml.kernel.org/r/20190528043202.99980-1-shakeelb@google.com
Fixes: 60d3fd32a7 ("list_lru: introduce per-memcg lists")
Reported-by: syzbot+f90a420dfe2b1b03cb2c@syzkaller.appspotmail.com
Signed-off-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Cc: <stable@vger.kernel.org>	[4.0+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-13 17:34:56 -10:00
Johannes Weiner
815744d751 mm: memcontrol: don't batch updates of local VM stats and events
The kernel test robot noticed a 26% will-it-scale pagefault regression
from commit 42a3003535 ("mm: memcontrol: fix recursive statistics
correctness & scalabilty").  This appears to be caused by bouncing the
additional cachelines from the new hierarchical statistics counters.

We can fix this by getting rid of the batched local counters instead.

Originally, there were *only* group-local counters, and they were fully
maintained per cpu.  A reader of a stats file high up in the cgroup tree
would have to walk the entire subtree and collect each level's per-cpu
counters to get the recursive view.  This was prohibitively expensive,
and so we switched to per-cpu batched updates of the local counters
during a983b5ebee ("mm: memcontrol: fix excessive complexity in
memory.stat reporting"), reducing the complexity from nr_subgroups *
nr_cpus to nr_subgroups.

With growing machines and cgroup trees, the tree walk itself became too
expensive for monitoring top-level groups, and this is when the culprit
patch added hierarchy counters on each cgroup level.  When the per-cpu
batch size would be reached, both the local and the hierarchy counters
would get batch-updated from the per-cpu delta simultaneously.

This makes local and hierarchical counter reads blazingly fast, but it
unfortunately makes the write-side too cache line intense.

Since local counter reads were never a problem - we only centralized
them to accelerate the hierarchy walk - and use of the local counters
are becoming rarer due to replacement with hierarchical views (ongoing
rework in the page reclaim and workingset code), we can make those local
counters unbatched per-cpu counters again.

The scheme will then be as such:

   when a memcg statistic changes, the writer will:
   - update the local counter (per-cpu)
   - update the batch counter (per-cpu). If the batch is full:
   - spill the batch into the group's atomic_t
   - spill the batch into all ancestors' atomic_ts
   - empty out the batch counter (per-cpu)

   when a local memcg counter is read, the reader will:
   - collect the local counter from all cpus

   when a hiearchy memcg counter is read, the reader will:
   - read the atomic_t

We might be able to simplify this further and make the recursive
counters unbatched per-cpu counters as well (batch upward propagation,
but leave per-cpu collection to the readers), but that will require a
more in-depth analysis and testing of all the callsites.  Deal with the
immediate regression for now.

Link: http://lkml.kernel.org/r/20190521151647.GB2870@cmpxchg.org
Fixes: 42a3003535 ("mm: memcontrol: fix recursive statistics correctness & scalabilty")
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reported-by: kernel test robot <rong.a.chen@intel.com>
Tested-by: kernel test robot <rong.a.chen@intel.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Roman Gushchin <guro@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-13 17:34:56 -10:00
Linus Torvalds
c11fb13a11 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid
Pull HID fixes from Jiri Kosina:

 - regression fixes (reverts) for module loading changes that turned out
   to be incompatible with some userspace, from Benjamin Tissoires

 - regression fix for special Logitech unifiying receiver 0xc52f, from
   Hans de Goede

 - a few device ID additions to logitech driver, from Hans de Goede

 - fix for Bluetooth support on 2nd-gen Wacom Intuos Pro, from Jason
   Gerecke

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
  HID: logitech-dj: Fix 064d:c52f receiver support
  Revert "HID: core: Call request_module before doing device_add"
  Revert "HID: core: Do not call request_module() in async context"
  Revert "HID: Increase maximum report size allowed by hid_field_extract()"
  HID: a4tech: fix horizontal scrolling
  HID: hyperv: Add a module description line
  HID: logitech-hidpp: Add support for the S510 remote control
  HID: multitouch: handle faulty Elo touch device
  HID: wacom: Sync INTUOSP2_BT touch state after each frame if necessary
  HID: wacom: Correct button numbering 2nd-gen Intuos Pro over Bluetooth
  HID: wacom: Send BTN_TOUCH in response to INTUOSP2_BT eraser contact
  HID: wacom: Don't report anything prior to the tool entering range
  HID: wacom: Don't set tool type until we're in range
  HID: rmi: Use SET_REPORT request on control endpoint for Acer Switch 3 and 5
  HID: logitech-hidpp: add support for the MX5500 keyboard
  HID: logitech-dj: add support for the Logitech MX5500's Bluetooth Mini-Receiver
  HID: i2c-hid: add iBall Aer3 to descriptor override
2019-06-13 05:59:05 -10:00
Linus Torvalds
b076173a30 Merge tag 'selinux-pr-20190612' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux
Pull selinux fixes from Paul Moore:
 "Three patches for v5.2.

  One fixes a problem where we weren't correctly logging raw SELinux
  labels, the other two fix problems where we weren't properly checking
  calls to kmemdup()"

* tag 'selinux-pr-20190612' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
  selinux: fix a missing-check bug in selinux_sb_eat_lsm_opts()
  selinux: fix a missing-check bug in selinux_add_mnt_opt( )
  selinux: log raw contexts as untrusted strings
2019-06-12 16:10:57 -10:00
Gen Zhang
fec6375320 selinux: fix a missing-check bug in selinux_sb_eat_lsm_opts()
In selinux_sb_eat_lsm_opts(), 'arg' is allocated by kmemdup_nul(). It
returns NULL when fails. So 'arg' should be checked. And 'mnt_opts'
should be freed when error.

Signed-off-by: Gen Zhang <blackgod016574@gmail.com>
Fixes: 99dbbb593f ("selinux: rewrite selinux_sb_eat_lsm_opts()")
Cc: <stable@vger.kernel.org>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2019-06-12 12:27:26 -04:00
Linus Torvalds
35110e38e6 Merge tag 'media/v5.2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
Pull media fixes from Mauro Carvalho Chehab:

 - a debug warning for satellite tuning at dvb core was producing too
   much noise

 - a regression at hfi_parser on Venus driver

* tag 'media/v5.2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
  media: venus: hfi_parser: fix a regression in parser
  media: dvb: warning about dvb frequency limits produces too much noise
2019-06-12 05:57:05 -10:00
Gen Zhang
e2e0e09758 selinux: fix a missing-check bug in selinux_add_mnt_opt( )
In selinux_add_mnt_opt(), 'val' is allocated by kmemdup_nul(). It returns
NULL when fails. So 'val' should be checked. And 'mnt_opts' should be
freed when error.

Signed-off-by: Gen Zhang <blackgod016574@gmail.com>
Fixes: 757cbe597f ("LSM: new method: ->sb_add_mnt_opt()")
Cc: <stable@vger.kernel.org>
[PM: fixed some indenting problems]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2019-06-12 11:39:38 -04:00
Linus Torvalds
aa7235483a Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull ptrace fixes from Eric Biederman:
 "This is just two very minor fixes:

   - prevent ptrace from reading unitialized kernel memory found twice
     by syzkaller

   - restore a missing smp_rmb in ptrace_may_access and add comment tp
     it so it is not removed by accident again.

  Apologies for being a little slow about getting this to you, I am
  still figuring out how to develop with a little baby in the house"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
  ptrace: restore smp_rmb() in __ptrace_may_access()
  signal/ptrace: Don't leak unitialized kernel memory with PTRACE_PEEK_SIGINFO
2019-06-11 15:44:45 -10:00
Linus Torvalds
4d8f5f91b8 Merge branch 'stable/for-linus-5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb
Pull swiotlb fix from Konrad Rzeszutek Wilk:
 "One tiny fix for ARM64 where we could allocate the SWIOTLB twice"

* 'stable/for-linus-5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb:
  xen/swiotlb: don't initialize swiotlb twice on arm64
2019-06-11 15:38:34 -10:00
Linus Torvalds
c23b07125f Merge tag 'vfio-v5.2-rc5' of git://github.com/awilliam/linux-vfio
Pull VFIO fixes from Alex Williamson:
 "Fix mdev device create/remove paths to provide initialized device for
  parent driver create callback and correct ordering of device removal
  from bus prior to initiating removal by parent.

  Also resolve races between parent removal and device create/remove
  paths (all from Parav Pandit)"

* tag 'vfio-v5.2-rc5' of git://github.com/awilliam/linux-vfio:
  vfio/mdev: Synchronize device create/remove with parent removal
  vfio/mdev: Avoid creating sysfs remove file on stale device removal
  vfio/mdev: Improve the create/remove sequence
2019-06-11 15:27:57 -10:00
Linus Torvalds
6fa425a265 Merge tag 'for-5.2-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fix from David Sterba:
 "One regression fix to TRIM ioctl.

  The range cannot be used as its meaning can be confusing regarding
  physical and logical addresses. This confusion in code led to
  potential corruptions when the range overlapped data.

  The original patch made it to several stable kernels and was promptly
  reverted, the version for master branch is different due to additional
  changes but the change is effectively the same"

* tag 'for-5.2-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: Always trim all unallocated space in btrfs_trim_free_extents
2019-06-11 15:10:15 -10:00
Ondrej Mosnacek
aff7ed4851 selinux: log raw contexts as untrusted strings
These strings may come from untrusted sources (e.g. file xattrs) so they
need to be properly escaped.

Reproducer:
    # setenforce 0
    # touch /tmp/test
    # setfattr -n security.selinux -v 'kuřecí řízek' /tmp/test
    # runcon system_u:system_r:sshd_t:s0 cat /tmp/test
    (look at the generated AVCs)

Actual result:
    type=AVC [...] trawcon=kuřecí řízek

Expected result:
    type=AVC [...] trawcon=6B75C5996563C3AD20C599C3AD7A656B

Fixes: fede148324 ("selinux: log invalid contexts in AVCs")
Cc: stable@vger.kernel.org # v5.1+
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Acked-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2019-06-11 18:35:51 -04:00
Jann Horn
f6581f5b55 ptrace: restore smp_rmb() in __ptrace_may_access()
Restore the read memory barrier in __ptrace_may_access() that was deleted
a couple years ago. Also add comments on this barrier and the one it pairs
with to explain why they're there (as far as I understand).

Fixes: bfedb58925 ("mm: Add a user_ns owner to mm_struct and fix ptrace permission checks")
Cc: stable@vger.kernel.org
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2019-06-11 15:08:28 -05:00
Linus Torvalds
01ccc3ad44 Merge tag 'for-linus-20190610' of git://git.kernel.dk/linux-block
Pull block cgroup symlink revert from Jens Axboe:
 "I talked to Tejun about this offline, and he's not a huge fan of the
  symlink.

  So let's revert this for now, and Paolo can do this properly for 5.3
  instead"

* tag 'for-linus-20190610' of git://git.kernel.dk/linux-block:
  cgroup/bfq: revert bfq.weight symlink change
2019-06-10 07:43:30 -10:00
Linus Torvalds
5e3b6b8ecc Merge tag 'regulator-fix-v5.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
Pull regulator fix from Mark Brown:
 "Just one driver specific fix here, for a boot regression introduced
  during some modernization work on the tps6507x driver"

* tag 'regulator-fix-v5.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
  regulator: tps6507x: Fix boot regression due to testing wrong init_data pointer
2019-06-10 07:35:55 -10:00
Linus Torvalds
e59bf4282c Merge tag 'spi-fix-v5.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
 "A small set of fixes here.

  One core fix for error handling when we fail to set up the hardware
  before initiating a transfer and another one reverting a change in the
  core which broke Raspberry Pi in common use cases as part of some
  optimization work.

  There's also a couple of driver specific fixes"

* tag 'spi-fix-v5.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  spi: abort spi_sync if failed to prepare_transfer_hardware
  spi: spi-fsl-spi: call spi_finalize_current_message() at the end
  spi: bitbang: Fix NULL pointer dereference in spi_unregister_master
  spi: Fix Raspberry Pi breakage
2019-06-10 07:19:56 -10:00
Jens Axboe
cf8929885d cgroup/bfq: revert bfq.weight symlink change
There's some discussion on how to do this the best, and Tejun prefers
that BFQ just create the file itself instead of having cgroups support
a symlink feature.

Hence revert commit 54b7b868e8 and 19e9da9e86 for 5.2, and this
can be done properly for 5.3.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-06-10 03:35:41 -06:00
Linus Torvalds
d1fdb6d8f6 Linux 5.2-rc4 v5.2-rc4 2019-06-08 20:24:46 -07:00
Linus Torvalds
2759e05cdb Merge tag 'ceph-for-5.2-rc4' of git://github.com/ceph/ceph-client
Pull ceph fixes from Ilya Dryomov:
 "A change to call iput() asynchronously to avoid a possible deadlock
  when iput_final() needs to wait for in-flight I/O (e.g. readahead) and
  a fixup for a cleanup that went into -rc1"

* tag 'ceph-for-5.2-rc4' of git://github.com/ceph/ceph-client:
  ceph: fix error handling in ceph_get_caps()
  ceph: avoid iput_final() while holding mutex or in dispatch thread
  ceph: single workqueue for inode related works
2019-06-08 15:57:35 -07:00
Linus Torvalds
8e61f6f7c3 Merge tag 'for-linus-5.2b-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fix from Juergen Gross:
 "Just one fix for the Xen block frontend driver avoiding allocations
  with order > 0"

* tag 'for-linus-5.2b-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen-blkfront: switch kcalloc to kvcalloc for large array allocation
2019-06-08 13:16:05 -07:00
Linus Torvalds
3d4645bf7a Merge tag 's390-5.2-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Heiko Carstens:

 - fix stack unwinder: the stack unwinder rework has on off-by-one bug
   which prevents following stack backchains over more than one context
   (e.g. irq -> process).

 - fix address space detection in exception handler: if user space
   switches to access register mode, which is not supported anymore, the
   exception handler may resolve to the wrong address space.

* tag 's390-5.2-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/unwind: correct stack switching during unwind
  s390/mm: fix address space detection in exception handling
2019-06-08 13:12:54 -07:00
Linus Torvalds
d0cc617aff Merge tag 'mips_fixes_5.2_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux
Pull MIPS fixes from Paul Burton:

 - Declare ginvt() __always_inline due to its use of an argument as an
   inline asm immediate.

 - A VDSO build fix following Kbuild changes made this cycle.

 - A fix for boot failures on txx9 systems following memory
   initialization changes made this cycle.

 - Bounds check virt_addr_valid() to prevent it spuriously indicating
   that bogus addresses are valid, in turn fixing hardened usercopy
   failures that have been present since v4.12.

 - Build uImage.gz for pistachio systems by default, since this is the
   image we need in order to actually boot on a board.

 - Remove an unused variable in our uprobes code.

* tag 'mips_fixes_5.2_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
  MIPS: uprobes: remove set but not used variable 'epc'
  MIPS: pistachio: Build uImage.gz by default
  MIPS: Make virt_addr_valid() return bool
  MIPS: Bounds check virt_addr_valid
  MIPS: TXx9: Fix boot crash in free_initmem()
  MIPS: remove a space after -I to cope with header search paths for VDSO
  MIPS: mark ginvt() as __always_inline
2019-06-08 13:09:31 -07:00
Linus Torvalds
9331b6740f Merge tag 'spdx-5.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull yet more SPDX updates from Greg KH:
 "Another round of SPDX header file fixes for 5.2-rc4

  These are all more "GPL-2.0-or-later" or "GPL-2.0-only" tags being
  added, based on the text in the files. We are slowly chipping away at
  the 700+ different ways people tried to write the license text. All of
  these were reviewed on the spdx mailing list by a number of different
  people.

  We now have over 60% of the kernel files covered with SPDX tags:
	$ ./scripts/spdxcheck.py -v 2>&1 | grep Files
	Files checked:            64533
	Files with SPDX:          40392
	Files with errors:            0

  I think the majority of the "easy" fixups are now done, it's now the
  start of the longer-tail of crazy variants to wade through"

* tag 'spdx-5.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (159 commits)
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 450
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 449
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 448
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 446
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 445
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 444
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 443
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 442
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 441
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 440
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 438
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 437
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 436
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 435
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 434
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 433
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 432
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 431
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 430
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 429
  ...
2019-06-08 12:52:42 -07:00
Linus Torvalds
1ce2c85137 Merge tag 'char-misc-5.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver fixes from Greg KH:
 "Here are some small char and misc driver fixes for 5.2-rc4 to resolve
  a number of reported issues.

  The most "notable" one here is the kernel headers in proc^Wsysfs
  fixes. Those changes move the header file info into sysfs and fixes
  the build issues that you reported.

  Other than that, a bunch of small habanalabs driver fixes, some fpga
  driver fixes, and a few other tiny driver fixes.

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'char-misc-5.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
  habanalabs: Read upper bits of trace buffer from RWPHI
  habanalabs: Fix virtual address access via debugfs for 2MB pages
  fpga: zynqmp-fpga: Correctly handle error pointer
  habanalabs: fix bug in checking huge page optimization
  habanalabs: Avoid using a non-initialized MMU cache mutex
  habanalabs: fix debugfs code
  uapi/habanalabs: add opcode for enable/disable device debug mode
  habanalabs: halt debug engines on user process close
  test_firmware: Use correct snprintf() limit
  genwqe: Prevent an integer overflow in the ioctl
  parport: Fix mem leak in parport_register_dev_model
  fpga: dfl: expand minor range when registering chrdev region
  fpga: dfl: Add lockdep classes for pdata->lock
  fpga: dfl: afu: Pass the correct device to dma_mapping_error()
  fpga: stratix10-soc: fix use-after-free on s10_init()
  w1: ds2408: Fix typo after 49695ac468 (reset on output_write retry with readback)
  kheaders: Do not regenerate archive if config is not changed
  kheaders: Move from proc to sysfs
  lkdtm/bugs: Adjust recursion test to avoid elision
  lkdtm/usercopy: Moves the KERNEL_DS test to non-canonical
2019-06-08 12:50:36 -07:00
Linus Torvalds
902b2edfca Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
 "I2C has a driver bugfix and a MAINTAINERS fix"

* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  MAINTAINERS: Karthikeyan Ramasubramanian is MIA
  i2c: xiic: Add max_read_len quirk
2019-06-08 12:48:49 -07:00
Linus Torvalds
66b59f2b5e Merge tag 'dmaengine-fix-5.2-rc4' of git://git.infradead.org/users/vkoul/slave-dma
Pull dmaengine fixes from Vinod Koul:

 - jz4780 transfer fix for acking descriptors early

 - fsl-qdma: clean registers on error

 - dw-axi-dmac: null pointer dereference fix

 - mediatek-cqdma: fix sleeping in atomic context

 - tegra210-adma: fix bunch os issues like crashing in driver probe,
   channel FIFO configuration etc.

 - sprd: Fixes for possible crash on descriptor status, block length
   overflow. For 2-stage transfer fix incorrect start, configuration and
   interrupt handling.

* tag 'dmaengine-fix-5.2-rc4' of git://git.infradead.org/users/vkoul/slave-dma:
  dmaengine: sprd: Add interrupt support for 2-stage transfer
  dmaengine: sprd: Fix the right place to configure 2-stage transfer
  dmaengine: sprd: Fix block length overflow
  dmaengine: sprd: Fix the incorrect start for 2-stage destination channels
  dmaengine: sprd: Add validation of current descriptor in irq handler
  dmaengine: sprd: Fix the possible crash when getting descriptor status
  dmaengine: tegra210-adma: Fix spelling
  dmaengine: tegra210-adma: Fix channel FIFO configuration
  dmaengine: tegra210-adma: Fix crash during probe
  dmaengine: mediatek-cqdma: sleeping in atomic context
  dmaengine: dw-axi-dmac: fix null dereference when pointer first is null
  dmaengine: fsl-qdma: Add improvement
  dmaengine: jz4780: Fix transfers being ACKed too soon
2019-06-08 12:46:31 -07:00
Linus Torvalds
8d72e5bd86 Merge tag 'for-linus-20190608' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:

 - Allow symlink from the bfq.weight cgroup parameter to the general
   weight (Angelo)

 - Damien is new skd maintainer (Bart)

 - NVMe pull request from Sagi, with a few small fixes.

 - Ensure we set DMA segment size properly, dma-debug is now tripping on
   these (Christoph)

 - Remove useless debugfs_create() return check (Greg)

 - Remove redundant unlikely() check on IS_ERR() (Kefeng)

 - Fixup request freeing on exit (Ming)

* tag 'for-linus-20190608' of git://git.kernel.dk/linux-block:
  block, bfq: add weight symlink to the bfq.weight cgroup parameter
  cgroup: let a symlink too be created with a cftype file
  block: free sched's request pool in blk_cleanup_queue
  nvme-rdma: use dynamic dma mapping per command
  nvme: Fix u32 overflow in the number of namespace list calculation
  mmc: also set max_segment_size in the device
  mtip32xx: also set max_segment_size in the device
  rsxx: don't call dma_set_max_seg_size
  nvme-pci: don't limit DMA segement size
  block: Drop unlikely before IS_ERR(_OR_NULL)
  block: aoe: no need to check return value of debugfs_create functions
  nvmet: fix data_len to 0 for bdev-backed write_zeroes
  MAINTAINERS: Hand over skd maintainership
  nvme-tcp: fix queue mapping when queue count is limited
  nvme-rdma: fix queue mapping when queue count is limited
2019-06-08 12:12:11 -07:00
Linus Torvalds
1b02caa319 Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
 "Two bug fixes, both for fairly serious problems; the UFS one looks
  like it could be used to exfiltrate data from the kernel, although
  probably only a privileged user has access to the command management
  interface and the missing unlock in smartpqi is long standing and
  probably a little used error path"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: smartpqi: unlock on error in pqi_submit_raid_request_synchronous()
  scsi: ufs: Check that space was properly alloced in copy_query_response
2019-06-08 11:54:17 -07:00
Linus Torvalds
0ad43e29b6 Merge tag 'linux-kselftest-5.2-rc4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull Kselftest fix from Shuah Khan:
 "This consists of a single fix for a vm test build failure regression
  when it is built by itself"

* tag 'linux-kselftest-5.2-rc4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  selftests: vm: Fix test build failure when built by itself
2019-06-08 10:57:32 -07:00
Linus Torvalds
79c3ba3206 Merge tag 'drm-fixes-2019-06-07-1' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
 "A small bit more lively this week but not majorly so. I'm away in
  Japan next week for family holiday, so I'll be pretty disconnected,
  I've asked Daniel to do fixes for the week while I'm out.

  The nouveau firmware changes are a bit large, but they address a big
  problem where a whole set of boards don't load with the driver, and
  the new firmware fixes that, so I think it's worth trying to land it
  now.

  core:
   - Allow fb changes in async commits (drivers as well)

  udmabuf:
   - Unmap scatterlist when unmapping udmabuf

  nouveau:
   - firmware loading fixes for secboot firmware on new GPU revision.

  komeda:
   - oops, dma mapping and warning fixes

  arm-hdlcd:
   - clock fixes
   - mode validation fix

  i915:
   - Add a missing Icelake workaround
   - GVT - DMA map fault fix and enforcement fixes

  amdgpu:
   - DCE resume fix
   - New raven variation updates"

* tag 'drm-fixes-2019-06-07-1' of git://anongit.freedesktop.org/drm/drm: (33 commits)
  drm/nouveau/secboot/gp10[2467]: support newer FW to fix SEC2 failures on some boards
  drm/nouveau/secboot: enable loading of versioned LS PMU/SEC2 ACR msgqueue FW
  drm/nouveau/secboot: split out FW version-specific LS function pointers
  drm/nouveau/secboot: pass max supported FW version to LS load funcs
  drm/nouveau/core: support versioned firmware loading
  drm/nouveau/core: pass subdev into nvkm_firmware_get, rather than device
  drm/komeda: Potential error pointer dereference
  drm/komeda: remove set but not used variable 'kcrtc'
  drm/amd/amdgpu: add RLC firmware to support raven1 refresh
  drm/amd/powerplay: add set_power_profile_mode for raven1_refresh
  drm/amdgpu: fix ring test failure issue during s3 in vce 3.0 (V2)
  udmabuf: actually unmap the scatterlist
  drm/arm/hdlcd: Allow a bit of clock tolerance
  drm/arm/hdlcd: Actually validate CRTC modes
  drm/arm/mali-dp: Add a loop around the second set CVAL and try 5 times
  drm/komeda: fixing of DMA mapping sg segment warning
  drm: don't block fb changes for async plane updates
  drm/vc4: fix fb references in async update
  drm/msm: fix fb references in async update
  drm/amd: fix fb references in async update
  ...
2019-06-07 17:39:31 -07:00
Wolfram Sang
8f77293cca MAINTAINERS: Karthikeyan Ramasubramanian is MIA
A mail just bounced back with "user unknown":

550 5.1.1 <kramasub@codeaurora.org> User doesn't exist

I also couldn't find a more recent address in git history. So, remove
this stale entry.

Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2019-06-08 00:32:50 +02:00
Robert Hancock
49b8095867 i2c: xiic: Add max_read_len quirk
This driver does not support reading more than 255 bytes at once because
the register for storing the number of bytes to read is only 8 bits. Add
a max_read_len quirk to enforce this.

This was found when using this driver with the SFP driver, which was
previously reading all 256 bytes in the SFP EEPROM in one transaction.
This caused a bunch of hard-to-debug errors in the xiic driver since the
driver/logic was treating the number of bytes to read as zero.
Rejecting transactions that aren't supported at least allows the problem
to be diagnosed more easily.

Signed-off-by: Robert Hancock <hancock@sedsystems.ca>
Reviewed-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Cc: stable@kernel.org
2019-06-08 00:24:07 +02:00
Linus Torvalds
d4425649c6 Merge tag 'hwmon-for-v5.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
Pull hwmon fixes from Guenter Roeck:

 - Fix a couple of inconsistencies and locking problems in pmbus driver

 - Register with thermal subsystem only on systems supporting devicetree

* tag 'hwmon-for-v5.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
  hwmon: (pmbus/core) Treat parameters as paged if on multiple pages
  hwmon: (pmbus/core) mutex_lock write in pmbus_set_samples
  hwmon: (core) add thermal sensors only if dev->of_node is present
2019-06-07 13:38:53 -07:00
Jan Glauber
893a7d32e8 lockref: Limit number of cmpxchg loop retries
The lockref cmpxchg loop is unbound as long as the spinlock is not
taken. Depending on the hardware implementation of compare-and-swap
a high number of loop retries might happen.

Add an upper bound to the loop to force the fallback to spinlocks
after some time. A retry value of 100 should not impact any hardware
that does not have this issue.

With the retry limit the performance of an open-close testcase
improved between 60-70% on ThunderX2.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jan Glauber <jglauber@marvell.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-07 13:15:06 -07:00
Andrey Konovalov
d93445225c uaccess: add noop untagged_addr definition
Architectures that support memory tagging have a need to perform untagging
(stripping the tag) in various parts of the kernel. This patch adds an
untagged_addr() macro, which is defined as noop for architectures that do
not support memory tagging. The oncoming patch series will define it at
least for sparc64 and arm64.

Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Khalid Aziz <khalid.aziz@oracle.com>
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-07 13:09:06 -07:00
Linus Torvalds
d18c7e9d6e Merge tag 'xtensa-20190607' of git://github.com/jcmvbkbc/linux-xtensa
Pull xtensa fix from Max Filippov:
 "Fix a section mismatch between memblock_reserve and mem_reserve.

  This fixes tinyconfig xtensa builds"

* tag 'xtensa-20190607' of git://github.com/jcmvbkbc/linux-xtensa:
  xtensa: Fix section mismatch between memblock_reserve and mem_reserve
2019-06-07 13:06:00 -07:00
Jens Axboe
6c70f899b8 Merge branch 'nvme-5.2-rc-next' of git://git.infradead.org/nvme into for-linus
Pull NVMe fixes from Sagi.

* 'nvme-5.2-rc-next' of git://git.infradead.org/nvme:
  nvme-rdma: use dynamic dma mapping per command
  nvme: Fix u32 overflow in the number of namespace list calculation
  nvmet: fix data_len to 0 for bdev-backed write_zeroes
  nvme-tcp: fix queue mapping when queue count is limited
  nvme-rdma: fix queue mapping when queue count is limited
2019-06-07 14:04:28 -06:00