linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-04-29 09:22:53 -04:00

Author	SHA1	Message	Date
Jérôme Glisse	133ff0eac9	mm/hmm: heterogeneous memory management (HMM for short) HMM provides 3 separate types of functionality: - Mirroring: synchronize CPU page table and device page table - Device memory: allocating struct page for device memory - Migration: migrating regular memory to device memory This patch introduces some common helpers and definitions to all of those 3 functionality. Link: http://lkml.kernel.org/r/20170817000548.32038-3-jglisse@redhat.com Signed-off-by: Jérôme Glisse <jglisse@redhat.com> Signed-off-by: Evgeny Baskakov <ebaskakov@nvidia.com> Signed-off-by: John Hubbard <jhubbard@nvidia.com> Signed-off-by: Mark Hairgrove <mhairgrove@nvidia.com> Signed-off-by: Sherry Cheung <SCheung@nvidia.com> Signed-off-by: Subhash Gutti <sgutti@nvidia.com> Cc: Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: David Nellans <dnellans@nvidia.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Bob Liu <liubo95@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-09-08 18:26:45 -07:00
Jérôme Glisse	bffc33ec53	hmm: heterogeneous memory management documentation Patch series "HMM (Heterogeneous Memory Management)", v25. Heterogeneous Memory Management (HMM) (description and justification) Today device driver expose dedicated memory allocation API through their device file, often relying on a combination of IOCTL and mmap calls. The device can only access and use memory allocated through this API. This effectively split the program address space into object allocated for the device and useable by the device and other regular memory (malloc, mmap of a file, share memory, Ã¢) only accessible by CPU (or in a very limited way by a device by pinning memory). Allowing different isolated component of a program to use a device thus require duplication of the input data structure using device memory allocator. This is reasonable for simple data structure (array, grid, image, Ã¢) but this get extremely complex with advance data structure (list, tree, graph, Ã¢) that rely on a web of memory pointers. This is becoming a serious limitation on the kind of work load that can be offloaded to device like GPU. New industry standard like C++, OpenCL or CUDA are pushing to remove this barrier. This require a shared address space between GPU device and CPU so that GPU can access any memory of a process (while still obeying memory protection like read only). This kind of feature is also appearing in various other operating systems. HMM is a set of helpers to facilitate several aspects of address space sharing and device memory management. Unlike existing sharing mechanism that rely on pining pages use by a device, HMM relies on mmu_notifier to propagate CPU page table update to device page table. Duplicating CPU page table is only one aspect necessary for efficiently using device like GPU. GPU local memory have bandwidth in the TeraBytes/ second range but they are connected to main memory through a system bus like PCIE that is limited to 32GigaBytes/second (PCIE 4.0 16x). Thus it is necessary to allow migration of process memory from main system memory to device memory. Issue is that on platform that only have PCIE the device memory is not accessible by the CPU with the same properties as main memory (cache coherency, atomic operations, ...). To allow migration from main memory to device memory HMM provides a set of helper to hotplug device memory as a new type of ZONE_DEVICE memory which is un-addressable by CPU but still has struct page representing it. This allow most of the core kernel logic that deals with a process memory to stay oblivious of the peculiarity of device memory. When page backing an address of a process is migrated to device memory the CPU page table entry is set to a new specific swap entry. CPU access to such address triggers a migration back to system memory, just like if the page was swap on disk. HMM also blocks any one from pinning a ZONE_DEVICE page so that it can always be migrated back to system memory if CPU access it. Conversely HMM does not migrate to device memory any page that is pin in system memory. To allow efficient migration between device memory and main memory a new migrate_vma() helpers is added with this patchset. It allows to leverage device DMA engine to perform the copy operation. This feature will be use by upstream driver like nouveau mlx5 and probably other in the future (amdgpu is next suspect in line). We are actively working on nouveau and mlx5 support. To test this patchset we also worked with NVidia close source driver team, they have more resources than us to test this kind of infrastructure and also a bigger and better userspace eco-system with various real industry workload they can be use to test and profile HMM. The expected workload is a program builds a data set on the CPU (from disk, from network, from sensors, Ã¢). Program uses GPU API (OpenCL, CUDA, ...) to give hint on memory placement for the input data and also for the output buffer. Program call GPU API to schedule a GPU job, this happens using device driver specific ioctl. All this is hidden from programmer point of view in case of C++ compiler that transparently offload some part of a program to GPU. Program can keep doing other stuff on the CPU while the GPU is crunching numbers. It is expected that CPU will not access the same data set as the GPU while GPU is working on it, but this is not mandatory. In fact we expect some small memory object to be actively access by both GPU and CPU concurrently as synchronization channel and/or for monitoring purposes. Such object will stay in system memory and should not be bottlenecked by system bus bandwidth (rare write and read access from both CPU and GPU). As we are relying on device driver API, HMM does not introduce any new syscall nor does it modify any existing ones. It does not change any POSIX semantics or behaviors. For instance the child after a fork of a process that is using HMM will not be impacted in anyway, nor is there any data hazard between child COW or parent COW of memory that was migrated to device prior to fork. HMM assume a numbers of hardware features. Device must allow device page table to be updated at any time (ie device job must be preemptable). Device page table must provides memory protection such as read only. Device must track write access (dirty bit). Device must have a minimum granularity that match PAGE_SIZE (ie 4k). Reviewer (just hint): Patch 1 HMM documentation Patch 2 introduce core infrastructure and definition of HMM, pretty small patch and easy to review Patch 3 introduce the mirror functionality of HMM, it relies on mmu_notifier and thus someone familiar with that part would be in better position to review Patch 4 is an helper to snapshot CPU page table while synchronizing with concurrent page table update. Understanding mmu_notifier makes review easier. Patch 5 is mostly a wrapper around handle_mm_fault() Patch 6 add new add_pages() helper to avoid modifying each arch memory hot plug function Patch 7 add a new memory type for ZONE_DEVICE and also add all the logic in various core mm to support this new type. Dan Williams and any core mm contributor are best people to review each half of this patchset Patch 8 special case HMM ZONE_DEVICE pages inside put_page() Kirill and Dan Williams are best person to review this Patch 9 allow to uncharge a page from memory group without using the lru list field of struct page (best reviewer: Johannes Weiner or Vladimir Davydov or Michal Hocko) Patch 10 Add support to uncharge ZONE_DEVICE page from a memory cgroup (best reviewer: Johannes Weiner or Vladimir Davydov or Michal Hocko) Patch 11 add helper to hotplug un-addressable device memory as new type of ZONE_DEVICE memory (new type introducted in patch 3 of this serie). This is boiler plate code around memory hotplug and it also pick a free range of physical address for the device memory. Note that the physical address do not point to anything (at least as far as the kernel knows). Patch 12 introduce a new hmm_device class as an helper for device driver that want to expose multiple device memory under a common fake device driver. This is usefull for multi-gpu configuration. Anyone familiar with device driver infrastructure can review this. Boiler plate code really. Patch 13 add a new migrate mode. Any one familiar with page migration is welcome to review. Patch 14 introduce a new migration helper (migrate_vma()) that allow to migrate a range of virtual address of a process using device DMA engine to perform the copy. It is not limited to do copy from and to device but can also do copy between any kind of source and destination memory. Again anyone familiar with migration code should be able to verify the logic. Patch 15 optimize the new migrate_vma() by unmapping pages while we are collecting them. This can be review by any mm folks. Patch 16 add unaddressable memory migration to helper introduced in patch 7, this can be review by anyone familiar with migration code Patch 17 add a feature that allow device to allocate non-present page on the GPU when migrating a range of address to device memory. This is an helper for device driver to avoid having to first allocate system memory before migration to device memory Patch 18 add a new kind of ZONE_DEVICE memory for cache coherent device memory (CDM) Patch 19 add an helper to hotplug CDM memory Previous patchset posting : v1 http://lwn.net/Articles/597289/ v2 https://lkml.org/lkml/2014/6/12/559 v3 https://lkml.org/lkml/2014/6/13/633 v4 https://lkml.org/lkml/2014/8/29/423 v5 https://lkml.org/lkml/2014/11/3/759 v6 http://lwn.net/Articles/619737/ v7 http://lwn.net/Articles/627316/ v8 https://lwn.net/Articles/645515/ v9 https://lwn.net/Articles/651553/ v10 https://lwn.net/Articles/654430/ v11 http://www.gossamer-threads.com/lists/linux/kernel/2286424 v12 http://www.kernelhub.org/?msg=972982&p=2 v13 https://lwn.net/Articles/706856/ v14 https://lkml.org/lkml/2016/12/8/344 v15 http://www.mail-archive.com/linux-kernel@xxxxxxxxxxxxxxx/msg1304107.html v16 http://www.spinics.net/lists/linux-mm/msg119814.html v17 https://lkml.org/lkml/2017/1/27/847 v18 https://lkml.org/lkml/2017/3/16/596 v19 https://lkml.org/lkml/2017/4/5/831 v20 https://lwn.net/Articles/720715/ v21 https://lkml.org/lkml/2017/4/24/747 v22 http://lkml.iu.edu/hypermail/linux/kernel/1705.2/05176.html v23 https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1404788.html v24 https://lwn.net/Articles/726691/ This patch (of 19): This adds documentation for HMM (Heterogeneous Memory Management). It presents the motivation behind it, the features necessary for it to be useful and and gives an overview of how this is implemented. Link: http://lkml.kernel.org/r/20170817000548.32038-2-jglisse@redhat.com Signed-off-by: Jérôme Glisse <jglisse@redhat.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: David Nellans <dnellans@nvidia.com> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Evgeny Baskakov <ebaskakov@nvidia.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Mark Hairgrove <mhairgrove@nvidia.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Sherry Cheung <SCheung@nvidia.com> Cc: Subhash Gutti <sgutti@nvidia.com> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Bob Liu <liubo95@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-09-08 18:26:45 -07:00
Naoya Horiguchi	8135d8926c	mm: memory_hotplug: memory hotremove supports thp migration This patch enables thp migration for memory hotremove. Link: http://lkml.kernel.org/r/20170717193955.20207-11-zi.yan@sent.com Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Zi Yan <zi.yan@cs.rutgers.edu> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: David Nellans <dnellans@nvidia.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Minchan Kim <minchan@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-09-08 18:26:45 -07:00
Naoya Horiguchi	e8db67eb0d	mm: migrate: move_pages() supports thp migration This patch enables thp migration for move_pages(2). Link: http://lkml.kernel.org/r/20170717193955.20207-10-zi.yan@sent.com Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Zi Yan <zi.yan@cs.rutgers.edu> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: David Nellans <dnellans@nvidia.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Minchan Kim <minchan@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-09-08 18:26:45 -07:00
Naoya Horiguchi	c863379849	mm: mempolicy: mbind and migrate_pages support thp migration This patch enables thp migration for mbind(2) and migrate_pages(2). Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Zi Yan <zi.yan@cs.rutgers.edu> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: David Nellans <dnellans@nvidia.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Minchan Kim <minchan@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-09-08 18:26:45 -07:00
Naoya Horiguchi	ab6e3d0939	mm: soft-dirty: keep soft-dirty bits over thp migration Soft dirty bit is designed to keep tracked over page migration. This patch makes it work in the same manner for thp migration too. Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Zi Yan <zi.yan@cs.rutgers.edu> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: David Nellans <dnellans@nvidia.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Minchan Kim <minchan@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-09-08 18:26:45 -07:00
Zi Yan	84c3fc4e9c	mm: thp: check pmd migration entry in common path When THP migration is being used, memory management code needs to handle pmd migration entries properly. This patch uses !pmd_present() or is_swap_pmd() (depending on whether pmd_none() needs separate code or not) to check pmd migration entries at the places where a pmd entry is present. Since pmd-related code uses split_huge_page(), split_huge_pmd(), pmd_trans_huge(), pmd_trans_unstable(), or pmd_none_or_trans_huge_or_clear_bad(), this patch: 1. adds pmd migration entry split code in split_huge_pmd(), 2. takes care of pmd migration entries whenever pmd_trans_huge() is present, 3. makes pmd_none_or_trans_huge_or_clear_bad() pmd migration entry aware. Since split_huge_page() uses split_huge_pmd() and pmd_trans_unstable() is equivalent to pmd_none_or_trans_huge_or_clear_bad(), we do not change them. Until this commit, a pmd entry should be: 1. pointing to a pte page, 2. is_swap_pmd(), 3. pmd_trans_huge(), 4. pmd_devmap(), or 5. pmd_none(). Signed-off-by: Zi Yan <zi.yan@cs.rutgers.edu> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: David Nellans <dnellans@nvidia.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Minchan Kim <minchan@kernel.org> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-09-08 18:26:45 -07:00
Zi Yan	616b837153	mm: thp: enable thp migration in generic path Add thp migration's core code, including conversions between a PMD entry and a swap entry, setting PMD migration entry, removing PMD migration entry, and waiting on PMD migration entries. This patch makes it possible to support thp migration. If you fail to allocate a destination page as a thp, you just split the source thp as we do now, and then enter the normal page migration. If you succeed to allocate destination thp, you enter thp migration. Subsequent patches actually enable thp migration for each caller of page migration by allowing its get_new_page() callback to allocate thps. [zi.yan@cs.rutgers.edu: fix gcc-4.9.0 -Wmissing-braces warning] Link: http://lkml.kernel.org/r/A0ABA698-7486-46C3-B209-E95A9048B22C@cs.rutgers.edu [akpm@linux-foundation.org: fix x86_64 allnoconfig warning] Signed-off-by: Zi Yan <zi.yan@cs.rutgers.edu> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: David Nellans <dnellans@nvidia.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Minchan Kim <minchan@kernel.org> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-09-08 18:26:45 -07:00
Naoya Horiguchi	9c670ea379	mm: thp: introduce CONFIG_ARCH_ENABLE_THP_MIGRATION Introduce CONFIG_ARCH_ENABLE_THP_MIGRATION to limit thp migration functionality to x86_64, which should be safer at the first step. Link: http://lkml.kernel.org/r/20170717193955.20207-5-zi.yan@sent.com Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Zi Yan <zi.yan@cs.rutgers.edu> Reviewed-by: Anshuman Khandual <khandual@linux.vnet.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: David Nellans <dnellans@nvidia.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Minchan Kim <minchan@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-09-08 18:26:45 -07:00
Naoya Horiguchi	b5ff8161e3	mm: thp: introduce separate TTU flag for thp freezing TTU_MIGRATION is used to convert pte into migration entry until thp split completes. This behavior conflicts with thp migration added later patches, so let's introduce a new TTU flag specifically for freezing. try_to_unmap() is used both for thp split (via freeze_page()) and page migration (via __unmap_and_move()). In freeze_page(), ttu_flag given for head page is like below (assuming anonymous thp): (TTU_IGNORE_MLOCK \| TTU_IGNORE_ACCESS \| TTU_RMAP_LOCKED \| \ TTU_MIGRATION \| TTU_SPLIT_HUGE_PMD) and ttu_flag given for tail pages is: (TTU_IGNORE_MLOCK \| TTU_IGNORE_ACCESS \| TTU_RMAP_LOCKED \| \ TTU_MIGRATION) __unmap_and_move() calls try_to_unmap() with ttu_flag: (TTU_MIGRATION \| TTU_IGNORE_MLOCK \| TTU_IGNORE_ACCESS) Now I'm trying to insert a branch for thp migration at the top of try_to_unmap_one() like below static int try_to_unmap_one(struct page page, struct vm_area_struct vma, unsigned long address, void arg) { ... / PMD-mapped THP migration entry / if (!pvmw.pte && (flags & TTU_MIGRATION)) { if (!PageAnon(page)) continue; set_pmd_migration_entry(&pvmw, page); continue; } ... } so try_to_unmap() for tail pages called by thp split can go into thp migration code path (which converts pmd* into migration entry), while the expectation is to freeze thp (which converts pte into migration entry.) I detected this failure as a "bad page state" error in a testcase where split_huge_page() is called from queue_pages_pte_range(). Link: http://lkml.kernel.org/r/20170717193955.20207-4-zi.yan@sent.com Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Zi Yan <zi.yan@cs.rutgers.edu> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: David Nellans <dnellans@nvidia.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Minchan Kim <minchan@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-09-08 18:26:45 -07:00
Naoya Horiguchi	eee4818baa	mm: x86: move _PAGE_SWP_SOFT_DIRTY from bit 7 to bit 1 _PAGE_PSE is used to distinguish between a truly non-present (_PAGE_PRESENT=0) PMD, and a PMD which is undergoing a THP split and should be treated as present. But _PAGE_SWP_SOFT_DIRTY currently uses the _PAGE_PSE bit, which would cause confusion between one of those PMDs undergoing a THP split, and a soft-dirty PMD. Dropping _PAGE_PSE check in pmd_present() does not work well, because it can hurt optimization of tlb handling in thp split. Thus, we need to move the bit. In the current kernel, bits 1-4 are not used in non-present format since commit `00839ee3b2` ("x86/mm: Move swap offset/type up in PTE to work around erratum"). So let's move _PAGE_SWP_SOFT_DIRTY to bit 1. Bit 7 is used as reserved (always clear), so please don't use it for other purpose. Link: http://lkml.kernel.org/r/20170717193955.20207-3-zi.yan@sent.com Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Zi Yan <zi.yan@cs.rutgers.edu> Acked-by: Dave Hansen <dave.hansen@intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com> Cc: David Nellans <dnellans@nvidia.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Minchan Kim <minchan@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-09-08 18:26:45 -07:00
Naoya Horiguchi	88aaa2a1d7	mm: mempolicy: add queue_pages_required() Patch series "mm: page migration enhancement for thp", v9. Motivations: 1. THP migration becomes important in the upcoming heterogeneous memory systems. As David Nellans from NVIDIA pointed out from other threads (http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1349227.html), future GPUs or other accelerators will have their memory managed by operating systems. Moving data into and out of these memory nodes efficiently is critical to applications that use GPUs or other accelerators. Existing page migration only supports base pages, which has a very low memory bandwidth utilization. My experiments (see below) show THP migration can migrate pages more efficiently. 2. Base page migration vs THP migration throughput. Here are cross-socket page migration results from calling move_pages() syscall: In x86_64, a Intel two-socket E5-2640v3 box, - single 4KB base page migration takes 62.47 us, using 0.06 GB/s BW, - single 2MB THP migration takes 658.54 us, using 2.97 GB/s BW, - 512 4KB base page migration takes 1987.38 us, using 0.98 GB/s BW. In ppc64, a two-socket Power8 box, - single 64KB base page migration takes 49.3 us, using 1.24 GB/s BW, - single 16MB THP migration takes 2202.17 us, using 7.10 GB/s BW, - 256 64KB base page migration takes 2543.65 us, using 6.14 GB/s BW. THP migration can give us 3x and 1.15x throughput over base page migration in x86_64 and ppc64 respectivley. You can test it out by using the code here: https://github.com/x-y-z/thp-migration-bench 3. Existing page migration splits THP before migration and cannot guarantee the migrated pages are still contiguous. Contiguity is always what GPUs and accelerators look for. Without THP migration, khugepaged needs to do extra work to reassemble the migrated pages back to THPs. This patch (of 10): Introduce a separate check routine related to MPOL_MF_INVERT flag. This patch just does cleanup, no behavioral change. Link: http://lkml.kernel.org/r/20170717193955.20207-2-zi.yan@sent.com Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Zi Yan <zi.yan@cs.rutgers.edu> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: David Nellans <dnellans@nvidia.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-09-08 18:26:45 -07:00
Linus Torvalds	015a9e66b9	RDMA/netlink: clean up message validity array initializer The fix in the parent made me look at that function, and react to how illogical and illegible the array initializer was. Use named array indexes to make it clearer what is going on, and make the initializer not depend silently on the exact index numbers. [ The initializer now also shows an odd inconsistency in the naming: note the IWCM vs IWPM.. - Linus ] Cc: Leon Romanovsky <leonro@mellanox.com> Cc: Doug Ledford <dledford@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-09-08 10:17:20 -07:00
Leon Romanovsky	8b2c7e7a3c	RDAM/netlink: Fix out-of-bound access while checking message validity The netlink message sent with type == 0, which doesn't have any client behind it, caused to the overflow in max_num_ops array. Fix it by declaring zero number of ops for the first client. Fixes: `c9901724a2` ("RDMA/netlink: Remove netlink clients infrastructure") Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-09-08 10:01:03 -07:00
Linus Torvalds	5969d1bb30	Merge branch 'gperf-removal' Remove our use of 'gperf' for generating perfect hashes from some of our build tools. This removal was prompted by Masahiro Yamada sending out a patch that removes all our pre-generated files, and when I tested it, I noticed that the gperf version I have (3.1) apparently generates code that no longer works with out code-base because the function interfaces generated by gperf have changed. We really don't care that much, and the gperf people changed their interfaces in ways that makes it annoying to work with them. Tools that make it hard to use them should not be used, and the kernel is not at all interested in some autoconf mess. So remove the gperf dependency entirely. It turns out that if you ignore the pre-generated files, the use of gperf apparently saved us a whopping fifteen lines of code. It obviously wasn't worth it, considering that the pre-generated files are about 500 lines. I sent this out as a patch about three weeks ago, and got absolutely zero responses. So let's see if anybody notices now that I merge it. Because there might be serious bugs here, but it WorksForMe(tm). * gperf-removal: Remove gperf usage from toolchain	2017-09-07 21:39:15 -07:00
Linus Torvalds	572c01ba19	Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI updates from James Bottomley: "This is mostly updates of the usual suspects: lpfc, qla2xxx, hisi_sas, megaraid_sas, zfcp and a host of minor updates. The major driver change here is the elimination of the block based cciss driver in favour of the SCSI based hpsa driver (which now drives all the legacy cases cciss used to be required for). Plus a reset handler clean up and the redo of the SAS SMP handler to use bsg lib" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (279 commits) scsi: scsi-mq: Always unprepare before requeuing a request scsi: Show .retries and .jiffies_at_alloc in debugfs scsi: Improve requeuing behavior scsi: Call scsi_initialize_rq() for filesystem requests scsi: qla2xxx: Reset the logo flag, after target re-login. scsi: qla2xxx: Fix slow mem alloc behind lock scsi: qla2xxx: Clear fc4f_nvme flag scsi: qla2xxx: add missing includes for qla_isr scsi: qla2xxx: Fix an integer overflow in sysfs code scsi: aacraid: report -ENOMEM to upper layer from aac_convert_sgraw2() scsi: aacraid: get rid of one level of indentation scsi: aacraid: fix indentation errors scsi: storvsc: fix memory leak on ring buffer busy scsi: scsi_transport_sas: switch to bsg-lib for SMP passthrough scsi: smartpqi: remove the smp_handler stub scsi: hpsa: remove the smp_handler stub scsi: bsg-lib: pass the release callback through bsg_setup_queue scsi: Rework handling of scsi_device.vpd_pg8[03] scsi: Rework the code for caching Vital Product Data (VPD) scsi: rcu: Introduce rcu_swap_protected() ...	2017-09-07 21:11:05 -07:00
Linus Torvalds	cef5d0f952	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk Pull printk updates from Petr Mladek: - Do not allow use of freed init data and code even when boot consoles are forced to stay. Also check for the init memory more precisely. - Some code clean up by starting contributors. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk: printk: Clean up do_syslog() error handling printk/console: Enhance the check for consoles using init memory printk/console: Always disable boot consoles that use init memory before it is freed printk: Modify operators of printed_len and text_len	2017-09-07 21:00:52 -07:00
Linus Torvalds	0fb02e718f	Merge tag 'audit-pr-20170907' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit Pull audit updates from Paul Moore: "A small pull request for audit this time, only four patches and only two with any real code changes. Those two changes are the removal of a pointless SELinux AVC initialization audit event and a fix to improve the audit timestamp overhead. The other two patches are comment cleanup and administrative updates, nothing very exciting. Everything passes our tests" * tag 'audit-pr-20170907' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit: audit: update the function comments selinux: remove AVC init audit log message audit: update the audit info in MAINTAINERS audit: Reduce overhead using a coarse clock	2017-09-07 20:48:25 -07:00
Linus Torvalds	828f4257d1	Merge tag 'secureexec-v4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull secureexec update from Kees Cook: "This series has the ultimate goal of providing a sane stack rlimit when running setid processes. To do this, the bprm_secureexec LSM hook is collapsed into the bprm_set_creds hook so the secureexec-ness of an exec can be determined early enough to make decisions about rlimits and the resulting memory layouts. Other logic acting on the secureexec-ness of an exec is similarly consolidated. Capabilities needed some special handling, but the refactoring removed other special handling, so that was a wash" tag 'secureexec-v4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: exec: Consolidate pdeath_signal clearing exec: Use sane stack rlimit under secureexec exec: Consolidate dumpability logic smack: Remove redundant pdeath_signal clearing exec: Use secureexec for clearing pdeath_signal exec: Use secureexec for setting dumpability LSM: drop bprm_secureexec hook commoncap: Move cap_elevated calculation into bprm_set_creds commoncap: Refactor to remove bprm_secureexec hook smack: Refactor to remove bprm_secureexec hook selinux: Refactor to remove bprm_secureexec hook apparmor: Refactor to remove bprm_secureexec hook binfmt: Introduce secureexec flag exec: Correct comments about "point of no return" exec: Rename bprm->cred_prepared to called_set_creds	2017-09-07 20:35:29 -07:00
Linus Torvalds	44ccba3f7b	Merge tag 'gcc-plugins-v4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull gcc plugins update from Kees Cook: "This finishes the porting work on randstruct, and introduces a new option to structleak, both noted below: - For the randstruct plugin, enable automatic randomization of structures that are entirely function pointers (along with a couple designated initializer fixes). - For the structleak plugin, provide an option to perform zeroing initialization of all otherwise uninitialized stack variables that are passed by reference (Ard Biesheuvel)" * tag 'gcc-plugins-v4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: gcc-plugins: structleak: add option to init all vars used as byref args randstruct: Enable function pointer struct detection drivers/net/wan/z85230.c: Use designated initializers drm/amd/powerplay: rv: Use designated initializers	2017-09-07 20:30:19 -07:00
Linus Torvalds	21d236bf2b	Merge tag 'pstore-v4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull pstore update from Kees Cook: "Make pstore permissions more versatile by removing CAP_SYSLOG requirement and defining more restrictive root directory DAC permissions default (0750, which can be adjust after boot unlike the CAP_SYSLOG check). Suggested by Nick Kralevich" * tag 'pstore-v4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: Revert "pstore: Honor dmesg_restrict sysctl on dmesg dumps" pstore: Make default pstorefs root dir perms 0750	2017-09-07 19:58:56 -07:00
Linus Torvalds	8dc5b3a6cb	Merge tag '4.14-smb3-xattr-enable' of git://git.samba.org/sfrench/cifs-2.6 Pull cifs update from Steve French: "Enable xattr support for smb3 and also a bugfix" * tag '4.14-smb3-xattr-enable' of git://git.samba.org/sfrench/cifs-2.6: cifs: Check for timeout on Negotiate stage cifs: Add support for writing attributes on SMB2+ cifs: Add support for reading attributes on SMB2+	2017-09-07 16:06:14 -07:00
Linus Torvalds	2500e287bc	Merge git://git.kvack.org/~bcrl/aio-next Pull aio fix from Ben LaHaise: "Improve aio-nr counting on large SMP systems. It has been in linux-next for quite some time" * git://git.kvack.org/~bcrl/aio-next: fs: aio: fix the increment of aio-nr and counting against aio-max-nr	2017-09-07 15:51:11 -07:00
Linus Torvalds	ae8ac6b7db	Merge branch 'quota_scaling' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull quota scaling updates from Jan Kara: "This contains changes to make the quota subsystem more scalable. Reportedly it improves number of files created per second on ext4 filesystem on fast storage by about a factor of 2x" * 'quota_scaling' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: (28 commits) quota: Add lock annotations to struct members quota: Reduce contention on dq_data_lock fs: Provide __inode_get_bytes() quota: Inline dquot_[re]claim_reserved_space() into callsite quota: Inline inode_{incr,decr}_space() into callsites quota: Inline functions into their callsites ext4: Disable dirty list tracking of dquots when journalling quotas quota: Allow disabling tracking of dirty dquots in a list quota: Remove dq_wait_unused from dquot quota: Move locking into clear_dquot_dirty() quota: Do not dirty bad dquots quota: Fix possible corruption of dqi_flags quota: Propagate ->quota_read errors from v2_read_file_info() quota: Fix error codes in v2_read_file_info() quota: Push dqio_sem down to ->read_file_info() quota: Push dqio_sem down to ->write_file_info() quota: Push dqio_sem down to ->get_next_id() quota: Push dqio_sem down to ->release_dqblk() quota: Remove locking for writing to the old quota format quota: Do not acquire dqio_sem for dquot overwrites in v2 format ...	2017-09-07 15:19:35 -07:00
Linus Torvalds	460352c2f1	Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull UDF, reiserfs, quota, fsnotify cleanups from Jan Kara: "Several UDF, reiserfs, quota and fsnotify cleanups. Note that there is also a patch updating MAINTAINERS entry for notification subsystem to point to me as a maintainer since current entries are stale" * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: fsnotify: make dnotify_fsnotify_ops const isofs: Delete an unnecessary variable initialisation in isofs_read_inode() isofs: Adjust four checks for null pointers isofs: Delete an error message for a failed memory allocation in isofs_read_inode() quota_v2: Delete an error message for a failed memory allocation in v2_read_file_info() fs-udf: Delete an error message for a failed memory allocation in two functions fs-udf: Improve six size determinations fs-udf: Adjust two checks for null pointers reiserfs: fix spelling mistake: "tranasction" -> "transaction" MAINTAINERS: Update entries for notification subsystem uapi/linux/quota.h: Do not include linux/errno.h	2017-09-07 14:53:17 -07:00
Linus Torvalds	74fee4e88f	Merge tag 'devicetree-for-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux Pull DeviceTree updates from Rob Herring: "There's a few orphans in the conversion to %pOF printf specifiers included here that no one else picked up. Summary: - Convert more DT code to use of_property_read_* API. - Improve DT overlay support when adding multiple overlays - Convert printk's to %pOF format specifiers. Most went via subsystem trees, but picked up the remaining orphans - Correct unittests to use preferred "okay" for "status" property value - Add a KASLR seed property - Vendor prefixes for Mellanox, Theobroma System, Adaptrum, Moxa - Fix modalias buffer handling - Clean-up of include paths for building dtbs - Add bindings for amc6821, isl1208, tsl2x7x, srf02, and srf10 devices - Add nvmem bindings for MediaTek MT7623 and MT7622 SoC - Add compatible string for Allwinner H5 Mali-450 GPU - Fix links to old OpenFirmware docs with new mirror on devicetree.org - Remove status property from binding doc examples" * tag 'devicetree-for-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (45 commits) devicetree: Adjust status "ok" -> "okay" under drivers/of/ dt-bindings: Remove "status" from examples dt-bindings: pinctrl: sh-pfc: Use generic node name dt-bindings: Add vendor Mellanox dt-binding: net/phy: fix interrupts description virt: Convert to using %pOF instead of full_name macintosh: Convert to using %pOF instead of full_name ide: pmac: Convert to using %pOF instead of full_name microblaze: Convert to using %pOF instead of full_name dt-bindings: usb: musb: Grammar s/the/to/, s/is/are/ of: Use PLATFORM_DEVID_NONE definition of/device: Fix of_device_get_modalias() buffer handling of/device: Prevent buffer overflow in of_device_modalias() dt-bindings: add amc6821, isl1208 trivial bindings dt-bindings: add vendor prefix for Theobroma Systems of: search scripts/dtc/include-prefixes path for both CPP and DTC of: remove arch/$(SRCARCH)/boot/dts from include search path for CPP of: remove drivers/of/testcase-data from include search path for CPP of: return of_get_cpu_node from of_cpu_device_node_get if CPUs are not registered iio: srf08: add device tree binding for srf02 and srf10 ...	2017-09-07 14:43:33 -07:00
Linus Torvalds	7d95565612	Merge tag 'drm-intel-next-fixes-2017-09-07' of git://anongit.freedesktop.org/git/drm-intel Pull i916 drm fixes from Rodrigo Vivi: "Since Dave is on paternity leave we are sending drm/i915 fixes for v4.14-rc1 directly to you as he had asked us to do. The most critical ones are the GPU reset fix for gen2-4 and GVT fix for a regression that is blocking gvt init to work on your tree. The rest is general fixes for patches coming from drm-next" Acked-by: Dave Airlie <airlied@redhat.com> * tag 'drm-intel-next-fixes-2017-09-07' of git://anongit.freedesktop.org/git/drm-intel: drm/i915: Re-enable GTT following a device reset drm/i915: Annotate user relocs with __user drm/i915: Silence sparse by using gfp_t drm/i915: Add __rcu to radix tree slot pointer drm/i915: Fix the missing PPAT cache attributes on CNL drm/i915/gvt: Remove one duplicated MMIO drm/i915: Fix enum pipe vs. enum transcoder for the PCH transcoder drm/i915: Make i2c lock ops static drm/i915: Make i9xx_load_ycbcr_conversion_matrix() static drm/i915/edp: Increase T12 panel delay to 900 ms to fix DP AUX CH timeouts drm/i915: Ignore duplicate VMA stored within the per-object handle LUT drm/i915: Skip fence alignemnt check for the CCS plane drm/i915: Treat fb->offsets[] as a raw byte offset instead of a linear offset drm/i915: Always wake the device to flush the GTT drm/i915: Recreate vmapping even when the object is pinned drm/i915: Quietly cancel FBC activation if CRTC is turned off before worker	2017-09-07 14:37:25 -07:00
Linus Torvalds	5f9cc57036	Merge tag 'leds_for_4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds Pull LED updates from Jacek Anaszewski: "LED class drivers improvements: leds-pca955x: - add Device Tree support and bindings - use devm_led_classdev_register() - add GPIO support - prevent crippled LED class device name - check for I2C errors leds-gpio: - add optional retain-state-shutdown DT property - allow LED to retain state at shutdown leds-tlc591xx: - merge conditional tests - add missing of_node_put leds-powernv: - delete an error message for a failed memory allocation in powernv_led_create() leds-is31fl32xx.c - convert to using custom %pOF printf format specifier Constify attribute_group structures in: - leds-blinkm - leds-lm3533 Make several arrays static const in: - leds-aat1290 - leds-lp5521 - leds-lp5562 - leds-lp8501" * tag 'leds_for_4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds: leds: pca955x: check for I2C errors leds: gpio: Allow LED to retain state at shutdown dt-bindings: leds: gpio: Add optional retain-state-shutdown property leds: powernv: Delete an error message for a failed memory allocation in powernv_led_create() leds: lp8501: make several arrays static const leds: lp5562: make several arrays static const leds: lp5521: make several arrays static const leds: aat1290: make array max_mm_current_percent static const leds: pca955x: Prevent crippled LED device name leds: lm3533: constify attribute_group structure dt-bindings: leds: add pca955x leds: pca955x: add GPIO support leds: pca955x: use devm_led_classdev_register leds: pca955x: add device tree support leds: Convert to using %pOF instead of full_name leds: blinkm: constify attribute_group structures. leds: tlc591xx: add missing of_node_put leds: tlc591xx: merge conditional tests	2017-09-07 14:33:13 -07:00
Linus Torvalds	cd7b34fe1c	Merge tag 'dmaengine-4.14-rc1' of git://git.infradead.org/users/vkoul/slave-dma Pull dmaengine updates from Vinod Koul: "This one features the usual updates to the drivers and one good part of removing DA_SG from core as it has no users. Summary: - Remove DMA_SG support as we have no users for this feature - New driver for Altera / Intel mSGDMA IP core - Support for memset in dmatest and qcom_hidma driver - Update for non cyclic mode in k3dma, bunch of update in bam_dma, bcm sba-raid - Constify device ids across drivers" * tag 'dmaengine-4.14-rc1' of git://git.infradead.org/users/vkoul/slave-dma: (52 commits) dmaengine: sun6i: support V3s SoC variant dmaengine: sun6i: make gate bit in sun8i's DMA engines a common quirk dmaengine: rcar-dmac: document R8A77970 bindings dmaengine: xilinx_dma: Fix error code format specifier dmaengine: altera: Use macros instead of structs to describe the registers dmaengine: ti-dma-crossbar: Fix dra7 reserve function dmaengine: pl330: constify amba_id dmaengine: pl08x: constify amba_id dmaengine: bcm-sba-raid: Remove redundant SBA_REQUEST_STATE_COMPLETED dmaengine: bcm-sba-raid: Explicitly ACK mailbox message after sending dmaengine: bcm-sba-raid: Add debugfs support dmaengine: bcm-sba-raid: Remove redundant SBA_REQUEST_STATE_RECEIVED dmaengine: bcm-sba-raid: Re-factor sba_process_deferred_requests() dmaengine: bcm-sba-raid: Pre-ack async tx descriptor dmaengine: bcm-sba-raid: Peek mbox when we have no free requests dmaengine: bcm-sba-raid: Alloc resources before registering DMA device dmaengine: bcm-sba-raid: Improve sba_issue_pending() run duration dmaengine: bcm-sba-raid: Increase number of free sba_request dmaengine: bcm-sba-raid: Allow arbitrary number free sba_request dmaengine: bcm-sba-raid: Remove reqs_free_count from sba_device ...	2017-09-07 14:03:05 -07:00
Linus Torvalds	75c727155c	Merge tag 'backlight-next-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight Pull backlight updates from Lee Jones: "Fix-ups: - Constification; pwm_bl - Use new GPIO API; gpio_backlight - Remove unused functionality; gpio_backlight Bug Fixes: - Fix artificial MAXREG limit; lm3630a_bl" * tag 'backlight-next-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight: backlight: gpio_backlight: Delete pdata inversion backlight: gpio_backlight: Convert to use GPIO descriptor backlight: pwm_bl: Make of_device_ids const backlight: lm3630a: Bump REG_MAX value to 0x50 instead of 0x1F	2017-09-07 13:55:16 -07:00
Linus Torvalds	968c61f7da	Merge tag 'mfd-next-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd Pull MFD updates from Lee Jones: "New Drivers - RK805 Power Management IC (PMIC) - ROHM BD9571MWV-M MFD Power Management IC (PMIC) - Texas Instruments TPS68470 Power Management IC (PMIC) & LEDs New Device Support: - Add support for HiSilicon Hi6421v530 to hi6421-pmic-core - Add support for X-Powers AXP806 to axp20x - Add support for X-Powers AXP813 to axp20x - Add support for Intel Sunrise Point LPSS to intel-lpss-pci New Functionality: - Amend API to provide register layout; atmel-smc Fix-ups: - DT re-work; omap, nokia - Header file location change {I2C => MFD}; dm355evm_msp, tps65010 - Fix chip ID formatting issue(s); rk808 - Optionally register touchscreen devices; da9052-core - Documentation improvements; twl-core - Constification; rtsx_pcr, ab8500-core, da9055-i2c, da9052-spi - Drop unnecessary static declaration; max8925-i2c - Kconfig changes (missing deps and remove module support) - Slim down oversized licence statement; hi6421-pmic-core - Use managed resources (devm_); lp87565 - Supply proper error checking/handling; t7l66xb Bug Fixes: - Fix counter duplication issue; da9052-core - Fix potential NULL deference issue; max8998 - Leave SPI-NOR write-protection bit alone; lpc_ich - Ensure device is put into reset during suspend; intel-lpss - Correct register offset variable size; omap-usb-tll" tag 'mfd-next-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd: (61 commits) mfd: intel_soc_pmic: Differentiate between Bay and Cherry Trail CRC variants mfd: intel_soc_pmic: Export separate mfd-cell configs for BYT and CHT dt-bindings: mfd: Add bindings for ZII RAVE devices mfd: omap-usb-tll: Fix register offsets mfd: da9052: Constify spi_device_id mfd: intel-lpss: Put I2C and SPI controllers into reset state on suspend mfd: da9055: Constify i2c_device_id mfd: intel-lpss: Add missing PCI ID for Intel Sunrise Point LPSS devices mfd: t7l66xb: Handle return value of clk_prepare_enable mfd: Add ROHM BD9571MWV-M PMIC DT bindings mfd: intel_soc_pmic_chtwc: Turn Kconfig option into a bool mfd: lp87565: Convert to use devm_mfd_add_devices() mfd: Add support for TPS68470 device mfd: lpc_ich: Do not touch SPI-NOR write protection bit on Haswell/Broadwell mfd: syscon: atmel-smc: Add helper to retrieve register layout mfd: axp20x: Use correct platform device ID for many PEK dt-bindings: mfd: axp20x: Introduce bindings for AXP813 mfd: axp20x: Add support for AXP813 PMIC dt-bindings: mfd: axp20x: Add AXP806 to supported list of chips mfd: Add ROHM BD9571MWV-M MFD PMIC driver ...	2017-09-07 13:51:13 -07:00
Linus Torvalds	9d71941d39	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input updates from Dmitry Torokhov: - a new GPIO bit-banging driver implementing PS/2 protocol - a new power key driver for Rockchip RK805 PMIC - bunch of patches constifying various device ID structures - Elan I2C touchpad driver now supports devices with 2 buttons - other assorted fixes * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (76 commits) Input: byd - make array seq static, reduces object code size Input: xilinx_ps2 - fix multiline comment style Input: pxa27x_keypad - handle return value of clk_prepare_enable Input: tegra-kbc - handle return value of clk_prepare_enable Input: PS/2 gpio bit banging driver for serio bus Input: xen-kbdfront - enable auto repeat for xen keyboard frontend driver Input: ambakmi - constify amba_id Input: atmel_mxt_ts - add support for reset line Input: atmel_mxt_ts - use more managed resources Input: wacom_w8001 - constify serio_device_id Input: tsc40 - constify serio_device_id Input: touchwin - constify serio_device_id Input: touchright - constify serio_device_id Input: touchit213 - constify serio_device_id Input: penmount - constify serio_device_id Input: mtouch - constify serio_device_id Input: inexio - constify serio_device_id Input: hampshire - constify serio_device_id Input: gunze - constify serio_device_id Input: fujitsu_ts - constify serio_device_id ...	2017-09-07 13:39:21 -07:00
Linus Torvalds	dfd9e6d231	Merge tag 'mailbox-v4.14' of git://git.linaro.org/landing-teams/working/fujitsu/integration Pull mailbox updates from Jassi Brar: "Just behavorial changes to a controller driver: the Broadcom's Flexrm mailbox driver has been modifified to support debugfs and TX-Done mechanism by ACK. Nothing for the core mailbox stack" * tag 'mailbox-v4.14' of git://git.linaro.org/landing-teams/working/fujitsu/integration: mailbox: bcm-flexrm-mailbox: Use txdone_ack instead of txdone_poll mailbox: bcm-flexrm-mailbox: Use bitmap instead of IDA mailbox: bcm-flexrm-mailbox: Fix mask used in CMPL_START_ADDR_VALUE() mailbox: bcm-flexrm-mailbox: Add debugfs support mailbox: bcm-flexrm-mailbox: Set IRQ affinity hint for FlexRM ring IRQs	2017-09-07 13:23:37 -07:00
Linus Torvalds	c0da4fa0d1	Merge tag 'media/v4.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull media updates from Mauro Carvalho Chehab: "Brazil's Independence Day pull request :-) This is one of the biggest media pull requests, with 625 patches affecting almost all parts of media (RC, DVB, V4L2, CEC, docs). This contains: - A lot of new drivers: * DVB frontends: mxl5xx, stv0910, stv6111; * camera flash: as3645a led driver; * HDMI receiver: adv748X; * camera sensor: Omnivision 6650 5M driver (ov6650); * HDMI CEC: ao-cec meson driver; * V4L2: Qualcom camss driver; * Remote controller: gpio-ir-tx, pwm-ir-tx and zx-irdec drivers. - The DDbridge DVB driver got a massive update, with makes it in sync with modern hardware from that vendor; - There's an important milestone on this series: the DVB documentation was written in 2003, but only started to be updated in 2007. It also used to contain several gaps from the time it was kept out of tree, mentioning error codes and device nodes that never existed upstream. On this series, it received a massive update: all non-deprecated digital TV APIs are now in sync with the current implementation; - Some DVB APIs that aren't used by any upstream driver got removed; - Other parts of the media documentation algo got updated, fixing some bugs on its PDF output and making it compatible with Sphinx version 1.6. As the number of hacks required to build PDF output reduced, I hope we'll have less troubles as newer versions of our documentation toolchain are released (famous last words); - As usual, lots of driver cleanups and improvements" * tag 'media/v4.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (624 commits) media: leds: as3645a: add V4L2_FLASH_LED_CLASS dependency media: get rid of removed DMX_GET_CAPS and DMX_SET_SOURCE leftovers media: Revert "[media] v4l: async: make v4l2 coexist with devicetree nodes in a dt overlay" media: staging: atomisp: sh_css_calloc shall return a pointer to the allocated space media: Revert "[media] lirc_dev: remove superfluous get/put_device() calls" media: add qcom_camss.rst to v4l-drivers rst file media: dvb headers: make checkpatch happier media: dvb uapi: move frontend legacy API to another part of the book media: pixfmt-srggb12p.rst: better format the table for PDF output media: docs-rst: media: Don't use \small for V4L2_PIX_FMT_SRGGB10 documentation media: index.rst: don't write "Contents:" on PDF output media: pixfmt*.rst: replace a two dots by a comma media: vidioc-g-fmt.rst: adjust table format media: vivid.rst: add a blank line to correct ReST format media: v4l2 uapi book: get rid of driver programming's chapter media: format.rst: use the right markup for important notes media: docs-rst: cardlists: change their format to flat-tables media: em28xx-cardlist.rst: update to reflect last changes media: v4l2-event.rst: adjust table to fit on PDF output media: docs: don't show ToC for each part on PDF output ...	2017-09-07 12:53:14 -07:00
Linus Torvalds	d969443064	Merge tag 'sound-4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound updates from Takashi Iwai: "We have touched quite a lot of files but with fewer changes at this cycle; as you can see, most of changes are trivial fixes, especially constification patches. Among the massive attacks by constification gangs, we had a few core changes (mostly for ASoC core), as well the fixes and the updates by major vendors. Some highlights: ALSA core: - Fix possible races in control API user-TLV codes - Small cleanup of PCM core ASoC: - Continued work for componentization; still half-baked, but we're certainly progressing - Use of devres for jack detection GPIOs, rather as a cleanup - Jack detection support for Qualcomm MSM8916 - Support for Allwinner H3, Cirrus Logic CS43130, Intel Kabylake systems with RT5663, Realtek RT274, TI TLV320AIC32x6 and Wolfson WM8523" * tag 'sound-4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (512 commits) ALSA: hda/ca0132 - Fix memory leak at error path ALSA: hda: Fix forget to free resource in error handling code path in hda_codec_driver_probe ASoC: cs43130: Fix unused compiler warnings for PM runtime ASoC: cs43130: Fix possible Oops with invalid dev_id ASoC: cs43130: fix spelling mistake: "irq_occurrance" -> "irq_occurrence" ALSA: atmel: Remove leftovers of AVR32 removal ALSA: atmel: convert AC97c driver to GPIO descriptor API ALSA: hda/realtek - Enable jack detection function for Intel ALC700 ALSA: hda: Fix regression of hdmi eld control created based on invalid pcm ASoC: Intel: Skylake: Add IPC to configure the copier secondary pins ASoC: add missing compile rule for max98371 ASoC: add missing compile rule for sirf-audio-codec ASoC: add missing compile rule for max98371 ASoC: cs43130: Add devicetree bindings for CS43130 ASoC: cs43130: Add support for CS43130 codec ASoC: make clock direction configurable in asoc-simple ALSA: ctxfi: Remove null check before kfree ASoC: max98927: Changed device property read function ASoC: max98927: Modified DAPM widget and map to enable/disable VI sense path ASoC: max98927: Added PM suspend and resume function ...	2017-09-07 12:44:53 -07:00
Linus Torvalds	3645e6d0dc	Merge tag 'md/4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md Pull MD updates from Shaohua Li: "This update mainly fixes bugs: - Make raid5 ppl support several ppl from Pawel - Several raid5-cache bug fixes from Song - Bitmap fixes from Neil and Me - One raid1/10 regression fix since 4.12 from Me - Other small fixes and cleanup" * tag 'md/4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md: md/bitmap: disable bitmap_resize for file-backed bitmaps. raid5-ppl: Recovery support for multiple partial parity logs md: Runtime support for multiple ppls md/raid0: attach correct cgroup info in bio lib/raid6: align AVX512 constants to 512 bits, not bytes raid5: remove raid5_build_block md/r5cache: call mddev_lock/unlock() in r5c_journal_mode_show md: replace seq_release_private with seq_release md: notify about new spare disk in the container md/raid1/10: reset bio allocated from mempool md/raid5: release/flush io in raid5_do_work() md/bitmap: copy correct data for bitmap super	2017-09-07 12:41:48 -07:00
Linus Torvalds	15d8ffc964	Merge tag 'mmc-v4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc Pull MMC updates from Ulf Hansson: "MMC core: - Continue to refactor the mmc block code to prepare for blkmq - Move mmc block debugfs into block module - Next step for eMMC CMDQ by adding a new mmc host interface for it - Move Kconfig option MMC_DEBUG from core to host - Some additional minor improvements MMC host: - Declare structs as const when applicable - Explicitly request exclusive reset control when applicable - Improve some error paths and other various cleanups - sdhci: Preparations to support SDHCI OMAP - sdhci: Improve some PM related code - sdhci: Re-factoring and modernizations - sdhci-xenon: Add runtime PM and system sleep support - sdhci-xenon: Add support for eMMC HS400 Enhanced Strobe - sdhci-cadence: Add system sleep support - sdhci-of-at91: Improve system sleep support - dw_mmc: Add support for Hisilicon hi3660 - sunxi: Add support for A83T eMMC - sunxi: Add support for DDR52 mode - meson-gx: Add support for UHS-I SD-cards - meson-gx: Cleanups and improvements - tmio: Fix CMD12 (STOP) handling - tmio: Cleanups and improvements - renesas_sdhi: Add r8a7743/5 support - renesas-sdhi: Add support for R-Car Gen3 SDHI DMAC - renesas_sdhi: Cleanups and improvements" * tag 'mmc-v4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: (145 commits) mmc: renesas_sdhi: Add r8a7743/5 support mmc: meson-gx: fix __ffsdi2 undefined on arm32 mmc: sdhci-xenon: add runtime pm support and reimplement standby mmc: core: Move mmc_start_areq() declaration mmc: mmci: stop building qcom dml as module mmc: sunxi: Reset the device at probe time clk: sunxi-ng: Provide a default reset hook mmc: meson-gx: rework tuning function mmc: meson-gx: change default tx phase mmc: meson-gx: implement voltage switch callback mmc: meson-gx: use CCF to handle the clock phases mmc: meson-gx: implement card_busy callback mmc: meson-gx: simplify interrupt handler mmc: meson-gx: work around clk-stop issue mmc: meson-gx: fix dual data rate mode frequencies mmc: meson-gx: rework clock init function mmc: meson-gx: rework clk_set function mmc: meson-gx: rework set_ios function mmc: meson-gx: cfg init overwrite values mmc: meson-gx: initialize sane clk default before clock register ...	2017-09-07 12:24:50 -07:00
James Bottomley	2441500a41	Merge branch 'fixes' into misc	2017-09-07 12:12:43 -07:00
Linus Torvalds	a0725ab0c7	Merge branch 'for-4.14/block' of git://git.kernel.dk/linux-block Pull block layer updates from Jens Axboe: "This is the first pull request for 4.14, containing most of the code changes. It's a quiet series this round, which I think we needed after the churn of the last few series. This contains: - Fix for a registration race in loop, from Anton Volkov. - Overflow complaint fix from Arnd for DAC960. - Series of drbd changes from the usual suspects. - Conversion of the stec/skd driver to blk-mq. From Bart. - A few BFQ improvements/fixes from Paolo. - CFQ improvement from Ritesh, allowing idling for group idle. - A few fixes found by Dan's smatch, courtesy of Dan. - A warning fixup for a race between changing the IO scheduler and device remova. From David Jeffery. - A few nbd fixes from Josef. - Support for cgroup info in blktrace, from Shaohua. - Also from Shaohua, new features in the null_blk driver to allow it to actually hold data, among other things. - Various corner cases and error handling fixes from Weiping Zhang. - Improvements to the IO stats tracking for blk-mq from me. Can drastically improve performance for fast devices and/or big machines. - Series from Christoph removing bi_bdev as being needed for IO submission, in preparation for nvme multipathing code. - Series from Bart, including various cleanups and fixes for switch fall through case complaints" * 'for-4.14/block' of git://git.kernel.dk/linux-block: (162 commits) kernfs: checking for IS_ERR() instead of NULL drbd: remove BIOSET_NEED_RESCUER flag from drbd_{md_,}io_bio_set drbd: Fix allyesconfig build, fix recent commit drbd: switch from kmalloc() to kmalloc_array() drbd: abort drbd_start_resync if there is no connection drbd: move global variables to drbd namespace and make some static drbd: rename "usermode_helper" to "drbd_usermode_helper" drbd: fix race between handshake and admin disconnect/down drbd: fix potential deadlock when trying to detach during handshake drbd: A single dot should be put into a sequence. drbd: fix rmmod cleanup, remove _all_ debugfs entries drbd: Use setup_timer() instead of init_timer() to simplify the code. drbd: fix potential get_ldev/put_ldev refcount imbalance during attach drbd: new disk-option disable-write-same drbd: Fix resource role for newly created resources in events2 drbd: mark symbols static where possible drbd: Send P_NEG_ACK upon write error in protocol != C drbd: add explicit plugging when submitting batches drbd: change list_for_each_safe to while(list_first_entry_or_null) drbd: introduce drbd_recv_header_maybe_unplug ...	2017-09-07 11:59:42 -07:00
Linus Torvalds	3ee31b89d9	Merge tag 'for-linus-4.14b-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen updates from Juergen Gross: - the new pvcalls backend for routing socket calls from a guest to dom0 - some cleanups of Xen code - a fix for wrong usage of {get,put}_cpu() * tag 'for-linus-4.14b-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: (27 commits) xen/mmu: set MMU_NORMAL_PT_UPDATE in remap_area_mfn_pte_fn xen: Don't try to call xen_alloc_p2m_entry() on autotranslating guests xen/events: events_fifo: Don't use {get,put}_cpu() in xen_evtchn_fifo_init() xen/pvcalls: use WARN_ON(1) instead of __WARN() xen: remove not used trace functions xen: remove unused function xen_set_domain_pte() xen: remove tests for pvh mode in pure pv paths xen-platform: constify pci_device_id. xen: cleanup xen.h xen: introduce a Kconfig option to enable the pvcalls backend xen/pvcalls: implement write xen/pvcalls: implement read xen/pvcalls: implement the ioworker functions xen/pvcalls: disconnect and module_exit xen/pvcalls: implement release command xen/pvcalls: implement poll command xen/pvcalls: implement accept command xen/pvcalls: implement listen command xen/pvcalls: implement bind command xen/pvcalls: implement connect command ...	2017-09-07 10:24:21 -07:00
Linus Torvalds	bac65d9d87	Merge tag 'powerpc-4.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: "Nothing really major this release, despite quite a lot of activity. Just lots of things all over the place. Some things of note include: - Access via perf to a new type of PMU (IMC) on Power9, which can count both core events as well as nest unit events (Memory controller etc). - Optimisations to the radix MMU TLB flushing, mostly to avoid unnecessary Page Walk Cache (PWC) flushes when the structure of the tree is not changing. - Reworks/cleanups of do_page_fault() to modernise it and bring it closer to other architectures where possible. - Rework of our page table walking so that THP updates only need to send IPIs to CPUs where the affected mm has run, rather than all CPUs. - The size of our vmalloc area is increased to 56T on 64-bit hash MMU systems. This avoids problems with the percpu allocator on systems with very sparse NUMA layouts. - STRICT_KERNEL_RWX support on PPC32. - A new sched domain topology for Power9, to capture the fact that pairs of cores may share an L2 cache. - Power9 support for VAS, which is a new mechanism for accessing coprocessors, and initial support for using it with the NX compression accelerator. - Major work on the instruction emulation support, adding support for many new instructions, and reworking it so it can be used to implement the emulation needed to fixup alignment faults. - Support for guests under PowerVM to use the Power9 XIVE interrupt controller. And probably that many things again that are almost as interesting, but I had to keep the list short. Plus the usual fixes and cleanups as always. Thanks to: Alexey Kardashevskiy, Alistair Popple, Andreas Schwab, Aneesh Kumar K.V, Anju T Sudhakar, Arvind Yadav, Balbir Singh, Benjamin Herrenschmidt, Bhumika Goyal, Breno Leitao, Bryant G. Ly, Christophe Leroy, Cédric Le Goater, Dan Carpenter, Dou Liyang, Frederic Barrat, Gautham R. Shenoy, Geliang Tang, Geoff Levand, Hannes Reinecke, Haren Myneni, Ivan Mikhaylov, John Allen, Julia Lawall, LABBE Corentin, Laurentiu Tudor, Madhavan Srinivasan, Markus Elfring, Masahiro Yamada, Matt Brown, Michael Neuling, Murilo Opsfelder Araujo, Nathan Fontenot, Naveen N. Rao, Nicholas Piggin, Oliver O'Halloran, Paul Mackerras, Rashmica Gupta, Rob Herring, Rui Teng, Sam Bobroff, Santosh Sivaraj, Scott Wood, Shilpasri G Bhat, Sukadev Bhattiprolu, Suraj Jitindar Singh, Tobin C. Harding, Victor Aoqui" * tag 'powerpc-4.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (321 commits) powerpc/xive: Fix section __init warning powerpc: Fix kernel crash in emulation of vector loads and stores powerpc/xive: improve debugging macros powerpc/xive: add XIVE Exploitation Mode to CAS powerpc/xive: introduce H_INT_ESB hcall powerpc/xive: add the HW IRQ number under xive_irq_data powerpc/xive: introduce xive_esb_write() powerpc/xive: rename xive_poke_esb() in xive_esb_read() powerpc/xive: guest exploitation of the XIVE interrupt controller powerpc/xive: introduce a common routine xive_queue_page_alloc() powerpc/sstep: Avoid used uninitialized error axonram: Return directly after a failed kzalloc() in axon_ram_probe() axonram: Improve a size determination in axon_ram_probe() axonram: Delete an error message for a failed memory allocation in axon_ram_probe() powerpc/powernv/npu: Move tlb flush before launching ATSD powerpc/macintosh: constify wf_sensor_ops structures powerpc/iommu: Use permission-specific DEVICE_ATTR variants powerpc/eeh: Delete an error out of memory message at init time powerpc/mm: Use seq_putc() in two functions macintosh: Convert to using %pOF instead of full_name ...	2017-09-07 10:15:40 -07:00
Linus Torvalds	f92e3da18b	Merge branch 'efi-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull EFI updates from Ingo Molnar: "The main changes in this cycle were: - Transparently fall back to other poweroff method(s) if EFI poweroff fails (and returns) - Use separate PE/COFF section headers for the RX and RW parts of the ARM stub loader so that the firmware can use strict mapping permissions - Add support for requesting the firmware to wipe RAM at warm reboot - Increase the size of the random seed obtained from UEFI so CRNG fast init can complete earlier - Update the EFI framebuffer address if it points to a BAR that gets moved by the PCI resource allocation code - Enable "reset attack mitigation" of TPM environments: this is enabled if the kernel is configured with CONFIG_RESET_ATTACK_MITIGATION=y. - Clang related fixes - Misc cleanups, constification, refactoring, etc" * 'efi-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: efi/bgrt: Use efi_mem_type() efi: Move efi_mem_type() to common code efi/reboot: Make function pointer orig_pm_power_off static efi/random: Increase size of firmware supplied randomness efi/libstub: Enable reset attack mitigation firmware/efi/esrt: Constify attribute_group structures firmware/efi: Constify attribute_group structures firmware/dcdbas: Constify attribute_group structures arm/efi: Split zImage code and data into separate PE/COFF sections arm/efi: Replace open coded constants with symbolic ones arm/efi: Remove pointless dummy .reloc section arm/efi: Remove forbidden values from the PE/COFF header drivers/fbdev/efifb: Allow BAR to be moved instead of claiming it efi/reboot: Fall back to original power-off method if EFI_RESET_SHUTDOWN returns efi/arm/arm64: Add missing assignment of efi.config_table efi/libstub/arm64: Set -fpie when building the EFI stub efi/libstub/arm64: Force 'hidden' visibility for section markers efi/libstub/arm64: Use hidden attribute for struct screen_info reference efi/arm: Don't mark ACPI reclaim memory as MEMBLOCK_NOMAP	2017-09-07 09:42:35 -07:00
Mauricio Faria de Oliveira	2a8a98673c	fs: aio: fix the increment of aio-nr and counting against aio-max-nr Currently, aio-nr is incremented in steps of 'num_possible_cpus() * 8' for io_setup(nr_events, ..) with 'nr_events < num_possible_cpus() * 4': ioctx_alloc() ... nr_events = max(nr_events, num_possible_cpus() * 4); nr_events = 2; ... ctx->max_reqs = nr_events; ... aio_nr += ctx->max_reqs; .... This limits the number of aio contexts actually available to much less than aio-max-nr, and is increasingly worse with greater number of CPUs. For example, with 64 CPUs, only 256 aio contexts are actually available (with aio-max-nr = 65536) because the increment is 512 in that scenario. Note: 65536 [max aio contexts] / (6442) [increment per aio context] is 128, but make it 256 (double) as counting against 'aio-max-nr 2': ioctx_alloc() ... if (aio_nr + nr_events > (aio_max_nr * 2UL) \|\| ... goto err_ctx; ... This patch uses the original value of nr_events (from userspace) to increment aio-nr and count against aio-max-nr, which resolves those. Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com> Reported-by: Lekshmi C. Pillai <lekshmi.cpillai@in.ibm.com> Tested-by: Lekshmi C. Pillai <lekshmi.cpillai@in.ibm.com> Tested-by: Paul Nguyen <nguyenp@us.ibm.com> Reviewed-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>	2017-09-07 12:28:28 -04:00
Linus Torvalds	57e88b43b8	Merge branch 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 platform updates from Ingo Molnar: "The main changes include various Hyper-V optimizations such as faster hypercalls and faster/better TLB flushes - and there's also some Intel-MID cleanups" * 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: tracing/hyper-v: Trace hyperv_mmu_flush_tlb_others() x86/hyper-v: Support extended CPU ranges for TLB flush hypercalls x86/platform/intel-mid: Make several arrays static, to make code smaller MAINTAINERS: Add missed file for Hyper-V x86/hyper-v: Use hypercall for remote TLB flush hyper-v: Globalize vp_index x86/hyper-v: Implement rep hypercalls hyper-v: Use fast hypercall for HVCALL_SIGNAL_EVENT x86/hyper-v: Introduce fast hypercall implementation x86/hyper-v: Make hv_do_hypercall() inline x86/hyper-v: Include hyperv/ only when CONFIG_HYPERV is set x86/platform/intel-mid: Make 'bt_sfi_data' const x86/platform/intel-mid: Make IRQ allocation a bit more flexible x86/platform/intel-mid: Group timers callbacks together	2017-09-07 09:25:15 -07:00
Linus Torvalds	3b9f8ed25d	Merge branch 'for-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata Pull libata updates from Tejun Heo: "Except for the ahci fix that fixes a boot issue, nothing major in this pull request. Some new platform controller support and device specific changes" * 'for-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata: libata: zpodd: make arrays cdb static, reduces object code size ahci: don't use MSI for devices with the silly Intel NVMe remapping scheme dt-bindings: ata: add DT bindings for MediaTek SATA controller ata: mediatek: add support for MediaTek SATA controller pata_octeon_cf: use of_property_read_{bool\|u32}() cs5536: add support for IDE controller variant ata: sata_gemini: Introduce explicit IDE pin control ata: sata_gemini: Retire custom pin control ata: ahci_platform: Add shutdown handler ata: sata_gemini: explicitly request exclusive reset control ata: Drop unnecessary static ata: Convert to using %pOF instead of full_name	2017-09-06 22:41:21 -07:00
Linus Torvalds	608c1d3c17	Merge branch 'for-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup updates from Tejun Heo: "Several notable changes this cycle: - Thread mode was merged. This will be used for cgroup2 support for CPU and possibly other controllers. Unfortunately, CPU controller cgroup2 support didn't make this pull request but most contentions have been resolved and the support is likely to be merged before the next merge window. - cgroup.stat now shows the number of descendant cgroups. - cpuset now can enable the easier-to-configure v2 behavior on v1 hierarchy" * 'for-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (21 commits) cpuset: Allow v2 behavior in v1 cgroup cgroup: Add mount flag to enable cpuset to use v2 behavior in v1 cgroup cgroup: remove unneeded checks cgroup: misc changes cgroup: short-circuit cset_cgroup_from_root() on the default hierarchy cgroup: re-use the parent pointer in cgroup_destroy_locked() cgroup: add cgroup.stat interface with basic hierarchy stats cgroup: implement hierarchy limits cgroup: keep track of number of descent cgroups cgroup: add comment to cgroup_enable_threaded() cgroup: remove unnecessary empty check when enabling threaded mode cgroup: update debug controller to print out thread mode information cgroup: implement cgroup v2 thread support cgroup: implement CSS_TASK_ITER_THREADED cgroup: introduce cgroup->dom_cgrp and threaded css_set handling cgroup: add @flags to css_task_iter_start() and implement CSS_TASK_ITER_PROCS cgroup: reorganize cgroup.procs / task write path cgroup: replace css_set walking populated test with testing cgrp->nr_populated_csets cgroup: distinguish local and children populated states cgroup: remove now unused list_head @pending in cgroup_apply_cftypes() ...	2017-09-06 22:25:25 -07:00
Linus Torvalds	9954d4892a	Merge branch 'for-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq Pull workqueue updates from Tejun Heo: "Nothing major. I introduced a flag collsion bug during v4.13 cycle which is fixed in this pull request. Fortunately, the flag is for debugging / verification and the bug isn't critical" * 'for-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: workqueue: Fix flag collision workqueue: Use TASK_IDLE workqueue: fix path to documentation workqueue: doc change for ST behavior on NUMA systems	2017-09-06 21:59:31 -07:00
Linus Torvalds	a7cbfd05f4	Merge branch 'for-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu Pull percpu updates from Tejun Heo: "A lot of changes for percpu this time around. percpu inherited the same area allocator from the original pre-virtual-address-mapped implementation. This was from the time when percpu allocator wasn't used all that much and the implementation was focused on simplicity, with the unfortunate computational complexity of O(number of areas allocated from the chunk) per alloc / free. With the increase in percpu usage, we're hitting cases where the lack of scalability is hurting. The most prominent one right now is bpf perpcu map creation / destruction which may allocate and free a lot of entries consecutively and it's likely that the problem will become more prominent in the future. To address the issue, Dennis replaced the area allocator with hinted bitmap allocator which is more consistent. While the new allocator does perform a bit worse in some cases, it outperforms the old allocator way more than an order of magnitude in other more common scenarios while staying mostly flat in CPU overhead and completely flat in memory consumption" * 'for-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (27 commits) percpu: update header to contain bitmap allocator explanation. percpu: update pcpu_find_block_fit to use an iterator percpu: use metadata blocks to update the chunk contig hint percpu: update free path to take advantage of contig hints percpu: update alloc path to only scan if contig hints are broken percpu: keep track of the best offset for contig hints percpu: skip chunks if the alloc does not fit in the contig hint percpu: add first_bit to keep track of the first free in the bitmap percpu: introduce bitmap metadata blocks percpu: replace area map allocator with bitmap percpu: generalize bitmap (un)populated iterators percpu: increase minimum percpu allocation size and align first regions percpu: introduce nr_empty_pop_pages to help empty page accounting percpu: change the number of pages marked in the first_chunk pop bitmap percpu: combine percpu address checks percpu: modify base_addr to be region specific percpu: setup_first_chunk rename schunk/dchunk to chunk percpu: end chunk area maps page aligned for the populated bitmap percpu: unify allocation of schunk and dchunk percpu: setup_first_chunk remove dyn_size and consolidate logic ...	2017-09-06 21:33:12 -07:00
Linus Torvalds	d34fc1adf0	Merge branch 'akpm' (patches from Andrew) Merge updates from Andrew Morton: - various misc bits - DAX updates - OCFS2 - most of MM * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (119 commits) mm,fork: introduce MADV_WIPEONFORK x86,mpx: make mpx depend on x86-64 to free up VMA flag mm: add /proc/pid/smaps_rollup mm: hugetlb: clear target sub-page last when clearing huge page mm: oom: let oom_reap_task and exit_mmap run concurrently swap: choose swap device according to numa node mm: replace TIF_MEMDIE checks by tsk_is_oom_victim mm, oom: do not rely on TIF_MEMDIE for memory reserves access z3fold: use per-cpu unbuddied lists mm, swap: don't use VMA based swap readahead if HDD is used as swap mm, swap: add sysfs interface for VMA based swap readahead mm, swap: VMA based swap readahead mm, swap: fix swap readahead marking mm, swap: add swap readahead hit statistics mm/vmalloc.c: don't reinvent the wheel but use existing llist API mm/vmstat.c: fix wrong comment selftests/memfd: add memfd_create hugetlbfs selftest mm/shmem: add hugetlbfs support to memfd_create() mm, devm_memremap_pages: use multi-order radix for ZONE_DEVICE lookups mm/vmalloc.c: halve the number of comparisons performed in pcpu_get_vm_areas() ...	2017-09-06 20:49:49 -07:00
Andy Lutomirski	1c9fe4409c	x86/mm: Document how CR4.PCIDE restore works While debugging a problem, I thought that using cr4_set_bits_and_update_boot() to restore CR4.PCIDE would be helpful. It turns out to be counterproductive. Add a comment documenting how this works. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-09-06 20:12:57 -07:00

1 2 3 4 5 ...

702800 Commits