Pull iommu updates from Joerg Roedel:
- Allow compiling the ARM-SMMU drivers as modules.
- Fixes and cleanups for the ARM-SMMU drivers and io-pgtable code
collected by Will Deacon. The merge-commit (6855d1ba75) has all the
details.
- Cleanup of the iommu_put_resv_regions() call-backs in various
drivers.
- AMD IOMMU driver cleanups.
- Update for the x2APIC support in the AMD IOMMU driver.
- Preparation patches for Intel VT-d nested mode support.
- RMRR and identity domain handling fixes for the Intel VT-d driver.
- More small fixes and cleanups.
* tag 'iommu-updates-v5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (87 commits)
iommu/amd: Remove the unnecessary assignment
iommu/vt-d: Remove unnecessary WARN_ON_ONCE()
iommu/vt-d: Unnecessary to handle default identity domain
iommu/vt-d: Allow devices with RMRRs to use identity domain
iommu/vt-d: Add RMRR base and end addresses sanity check
iommu/vt-d: Mark firmware tainted if RMRR fails sanity check
iommu/amd: Remove unused struct member
iommu/amd: Replace two consecutive readl calls with one readq
iommu/vt-d: Don't reject Host Bridge due to scope mismatch
PCI/ATS: Add PASID stubs
iommu/arm-smmu-v3: Return -EBUSY when trying to re-add a device
iommu/arm-smmu-v3: Improve add_device() error handling
iommu/arm-smmu-v3: Use WRITE_ONCE() when changing validity of an STE
iommu/arm-smmu-v3: Add second level of context descriptor table
iommu/arm-smmu-v3: Prepare for handling arm_smmu_write_ctx_desc() failure
iommu/arm-smmu-v3: Propagate ssid_bits
iommu/arm-smmu-v3: Add support for Substream IDs
iommu/arm-smmu-v3: Add context descriptor tables allocators
iommu/arm-smmu-v3: Prepare arm_smmu_s1_cfg for SSID support
ACPI/IORT: Parse SSID property of named component node
...
Pull xen updates from Juergen Gross:
- fix a bug introduced in 5.5 in the Xen gntdev driver
- fix the Xen balloon driver when running on ancient Xen versions
- allow Xen stubdoms to control interrupt enable flags of
passed-through PCI cards
- release resources in Xen backends under memory pressure
* tag 'for-linus-5.6-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/blkback: Consistently insert one empty line between functions
xen/blkback: Remove unnecessary static variable name prefixes
xen/blkback: Squeeze page pools if a memory pressure is detected
xenbus/backend: Protect xenbus callback with lock
xenbus/backend: Add memory pressure handler callback
xen/gntdev: Do not use mm notifiers with autotranslating guests
xen/balloon: Support xend-based toolstack take two
xen-pciback: optionally allow interrupt enable flag writes
syzbot managed to send an IPX packet through bond_alb_xmit()
and af_packet and triggered a use-after-free.
First, bond_alb_xmit() was using ipx_hdr() helper to reach
the IPX header, but ipx_hdr() was using the transport offset
instead of the network offset. In the particular syzbot
report transport offset was 0xFFFF
This patch removes ipx_hdr() since it was only (mis)used from bonding.
Then we need to make sure IPv4/IPv6/IPX headers are pulled
in skb->head before dereferencing anything.
BUG: KASAN: use-after-free in bond_alb_xmit+0x153a/0x1590 drivers/net/bonding/bond_alb.c:1452
Read of size 2 at addr ffff8801ce56dfff by task syz-executor.2/18108
(if (ipx_hdr(skb)->ipx_checksum != IPX_NO_CHECKSUM) ...)
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
[<ffffffff8441fc42>] __dump_stack lib/dump_stack.c:17 [inline]
[<ffffffff8441fc42>] dump_stack+0x14d/0x20b lib/dump_stack.c:53
[<ffffffff81a7dec4>] print_address_description+0x6f/0x20b mm/kasan/report.c:282
[<ffffffff81a7e0ec>] kasan_report_error mm/kasan/report.c:380 [inline]
[<ffffffff81a7e0ec>] kasan_report mm/kasan/report.c:438 [inline]
[<ffffffff81a7e0ec>] kasan_report.cold+0x8c/0x2a0 mm/kasan/report.c:422
[<ffffffff81a7dc4f>] __asan_report_load_n_noabort+0xf/0x20 mm/kasan/report.c:469
[<ffffffff82c8c00a>] bond_alb_xmit+0x153a/0x1590 drivers/net/bonding/bond_alb.c:1452
[<ffffffff82c60c74>] __bond_start_xmit drivers/net/bonding/bond_main.c:4199 [inline]
[<ffffffff82c60c74>] bond_start_xmit+0x4f4/0x1570 drivers/net/bonding/bond_main.c:4224
[<ffffffff83baa558>] __netdev_start_xmit include/linux/netdevice.h:4525 [inline]
[<ffffffff83baa558>] netdev_start_xmit include/linux/netdevice.h:4539 [inline]
[<ffffffff83baa558>] xmit_one net/core/dev.c:3611 [inline]
[<ffffffff83baa558>] dev_hard_start_xmit+0x168/0x910 net/core/dev.c:3627
[<ffffffff83bacf35>] __dev_queue_xmit+0x1f55/0x33b0 net/core/dev.c:4238
[<ffffffff83bae3a8>] dev_queue_xmit+0x18/0x20 net/core/dev.c:4278
[<ffffffff84339189>] packet_snd net/packet/af_packet.c:3226 [inline]
[<ffffffff84339189>] packet_sendmsg+0x4919/0x70b0 net/packet/af_packet.c:3252
[<ffffffff83b1ac0c>] sock_sendmsg_nosec net/socket.c:673 [inline]
[<ffffffff83b1ac0c>] sock_sendmsg+0x12c/0x160 net/socket.c:684
[<ffffffff83b1f5a2>] __sys_sendto+0x262/0x380 net/socket.c:1996
[<ffffffff83b1f700>] SYSC_sendto net/socket.c:2008 [inline]
[<ffffffff83b1f700>] SyS_sendto+0x40/0x60 net/socket.c:2004
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: Jay Vosburgh <j.vosburgh@gmail.com>
Cc: Veaceslav Falico <vfalico@gmail.com>
Cc: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull vfs recursive removal updates from Al Viro:
"We have quite a few places where synthetic filesystems do an
equivalent of 'rm -rf', with varying amounts of code duplication,
wrong locking, etc. That really ought to be a library helper.
Only debugfs (and very similar tracefs) are converted here - I have
more conversions, but they'd never been in -next, so they'll have to
wait"
* 'work.recursive_removal' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
simple_recursive_removal(): kernel-side rm -rf for ramfs-style filesystems
When the directory is large and it's being modified by one client
while another client is doing the 'ls -l' on the same directory then
the cache page invalidation from nfs_force_use_readdirplus causes
the reading client to keep restarting READDIRPLUS from cookie 0
which causes the 'ls -l' to take a very long time to complete,
possibly never completing.
Currently when nfs_force_use_readdirplus is called to switch from
READDIR to READDIRPLUS, it invalidates all the cached pages of the
directory. This cache page invalidation causes the next nfs_readdir
to re-read the directory content from cookie 0.
This patch is to optimise the cache invalidation in
nfs_force_use_readdirplus by only truncating the cached pages from
last page index accessed to the end the file. It also marks the
inode to delay invalidating all the cached page of the directory
until the next initial nfs_readdir of the next 'ls' instance.
Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Reviewed-by: Trond Myklebust <trond.myklebust@hammerspace.com>
[Anna - Fix conflicts with Trond's readdir patches]
[Anna - Remove redundant call to nfs_zap_mapping()]
[Anna - Replace d_inode(file_dentry(desc->file)) with file_inode(desc->file)]
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Pull Microblaze update from Michal Simek:
- enable CMA
- add support for MB v11
- defconfig updates
- minor fixes
* tag 'microblaze-v5.6-rc1' of git://git.monstr.eu/linux-2.6-microblaze:
microblaze: Add ID for Microblaze v11
microblaze: Prevent the overflow of the start
microblaze: Wire CMA allocator
asm-generic: Make dma-contiguous.h a mandatory include/asm header
microblaze: Sync defconfig with latest Kconfig layout
microblaze: defconfig: Disable EXT2 driver and Enable EXT3 & EXT4 drivers
microblaze: Align comments with register usage
Pull overlayfs update from Miklos Szeredi:
- Try to preserve holes in sparse files when copying up, thus saving
disk space and improving performance.
- Fix a performance regression introduced in v4.19 by preserving
asynchronicity of IO when fowarding to underlying layers. Add VFS
helpers to submit async iocbs.
- Fix a regression in lseek(2) introduced in v4.19 that breaks >2G
seeks on 32bit kernels.
- Fix a corner case where st_ino/st_dev was not preserved across copy
up.
- Miscellaneous fixes and cleanups.
* tag 'ovl-update-5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
ovl: fix lseek overflow on 32bit
ovl: add splice file read write helper
ovl: implement async IO routines
vfs: add vfs_iocb_iter_[read|write] helper functions
ovl: layer is const
ovl: fix corner case of non-constant st_dev;st_ino
ovl: fix corner case of conflicting lower layer uuid
ovl: generalize the lower_fs[] array
ovl: simplify ovl_same_sb() helper
ovl: generalize the lower_layers[] array
ovl: improving copy-up efficiency for big sparse file
ovl: use ovl_inode_lock in ovl_llseek()
ovl: use pr_fmt auto generate prefix
ovl: fix wrong WARN_ON() in ovl_cache_update_ino()
Pull remoteproc updates from Bjorn Andersson:
"This adds support for the Mediatek MT8183 SCP, modem remoteproc on
Qualcomm SC7180 platform, audio and sensor remoteprocs on Qualcomm
MSM8998 and audio, compute, modem and sensor remoteprocs on Qualcomm
SM8150.
It adds votes for necessary power-domains for all Qualcomm TrustZone
based remoteproc instances are held, fixes a bug related to remoteproc
drivers registering before the core has been initialized and does
clean up the Qualcomm modem remoteproc driver"
* tag 'rproc-v5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/andersson/remoteproc: (21 commits)
remoteproc: qcom: q6v5-mss: Improve readability of reset_assert
remoteproc: qcom: q6v5-mss: Use regmap_read_poll_timeout
remoteproc: qcom: q6v5-mss: Rename boot status timeout
remoteproc: qcom: q6v5-mss: Improve readability across clk handling
remoteproc: use struct_size() helper
remoteproc: Initialize rproc_class before use
rpmsg: add rpmsg support for mt8183 SCP.
remoteproc/mediatek: add SCP support for mt8183
dt-bindings: Add a binding for Mediatek SCP
remoteproc: mss: q6v5-mss: Add modem support on SC7180
dt-bindings: remoteproc: qcom: Add Q6V5 Modem PIL binding for SC7180
remoteproc: qcom: pas: Add MSM8998 ADSP and SLPI support
dt-bindings: remoteproc: qcom: Add ADSP and SLPI support for MSM8998 SoC
remoteproc: q6v5-mss: Remove mem clk from the active pool
remoteproc: qcom: Remove unneeded semicolon
remoteproc: qcom: pas: Add auto_boot flag
remoteproc: qcom: pas: Add SM8150 ADSP, CDSP, Modem and SLPI support
dt-bindings: remoteproc: qcom: SM8150 Add ADSP, CDSP, MPSS and SLPI support
remoteproc: qcom: pas: Vote for active/proxy power domains
dt-bindings: remoteproc: qcom: Add power-domain bindings for Q6V5 PAS
...
Merge more updates from Andrew Morton:
"The rest of MM and the rest of everything else: hotfixes, ipc, misc,
procfs, lib, cleanups, arm"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (67 commits)
ARM: dma-api: fix max_pfn off-by-one error in __dma_supported()
treewide: remove redundant IS_ERR() before error code check
include/linux/cpumask.h: don't calculate length of the input string
lib: new testcases for bitmap_parse{_user}
lib: rework bitmap_parse()
lib: make bitmap_parse_user a wrapper on bitmap_parse
lib: add test for bitmap_parse()
bitops: more BITS_TO_* macros
lib/string: add strnchrnul()
proc: convert everything to "struct proc_ops"
proc: decouple proc from VFS with "struct proc_ops"
asm-generic/tlb: provide MMU_GATHER_TABLE_FREE
asm-generic/tlb: rename HAVE_MMU_GATHER_NO_GATHER
asm-generic/tlb: rename HAVE_MMU_GATHER_PAGE_SIZE
asm-generic/tlb: rename HAVE_RCU_TABLE_FREE
asm-generic/tlb: add missing CONFIG symbol
asm-gemeric/tlb: remove stray function declarations
asm-generic/tlb: avoid potential double flush
mm/mmu_gather: invalidate TLB correctly on batch allocation failure and flush
powerpc/mmu_gather: enable RCU_TABLE_FREE even for !SMP case
...
Pull drm ttm/mm updates from Dave Airlie:
"Thomas Hellstrom has some more changes to the TTM layer that needed a
patch to the mm subsystem.
This adds a new mm API vmf_insert_mixed_prot to avoid an ugly hack
that has limitations in the TTM layer"
* tag 'drm-next-2020-02-04' of git://anongit.freedesktop.org/drm/drm:
mm, drm/ttm: Fix vm page protection handling
mm: Add a vmf_insert_mixed_prot() function
Pull chrome platform updates from Benson Leung:
"CrOS EC:
- Refactoring of some of cros_ec's headers:
include/linux/mfd/cros_ec.h now removed, new cros_ec.h added to
drivers/platform/chrome which contains shared operations of cros_ec
transport drivers.
- Response tracing in cros_ec_proto
Wilco EC:
- Fix unregistration order.
- Fix keyboard backlight probing on systems without keyboard
backlight
- Minor cleanup (newlines in printks, COMPILE_TEST)
Misc:
- chromeos_laptop converted to use i2c_new_scanned_device instead of
i2c_new_probed_device"
* tag 'tag-chrome-platform-for-v5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/chrome-platform/linux:
platform/chrome: cros_ec: Match implementation with headers
platform/chrome: cros_ec: Drop unaligned.h include
platform/chrome: wilco_ec: Allow wilco to be compiled in COMPILE_TEST
platform/chrome: wilco_ec: Add newlines to printks
platform/chrome: wilco_ec: Fix unregistration order
cros_ec: treewide: Remove 'include/linux/mfd/cros_ec.h'
platform/chrome: cros_ec_ishtp: Make init_lock static
platform/chrome: chromeos_laptop: Convert to i2c_new_scanned_device
platform/chrome: cros_ec_lpc: Use platform_get_irq_optional() for optional IRQs
platform/chrome: cros_ec_proto: Add response tracing
platform/chrome: cros_ec_trace: Match trace commands with EC commands
Pull RTC updates from Alexandre Belloni:
"The VL_READ and VL_CLR ioctls have been reworked to be more useful.
This will not break userspace as there are very few users and they are
using the integer value as a boolean.
Apart from that, two drivers were reworked and a few fixes here and
there for a net reduction of number of lines.
Summary:
Subsystem:
- the VL_READ and VL_CLR ioctls are now documented and their behavior
is unified across all the drivers.
- RTC_I2C_AND_SPI Kconfig option rework to avoid selecting both
REGMAP_I2C and REGMAP_SPI unecessarily.
Drivers:
- at91rm9200: remove deprecated procfs, add sam9x60, sama5d4 and
sama5d2 compatibles.
- cmos: solve lost interrupts issue on MS Surface 3
- hym8563: return proper errno when time is invalid
- rv3029: many fixes, nvram support"
* tag 'rtc-5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux: (63 commits)
dt-bindings: rtc: at91rm9200: document clocks property
rtc: i2c/spi: Avoid inclusion of REGMAP support when not needed
rtc: Kconfig: select REGMAP_I2C when necessary
rtc: Kconfig: properly indent sd3078 entry
rtc: cmos: Refactor code by using the new dmi_get_bios_year() helper
rtc: cmos: Use predefined value for RTC IRQ on legacy x86
rtc: cmos: Stop using shared IRQ
rtc: tps6586x: Use IRQ_NOAUTOEN flag
rtc: at91rm9200: use FIELD_PREP/FIELD_GET
rtc: at91rm9200: avoid time readout in at91_rtc_setalarm
rtc: at91rm9200: move register definitions to C file
rtc: at91rm9200: add sama5d4 and sama5d2 compatibles
dt-bindings: rtc: at91rm9200: convert bindings to json-schema
rtc: at91rm9200: remove procfs information
dt-bindings: atmel, at91rm9200-rtc: add microchip, sam9x60-rtc
rtc: pcf8563: Use BIT
rtc: moxart: Convert to SPDX identifier
rtc: ds1343: Remove unused struct spi_device in struct ds1343_priv
rtc: rx8025: Remove struct i2c_client from struct rx8025_data
rtc: hym8563: Read the valid flag directly instead of caching it
...
Patch series "lib: rework bitmap_parse", v5.
Similarl to the recently revisited bitmap_parselist(), bitmap_parse() is
ineffective and overcomplicated. This series reworks it, aligns its
interface with bitmap_parselist() and makes it simpler to use.
The series also adds a test for the function and fixes usage of it in
cpumask_parse() according to the new design - drops the calculating of
length of an input string.
bitmap_parse() takes the array of numbers to be put into the map in the BE
order which is reversed to the natural LE order for bitmaps. For example,
to construct bitmap containing a bit on the position 42, we have to put a
line '400,0'. Current implementation reads chunk one by one from the
beginning ('400' before '0') and makes bitmap shift after each successful
parse. It makes the complexity of the whole process as O(n^2). We can do
it in reverse direction ('0' before '400') and avoid shifting, but it
requires reverse parsing helpers.
This patch (of 7):
New function works like strchrnul() with a length limited string.
Link: http://lkml.kernel.org/r/20200102043031.30357-2-yury.norov@gmail.com
Signed-off-by: Yury Norov <yury.norov@gmail.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Amritha Nambiar <amritha.nambiar@intel.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: "Tobin C . Harding" <tobin@kernel.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Miklos Szeredi <mszeredi@redhat.com>
Cc: Vineet Gupta <vineet.gupta1@synopsys.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
As described in the comment, the correct order for freeing pages is:
1) unhook page
2) TLB invalidate page
3) free page
This order equally applies to page directories.
Currently there are two correct options:
- use tlb_remove_page(), when all page directores are full pages and
there are no futher contraints placed by things like software
walkers (HAVE_FAST_GUP).
- use MMU_GATHER_RCU_TABLE_FREE and tlb_remove_table() when the
architecture does not do IPI based TLB invalidate and has
HAVE_FAST_GUP (or software TLB fill).
This however leaves architectures that don't have page based directories
but don't need RCU in a bind. For those, provide MMU_GATHER_TABLE_FREE,
which provides the independent batching for directories without the
additional RCU freeing.
Link: http://lkml.kernel.org/r/20200116064531.483522-10-aneesh.kumar@linux.ibm.com
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Aneesh reported that:
tlb_flush_mmu()
tlb_flush_mmu_tlbonly()
tlb_flush() <-- #1
tlb_flush_mmu_free()
tlb_table_flush()
tlb_table_invalidate()
tlb_flush_mmu_tlbonly()
tlb_flush() <-- #2
does two TLBIs when tlb->fullmm, because __tlb_reset_range() will not
clear tlb->end in that case.
Observe that any caller to __tlb_adjust_range() also sets at least one of
the tlb->freed_tables || tlb->cleared_p* bits, and those are
unconditionally cleared by __tlb_reset_range().
Change the condition for actually issuing TLBI to having one of those bits
set, as opposed to having tlb->end != 0.
Link: http://lkml.kernel.org/r/20200116064531.483522-4-aneesh.kumar@linux.ibm.com
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reported-by: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Architectures for which we have hardware walkers of Linux page table
should flush TLB on mmu gather batch allocation failures and batch flush.
Some architectures like POWER supports multiple translation modes (hash
and radix) and in the case of POWER only radix translation mode needs the
above TLBI. This is because for hash translation mode kernel wants to
avoid this extra flush since there are no hardware walkers of linux page
table. With radix translation, the hardware also walks linux page table
and with that, kernel needs to make sure to TLB invalidate page walk cache
before page table pages are freed.
More details in commit d86564a2f0 ("mm/tlb, x86/mm: Support invalidating
TLB caches for RCU_TABLE_FREE")
The changes to sparc are to make sure we keep the old behavior since we
are now removing HAVE_RCU_TABLE_NO_INVALIDATE. The default value for
tlb_needs_table_invalidate is to always force an invalidate and sparc can
avoid the table invalidate. Hence we define tlb_needs_table_invalidate to
false for sparc architecture.
Link: http://lkml.kernel.org/r/20200116064531.483522-3-aneesh.kumar@linux.ibm.com
Fixes: a46cc7a90f ("powerpc/mm/radix: Improve TLB/PWC flushes")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> [powerpc]
Cc: <stable@vger.kernel.org> [4.14+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Patch series "Generic page walk and ptdump", v17.
Many architectures current have a debugfs file for dumping the kernel page
tables. Currently each architecture has to implement custom functions for
this because the details of walking the page tables used by the kernel are
different between architectures.
This series extends the capabilities of walk_page_range() so that it can
deal with the page tables of the kernel (which have no VMAs and can
contain larger huge pages than exist for user space). A generic PTDUMP
implementation is the implemented making use of the new functionality of
walk_page_range() and finally arm64 and x86 are switch to using it,
removing the custom table walkers.
To enable a generic page table walker to walk the unusual mappings of the
kernel we need to implement a set of functions which let us know when the
walker has reached the leaf entry. After a suggestion from Will Deacon
I've chosen the name p?d_leaf() as this (hopefully) describes the purpose
(and is a new name so has no historic baggage). Some architectures have
p?d_large macros but this is easily confused with "large pages".
This series ends with a generic PTDUMP implemention for arm64 and x86.
Mostly this is a clean up and there should be very little functional
change. The exceptions are:
* arm64 PTDUMP debugfs now displays pages which aren't present (patch 22).
* arm64 has the ability to efficiently process KASAN pages (which
previously only x86 implemented). This means that the combination of
KASAN and DEBUG_WX is now useable.
This patch (of 23):
Exposing the pud/pgd levels of the page tables to walk_page_range() means
we may come across the exotic large mappings that come with large areas of
contiguous memory (such as the kernel's linear map).
For architectures that don't provide all p?d_leaf() macros, provide
generic do nothing default that are suitable where there cannot be leaf
pages at that level. Futher patches will add implementations for
individual architectures.
The name p?d_leaf() is chosen to minimize the confusion with existing uses
of "large" pages and "huge" pages which do not necessary mean that the
entry is a leaf (for example it may be a set of contiguous entries that
only take 1 TLB slot). For the purpose of walking the page tables we
don't need to know how it will be represented in the TLB, but we do need
to know for sure if it is a leaf of the tree.
Link: http://lkml.kernel.org/r/20191218162402.45610-2-steven.price@arm.com
Signed-off-by: Steven Price <steven.price@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Morse <james.morse@arm.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "Liang, Kan" <kan.liang@linux.intel.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: James Hogan <jhogan@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Burton <paul.burton@mips.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Zong Li <zong.li@sifive.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Let's make sure that all memory holes are actually marked PageReserved(),
that page_to_pfn() produces reliable results, and that these pages are not
detected as "mmap" pages due to the mapcount.
E.g., booting a x86-64 QEMU guest with 4160 MB:
[ 0.010585] Early memory node ranges
[ 0.010586] node 0: [mem 0x0000000000001000-0x000000000009efff]
[ 0.010588] node 0: [mem 0x0000000000100000-0x00000000bffdefff]
[ 0.010589] node 0: [mem 0x0000000100000000-0x0000000143ffffff]
max_pfn is 0x144000.
Before this change:
[root@localhost ~]# ./page-types -r -a 0x144000,
flags page-count MB symbolic-flags long-symbolic-flags
0x0000000000000800 16384 64 ___________M_______________________________ mmap
total 16384 64
After this change:
[root@localhost ~]# ./page-types -r -a 0x144000,
flags page-count MB symbolic-flags long-symbolic-flags
0x0000000100000000 16384 64 ___________________________r_______________ reserved
total 16384 64
IOW, especially the unavailable physical memory ("memory hole") in the
last section would not get properly marked PageReserved() and is indicated
to be "mmap" memory.
Drop the trace of that function from include/linux/mm.h - nobody else
needs it, and rename it accordingly.
Note: The fake zone/node might not be covered by the zone/node span. This
is not an urgent issue (for now, we had the same node/zone due to the
zeroing). We'll need a clean way to mark memory holes (e.g., using a page
type PageHole() if possible or a fake ZONE_INVALID) and eventually stop
marking these memory holes PageReserved().
Link: http://lkml.kernel.org/r/20191211163201.17179-4-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Bob Picco <bob.picco@oracle.com>
Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Pavel Tatashin <pasha.tatashin@oracle.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Steven Sistare <steven.sistare@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
eventfd use cases from aio and io_uring can deadlock due to circular
or resursive calling, when eventfd_signal() tries to grab the waitqueue
lock. On top of that, it's also possible to construct notification
chains that are deep enough that we could blow the stack.
Add a percpu counter that tracks the percpu recursion depth, warn if we
exceed it. The counter is also exposed so that users of eventfd_signal()
can do the right thing if it's non-zero in the context where it is
called.
Cc: stable@vger.kernel.org # 4.19+
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pull percpu updates from Dennis Zhou:
"Separate out variables that can be decrypted into their own page
anytime encryption can be enabled and fix __percpu annotations in
asm-generic for sparse"
* 'for-5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu:
percpu: Separate decrypted varaibles anytime encryption can be enabled
percpu: fix __percpu annotation in asm-generic
Pull clk updates from Stephen Boyd:
"There are a few changes to the core framework this time around, in
addition to the normal collection of driver updates to support new
SoCs, fix incorrect data, and convert various drivers to clk_hw based
APIs.
In the core, we allow clk_ops::init() to return an error code now so
that we can fail clk registration if the callback does something like
fail to allocate memory. We also add a new "terminate" clk_op so that
things done in clk_ops::init() can be undone, e.g. free memory. We
also spit out a warning now when critical clks fail to enable and we
support changing clk rates and enable/disable state through debugfs
when developers compile the kernel themselves.
On the driver front, we get support for what seems like a lot of
Qualcomm and NXP SoCs given that those vendors dominate the diffstat.
There are a couple new drivers for Xilinx and Amlogic SoCs too. The
updates are all small things like fixing the way glitch free muxes
switch parents, avoiding div-by-zero problems, or fixing data like
parent names. See the updates section below for more details.
Finally, the "basic" clk types have been converted to support
specifying parents with clk_hw pointers. This work includes an
overhaul of the fixed-rate clk type to be more modern by using clk_hw
APIs.
Core:
- Let clk_ops::init() return an error code
- Add a clk_ops::terminate() callback to undo clk_ops::init()
- Warn about critical clks that fail to enable or prepare
- Support dangerous debugfs actions on clks with dead code
New Drivers:
- Support for Xilinx Versal platform clks
- Display clk controller on qcom sc7180
- Video clk controller on qcom sc7180
- Graphics clk controller on qcom sc7180
- CPU PLLs for qcom msm8916
- Move qcom msm8974 gfx3d clk to RPM control
- Display port clk support on qcom sdm845 SoCs
- Global clk controller on qcom ipq6018
- Add a driver for BCLK of Freescale SAI cores
- Add cam, vpe and sgx clock support for TI dra7
- Add aess clock support for TI omap5
- Enable clks for CPUfreq on Allwinner A64 SoCs
- Add Amlogic meson8b DDR clock controller
- Add input clocks to Amlogic meson8b controllers
- Add SPIBSC (SPI FLASH) clock on Renesas RZ/A2
- i.MX8MP clk driver support
Updates:
- Convert gpio, fixed-factor, mux, gate, divider basic clks to hw
based APIs
- Detect more PRMCU variants in ux500 driver
- Adjust the composite clk type to new way of describing clk parents
- Fixes for clk controllers on qcom msm8998 SoCs
- Fix gmac main clock for TI dra7
- Move TI dra7-atl clock header to correct location
- Fix hidden node name dependency on TI clkctrl clocks
- Fix Amlogic meson8b mali clock update using the glitch free mux
- Fix Amlogic pll driver division by zero at init
- Prepare for split of Renesas R-Car H3 ES1.x and ES2.0+ config
symbols
- Switch more i.MX clk drivers to clk_hw based APIs
- Disable non-functional divider between pll4_audio_div and
pll4_post_div on imx6q
- Fix watchdog2 clock name typo in imx7ulp clock driver
- Set CLK_GET_RATE_NOCACHE flag for DRAM related clocks on i.MX8M
SoCs
- Suppress bind attrs for i.MX8M clock driver
- Add a big comment in imx8qxp-lpcg driver to tell why
devm_platform_ioremap_resource() shouldn't be used for the driver
- A correction on i.MX8MN usb1_ctrl parent clock setting"
* tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: (140 commits)
dt/bindings: clk: fsl,plldig: Drop 'bindings' from schema id
clk: ls1028a: Fix warning on clamp() usage
clk: qoriq: add ls1088a hwaccel clocks support
clk: ls1028a: Add clock driver for Display output interface
dt/bindings: clk: Add YAML schemas for LS1028A Display Clock bindings
clk: fsl-sai: new driver
dt-bindings: clock: document the fsl-sai driver
clk: composite: add _register_composite_pdata() variants
clk: qcom: rpmh: Sort OF match table
dt-bindings: fix warnings in validation of qcom,gcc.yaml
dt-binding: fix compilation error of the example in qcom,gcc.yaml
clk: zynqmp: Add support for clock with CLK_DIVIDER_POWER_OF_TWO flag
clk: zynqmp: Fix divider calculation
clk: zynqmp: Add support for get max divider
clk: zynqmp: Warn user if clock user are more than allowed
clk: zynqmp: Extend driver for versal
dt-bindings: clock: Add bindings for versal clock driver
clk: ti: clkctrl: Fix hidden dependency to node name
clk: ti: add clkctrl data dra7 sgx
clk: ti: omap5: Add missing AESS clock
...
Pull kgdb updates from Daniel Thompson:
"Everything for kgdb this time around is either simplifications or
clean ups.
In particular Douglas Anderson's modifications to the backtrace
machine in the *last* dev cycle have enabled Doug to tidy up some MIPS
specific backtrace code and stop sharing certain data structures
across the kernel. Note that The MIPS folks were on Cc: for the MIPS
patch and reacted positively (but without an explicit Acked-by).
Doug also got rid of the implicit switching between tasks and register
sets during some but not of kdb's backtrace actions (because the
implicit switching was either confusing for users, pointless or both).
Finally there is a coverity fix and patch to replace open coded
console traversal with the proper helper function"
* tag 'kgdb-5.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/danielt/linux:
kdb: Use for_each_console() helper
kdb: remove redundant assignment to pointer bp
kdb: Get rid of confusing diag msg from "rd" if current task has no regs
kdb: Gid rid of implicit setting of the current task / regs
kdb: kdb_current_task shouldn't be exported
kdb: kdb_current_regs should be private
MIPS: kdb: Remove old workaround for backtracing on other CPUs
The 'cros_ec' core driver is the common interface for the cros_ec
transport drivers to do the shared operations to register, unregister,
suspend, resume and handle_event. The interface is provided by including
the header 'include/linux/platform_data/cros_ec_proto.h', however, instead
of have the implementation of these functions in cros_ec_proto.c, it is in
'cros_ec.c', which is a different kernel module. Apart from being a bad
practice, this can induce confusions allowing the users of the cros_ec
protocol to call these functions.
The register, unregister, suspend, resume and handle_event functions
*should* only be called by the different transport drivers (i2c, spi, lpc,
etc.), so make this a bit less confusing by moving these functions from
the public in-kernel space to a private include in platform/chrome, and
then, the interface for cros_ec module and for the cros_ec_proto module is
clean.
Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
Signed-off-by: Benson Leung <bleung@chromium.org>
Pull backlight updates from Lee Jones:
"Fix-ups:
- Remove superfluous code in ams369fg06
- Convert over to GPIO descriptor (gpiod) in bd6107
Bug Fixes:
- Fix unsigned comparison to less than zero in qcom-wled"
* tag 'backlight-next-5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight:
backlight: qcom-wled: Fix unsigned comparison to zero
backlight: bd6107: Convert to use GPIO descriptor
backlight: ams369fg06: Drop GPIO include
Pull MFD updates from Lee Jones:
"New Drivers:
- Add support for ROHM BD71828 PMICs and GPIOs
- Add support for Qualcomm Aqstic Audio Codecs WCD9340 and WCD9341
New Device Support:
- Add support for BD71828 to BD70528 RTC driver
- Add support for Intel's Jasper Lake to LPSS PCI
New Functionality:
- Add support for Power Key to ROHM BD71828
- Add support for Clocks to ROHM BD71828
- Add support for GPIOs to Dialog DA9062
- Add support for USB PD Notify to ChromiumOS EC
- Allow callers to specify args when requesting regmap lookup; syscon
Fix-ups:
- Improve error handling and sanity checking; atmel-hlcdc, dln2
- Device Tree support/documentation; bd71828, da9062, xylon,logicvc,
ab8500, max14577, atmel-usart
- Match devices using platform IDs; bd7xxxx
- Refactor BD718x7 regulator component; bd718x7-regulator
- Use standard interfaces/helpers; syscon, sm501
- Trivial (whitespace, spelling, etc); ab8500-core, Kconfig
- Remove unused code; db8500-prcmu, tqmx86
- Wait until boot has finished before accessing registers;
madera-core
- Provide missing register value defaults; cs47l15-tables
- Allow more time for hardware to reset; madera-core
Bug Fixes:
- Fix erroneous register values; rohm-bd70528
- Fix register volatility; axp20x, rn5t618
- Fix Kconfig dependencies; MFD_MAX77650
- Fix incorrect compatible string; da9062-core
- Fix syscon_regmap_lookup_by_phandle_args() stub; syscon"
* tag 'mfd-next-5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd: (41 commits)
mfd: syscon: Fix syscon_regmap_lookup_by_phandle_args() dummy
mfd: wcd934x: Add support to wcd9340/wcd9341 codec
mfd: syscon: Add arguments support for syscon reference
mfd: rn5t618: Mark ADC control register volatile
dt-bindings: atmel-usart: Add microchip,sam9x60-{usart, dbgu}
dt-bindings: atmel-usart: Remove wildcard
mfd: cros_ec: Add cros-usbpd-notify subdevice
mfd: da9062: Fix watchdog compatible string
mfd: madera: Allow more time for hardware reset
mfd: cs47l15: Add missing register default
mfd: madera: Wait for boot done before accessing any other registers
mfd: Kconfig: Rename Samsung to lowercase
mfd: tqmx86: remove set but not used variable 'i2c_ien'
mfd: dbx500-prcmu: Drop DSI pll clock functions
mfd: dbx500-prcmu: Drop set_display_clocks()
mfd: max77650: Select REGMAP_IRQ in Kconfig
mfd: axp20x: Mark AXP20X_VBUS_IPSOUT_MGMT as volatile
mfd: ab8500: Fix ab8500-clk typo
mfd: intel-lpss: Add Intel Jasper Lake PCI IDs
dt-bindings: mfd: max14577: Add reference to max14040_battery.txt descriptions
...
Pull Hyper-V updates from Sasha Levin:
- Most of the commits here are work to enable host-initiated
hibernation support by Dexuan Cui.
- Fix for a warning shown when host sends non-aligned balloon requests
by Tianyu Lan.
* tag 'hyperv-next-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
hv_utils: Add the support of hibernation
hv_utils: Support host-initiated hibernation request
hv_utils: Support host-initiated restart request
Tools: hv: Reopen the devices if read() or write() returns errors
video: hyperv: hyperv_fb: Use physical memory for fb on HyperV Gen 1 VMs.
Drivers: hv: vmbus: Ignore CHANNELMSG_TL_CONNECT_RESULT(23)
video: hyperv_fb: Fix hibernation for the deferred IO feature
Input: hyperv-keyboard: Add the support of hibernation
hv_balloon: Balloon up according to request page number