Pull VFS fixes from Al Viro:
"Christoph's and Jan's aio fixes, fixup for generic_file_splice_read
(removal of pointless detritus that actually breaks it when used for
gfs2 ->splice_read()) and fixup for generic_file_read_iter()
interaction with ITER_PIPE destinations."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
splice: remove detritus from generic_file_splice_read()
mm/filemap: don't allow partially uptodate page for pipes
aio: fix freeze protection of aio writes
fs: remove aio_run_iocb
fs: remove the never implemented aio_fsync file operation
aio: hold an extra file reference over AIO read/write operations
Pull Ceph fixes from Ilya Dryomov:
"Ceph's ->read_iter() implementation is incompatible with the new
generic_file_splice_read() code that went into -rc1. Switch to the
less efficient default_file_splice_read() for now; the proper fix is
being held for 4.10.
We also have a fix for a 4.8 regression and a trival libceph fixup"
* tag 'ceph-for-4.9-rc5' of git://github.com/ceph/ceph-client:
libceph: initialize last_linger_id with a large integer
libceph: fix legacy layout decode with pool 0
ceph: use default file splice read callback
Pull NFS client bugfixes from Anna Schumaker:
"Most of these fix regressions in 4.9, and none are going to stable
this time around.
Bugfixes:
- Trim extra slashes in v4 nfs_paths to fix tools that use this
- Fix a -Wmaybe-uninitialized warnings
- Fix suspicious RCU usages
- Fix Oops when mounting multiple servers at once
- Suppress a false-positive pNFS error
- Fix a DMAR failure in NFS over RDMA"
* tag 'nfs-for-4.9-3' of git://git.linux-nfs.org/projects/anna/linux-nfs:
xprtrdma: Fix DMAR failure in frwr_op_map() after reconnect
fs/nfs: Fix used uninitialized warn in nfs4_slot_seqid_in_use()
NFS: Don't print a pNFS error if we aren't using pNFS
NFS: Ignore connections that have cl_rpcclient uninitialized
SUNRPC: Fix suspicious RCU usage
NFSv4.1: work around -Wmaybe-uninitialized warning
NFS: Trim extra slash in v4 nfs_path
Pull xfs fix from Dave Chinner:
"This is a fix for an unmount hang (regression) when the filesystem is
shutdown. It was supposed to go to you for -rc3, but I accidentally
tagged the commit prior to it in that pullreq.
Summary:
- fix for aborting deferred transactions on filesystem shutdown"
* tag 'xfs-fixes-for-linus-4.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs:
xfs: defer should abort intent items if the trans roll fails
While testing OBJFREELIST_SLAB integration with pagealloc, we found a
bug where kmem_cache(sys) would be created with both CFLGS_OFF_SLAB &
CFLGS_OBJFREELIST_SLAB. When it happened, critical allocations needed
for loading drivers or creating new caches will fail.
The original kmem_cache is created early making OFF_SLAB not possible.
When kmem_cache(sys) is created, OFF_SLAB is possible and if pagealloc
is enabled it will try to enable it first under certain conditions.
Given kmem_cache(sys) reuses the original flag, you can have both flags
at the same time resulting in allocation failures and odd behaviors.
This fix discards allocator specific flags from memcg before calling
create_cache.
The bug exists since 4.6-rc1 and affects testing debug pagealloc
configurations.
Fixes: b03a017beb ("mm/slab: introduce new slab management type, OBJFREELIST_SLAB")
Link: http://lkml.kernel.org/r/1478553075-120242-1-git-send-email-thgarnie@google.com
Signed-off-by: Greg Thelen <gthelen@google.com>
Signed-off-by: Thomas Garnier <thgarnie@google.com>
Tested-by: Thomas Garnier <thgarnie@google.com>
Acked-by: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
It could be not possible to freeze coredumping task when it waits for
'core_state->startup' completion, because threads are frozen in
get_signal() before they got a chance to complete 'core_state->startup'.
Inability to freeze a task during suspend will cause suspend to fail.
Also CRIU uses cgroup freezer during dump operation. So with an
unfreezable task the CRIU dump will fail because it waits for a
transition from 'FREEZING' to 'FROZEN' state which will never happen.
Use freezer_do_not_count() to tell freezer to ignore coredumping task
while it waits for core_state->startup completion.
Link: http://lkml.kernel.org/r/1475225434-3753-1-git-send-email-aryabinin@virtuozzo.com
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Tejun Heo <tj@kernel.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Starting from 4.9-rc1 kernel, I started noticing some test failures of
sendfile(2) and splice(2) (sendfile0N and splice01 from LTP) when
testing on sub-page block size filesystems (tested both XFS and ext4),
these syscalls start to return EIO in the tests. e.g.
sendfile02 1 TFAIL : sendfile02.c:133: sendfile(2) failed to return expected value, expected: 26, got: -1
sendfile02 2 TFAIL : sendfile02.c:133: sendfile(2) failed to return expected value, expected: 24, got: -1
sendfile02 3 TFAIL : sendfile02.c:133: sendfile(2) failed to return expected value, expected: 22, got: -1
sendfile02 4 TFAIL : sendfile02.c:133: sendfile(2) failed to return expected value, expected: 20, got: -1
This is because that in sub-page block size cases, we don't need the
whole page to be uptodate, only the part we care about is uptodate is OK
(if fs has ->is_partially_uptodate defined).
But page_cache_pipe_buf_confirm() doesn't have the ability to check the
partially-uptodate case, it needs the whole page to be uptodate. So it
returns EIO in this case.
This is a regression introduced by commit 82c156f853 ("switch
generic_file_splice_read() to use of ->read_iter()"). Prior to the
change, generic_file_splice_read() doesn't allow partially-uptodate page
either, so it worked fine.
Fix it by skipping the partially-uptodate check if we're working on a
pipe in do_generic_file_read(), so we read the whole page from disk as
long as the page is not uptodate.
I think the other way to fix it is to add the ability to check & allow
partially-uptodate page to page_cache_pipe_buf_confirm(), but that is
much harder to do and seems gain little.
Link: http://lkml.kernel.org/r/1477986187-12717-1-git-send-email-guaneryu@gmail.com
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Error paths in hugetlb_cow() and hugetlb_no_page() may free a newly
allocated huge page.
If a reservation was associated with the huge page, alloc_huge_page()
consumed the reservation while allocating. When the newly allocated
page is freed in free_huge_page(), it will increment the global
reservation count. However, the reservation entry in the reserve map
will remain.
This is not an issue for shared mappings as the entry in the reserve map
indicates a reservation exists. But, an entry in a private mapping
reserve map indicates the reservation was consumed and no longer exists.
This results in an inconsistency between the reserve map and the global
reservation count. This 'leaks' a reserved huge page.
Create a new routine restore_reserve_on_error() to restore the reserve
entry in these specific error paths. This routine makes use of a new
function vma_add_reservation() which will add a reserve entry for a
specific address/page.
In general, these error paths were rarely (if ever) taken on most
architectures. However, powerpc contained arch specific code that that
resulted in an extra fault and execution of these error paths on all
private mappings.
Fixes: 67961f9db8 ("mm/hugetlb: fix huge page reserve accounting for private mappings)
Link: http://lkml.kernel.org/r/1476933077-23091-2-git-send-email-mike.kravetz@oracle.com
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Reported-by: Jan Stancek <jstancek@redhat.com>
Tested-by: Jan Stancek <jstancek@redhat.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Kirill A . Shutemov <kirill.shutemov@linux.intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This reverts commit 05fd007e46 ("console: don't prefer first
registered if DT specifies stdout-path").
The reverted commit changes existing behavior on which many ARM boards
rely. Many ARM small-board-computers, like e.g. the Raspberry Pi have
both a video output and a serial console. Depending on whether the user
is using the device as a more regular computer; or as a headless device
we need to have the console on either one or the other.
Many users rely on the kernel behavior of the console being present on
both outputs, before the reverted commit the console setup with no
console= kernel arguments on an ARM board which sets stdout-path in dt
would look like this:
[root@localhost ~]# cat /proc/consoles
ttyS0 -W- (EC p a) 4:64
tty0 -WU (E p ) 4:1
Where as after the reverted commit, it looks like this:
[root@localhost ~]# cat /proc/consoles
ttyS0 -W- (EC p a) 4:64
This commit reverts commit 05fd007e46 ("console: don't prefer first
registered if DT specifies stdout-path") restoring the original
behavior.
Fixes: 05fd007e46 ("console: don't prefer first registered if DT specifies stdout-path")
Link: http://lkml.kernel.org/r/20161104121135.4780-2-hdegoede@redhat.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Frank Rowand <frowand.list@gmail.com>
Cc: Thorsten Leemhuis <regressions@leemhuis.info>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When memory_failure() runs on a thp tail page after pmd is split, we
trigger the following VM_BUG_ON_PAGE():
page:ffffd7cd819b0040 count:0 mapcount:0 mapping: (null) index:0x1
flags: 0x1fffc000400000(hwpoison)
page dumped because: VM_BUG_ON_PAGE(!page_count(p))
------------[ cut here ]------------
kernel BUG at /src/linux-dev/mm/memory-failure.c:1132!
memory_failure() passed refcount and page lock from tail page to head
page, which is not needed because we can pass any subpage to
split_huge_page().
Fixes: 61f5d698cc ("mm: re-enable THP")
Link: http://lkml.kernel.org/r/1477961577-7183-1-git-send-email-n-horiguchi@ah.jp.nec.com
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: <stable@vger.kernel.org> [4.5+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
CMA allocation request size is represented by size_t that gets truncated
when same is passed as int to bitmap_find_next_zero_area_off.
We observe that during fuzz testing when cma allocation request is too
high, bitmap_find_next_zero_area_off still returns success due to the
truncation. This leads to kernel crash, as subsequent code assumes that
requested memory is available.
Fail cma allocation in case the request breaches the corresponding cma
region size.
Link: http://lkml.kernel.org/r/1478189211-3467-1-git-send-email-shashim@codeaurora.org
Signed-off-by: Shiraz Hashim <shashim@codeaurora.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If shmem_alloc_page() does not set PageLocked and PageSwapBacked, then
shmem_replace_page() needs to do so for itself. Without this, it puts
newpage on the wrong lru, re-unlocks the unlocked newpage, and system
descends into "Bad page" reports and freeze; or if CONFIG_DEBUG_VM=y, it
hits an earlier VM_BUG_ON_PAGE(!PageLocked), depending on config.
But shmem_replace_page() is not a common path: it's only called when
swapin (or swapoff) finds the page was already read into an unsuitable
zone: usually all zones are suitable, but gem objects for a few drm
devices (gma500, omapdrm, crestline, broadwater) require zone DMA32 if
there's more than 4GB of ram.
Fixes: 800d8c63b2 ("shmem: add huge pages support")
Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1611062003510.11253@eggly.anvils
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: <stable@vger.kernel.org> [4.8.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Christian Borntraeger reports:
With commit 8ea1d2a198 ("mm, frontswap: convert frontswap_enabled to
static key") kmemleak complains about a memory leak in swapon
unreferenced object 0x3e09ba56000 (size 32112640):
comm "swapon", pid 7852, jiffies 4294968787 (age 1490.770s)
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
__vmalloc_node_range+0x194/0x2d8
vzalloc+0x58/0x68
SyS_swapon+0xd60/0x12f8
system_call+0xd6/0x270
Turns out kmemleak is right. We now allocate the frontswap map
depending on the kernel config (and no longer on the enablement)
swapfile.c:
[...]
if (IS_ENABLED(CONFIG_FRONTSWAP))
frontswap_map = vzalloc(BITS_TO_LONGS(maxpages) * sizeof(long));
but later on this is passed along
--> enable_swap_info(p, prio, swap_map, cluster_info, frontswap_map);
and ignored if frontswap is disabled
--> frontswap_init(p->type, frontswap_map);
static inline void frontswap_init(unsigned type, unsigned long *map)
{
if (frontswap_enabled())
__frontswap_init(type, map);
}
Thing is, that frontswap map is never freed.
The leakage is relatively not that bad, because swapon is an infrequent
and privileged operation. However, if the first frontswap backend is
registered after a swap type has been already enabled, it will WARN_ON
in frontswap_register_ops() and frontswap will not be available for the
swap type.
Fix this by making sure the map is assigned by frontswap_init() as long
as CONFIG_FRONTSWAP is enabled.
Fixes: 8ea1d2a198 ("mm, frontswap: convert frontswap_enabled to static key")
Link: http://lkml.kernel.org/r/20161026134220.2566-1-vbabka@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
i_size check is a leftover from the horrors that used to play with
the page cache in that function. With the switch to ->read_iter(),
it's neither needed nor correct - for gfs2 it ends up being buggy,
since i_size is not guaranteed to be correct until later (inside
->read_iter()).
Spotted-by: Abhi Das <adas@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
osdc->last_linger_id is a counter for lreq->linger_id, which is used
for watch cookies. Starting with a large integer should ease the task
of telling apart kernel and userspace clients.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
If your data pool was pool 0, ceph_file_layout_from_legacy()
transform that to -1 unconditionally, which broke upgrades.
We only want do that for a fully zeroed ceph_file_layout,
so that it still maps to a file_layout_t. If any fields
are set, though, we trust the fl_pgpool to be a valid pool.
Fixes: 7627151ea3 ("libceph: define new ceph_file_layout structure")
Link: http://tracker.ceph.com/issues/17825
Signed-off-by: Yan, Zheng <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Splice read/write implementation changed recently. When using
generic_file_splice_read(), iov_iter with type == ITER_PIPE is
passed to filesystem's read_iter callback. But ceph_sync_read()
can't serve ITER_PIPE iov_iter correctly (ITER_PIPE iov_iter
expects pages from page cache).
Fixing ceph_sync_read() requires a big patch. So use default
splice read callback for now.
Signed-off-by: Yan, Zheng <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
When a LOCALINV WR is flushed, the frmr is marked STALE, then
frwr_op_unmap_sync DMA-unmaps the frmr's SGL. These STALE frmrs
are then recovered when frwr_op_map hunts for an INVALID frmr to
use.
All other cases that need frmr recovery leave that SGL DMA-mapped.
The FRMR recovery path unconditionally DMA-unmaps the frmr's SGL.
To avoid DMA unmapping the SGL twice for flushed LOCAL_INV WRs,
alter the recovery logic (rather than the hot frwr_op_unmap_sync
path) to distinguish among these cases. This solution also takes
care of the case where multiple LOCAL_INV WRs are issued for the
same rpcrdma_req, some complete successfully, but some are flushed.
Reported-by: Vasco Steinmetz <linux@kyberraum.net>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Vasco Steinmetz <linux@kyberraum.net>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Pull sound fixes from Takashi Iwai:
"This became a largish pull-request, as we've got a bunch of pending
ASoC fixes at this time. One noticeable change is the removal of error
directive in uapi/sound/asoc.h. We found that the API has been already
used on Chromebooks, so we need to support it even now.
A slight big LOC is found in Qualcomm lpass driver, but the rest are
all small and easy fixes for ASoC drivers (sti, sun4i, Realtek codecs,
Intel, tas571x, etc) in addition to the patches to harden the ALSA
core proc file accesses"
* tag 'sound-4.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (26 commits)
ALSA: info: Return error for invalid read/write
ALSA: info: Limit the proc text input size
ASoC: samsung: spdif: Fix DMA filter initialization
ASoC: sun4i-codec: Enable bus clock after getting GPIO
ASoC: lpass-cpu: add module licence and description
ASoC: lpass-platform: Fix broken pcm data usage
ASoC: sun4i-codec: return error code instead of NULL when create_card fails
ASoC: hdmi-codec: Fix hdmi_of_xlate_dai_name when #sound-dai-cells = <0>
ASoC: samsung: get access to DMA engine early to defer probe properly
ASoC: da7219: Connect output enable register to DAIOUT
ASoC: Intel: Skylake: Fix to turn off hdmi power on probe failure
ASoC: sti-sas: enable fast io for regmap
ASoC: sti: fix channel status update after playback start
ASoC: PXA: Brownstone needs I2C
ASoC: Intel: Skylake: Always acquire runtime pm ref on unload
ASoC: Intel: Atom: add terminate entry for dmi_system_id tables
ASoC: rt298: fix jack type detect error
ASoC: rt5663: fix a debug statement
ASoC: cs4270: fix DAPM stream name mismatch
ASoC: Intel: haswell depends on sst-firmware
...
Pull orangefs fix from Mike Marshall:
"We recently refactored the Orangefs debugfs code. The refactor seemed
to trigger dan.carpenter@oracle.com's static tester to find a possible
double-free in the code.
While designing the fix we saw a condition under which the buffer
being freed could also be overflowed.
We also realized how to rebuild the related debugfs file's "contents"
(a string) without deleting and re-creating the file.
This fix should eliminate the possible double-free, the potential
overflow and improve code readability"
* tag 'for-linus-4.9-rc4-ofs-1' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux:
orangefs: clean up debugfs
Pull s390 fixes from Martin Schwidefsky:
"Two bug fixes
- a memory alignment fix in the s390 only hypfs code
- a fix for the generic percpu code that caused ftrace to break on
s390. This is not relevant for x86 but for all architectures that
use the generic percpu code"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
percpu: use notrace variant of preempt_disable/preempt_enable
s390/hypfs: Use get_free_page() instead of kmalloc to ensure page alignment
Pull IOMMU fixes from Joerg Roedel:
- Four patches from Robin Murphy fix several issues with the recently
merged generic DT-bindings support for arm-smmu drivers
- A fix for a dead-lock issue in the VT-d driver, which shows up on
iommu hotplug
* tag 'iommu-fixes-v4.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
iommu/vt-d: Fix dead-locks in disable_dmar_iommu() path
iommu/arm-smmu: Fix out-of-bounds dereference
iommu/arm-smmu: Check that iommu_fwspecs are ours
iommu/arm-smmu: Don't inadvertently reject multiple SMMUv3s
iommu/arm-smmu: Work around ARM DMA configuration
It turns out that the disable_dmar_iommu() code-path tried
to get the device_domain_lock recursivly, which will
dead-lock when this code runs on dmar removal. Fix both
code-paths that could lead to the dead-lock.
Fixes: 55d940430a ('iommu/vt-d: Get rid of domain->iommu_lock')
Signed-off-by: Joerg Roedel <jroedel@suse.de>
When we iterate a master's config entries, what we generally care
about is the entry's stream map index, rather than the entry index
itself, so it's nice to have the iterator automatically assign the
former from the latter. Unfortunately, booting with KASAN reveals
the oversight that using a simple comma operator results in the
entry index being dereferenced before being checked for validity,
so we always access one element past the end of the fwspec array.
Flip things around so that the check always happens before the index
may be dereferenced.
Fixes: adfec2e709 ("iommu/arm-smmu: Convert to iommu_fwspec")
Reported-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
We seem to have forgotten to check that iommu_fwspecs actually belong to
us before we go ahead and dereference their private data. Oops.
Fixes: 021bb8420d ("iommu/arm-smmu: Wire up generic configuration support")
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
We now delay installing our per-bus iommu_ops until we know an SMMU has
successfully probed, as they don't serve much purpose beforehand, and
doing so also avoids fights between multiple IOMMU drivers in a single
kernel. However, the upshot of passing the return value of bus_set_iommu()
back from our probe function is that if there happens to be more than
one SMMUv3 device in a system, the second and subsequent probes will
wind up returning -EBUSY to the driver core and getting torn down again.
Avoid re-setting ops if ours are already installed, so that any genuine
failures stand out.
Fixes: 08d4ca2a67 ("iommu/arm-smmu: Support non-PCI devices with SMMUv3")
CC: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
CC: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The 32-bit ARM DMA configuration code predates the IOMMU core's default
domain functionality, and instead relies on allocating its own domains
and attaching any devices using the generic IOMMU binding to them.
Unfortunately, it does this relatively early on in the creation of the
device, before we've seen our add_device callback, which leads us to
attempt to operate on a half-configured master.
To avoid a crash, check for this situation on attach, but refuse to
play, as there's nothing we can do. This at least allows VFIO to keep
working for people who update their 32-bit DTs to the generic binding,
albeit with a few (innocuous) warnings from the DMA layer on boot.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Currently the ALSA proc handler allows read or write even if the proc
file were write-only or read-only. It's mostly harmless, does thing
but allocating memory and ignores the input/output. But it doesn't
tell user about the invalid use, and it's confusing and inconsistent
in comparison with other proc files.
This patch adds some sanity checks and let the proc handler returning
an -EIO error when the invalid read/write is performed.
Cc: <stable@vger.kernel.org> # v4.2+
Signed-off-by: Takashi Iwai <tiwai@suse.de>
The ALSA proc handler allows currently the write in the unlimited size
until kmalloc() fails. But basically the write is supposed to be only
for small inputs, mostly for one line inputs, and we don't have to
handle too large sizes at all. Since the kmalloc error results in the
kernel warning, it's better to limit the size beforehand.
This patch adds the limit of 16kB, which must be large enough for the
currently existing code.
Cc: stable@vger.kernel.org # v4.2+
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Commit 345ddcc882 ("ftrace: Have set_ftrace_pid use the bitmap like
events do") added a couple of this_cpu_read calls to the ftrace code.
On x86 this is not a problem, since it has single instructions to read
percpu data. Other architectures which use the generic variant now
have additional preempt_disable and preempt_enable calls in the core
ftrace code. This may lead to recursive calls and in result to a dead
machine, e.g. if preemption and debugging options are enabled.
To fix this use the notrace variant of preempt_disable and
preempt_enable within the generic percpu code.
Reported-and-bisected-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Tested-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Fixes: 345ddcc882 ("ftrace: Have set_ftrace_pid use the bitmap like events do")
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Fix the following warn:
fs/nfs/nfs4session.c: In function ‘nfs4_slot_seqid_in_use’:
fs/nfs/nfs4session.c:203:54: warning: ‘cur_seq’ may be used uninitialized in this function [-Wmaybe-uninitialized]
if (nfs4_slot_get_seqid(tbl, slotid, &cur_seq) == 0 &&
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~
cur_seq == seq_nr && test_bit(slotid, tbl->used_slots))
~~~~~~~~~~~~~~~~~
Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
We used to check for a valid layout type id before verifying pNFS flags
as an indicator for if we are using pNFS. This changed in 3132e49ece
with the introduction of multiple layout types, since now we are passing
an array of ids instead of just one. Since then, users have been seeing
a KERN_ERR printk show up whenever mounting NFS v4 without pNFS. This
patch restores the original behavior of exiting set_pnfs_layoutdriver()
early if we aren't using pNFS.
Fixes 3132e49ece ("pnfs: track multiple layout types in fsinfo
structure")
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
cl_rpcclient starts as ERR_PTR(-EINVAL), and connections like that
are floating freely through the system. Most places check whether
pointer is valid before dereferencing it, but newly added code
in nfs_match_client does not.
Which causes crashes when more than one NFS mount point is present.
Signed-off-by: Petr Vandrovec <petr@vandrovec.name>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
We need to hold the rcu_read_lock() when calling rcu_dereference(),
otherwise we can't guarantee that the object being dereferenced still
exists.
Fixes: 39e5d2df ("SUNRPC search xprt switch for sockaddr")
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Pull arm64 fix from Will Deacon:
"It's been pretty quiet on the fixes side of things for us, but Artem
reported a build failure introduced during the merge window that
appears with older GCCs that do not support asm goto. The fix is
bigger than I'd like, but it's a mechnical move of some constants to
break an include dependency between atomic.h and jump_label.h when
!HAVE_JUMP_LABEL.
Summary:
- Fix build failure on compilers without asm goto"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: Fix circular include of asm/lse.h through linux/jump_label.h
Pull openrisc fix from Guenter Roeck:
"Fix openrisc crash caused by ro_init changes"
* tag 'openrisc-for-linus-v4.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
openrisc: Define __ro_after_init to avoid crash
Pull hwmon fix from Guenter Roeck:
"Fix resource leak on devm_kcalloc failure"
* tag 'hwmon-for-linus-v4.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: (core) fix resource leak on devm_kcalloc failure
Pull HID fixes from Jiri Kosina:
- modprobe-after-rmmod load failure bugfix for intel-ish, from Even Xu
- IRQ probing bugfix for intel-ish, from Srinivas Pandruvada
- attribute parsing fix in hid-sensor, from Ooi, Joyce
- other small misc fixes / quirky device additions
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
HID: sensor: fix attributes in HID sensor interface
HID: intel-ish-hid: request_irq failure
HID: intel-ish-hid: Fix driver reinit failure
HID: intel-ish-hid: Move DMA disable code to new function
HID: intel-ish-hid: consolidate ish wake up operation
HID: usbhid: add ATEN CS962 to list of quirky devices
HID: intel-ish-hid: Fix !CONFIG_PM build warning
HID: sensor-hub: Fix packing of result buffer for feature report
We recently refactored the Orangefs debugfs code.
The refactor seemed to trigger dan.carpenter@oracle.com's
static tester to find a possible double-free in the code.
While designing the fix we saw a condition under which the
buffer being freed could also be overflowed.
We also realized how to rebuild the related debugfs file's
"contents" (a string) without deleting and re-creating the file.
This fix should eliminate the possible double-free, the
potential overflow and improve code readability.
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
Signed-off-by: Martin Brandenburg <martin@omnibond.com>
Starting from 4.9-rc1 kernel, I started noticing some test failures
of sendfile(2) and splice(2) (sendfile0N and splice01 from LTP) when
testing on sub-page block size filesystems (tested both XFS and
ext4), these syscalls start to return EIO in the tests. e.g.
sendfile02 1 TFAIL : sendfile02.c:133: sendfile(2) failed to return expected value, expected: 26, got: -1
sendfile02 2 TFAIL : sendfile02.c:133: sendfile(2) failed to return expected value, expected: 24, got: -1
sendfile02 3 TFAIL : sendfile02.c:133: sendfile(2) failed to return expected value, expected: 22, got: -1
sendfile02 4 TFAIL : sendfile02.c:133: sendfile(2) failed to return expected value, expected: 20, got: -1
This is because that in sub-page block size cases, we don't need the
whole page to be uptodate, only the part we care about is uptodate
is OK (if fs has ->is_partially_uptodate defined). But
page_cache_pipe_buf_confirm() doesn't have the ability to check the
partially-uptodate case, it needs the whole page to be uptodate. So
it returns EIO in this case.
This is a regression introduced by commit 82c156f853 ("switch
generic_file_splice_read() to use of ->read_iter()"). Prior to the
change, generic_file_splice_read() doesn't allow partially-uptodate
page either, so it worked fine.
Fix it by skipping the partially-uptodate check if we're working on
a pipe in do_generic_file_read(), so we read the whole page from
disk as long as the page is not uptodate.
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Pull i2c fix from Wolfram Sang:
"A bugfix for the I2C core fixing a (rare) race condition"
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: core: fix NULL pointer dereference under race condition