Commit Graph

766502 Commits

Author SHA1 Message Date
Bob Peterson
dffe12a828 gfs2: Fix gfs2_testbit to use clone bitmaps
Function gfs2_testbit is called in three places. Two of those places,
gfs2_alloc_extent and gfs2_unaligned_extlen, should be using the clone
bitmaps, not the "real" bitmaps. Function gfs2_unaligned_extlen is used
by the block reservations scheme to determine the length of an extent of
free blocks. Before this patch, it wasn't using the clone bitmap, which
means recently-freed blocks were treated as free blocks for the purposes
of an allocation.

This patch adds a new parameter to gfs2_testbit to indicate whether or
not the clone bitmaps should be used (if available).

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com>
2018-08-07 10:07:00 -05:00
Andreas Gruenbacher
21e2156f3c gfs2: Get rid of gfs2_ea_strlen
Function gfs2_ea_strlen is only called from ea_list_i, so inline it
there.  Remove the duplicate switch statement and the creative use of
memcpy to set a null byte.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: Andrew Price <anprice@redhat.com>
Reviewed-by: Bob Peterson <rpeterso@redhat.com>
2018-08-03 13:20:02 +01:00
Bob Peterson
3f30f929bb gfs2: cleanup: call gfs2_rgrp_ondisk2lvb from gfs2_rgrp_out
Before this patch gfs2_rgrp_ondisk2lvb was called after every call
to gfs2_rgrp_out. This patch just calls it directly from within
gfs2_rgrp_out, and moves the function to be before it so we don't
need a function prototype.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com>
2018-07-26 14:49:43 -05:00
Andreas Gruenbacher
776125785a gfs2: Special-case rindex for gfs2_grow
To speed up the common case of appending to a file,
gfs2_write_alloc_required presumes that writing beyond the end of a file
will always require additional blocks to be allocated.  This assumption
is incorrect for preallocates files, but there are no negative
consequences as long as *some* space is still left on the filesystem.

One special file that always has some space preallocated beyond the end
of the file is the rindex: when growing a filesystem, gfs2_grow adds one
or more new resource groups and appends records describing those
resource groups to the rindex; the preallocated space ensures that this
is always possible.

However, when a filesystem is completely full, gfs2_write_alloc_required
will indicate that an additional allocation is required, and appending
the next record to the rindex will fail even though space for that
record has already been preallocated.  To fix that, skip the incorrect
optimization in gfs2_write_alloc_required, but for the rindex only.
Other writes to preallocated space beyond the end of the file are still
allowed to fail on completely full filesystems.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: Bob Peterson <rpeterso@redhat.com>
2018-07-25 22:56:14 +02:00
Bob Peterson
f6753df35c GFS2: rgrp free blocks used incorrectly
Before this patch, several functions in rgrp.c checked the value of
rgd->rd_free_clone. That does not take into account blocks that were
reserved by a multi-block reservation. This causes a problem when
space gets tight in the file system. For example, when function
gfs2_inplace_reserve checks to see if a rgrp has enough blocks to
satisfy the request, it can accept a rgrp that it should reject
because, although there are enough blocks to satisfy the request
_now_, those blocks may be reserved for another running process.

A second problem with this occurs when we've reserved the remaining
blocks in an rgrp: function rg_mblk_search() can reject an rgrp
improperly because it calculates:

   u32 free_blocks = rgd->rd_free_clone - rgd->rd_reserved;

But rd_reserved includes blocks that the current process just
reserved in its own call to inplace_reserve. For example, it can
reserve the last 128 blocks of an rgrp, then reject that same rgrp
because the above calculates out to free_blocks = 0;

Consequences include, but are not limited to, (1) leaving holes,
and thus increasing file system fragmentation, and (2) reporting
file system is full long before it actually is.

This patch introduces a new function, rgd_free, which returns the
number of clone-free blocks (blocks that are truly free as opposed
to blocks that are still being used because an unlinked file is
still open) minus the number of blocks reserved by processes, but
not counting the blocks we ourselves reserved (because obviously
we need to allocate them).

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2018-07-25 00:09:09 +02:00
Colin Ian King
d1b0cb933c gfs2: remove redundant variable 'moved'
Variable 'moved' s being assigned but is never used hence it is
redundant and can be removed.  This has been the case ever since commit
c752666c.

Cleans up clang warning:
warning: variable 'moved' set but not used [-Wunused-but-set-variable]

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2018-07-25 00:08:59 +02:00
Andreas Gruenbacher
f95cbb44ab gfs2: use iomap_readpage for blocksize == PAGE_SIZE
We only use iomap_readpage for pages that don't have buffer heads
attached yet: iomap_readpage would otherwise read pages from disk that
are marked buffer_uptodate() but not PageUptodate().  Those pages may
actually contain data more recent than what's on disk.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: Bob Peterson <rpeterso@redhat.com>
2018-07-25 00:08:49 +02:00
Andreas Gruenbacher
1d45bb7f9d gfs2: Use iomap for stuffed direct I/O reads
Remove the fallback code from direct to buffered I/O for stuffed reads.

For stuffed writes, we must keep the fallback code: the deferred glock
we are holding under direct I/O doesn't allow to write to the inode or
change the file size.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: Bob Peterson <rpeterso@redhat.com>
2018-07-25 00:08:40 +02:00
Andreas Gruenbacher
0ed91eca11 Merge branch 'iomap-4.19-merge' into linux-gfs2/for-next
Merge xfs branch 'iomap-4.19-merge' into linux-gfs2/for-next.  This
brings in readpage and direct I/O support for inline data.

The IOMAP_F_BUFFER_HEAD flag introduced in commit "iomap: add initial
support for writes without buffer heads" needs to be set for gfs2 as
well, so do that in the merge.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2018-07-25 00:08:20 +02:00
Andreas Gruenbacher
c25892827c gfs2: fallocate_chunk: Always initialize struct iomap
In fallocate_chunk, always initialize the iomap before calling
gfs2_iomap_get_alloc: future changes could otherwise cause things like
iomap.flags to leak across calls.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: Bob Peterson <rpeterso@redhat.com>
2018-07-25 00:06:48 +02:00
Bob Peterson
4a7727725d GFS2: Fix recovery issues for spectators
This patch fixes a couple problems dealing with spectators who
remain with gfs2 mounts after the last non-spectator node fails.

Before this patch, spectator mounts would try to acquire the dlm's
mounted lock EX as part of its normal recovery sequence.
The mounted lock is only used to determine whether the node is
the first mounter, the first node to mount the file system, for
the purposes of file system recovery and journal replay.

It's not necessary for spectators: they should never do journal
recovery. If they acquire the lock it will prevent another "real"
first-mounter from acquiring the lock in EX mode, which means it
also cannot do journal recovery because it doesn't think it's the
first node to mount the file system.

This patch checks if the mounter is a spectator, and if so, avoids
grabbing the mounted lock. This allows a secondary mounter who is
really the first non-spectator mounter, to do journal recovery:
since the spectator doesn't acquire the lock, it can grab it in
EX mode, and therefore consider itself to be the first mounter
both as a "real" first mount, and as a first-real-after-spectator.

Note that the control lock still needs to be taken in PR mode
in order to fetch the lvb value so it has the current status of
all journal's recovery. This is used as it is today by a first
mounter to replay the journals. For spectators, it's merely
used to fetch the status bits. All recovery is bypassed and the
node waits until recovery is completed by a non-spectator node.

I also improved the cryptic message given by control_mount when
a spectator is waiting for a non-spectator to perform recovery.

It also fixes a problem in gfs2_recover_set whereby spectators
were never queueing recovery work for their own journal.
They cannot do recovery themselves, but they still need to queue
the work so they can check the recovery bits and clear the
DFL_BLOCK_LOCKS bit once the recovery happens on another node.

When the work queue runs on a spectator, it bypasses most of the
work so it won't print a bunch of annoying messages. All it will
print is a bunch of messages that look like this until recovery
completes on the non-spectator node:

GFS2: fsid=mycluster:scratch.s: recover generation 3 jid 0
GFS2: fsid=mycluster:scratch.s: recover jid 0 result busy

These continue every 1.5 seconds until the recovery is done by
the non-spectator, at which time it says:

GFS2: fsid=mycluster:scratch.s: recover generation 4 done

Then it proceeds with its mount.

If the file system is mounted in spectator node and the last
remaining non-spectator is fenced, any IO to the file system is
blocked by dlm and the spectator waits until recovery is
performed by a non-spectator.

If a spectator tries to mount the file system before any
non-spectators, it blocks and repeatedly gives this kernel
message:

GFS2: fsid=mycluster:scratch: Recovery is required. Waiting for a non-spectator to mount.
GFS2: fsid=mycluster:scratch: Recovery is required. Waiting for a non-spectator to mount.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2018-07-25 00:06:24 +02:00
Andreas Gruenbacher
a3479c7fc0 Merge branch 'iomap-write' into linux-gfs2/for-next
Pull in the gfs2 iomap-write changes: Tweak the existing code to
properly support iomap write and eliminate an unnecessary special case
in gfs2_block_map.  Implement iomap write support for buffered and
direct I/O.  Simplify some of the existing code and eliminate code that
is no longer used:

  gfs2: Remove gfs2_write_{begin,end}
  gfs2: iomap direct I/O support
  gfs2: gfs2_extent_length cleanup
  gfs2: iomap buffered write support
  gfs2: Further iomap cleanups

This is based on the following changes on the xfs 'iomap-4.19-merge'
branch:

  iomap: add private pointer to struct iomap
  iomap: add a page_done callback
  iomap: generic inline data handling
  iomap: complete partial direct I/O writes synchronously
  iomap: mark newly allocated buffer heads as new
  fs: factor out a __generic_write_end helper

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2018-07-24 20:02:40 +02:00
Souptick Joarder
109dbb1e6f fs: gfs2: Adding new return type vm_fault_t
Use new return type vm_fault_t for gfs2_page_mkwrite
handler.

see commit 1c8f422059 ("mm: change return type to
vm_fault_t") for reference.

Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
Reviewed-by: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2018-07-24 20:02:11 +02:00
Chengguang Xu
910f3d58d0 gfs2: using posix_acl_xattr_size instead of posix_acl_to_xattr
It seems better to get size by calling posix_acl_xattr_size() instead of
calling posix_acl_to_xattr() with NULL buffer argument.

posix_acl_xattr_size() never returns 0, so remove the unnecessary check.

Signed-off-by: Chengguang Xu <cgxu519@gmx.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2018-07-24 20:02:11 +02:00
Bob Peterson
e79e0e1428 gfs2: Don't reject a supposedly full bitmap if we have blocks reserved
Before this patch, you could get into situations like this:

1. Process 1 searches for X free blocks, finds them, makes a reservation
2. Process 2 searches for free blocks in the same rgrp, but now the
   bitmap is full because process 1's reservation is skipped over.
   So it marks the bitmap as GBF_FULL.
3. Process 1 tries to allocate blocks from its own reservation, but
   since the GBF_FULL bit is set, it skips over the rgrp and searches
   elsewhere, thus not using its own reservation.

This patch adds an additional check to allow processes to use their
own reservations.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2018-07-24 20:02:11 +02:00
Andreas Gruenbacher
b7eba890a2 gfs2: Eliminate redundant ip->i_rgd
GFS2 remembers the last rgrp used for allocations in ip->i_rgd.
However, block allocations are made by way of a reservations structure,
ip->i_res, which keeps the last rgrp in ip->i_res.rs_rgd, and ip->i_res
is kept in sync with ip->i_res.rs_rgd, so it's redundant.  Get rid of
ip->i_rgd and just use ip->i_res.rs_rgd in its place.

Based on patches by Robert Peterson.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2018-07-05 17:47:16 +02:00
Andreas Gruenbacher
03f8c41c73 gfs2: Stop messing with ip->i_rgd in the rlist code
In the resource group list code, keep the last resource group added in
the last position in the array.  Check against that instead of messing
with ip->i_rgd.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2018-07-04 21:38:42 +01:00
Andreas Gruenbacher
806a1477b1 iomap: add inline data support to iomap_readpage_actor
Just copy the inline data into the page using the existing helper.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-07-03 09:07:47 -07:00
Andreas Gruenbacher
ec181f6782 iomap: support direct I/O to inline data
Add support for reading from and writing to inline data to iomap_dio_rw.
This saves filesystems from having to implement fallback code for this
case.

The inline data is actually cached in the inode, so the I/O is only
direct in the sense that it doesn't go through the page cache.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-07-03 09:07:47 -07:00
Christoph Hellwig
09230435df iomap: refactor iomap_dio_actor
Split the function up into two helpers for the bio based I/O and hole
case, and a small helper to call the two.  This separates the code a
little better in preparation for supporting I/O to inline data.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-07-03 09:07:46 -07:00
Andreas Gruenbacher
025d0e7f73 gfs2: Remove gfs2_write_{begin,end}
Now that generic_file_write_iter is no longer used, there are no
remaining users of these address space operations.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: Bob Peterson <rpeterso@redhat.com>
2018-07-02 16:27:38 +01:00
Andreas Gruenbacher
967bcc91b0 gfs2: iomap direct I/O support
The page unmapping previously done in gfs2_direct_IO is now done
generically in iomap_dio_rw.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: Bob Peterson <rpeterso@redhat.com>
2018-07-02 16:27:32 +01:00
Andreas Gruenbacher
bcfe94139a gfs2: gfs2_extent_length cleanup
Now that gfs2_extent_length is no longer used for determining the size
of a hole and always with an upper size limit, the function can be
simplified.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: Bob Peterson <rpeterso@redhat.com>
2018-07-02 16:27:24 +01:00
Andreas Gruenbacher
64bc06bb32 gfs2: iomap buffered write support
With the traditional page-based writes, blocks are allocated separately
for each page written to.  With iomap writes, we can allocate a lot more
blocks at once, with a fraction of the allocation overhead for each
page.

Split calculating the number of blocks that can be allocated at a given
position (gfs2_alloc_size) off from gfs2_iomap_alloc: that size
determines the number of blocks to allocate and reserve in the journal.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: Bob Peterson <rpeterso@redhat.com>
2018-07-02 16:27:17 +01:00
Andreas Gruenbacher
d505a96a3b gfs2: Further iomap cleanups
In gfs2_iomap_alloc, set the type of newly allocated extents to
IOMAP_MAPPED so that iomap_to_bh will set the bh states correctly:
otherwise, the bhs would not be marked as mapped, confusing
__mpage_writepage.  This means that we need to check for the IOMAP_F_NEW
flag in fallocate_chunk now.

Further clean up gfs2_iomap_get and implement gfs2_stuffed_iomap here
directly.  For reads beyond the end of the file, return holes instead of
failing with -ENOENT so that we can get rid of that special case in
gfs2_block_map.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: Bob Peterson <rpeterso@redhat.com>
2018-07-02 16:26:01 +01:00
Arnd Bergmann
ee9c7f9ae3 gfs2: call ktime_get_coarse_real_ts64() directly
current_kernel_time64() is now just a deprecated wrapper around
ktime_get_coarse_real_ts64(), so let's just call that directly.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2018-06-21 07:40:23 -05:00
Andreas Gruenbacher
00251a16d7 gfs2: Minor clarification to __gfs2_punch_hole
Rename end_off to end_len to make the code less confusing.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2018-06-21 07:40:00 -05:00
Andreas Gruenbacher
9e1a9ecd13 gfs2: Don't withdraw under a spin lock
In two places, the gfs2_io_error_bh macro is called while holding the
sd_ail_lock spin lock.  This isn't allowed because gfs2_io_error_bh
withdraws the filesystem, which can sleep because it issues a uevent.
To fix that, add a gfs2_io_error_bh_wd macro that does withdraw the
filesystem and change gfs2_io_error_bh to not withdraw the filesystem.
In those places where the new gfs2_io_error_bh is used, withdraw the
filesystem after releasing sd_ail_lock.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Reviewed-by: Andrew Price <anprice@redhat.com>
2018-06-21 07:39:44 -05:00
Bob Peterson
f85c10e24a gfs2: eliminate rs_inum and reduce the size of gfs2 inodes
Before this patch, block reservations kept track of the inode
number. At one point, that was a valid thing to do. However, since
we made the reservation a part of the inode (rather than a pointer
to a separate allocated object) the reservation can determine the
inode number by using container_of. This saves us a little memory
in our inode.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Acked-by: Steven Whitehouse <swhiteho@redhat.com>
Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com>
2018-06-21 07:39:31 -05:00
Christoph Hellwig
c03cea4214 iomap: add initial support for writes without buffer heads
For now just limited to blocksize == PAGE_SIZE, where we can simply read
in the full page in write begin, and just set the whole page dirty after
copying data into it.  This code is enabled by default and XFS will now
be feed pages without buffer heads in ->writepage and ->writepages.

If a file system sets the IOMAP_F_BUFFER_HEAD flag on the iomap the old
path will still be used, this both helps the transition in XFS and
prepares for the gfs2 migration to the iomap infrastructure.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-06-20 09:32:41 -07:00
Andreas Gruenbacher
e184fde6f3 iomap: add private pointer to struct iomap
Add a private pointer to struct iomap to allow filesystems to pass data
from iomap_begin to iomap_end.  Will be used by gfs2 for passing on the
on-disk inode buffer head.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-06-19 15:10:57 -07:00
Christoph Hellwig
72b4daa241 iomap: add an iomap-based readpage and readpages implementation
Simply use iomap_apply to iterate over the file and a submit a bio for
each non-uptodate but mapped region and zero everything else.  Note that
as-is this can not be used for file systems with a blocksize smaller than
the page size, but that support will be added later.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-06-19 15:10:57 -07:00
Christoph Hellwig
63899c6f88 iomap: add a page_done callback
This will be used by gfs2 to attach data to transactions for the journaled
data mode.  But the concept is generic enough that we might be able to
use it for other purposes like encryption/integrity post-processing in the
future.

Based on a patch from Andreas Gruenbacher.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-06-19 15:10:56 -07:00
Andreas Gruenbacher
19e0c58f65 iomap: generic inline data handling
Add generic inline data handling by adding a pointer to the inline data
region to struct iomap.  When handling a buffered IOMAP_INLINE write,
iomap_write_begin will copy the current inline data from the inline data
region into the page cache, and iomap_write_end will copy the changes in
the page cache back to the inline data region.

This doesn't cover inline data reads and direct I/O yet because so far,
we have no users.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
[hch: small cleanups to better fit in with other iomap work]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-06-19 15:10:56 -07:00
Andreas Gruenbacher
ebf00be37d iomap: complete partial direct I/O writes synchronously
According to xfstest generic/240, applications seem to expect direct I/O
writes to either complete as a whole or to fail; short direct I/O writes
are apparently not appreciated.  This means that when only part of an
asynchronous direct I/O write succeeds, we can either fail the entire
write, or we can wait for the partial write to complete and retry the
remaining write as buffered I/O.  The old __blockdev_direct_IO helper
has code for waiting for partial writes to complete; the new
iomap_dio_rw iomap helper does not.

The above mentioned fallback mode is needed for gfs2, which doesn't
allow block allocations under direct I/O to avoid taking cluster-wide
exclusive locks.  As a consequence, an asynchronous direct I/O write to
a file range that contains a hole will result in a short write.  In that
case, wait for the short write to complete to allow gfs2 to recover.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-06-19 15:10:55 -07:00
Andreas Gruenbacher
3d7b6b21f6 iomap: mark newly allocated buffer heads as new
In iomap_to_bh, not only mark buffer heads in IOMAP_UNWRITTEN maps as
new, but also buffer heads in IOMAP_MAPPED maps with the IOMAP_F_NEW
flag set.  This will be used by filesystems like gfs2, which allocate
blocks in iomap->begin.

Minor corrections to the comment for IOMAP_UNWRITTEN maps.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-06-19 15:10:55 -07:00
Christoph Hellwig
a6d639da63 fs: factor out a __generic_write_end helper
Bits of the buffer.c based write_end implementations that don't know
about buffer_heads and can be reused by other implementations.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-06-19 15:10:55 -07:00
Linus Torvalds
ce397d215c Linux 4.18-rc1 v4.18-rc1 2018-06-17 08:04:49 +09:00
Linus Torvalds
265c5596da Merge tag 'for-linus-20180616' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
 "A collection of fixes that should go into -rc1. This contains:

   - bsg_open vs bsg_unregister race fix (Anatoliy)

   - NVMe pull request from Christoph, with fixes for regressions in
     this window, FC connect/reconnect path code unification, and a
     trace point addition.

   - timeout fix (Christoph)

   - remove a few unused functions (Christoph)

   - blk-mq tag_set reinit fix (Roman)"

* tag 'for-linus-20180616' of git://git.kernel.dk/linux-block:
  bsg: fix race of bsg_open and bsg_unregister
  block: remov blk_queue_invalidate_tags
  nvme-fabrics: fix and refine state checks in __nvmf_check_ready
  nvme-fabrics: handle the admin-only case properly in nvmf_check_ready
  nvme-fabrics: refactor queue ready check
  blk-mq: remove blk_mq_tagset_iter
  nvme: remove nvme_reinit_tagset
  nvme-fc: fix nulling of queue data on reconnect
  nvme-fc: remove reinit_request routine
  blk-mq: don't time out requests again that are in the timeout handler
  nvme-fc: change controllers first connect to use reconnect path
  nvme: don't rely on the changed namespace list log
  nvmet: free smart-log buffer after use
  nvme-rdma: fix error flow during mapping request data
  nvme: add bio remapping tracepoint
  nvme: fix NULL pointer dereference in nvme_init_subsystem
  blk-mq: reinit q->tag_set_list entry only after grace period
2018-06-17 05:37:55 +09:00
Linus Torvalds
5e7b9212a4 Merge tag 'docs-broken-links' of git://linuxtv.org/mchehab/experimental
Pull documentation fixes from Mauro Carvalho Chehab:
 "This solves a series of broken links for files under Documentation,
  and improves a script meant to detect such broken links (see
  scripts/documentation-file-ref-check).

  The changes on this series are:

   - can.rst: fix a footnote reference;

   - crypto_engine.rst: Fix two parsing warnings;

   - Fix a lot of broken references to Documentation/*;

   - improve the scripts/documentation-file-ref-check script, in order
     to help detecting/fixing broken references, preventing
     false-positives.

  After this patch series, only 33 broken references to doc files are
  detected by scripts/documentation-file-ref-check"

* tag 'docs-broken-links' of git://linuxtv.org/mchehab/experimental: (26 commits)
  fix a series of Documentation/ broken file name references
  Documentation: rstFlatTable.py: fix a broken reference
  ABI: sysfs-devices-system-cpu: remove a broken reference
  devicetree: fix a series of wrong file references
  devicetree: fix name of pinctrl-bindings.txt
  devicetree: fix some bindings file names
  MAINTAINERS: fix location of DT npcm files
  MAINTAINERS: fix location of some display DT bindings
  kernel-parameters.txt: fix pointers to sound parameters
  bindings: nvmem/zii: Fix location of nvmem.txt
  docs: Fix more broken references
  scripts/documentation-file-ref-check: check tools/*/Documentation
  scripts/documentation-file-ref-check: get rid of false-positives
  scripts/documentation-file-ref-check: hint: dash or underline
  scripts/documentation-file-ref-check: add a fix logic for DT
  scripts/documentation-file-ref-check: accept more wildcards at filenames
  scripts/documentation-file-ref-check: fix help message
  media: max2175: fix location of driver's companion documentation
  media: v4l: fix broken video4linux docs locations
  media: dvb: point to the location of the old README.dvb-usb file
  ...
2018-06-17 05:25:18 +09:00
Linus Torvalds
dbb2816fc7 Merge tag 'fsnotify_for_v4.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull fsnotify updates from Jan Kara:
 "fsnotify cleanups unifying handling of different watch types.

  This is the shortened fsnotify series from Amir with the last five
  patches pulled out. Amir has modified those patches to not change
  struct inode but obviously it's too late for those to go into this
  merge window"

* tag 'fsnotify_for_v4.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  fsnotify: add fsnotify_add_inode_mark() wrappers
  fanotify: generalize fanotify_should_send_event()
  fsnotify: generalize send_to_group()
  fsnotify: generalize iteration of marks by object type
  fsnotify: introduce marks iteration helpers
  fsnotify: remove redundant arguments to handle_event()
  fsnotify: use type id to identify connector object type
2018-06-17 05:06:18 +09:00
Linus Torvalds
644f2639ae Merge tag 'fbdev-v4.18' of git://github.com/bzolnier/linux
Pull fbdev updates from Bartlomiej Zolnierkiewicz:
 "There is nothing really major here, few small fixes, some cleanups and
  dead drivers removal:

   - mark omapfb drivers as orphans in MAINTAINERS file (Tomi Valkeinen)

   - add missing module license tags to omap/omapfb driver (Arnd
     Bergmann)

   - add missing GPIOLIB dependendy to omap2/omapfb driver (Arnd
     Bergmann)

   - convert savagefb, aty128fb & radeonfb drivers to use msleep & co.
     (Jia-Ju Bai)

   - allow COMPILE_TEST build for viafb driver (media part was reviewed
     by media subsystem Maintainer)

   - remove unused MERAM support from sh_mobile_lcdcfb and shmob-drm
     drivers (drm parts were acked by shmob-drm driver Maintainer)

   - remove unused auo_k190xfb drivers

   - misc cleanups (Souptick Joarder, Wolfram Sang, Markus Elfring, Andy
     Shevchenko, Colin Ian King)"

* tag 'fbdev-v4.18' of git://github.com/bzolnier/linux: (26 commits)
  fb_omap2: add gpiolib dependency
  video/omap: add module license tags
  MAINTAINERS: make omapfb orphan
  video: fbdev: pxafb: match_string() conversion fixup
  video: fbdev: nvidia: fix spelling mistake: "scaleing" -> "scaling"
  video: fbdev: fix spelling mistake: "frambuffer" -> "framebuffer"
  video: fbdev: pxafb: Convert to use match_string() helper
  video: fbdev: via: allow COMPILE_TEST build
  video: fbdev: remove unused sh_mobile_meram driver
  drm: shmobile: remove unused MERAM support
  video: fbdev: sh_mobile_lcdcfb: remove unused MERAM support
  video: fbdev: remove unused auo_k190xfb drivers
  video: omap: Improve a size determination in omapfb_do_probe()
  video: sm501fb: Improve a size determination in sm501fb_probe()
  video: fbdev-MMP: Improve a size determination in path_init()
  video: fbdev-MMP: Delete an error message for a failed memory allocation in two functions
  video: auo_k190x: Delete an error message for a failed memory allocation in auok190x_common_probe()
  video: sh_mobile_lcdcfb: Delete an error message for a failed memory allocation in two functions
  video: sh_mobile_meram: Delete an error message for a failed memory allocation in sh_mobile_meram_probe()
  video: fbdev: sh_mobile_meram: Drop SUPERH platform dependency
  ...
2018-06-17 05:00:24 +09:00
Linus Torvalds
35773c9381 Merge branch 'afs-proc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull AFS updates from Al Viro:
 "Assorted AFS stuff - ended up in vfs.git since most of that consists
  of David's AFS-related followups to Christoph's procfs series"

* 'afs-proc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  afs: Optimise callback breaking by not repeating volume lookup
  afs: Display manually added cells in dynamic root mount
  afs: Enable IPv6 DNS lookups
  afs: Show all of a server's addresses in /proc/fs/afs/servers
  afs: Handle CONFIG_PROC_FS=n
  proc: Make inline name size calculation automatic
  afs: Implement network namespacing
  afs: Mark afs_net::ws_cell as __rcu and set using rcu functions
  afs: Fix a Sparse warning in xdr_decode_AFSFetchStatus()
  proc: Add a way to make network proc files writable
  afs: Rearrange fs/afs/proc.c to remove remaining predeclarations.
  afs: Rearrange fs/afs/proc.c to move the show routines up
  afs: Rearrange fs/afs/proc.c by moving fops and open functions down
  afs: Move /proc management functions to the end of the file
2018-06-16 16:32:04 +09:00
Linus Torvalds
29d6849d88 Merge branch 'work.compat' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull compat updates from Al Viro:
 "Some biarch patches - getting rid of assorted (mis)uses of
  compat_alloc_user_space().

  Not much in that area this cycle..."

* 'work.compat' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  orangefs: simplify compat ioctl handling
  signalfd: lift sigmask copyin and size checks to callers of do_signalfd4()
  vmsplice(): lift importing iovec into vmsplice(2) and compat counterpart
2018-06-16 16:21:50 +09:00
Linus Torvalds
a5b729ea18 Merge branch 'work.aio' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull aio fixes from Al Viro:
 "Assorted AIO followups and fixes"

* 'work.aio' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  eventpoll: switch to ->poll_mask
  aio: only return events requested in poll_mask() for IOCB_CMD_POLL
  eventfd: only return events requested in poll_mask()
  aio: mark __aio_sigset::sigmask const
2018-06-16 16:11:40 +09:00
Linus Torvalds
9215310cf1 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Various netfilter fixlets from Pablo and the netfilter team.

 2) Fix regression in IPVS caused by lack of PMTU exceptions on local
    routes in ipv6, from Julian Anastasov.

 3) Check pskb_trim_rcsum for failure in DSA, from Zhouyang Jia.

 4) Don't crash on poll in TLS, from Daniel Borkmann.

 5) Revert SO_REUSE{ADDR,PORT} change, it regresses various things
    including Avahi mDNS. From Bart Van Assche.

 6) Missing of_node_put in qcom/emac driver, from Yue Haibing.

 7) We lack checking of the TCP checking in one special case during SYN
    receive, from Frank van der Linden.

 8) Fix module init error paths of mac80211 hwsim, from Johannes Berg.

 9) Handle 802.1ad properly in stmmac driver, from Elad Nachman.

10) Must grab HW caps before doing quirk checks in stmmac driver, from
    Jose Abreu.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (81 commits)
  net: stmmac: Run HWIF Quirks after getting HW caps
  neighbour: skip NTF_EXT_LEARNED entries during forced gc
  net: cxgb3: add error handling for sysfs_create_group
  tls: fix waitall behavior in tls_sw_recvmsg
  tls: fix use-after-free in tls_push_record
  l2tp: filter out non-PPP sessions in pppol2tp_tunnel_ioctl()
  l2tp: reject creation of non-PPP sessions on L2TPv2 tunnels
  mlxsw: spectrum_switchdev: Fix port_vlan refcounting
  mlxsw: spectrum_router: Align with new route replace logic
  mlxsw: spectrum_router: Allow appending to dev-only routes
  ipv6: Only emit append events for appended routes
  stmmac: added support for 802.1ad vlan stripping
  cfg80211: fix rcu in cfg80211_unregister_wdev
  mac80211: Move up init of TXQs
  mac80211_hwsim: fix module init error paths
  cfg80211: initialize sinfo in cfg80211_get_station
  nl80211: fix some kernel doc tag mistakes
  hv_netvsc: Fix the variable sizes in ipsecv2 and rsc offload
  rds: avoid unenecessary cong_update in loop transport
  l2tp: clean up stale tunnel or session in pppol2tp_connect's error path
  ...
2018-06-16 07:39:34 +09:00
Linus Torvalds
de7f01c22a Merge tag 'modules-for-v4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux
Pull module updates from Jessica Yu:
 "Minor code cleanup and also allow sig_enforce param to be shown in
  sysfs with CONFIG_MODULE_SIG_FORCE"

* tag 'modules-for-v4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux:
  module: Allow to always show the status of modsign
  module: Do not access sig_enforce directly
2018-06-16 07:36:39 +09:00
Linus Torvalds
8d1e5133bf Merge branch 'for-linus-4.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml
Pull uml updates from Richard Weinberger:
 "Minor updates for UML:

   - fixes for our new vector network driver by Anton

   - initcall cleanup by Alexander

   - We have a new mailinglist, sourceforge.net sucks"

* 'for-linus-4.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml:
  um: Fix raw interface options
  um: Fix initialization of vector queues
  um: remove uml initcalls
  um: Update mailing list address
2018-06-16 06:50:51 +09:00
Linus Torvalds
6a4d4b3253 Merge tag 'riscv-for-linus-4.18-merge_window' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux
Pull RISC-V updates from Palmer Dabbelt:
 "This contains some small RISC-V updates I'd like to target for 4.18.

  They are all fairly small this time. Here's a short summary, there's
  more info in the commits/merges:

   - a fix to __clear_user to respect the passed arguments.

   - enough support for the perf subsystem to work with RISC-V's ISA
     defined performance counters.

   - support for sparse and cleanups suggested by it.

   - support for R_RISCV_32 (a relocation, not the 32-bit ISA).

   - some MAINTAINERS cleanups.

   - the addition of CONFIG_HVC_RISCV_SBI to our defconfig, as it's
     always present.

  I've given these a simple build+boot test"

* tag 'riscv-for-linus-4.18-merge_window' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux:
  RISC-V: Add CONFIG_HVC_RISCV_SBI=y to defconfig
  RISC-V: Handle R_RISCV_32 in modules
  riscv/ftrace: Export _mcount when DYNAMIC_FTRACE isn't set
  riscv: add riscv-specific predefines to CHECKFLAGS
  riscv: split the declaration of __copy_user
  riscv: no __user for probe_kernel_address()
  riscv: use NULL instead of a plain 0
  perf: riscv: Add Document for Future Porting Guide
  perf: riscv: preliminary RISC-V support
  MAINTAINERS: Update Albert's email, he's back at Berkeley
  MAINTAINERS: Add myself as a maintainer for SiFive's drivers
  riscv: Fix the bug in memory access fixup code
2018-06-16 06:42:43 +09:00
Linus Torvalds
8949170cf4 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull more kvm updates from Paolo Bonzini:
 "Mostly the PPC part of the release, but also switching to Arnd's fix
  for the hyperv config issue and a typo fix.

  Main PPC changes:

   - reimplement the MMIO instruction emulation

   - transactional memory support for PR KVM

   - improve radix page table handling"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (63 commits)
  KVM: x86: VMX: redo fix for link error without CONFIG_HYPERV
  KVM: x86: fix typo at kvm_arch_hardware_setup comment
  KVM: PPC: Book3S PR: Fix failure status setting in tabort. emulation
  KVM: PPC: Book3S PR: Enable use on POWER9 bare-metal hosts in HPT mode
  KVM: PPC: Book3S PR: Don't let PAPR guest set MSR hypervisor bit
  KVM: PPC: Book3S PR: Fix failure status setting in treclaim. emulation
  KVM: PPC: Book3S PR: Fix MSR setting when delivering interrupts
  KVM: PPC: Book3S PR: Handle additional interrupt types
  KVM: PPC: Book3S PR: Enable kvmppc_get/set_one_reg_pr() for HTM registers
  KVM: PPC: Book3S: Remove load/put vcpu for KVM_GET_REGS/KVM_SET_REGS
  KVM: PPC: Remove load/put vcpu for KVM_GET/SET_ONE_REG ioctl
  KVM: PPC: Move vcpu_load/vcpu_put down to each ioctl case in kvm_arch_vcpu_ioctl
  KVM: PPC: Book3S PR: Enable HTM for PR KVM for KVM_CHECK_EXTENSION ioctl
  KVM: PPC: Book3S PR: Support TAR handling for PR KVM HTM
  KVM: PPC: Book3S PR: Add guard code to prevent returning to guest with PR=0 and Transactional state
  KVM: PPC: Book3S PR: Add emulation for tabort. in privileged state
  KVM: PPC: Book3S PR: Add emulation for trechkpt.
  KVM: PPC: Book3S PR: Add emulation for treclaim.
  KVM: PPC: Book3S PR: Restore NV regs after emulating mfspr from TM SPRs
  KVM: PPC: Book3S PR: Always fail transactions in guest privileged state
  ...
2018-06-16 06:37:04 +09:00