Commit Graph

121047 Commits

Author SHA1 Message Date
Nicholas Piggin
20c0e8269e powerpc/pseries: Implement paravirt qspinlocks for SPLPAR
This implements the generic paravirt qspinlocks using H_PROD and
H_CONFER to kick and wait.

This uses an un-directed yield to any CPU rather than the directed
yield to a pre-empted lock holder that paravirtualised simple
spinlocks use, that requires no kick hcall. This is something that
could be investigated and improved in future.

Performance results can be found in the commit which added queued
spinlocks.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Waiman Long <longman@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200724131423.1362108-5-npiggin@gmail.com
2020-07-27 00:01:29 +10:00
Nicholas Piggin
aa65ff6b18 powerpc/64s: Implement queued spinlocks and rwlocks
These have shown significantly improved performance and fairness when
spinlock contention is moderate to high on very large systems.

With this series including subsequent patches, on a 16 socket 1536
thread POWER9, a stress test such as same-file open/close from all
CPUs gets big speedups, 11620op/s aggregate with simple spinlocks vs
384158op/s (33x faster), where the difference in throughput between
the fastest and slowest thread goes from 7x to 1.4x.

Thanks to the fast path being identical in terms of atomics and
barriers (after a subsequent optimisation patch), single threaded
performance is not changed (no measurable difference).

On smaller systems, performance and fairness seems to be generally
improved. Using dbench on tmpfs as a test (that starts to run into
kernel spinlock contention), a 2-socket OpenPOWER POWER9 system was
tested with bare metal and KVM guest configurations. Results can be
found here:

https://github.com/linuxppc/issues/issues/305#issuecomment-663487453

Observations are:

- Queued spinlocks are equal when contention is insignificant, as
  expected and as measured with microbenchmarks.

- When there is contention, on bare metal queued spinlocks have better
  throughput and max latency at all points.

- When virtualised, queued spinlocks are slightly worse approaching
  peak throughput, but significantly better throughput and max latency
  at all points beyond peak, until queued spinlock maximum latency
  rises when clients are 2x vCPUs.

The regressions haven't been analysed very well yet, there are a lot
of things that can be tuned, particularly the paravirtualised locking,
but the numbers already look like a good net win even on relatively
small systems.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Waiman Long <longman@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200724131423.1362108-4-npiggin@gmail.com
2020-07-27 00:01:23 +10:00
Nicholas Piggin
5c9fa16e8a powerpc/64s: Remove PROT_SAO support
ISA v3.1 does not support the SAO storage control attribute required to
implement PROT_SAO. PROT_SAO was used by specialised system software
(Lx86) that has been discontinued for about 7 years, and is not thought
to be used elsewhere, so removal should not cause problems.

We rather remove it than keep support for older processors, because
live migrating guest partitions to newer processors may not be possible
if SAO is in use (or worse allowed with silent races).

- PROT_SAO stays in the uapi header so code using it would still build.
- arch_validate_prot() is removed, the generic version rejects PROT_SAO
  so applications would get a failure at mmap() time.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Drop KVM change for the time being]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200703011958.1166620-3-npiggin@gmail.com
2020-07-22 00:01:25 +10:00
Kajol Jain
1a8f0886a6 powerpc/perf/hv-24x7: Add cpu hotplug support
Patch here adds cpu hotplug functions to hv_24x7 pmu.
A new cpuhp_state "CPUHP_AP_PERF_POWERPC_HV_24x7_ONLINE" enum
is added.

The online callback function updates the cpumask only if its
empty. As the primary intention of adding hotplug support
is to designate a CPU to make HCALL to collect the
counter data.

The offline function test and clear corresponding cpu in a cpumask
and update cpumask to any other active cpu.

Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200709051836.723765-2-kjain@linux.ibm.com
2020-07-16 13:12:41 +10:00
Aneesh Kumar K.V
3e79f082eb libnvdimm/nvdimm/flush: Allow architecture to override the flush barrier
Architectures like ppc64 provide persistent memory specific barriers
that will ensure that all stores for which the modifications are
written to persistent storage by preceding dcbfps and dcbstps
instructions have updated persistent storage before any data
access or data transfer caused by subsequent instructions is initiated.
This is in addition to the ordering done by wmb()

Update nvdimm core such that architecture can use barriers other than
wmb to ensure all previous writes are architecturally visible for
the platform buffer flush.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200701072235.223558-5-aneesh.kumar@linux.ibm.com
2020-07-16 13:00:22 +10:00
Philippe Bergheaud
87db7579eb ocxl: control via sysfs whether the FPGA is reloaded on a link reset
Some opencapi FPGA images allow to control if the FPGA should be reloaded
on the next adapter reset. If it is supported, the image specifies it
through a Vendor Specific DVSEC in the config space of function 0.

Signed-off-by: Philippe Bergheaud <felix@linux.ibm.com>
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200619140439.153962-1-fbarrat@linux.ibm.com
2020-07-15 11:07:19 +10:00
Linus Torvalds
7561393908 Merge tag 'powerpc-5.8-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:

 - One fix for the interrupt rework we did last release which broke
   KVM-PR

 - Three commits fixing some fallout from the READ_ONCE() changes
   interacting badly with our 8xx 16K pages support, which uses a pte_t
   that is a structure of 4 actual PTEs

 - A cleanup of the 8xx pte_update() to use the newly added pmd_off()

 - A fix for a crash when handling an oops if CONFIG_DEBUG_VIRTUAL is
   enabled

 - A minor fix for the SPU syscall generation

Thanks to Aneesh Kumar K.V, Christian Zigotzky, Christophe Leroy, Mike
Rapoport, Nicholas Piggin.

* tag 'powerpc-5.8-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/8xx: Provide ptep_get() with 16k pages
  mm: Allow arches to provide ptep_get()
  mm/gup: Use huge_ptep_get() in gup_hugepte()
  powerpc/syscalls: Use the number when building SPU syscall table
  powerpc/8xx: use pmd_off() to access a PMD entry in pte_update()
  powerpc/64s: Fix KVM interrupt using wrong save area
  powerpc: Fix kernel crash in show_instructions() w/DEBUG_VIRTUAL
2020-06-21 10:02:53 -07:00
Linus Torvalds
93bbca271a Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fixes from Herbert Xu:

 - NULL dereference in octeontx

 - PM reference imbalance in ks-sa

 - deadlock in crypto manager

 - memory leak in drbg

 - missing socket limit check on receive SG list size in algif_skcipher

 - typos in caam

 - warnings in ccp and hisilicon

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: drbg - always try to free Jitter RNG instance
  crypto: marvell/octeontx - Fix a potential NULL dereference
  crypto: algboss - don't wait during notifier callback
  crypto: caam - fix typos
  crypto: ccp - Fix sparse warnings in sev-dev
  crypto: hisilicon - Cap block size at 2^31
  crypto: algif_skcipher - Cap recv SG list at ctx->used
  hwrng: ks-sa - Fix runtime PM imbalance on error
2020-06-21 10:01:03 -07:00
Linus Torvalds
64677779e8 Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
 "One minor fix and two patches reworking the ata dma drain for the
  !CONFIG_LIBATA case. The latter is a 5.7 regression fix"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: Wire up ata_scsi_dma_need_drain for SAS HBA drivers
  scsi: libata: Provide an ata_scsi_dma_need_drain stub for !CONFIG_ATA
  scsi: ufs-bsg: Fix runtime PM imbalance on error
2020-06-20 19:23:13 -07:00
Linus Torvalds
a5c6a1f0fe Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:

 - a small collection of remaining API conversion patches (all acked)
   which allow to finally remove the deprecated API

 - some documentation fixes and a MAINTAINERS addition

* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  MAINTAINERS: Add robert and myself as qcom i2c cci maintainers
  i2c: smbus: Fix spelling mistake in the comments
  Documentation/i2c: SMBus start signal is S not A
  i2c: remove deprecated i2c_new_device API
  Documentation: media: convert to use i2c_new_client_device()
  video: backlight: tosa_lcd: convert to use i2c_new_client_device()
  x86/platform/intel-mid: convert to use i2c_new_client_device()
  drm: encoder_slave: use new I2C API
  drm: encoder_slave: fix refcouting error for modules
2020-06-20 19:18:27 -07:00
Linus Torvalds
8b6ddd10d6 Merge tag 'trace-v5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fixes from Steven Rostedt:

 - Have recordmcount work with > 64K sections (to support LTO)

 - kprobe RCU fixes

 - Correct a kprobe critical section with missing mutex

 - Remove redundant arch_disarm_kprobe() call

 - Fix lockup when kretprobe triggers within kprobe_flush_task()

 - Fix memory leak in fetch_op_data operations

 - Fix sleep in atomic in ftrace trace array sample code

 - Free up memory on failure in sample trace array code

 - Fix incorrect reporting of function_graph fields in format file

 - Fix quote within quote parsing in bootconfig

 - Fix return value of bootconfig tool

 - Add testcases for bootconfig tool

 - Fix maybe uninitialized warning in ftrace pid file code

 - Remove unused variable in tracing_iter_reset()

 - Fix some typos

* tag 'trace-v5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  ftrace: Fix maybe-uninitialized compiler warning
  tools/bootconfig: Add testcase for show-command and quotes test
  tools/bootconfig: Fix to return 0 if succeeded to show the bootconfig
  tools/bootconfig: Fix to use correct quotes for value
  proc/bootconfig: Fix to use correct quotes for value
  tracing: Remove unused event variable in tracing_iter_reset
  tracing/probe: Fix memleak in fetch_op_data operations
  trace: Fix typo in allocate_ftrace_ops()'s comment
  tracing: Make ftrace packed events have align of 1
  sample-trace-array: Remove trace_array 'sample-instance'
  sample-trace-array: Fix sleeping function called from invalid context
  kretprobe: Prevent triggering kretprobe from within kprobe_flush_task
  kprobes: Remove redundant arch_disarm_kprobe() call
  kprobes: Fix to protect kick_kprobe_optimizer() by kprobe_mutex
  kprobes: Use non RCU traversal APIs on kprobe_tables if possible
  kprobes: Suppress the suspicious RCU warning on kprobes
  recordmcount: support >64k sections
2020-06-20 13:17:47 -07:00
Linus Torvalds
eede2b9b3f Merge tag 'libnvdimm-for-5.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm updates from Dan Williams:
 "A feature (papr_scm health retrieval) and a fix (sysfs attribute
  visibility) for v5.8.

  Vaibhav explains in the merge commit below why missing v5.8 would be
  painful and I agreed to try a -rc2 pull because only cosmetics kept
  this out of -rc1 and his initial versions were posted in more than
  enough time for v5.8 consideration:

   'These patches are tied to specific features that were committed to
    customers in upcoming distros releases (RHEL and SLES) whose
    time-lines are tied to 5.8 kernel release.

    Being able to track the health of an nvdimm is critical for our
    customers that are running workloads leveraging papr-scm nvdimms.
    Missing the 5.8 kernel would mean missing the distro timelines and
    shifting forward the availability of this feature in distro kernels
    by at least 6 months'

  Summary:

   - Fix the visibility of the region 'align' attribute.

     The new unit tests for region alignment handling caught a corner
     case where the alignment cannot be specified if the region is
     converted from static to dynamic provisioning at runtime.

   - Add support for device health retrieval for the persistent memory
     supported by the papr_scm driver.

     This includes both the standard sysfs "health flags" that the nfit
     persistent memory driver publishes and a mechanism for the ndctl
     tool to retrieve a health-command payload"

* tag 'libnvdimm-for-5.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  nvdimm/region: always show the 'align' attribute
  powerpc/papr_scm: Implement support for PAPR_PDSM_HEALTH
  ndctl/papr_scm,uapi: Add support for PAPR nvdimm specific methods
  powerpc/papr_scm: Improve error logging and handling papr_scm_ndctl()
  powerpc/papr_scm: Fetch nvdimm health information from PHYP
  seq_buf: Export seq_buf_printf
  powerpc: Document details on H_SCM_HEALTH hcall
2020-06-20 13:13:21 -07:00
Christophe Leroy
481e980a7c mm: Allow arches to provide ptep_get()
Since commit 9e343b467c ("READ_ONCE: Enforce atomicity for
{READ,WRITE}_ONCE() memory accesses") it is not possible anymore to
use READ_ONCE() to access complex page table entries like the one
defined for powerpc 8xx with 16k size pages.

Define a ptep_get() helper that architectures can override instead
of performing a READ_ONCE() on the page table entry pointer.

Fixes: 9e343b467c ("READ_ONCE: Enforce atomicity for {READ,WRITE}_ONCE() memory accesses")
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Acked-by: Will Deacon <will@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/087fa12b6e920e32315136b998aa834f99242695.1592225558.git.christophe.leroy@csgroup.eu
2020-06-20 22:14:53 +10:00
Linus Torvalds
d2b1c81f5f Merge tag 'block-5.8-2020-06-19' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:

 - Use import_uuid() where appropriate (Andy)

 - bcache fixes (Coly, Mauricio, Zhiqiang)

 - blktrace sparse warnings fix (Jan)

 - blktrace concurrent setup fix (Luis)

 - blkdev_get use-after-free fix (Jason)

 - Ensure all blk-mq maps are updated (Weiping)

 - Loop invalidate bdev fix (Zheng)

* tag 'block-5.8-2020-06-19' of git://git.kernel.dk/linux-block:
  block: make function 'kill_bdev' static
  loop: replace kill_bdev with invalidate_bdev
  partitions/ldm: Replace uuid_copy() with import_uuid() where it makes sense
  block: update hctx map when use multiple maps
  blktrace: Avoid sparse warnings when assigning q->blk_trace
  blktrace: break out of blktrace setup on concurrent calls
  block: Fix use-after-free in blkdev_get()
  trace/events/block.h: drop kernel-doc for dropped function parameter
  blk-mq: Remove redundant 'return' statement
  bcache: pr_info() format clean up in bcache_device_init()
  bcache: use delayed kworker fo asynchronous devices registration
  bcache: check and adjust logical block size for backing devices
  bcache: fix potential deadlock problem in btree_gc_coalesce
2020-06-19 13:11:26 -07:00
Linus Torvalds
592be758f1 Merge tag 'libata-5.8-2020-06-19' of git://git.kernel.dk/linux-block
Pull libata fixes from Jens Axboe:
 "A few minor changes that should go into this release"

* tag 'libata-5.8-2020-06-19' of git://git.kernel.dk/linux-block:
  libata: Use per port sync for detach
  ata/libata: Fix usage of page address by page_address in ata_scsi_mode_select_xlat function
  sata_rcar: handle pm_runtime_get_sync failure cases
2020-06-19 13:09:40 -07:00
Linus Torvalds
672f9255a7 Merge tag 'ceph-for-5.8-rc2' of git://github.com/ceph/ceph-client
Pull ceph fixes from Ilya Dryomov:
 "An important follow-up for replica reads support that went into -rc1
  and two target_copy() fixups"

* tag 'ceph-for-5.8-rc2' of git://github.com/ceph/ceph-client:
  libceph: don't omit used_replica in target_copy()
  libceph: don't omit recovery_deletes in target_copy()
  libceph: move away from global osd_req_flags
2020-06-19 12:25:04 -07:00
Linus Torvalds
98b769942c Merge tag 'overflow-v5.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull flex-array size helper from Kees Cook:
 "During the treewide clean-ups of zero-length "flexible arrays", the
  struct_size() helper was heavily used, but it was noticed that many
  times it would have been nice to have an additional helper to get the
  size of just the flexible array itself.

  This need appears to be even more common when cleaning up the 1-byte
  array "flexible arrays", so Gustavo implemented it.

  I'd love to get this landed early so it can be used during the v5.9
  dev cycle to ease the 1-byte array cleanups."

* tag 'overflow-v5.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  overflow.h: Add flex_array_size() helper
2020-06-19 11:45:03 -07:00
Wolfram Sang
390fd0475a i2c: remove deprecated i2c_new_device API
All in-tree users have been converted to the new i2c_new_client_device
function, so remove this deprecated one.

Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
2020-06-19 09:20:28 +02:00
Linus Torvalds
5e857ce6ea Merge branch 'hch' (maccess patches from Christoph Hellwig)
Merge non-faulting memory access cleanups from Christoph Hellwig:
 "Andrew and I decided to drop the patches implementing your suggested
  rename of the probe_kernel_* and probe_user_* helpers from -mm as
  there were way to many conflicts.

  After -rc1 might be a good time for this as all the conflicts are
  resolved now"

This also adds a type safety checking patch on top of the renaming
series to make the subtle behavioral difference between 'get_user()' and
'get_kernel_nofault()' less potentially dangerous and surprising.

* emailed patches from Christoph Hellwig <hch@lst.de>:
  maccess: make get_kernel_nofault() check for minimal type compatibility
  maccess: rename probe_kernel_address to get_kernel_nofault
  maccess: rename probe_user_{read,write} to copy_{from,to}_user_nofault
  maccess: rename probe_kernel_{read,write} to copy_{from,to}_kernel_nofault
2020-06-18 12:35:51 -07:00
Linus Torvalds
0c389d89ab maccess: make get_kernel_nofault() check for minimal type compatibility
Now that we've renamed probe_kernel_address() to get_kernel_nofault()
and made it look and behave more in line with get_user(), some of the
subtle type behavior differences end up being more obvious and possibly
dangerous.

When you do

        get_user(val, user_ptr);

the type of the access comes from the "user_ptr" part, and the above
basically acts as

        val = *user_ptr;

by design (except, of course, for the fact that the actual dereference
is done with a user access).

Note how in the above case, the type of the end result comes from the
pointer argument, and then the value is cast to the type of 'val' as
part of the assignment.

So the type of the pointer is ultimately the more important type both
for the access itself.

But 'get_kernel_nofault()' may now _look_ similar, but it behaves very
differently.  When you do

        get_kernel_nofault(val, kernel_ptr);

it behaves like

        val = *(typeof(val) *)kernel_ptr;

except, of course, for the fact that the actual dereference is done with
exception handling so that a faulting access is suppressed and returned
as the error code.

But note how different the casting behavior of the two superficially
similar accesses are: one does the actual access in the size of the type
the pointer points to, while the other does the access in the size of
the target, and ignores the pointer type entirely.

Actually changing get_kernel_nofault() to act like get_user() is almost
certainly the right thing to do eventually, but in the meantime this
patch adds logit to at least verify that the pointer type is compatible
with the type of the result.

In many cases, this involves just casting the pointer to 'void *' to
make it obvious that the type of the pointer is not the important part.
It's not how 'get_user()' acts, but at least the behavioral difference
is now obvious and explicit.

Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-18 12:10:37 -07:00
Christoph Hellwig
25f12ae45f maccess: rename probe_kernel_address to get_kernel_nofault
Better describe what this helper does, and match the naming of
copy_from_kernel_nofault.

Also switch the argument order around, so that it acts and looks
like get_user().

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-18 11:14:40 -07:00
Luc Van Oostenryck
670d0a4b10 sparse: use identifiers to define address spaces
Currently, address spaces in warnings are displayed as '<asn:X>' with
'X' being the address space's arbitrary number.

But since sparse v0.6.0-rc1 (late December 2018), sparse allows you to
define the address spaces using an identifier instead of a number.  This
identifier is then directly used in the warnings.

So, use the identifiers '__user', '__iomem', '__percpu' & '__rcu' for
the corresponding address spaces.  The default address space, __kernel,
being not displayed in warnings, stays defined as '0'.

With this change, warnings that used to be displayed as:

	cast removes address space '<asn:1>' of expression
	... void [noderef] <asn:2> *

will now be displayed as:

	cast removes address space '__user' of expression
	... void [noderef] __iomem *

This also moves the __kernel annotation to be the first one, since it is
quite different from the others because it's the default one, and so:

 - it's never displayed

 - it's normally not needed, nor in type annotations, nor in cast
   between address spaces. The only time it's needed is when it's
   combined with a typeof to express "the same type as this one but
   without the address space"

 - it can't be defined with a name, '0' must be used.

So, it seemed strange to me to have it in the middle of the other
ones.

Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Acked-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-18 10:05:23 -07:00
Zheng Bin
3373a3461a block: make function 'kill_bdev' static
kill_bdev does not have any external user, so make it static.

Signed-off-by: Zheng Bin <zhengbin13@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-06-18 09:24:35 -06:00
Kai-Heng Feng
b5292111de libata: Use per port sync for detach
Commit 130f4caf14 ("libata: Ensure ata_port probe has completed before
detach") may cause system freeze during suspend.

Using async_synchronize_full() in PM callbacks is wrong, since async
callbacks that are already scheduled may wait for not-yet-scheduled
callbacks, causes a circular dependency.

Instead of using big hammer like async_synchronize_full(), use async
cookie to make sure port probe are synced, without affecting other
scheduled PM callbacks.

Fixes: 130f4caf14 ("libata: Ensure ata_port probe has completed before detach")
Suggested-by: John Garry <john.garry@huawei.com>
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Tested-by: John Garry <john.garry@huawei.com>
BugLink: https://bugs.launchpad.net/bugs/1867983
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-06-18 09:21:40 -06:00
Christoph Hellwig
c0ee37e85e maccess: rename probe_user_{read,write} to copy_{from,to}_user_nofault
Better describe what these functions do.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-17 10:57:41 -07:00
Christoph Hellwig
fe557319aa maccess: rename probe_kernel_{read,write} to copy_{from,to}_kernel_nofault
Better describe what these functions do.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-17 10:57:41 -07:00
Gustavo A. R. Silva
b19d57d0f3 overflow.h: Add flex_array_size() helper
Add flex_array_size() helper for the calculation of the size, in bytes,
of a flexible array member contained within an enclosing structure.

Example of usage:

struct something {
	size_t count;
	struct foo items[];
};

struct something *instance;

instance = kmalloc(struct_size(instance, items, count), GFP_KERNEL);
instance->count = count;
memcpy(instance->items, src, flex_array_size(instance, items, instance->count));

The helper returns SIZE_MAX on overflow instead of wrapping around.

Additionally replaces parameter "n" with "count" in struct_size() helper
for greater clarity and unification.

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://lore.kernel.org/r/20200609012233.GA3371@embeddedor
Signed-off-by: Kees Cook <keescook@chromium.org>
2020-06-16 20:45:08 -07:00
Jiri Olsa
9b38cc704e kretprobe: Prevent triggering kretprobe from within kprobe_flush_task
Ziqian reported lockup when adding retprobe on _raw_spin_lock_irqsave.
My test was also able to trigger lockdep output:

 ============================================
 WARNING: possible recursive locking detected
 5.6.0-rc6+ #6 Not tainted
 --------------------------------------------
 sched-messaging/2767 is trying to acquire lock:
 ffffffff9a492798 (&(kretprobe_table_locks[i].lock)){-.-.}, at: kretprobe_hash_lock+0x52/0xa0

 but task is already holding lock:
 ffffffff9a491a18 (&(kretprobe_table_locks[i].lock)){-.-.}, at: kretprobe_trampoline+0x0/0x50

 other info that might help us debug this:
  Possible unsafe locking scenario:

        CPU0
        ----
   lock(&(kretprobe_table_locks[i].lock));
   lock(&(kretprobe_table_locks[i].lock));

  *** DEADLOCK ***

  May be due to missing lock nesting notation

 1 lock held by sched-messaging/2767:
  #0: ffffffff9a491a18 (&(kretprobe_table_locks[i].lock)){-.-.}, at: kretprobe_trampoline+0x0/0x50

 stack backtrace:
 CPU: 3 PID: 2767 Comm: sched-messaging Not tainted 5.6.0-rc6+ #6
 Call Trace:
  dump_stack+0x96/0xe0
  __lock_acquire.cold.57+0x173/0x2b7
  ? native_queued_spin_lock_slowpath+0x42b/0x9e0
  ? lockdep_hardirqs_on+0x590/0x590
  ? __lock_acquire+0xf63/0x4030
  lock_acquire+0x15a/0x3d0
  ? kretprobe_hash_lock+0x52/0xa0
  _raw_spin_lock_irqsave+0x36/0x70
  ? kretprobe_hash_lock+0x52/0xa0
  kretprobe_hash_lock+0x52/0xa0
  trampoline_handler+0xf8/0x940
  ? kprobe_fault_handler+0x380/0x380
  ? find_held_lock+0x3a/0x1c0
  kretprobe_trampoline+0x25/0x50
  ? lock_acquired+0x392/0xbc0
  ? _raw_spin_lock_irqsave+0x50/0x70
  ? __get_valid_kprobe+0x1f0/0x1f0
  ? _raw_spin_unlock_irqrestore+0x3b/0x40
  ? finish_task_switch+0x4b9/0x6d0
  ? __switch_to_asm+0x34/0x70
  ? __switch_to_asm+0x40/0x70

The code within the kretprobe handler checks for probe reentrancy,
so we won't trigger any _raw_spin_lock_irqsave probe in there.

The problem is in outside kprobe_flush_task, where we call:

  kprobe_flush_task
    kretprobe_table_lock
      raw_spin_lock_irqsave
        _raw_spin_lock_irqsave

where _raw_spin_lock_irqsave triggers the kretprobe and installs
kretprobe_trampoline handler on _raw_spin_lock_irqsave return.

The kretprobe_trampoline handler is then executed with already
locked kretprobe_table_locks, and first thing it does is to
lock kretprobe_table_locks ;-) the whole lockup path like:

  kprobe_flush_task
    kretprobe_table_lock
      raw_spin_lock_irqsave
        _raw_spin_lock_irqsave ---> probe triggered, kretprobe_trampoline installed

        ---> kretprobe_table_locks locked

        kretprobe_trampoline
          trampoline_handler
            kretprobe_hash_lock(current, &head, &flags);  <--- deadlock

Adding kprobe_busy_begin/end helpers that mark code with fake
probe installed to prevent triggering of another kprobe within
this code.

Using these helpers in kprobe_flush_task, so the probe recursion
protection check is hit and the probe is never set to prevent
above lockup.

Link: http://lkml.kernel.org/r/158927059835.27680.7011202830041561604.stgit@devnote2

Fixes: ef53d9c5e4 ("kprobes: improve kretprobe scalability with hashed locking")
Cc: Ingo Molnar <mingo@kernel.org>
Cc: "Gustavo A . R . Silva" <gustavoars@kernel.org>
Cc: Anders Roxell <anders.roxell@linaro.org>
Cc: "Naveen N . Rao" <naveen.n.rao@linux.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: David Miller <davem@davemloft.net>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: stable@vger.kernel.org
Reported-by: "Ziqian SUN (Zamir)" <zsun@redhat.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-06-16 21:21:01 -04:00
Linus Torvalds
69119673bd Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from David Miller:

 1) Don't get per-cpu pointer with preemption enabled in nft_set_pipapo,
    fix from Stefano Brivio.

 2) Fix memory leak in ctnetlink, from Pablo Neira Ayuso.

 3) Multiple definitions of MPTCP_PM_MAX_ADDR, from Geliang Tang.

 4) Accidently disabling NAPI in non-error paths of macb_open(), from
    Charles Keepax.

 5) Fix races between alx_stop and alx_remove, from Zekun Shen.

 6) We forget to re-enable SRIOV during resume in bnxt_en driver, from
    Michael Chan.

 7) Fix memory leak in ipv6_mc_destroy_dev(), from Wang Hai.

 8) rxtx stats use wrong index in mvpp2 driver, from Sven Auhagen.

 9) Fix memory leak in mptcp_subflow_create_socket error path, from Wei
    Yongjun.

10) We should not adjust the TCP window advertised when sending dup acks
    in non-SACK mode, because it won't be counted as a dup by the sender
    if the window size changes. From Eric Dumazet.

11) Destroy the right number of queues during remove in mvpp2 driver,
    from Sven Auhagen.

12) Various WOL and PM fixes to e1000 driver, from Chen Yu, Vaibhav
    Gupta, and Arnd Bergmann.

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (35 commits)
  e1000e: fix unused-function warning
  e1000: use generic power management
  e1000e: Do not wake up the system via WOL if device wakeup is disabled
  lan743x: add MODULE_DEVICE_TABLE for module loading alias
  mlxsw: spectrum: Adjust headroom buffers for 8x ports
  bareudp: Fixed configuration to avoid having garbage values
  mvpp2: remove module bugfix
  tcp: grow window for OOO packets only for SACK flows
  mptcp: fix memory leak in mptcp_subflow_create_socket()
  netfilter: flowtable: Make nf_flow_table_offload_add/del_cb inline
  net/sched: act_ct: Make tcf_ct_flow_table_restore_skb inline
  net: dsa: sja1105: fix PTP timestamping with large tc-taprio cycles
  mvpp2: ethtool rxtx stats fix
  MAINTAINERS: switch to my private email for Renesas Ethernet drivers
  rocker: fix incorrect error handling in dma_rings_init
  test_objagg: Fix potential memory leak in error handling
  net: ethernet: mtk-star-emac: simplify interrupt handling
  mld: fix memory leak in ipv6_mc_destroy_dev()
  bnxt_en: Return from timer if interface is not in open state.
  bnxt_en: Fix AER reset logic on 57500 chips.
  ...
2020-06-16 17:44:54 -07:00
Linus Torvalds
ffbc93768e Merge tag 'flex-array-conversions-5.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux
Pull flexible-array member conversions from Gustavo A. R. Silva:
 "Replace zero-length arrays with flexible-array members.

  Notice that all of these patches have been baking in linux-next for
  two development cycles now.

  There is a regular need in the kernel to provide a way to declare
  having a dynamically sized set of trailing elements in a structure.
  Kernel code should always use “flexible array members”[1] for these
  cases. The older style of one-element or zero-length arrays should no
  longer be used[2].

  C99 introduced “flexible array members”, which lacks a numeric size
  for the array declaration entirely:

        struct something {
                size_t count;
                struct foo items[];
        };

  This is the way the kernel expects dynamically sized trailing elements
  to be declared. It allows the compiler to generate errors when the
  flexible array does not occur last in the structure, which helps to
  prevent some kind of undefined behavior[3] bugs from being
  inadvertently introduced to the codebase.

  It also allows the compiler to correctly analyze array sizes (via
  sizeof(), CONFIG_FORTIFY_SOURCE, and CONFIG_UBSAN_BOUNDS). For
  instance, there is no mechanism that warns us that the following
  application of the sizeof() operator to a zero-length array always
  results in zero:

        struct something {
                size_t count;
                struct foo items[0];
        };

        struct something *instance;

        instance = kmalloc(struct_size(instance, items, count), GFP_KERNEL);
        instance->count = count;

        size = sizeof(instance->items) * instance->count;
        memcpy(instance->items, source, size);

  At the last line of code above, size turns out to be zero, when one
  might have thought it represents the total size in bytes of the
  dynamic memory recently allocated for the trailing array items. Here
  are a couple examples of this issue[4][5].

  Instead, flexible array members have incomplete type, and so the
  sizeof() operator may not be applied[6], so any misuse of such
  operators will be immediately noticed at build time.

  The cleanest and least error-prone way to implement this is through
  the use of a flexible array member:

        struct something {
                size_t count;
                struct foo items[];
        };

        struct something *instance;

        instance = kmalloc(struct_size(instance, items, count), GFP_KERNEL);
        instance->count = count;

        size = sizeof(instance->items[0]) * instance->count;
        memcpy(instance->items, source, size);

  instead"

[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://github.com/KSPP/linux/issues/21
[3] commit 7649773293 ("cxgb3/l2t: Fix undefined behaviour")
[4] commit f2cd32a443 ("rndis_wlan: Remove logically dead code")
[5] commit ab91c2a89f ("tpm: eventlog: Replace zero-length array with flexible-array member")
[6] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html

* tag 'flex-array-conversions-5.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux: (41 commits)
  w1: Replace zero-length array with flexible-array
  tracing/probe: Replace zero-length array with flexible-array
  soc: ti: Replace zero-length array with flexible-array
  tifm: Replace zero-length array with flexible-array
  dmaengine: tegra-apb: Replace zero-length array with flexible-array
  stm class: Replace zero-length array with flexible-array
  Squashfs: Replace zero-length array with flexible-array
  ASoC: SOF: Replace zero-length array with flexible-array
  ima: Replace zero-length array with flexible-array
  sctp: Replace zero-length array with flexible-array
  phy: samsung: Replace zero-length array with flexible-array
  RxRPC: Replace zero-length array with flexible-array
  rapidio: Replace zero-length array with flexible-array
  media: pwc: Replace zero-length array with flexible-array
  firmware: pcdp: Replace zero-length array with flexible-array
  oprofile: Replace zero-length array with flexible-array
  block: Replace zero-length array with flexible-array
  tools/testing/nvdimm: Replace zero-length array with flexible-array
  libata: Replace zero-length array with flexible-array
  kprobes: Replace zero-length array with flexible-array
  ...
2020-06-16 17:23:57 -07:00
Ilya Dryomov
22d2cfdffa libceph: move away from global osd_req_flags
osd_req_flags is overly general and doesn't suit its only user
(read_from_replica option) well:

- applying osd_req_flags in account_request() affects all OSD
  requests, including linger (i.e. watch and notify).  However,
  linger requests should always go to the primary even though
  some of them are reads (e.g. notify has side effects but it
  is a read because it doesn't result in mutation on the OSDs).

- calls to class methods that are reads are allowed to go to
  the replica, but most such calls issued for "rbd map" and/or
  exclusive lock transitions are requested to be resent to the
  primary via EAGAIN, doubling the latency.

Get rid of global osd_req_flags and set read_from_replica flag
only on specific OSD requests instead.

Fixes: 8ad44d5e0d ("libceph: read_from_replica option")
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
2020-06-16 16:01:53 +02:00
Gustavo A. R. Silva
5cab1634e4 tifm: Replace zero-length array with flexible-array
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].

[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://github.com/KSPP/linux/issues/21

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-06-15 23:08:32 -05:00
Gustavo A. R. Silva
af6bb61cc0 sctp: Replace zero-length array with flexible-array
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].

[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://github.com/KSPP/linux/issues/21

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-06-15 23:08:32 -05:00
Gustavo A. R. Silva
18bdc20be1 RxRPC: Replace zero-length array with flexible-array
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].

[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://github.com/KSPP/linux/issues/21

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-06-15 23:08:32 -05:00
Gustavo A. R. Silva
9c5fbf05cb libata: Replace zero-length array with flexible-array
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].

[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://github.com/KSPP/linux/issues/21

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-06-15 23:08:31 -05:00
Gustavo A. R. Silva
67a862a94d kprobes: Replace zero-length array with flexible-array
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].

[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://github.com/KSPP/linux/issues/21

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-06-15 23:08:31 -05:00
Gustavo A. R. Silva
ad8cb1654d keys: encrypted-type: Replace zero-length array with flexible-array
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].

[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://github.com/KSPP/linux/issues/21

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-06-15 23:08:31 -05:00
Gustavo A. R. Silva
50b6951feb kexec: Replace zero-length array with flexible-array
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].

[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://github.com/KSPP/linux/issues/21

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-06-15 23:08:31 -05:00
Gustavo A. R. Silva
764e515f41 KVM: Replace zero-length array with flexible-array
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].

[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://github.com/KSPP/linux/issues/21

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-06-15 23:08:31 -05:00
Gustavo A. R. Silva
67cd462446 FS-Cache: Replace zero-length array with flexible-array
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].

[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://github.com/KSPP/linux/issues/21

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-06-15 23:08:31 -05:00
Gustavo A. R. Silva
6b5679d237 cb710: Replace zero-length array with flexible-array
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].

[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://github.com/KSPP/linux/issues/21

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-06-15 23:08:31 -05:00
Gustavo A. R. Silva
ec4ac36939 drm/edid: Replace zero-length array with flexible-array
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].

[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://github.com/KSPP/linux/issues/21

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-06-15 23:08:31 -05:00
Gustavo A. R. Silva
d6562f1ca8 can: Replace zero-length array with flexible-array
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].

[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://github.com/KSPP/linux/issues/21

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-06-15 23:08:31 -05:00
Gustavo A. R. Silva
466f966b1f dmaengine: Replace zero-length array with flexible-array
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].

[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://github.com/KSPP/linux/issues/21

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-06-15 23:08:30 -05:00
Christoph Hellwig
7bb7ee8704 scsi: libata: Provide an ata_scsi_dma_need_drain stub for !CONFIG_ATA
SAS drivers can be compiled with ata support disabled.  Provide a stub so
that the drivers don't have to ifdef around wiring up
ata_scsi_dma_need_drain.

Link: https://lore.kernel.org/r/20200615064624.37317-2-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-06-15 23:40:41 -04:00
Vaibhav Jain
f517f7925b ndctl/papr_scm,uapi: Add support for PAPR nvdimm specific methods
Introduce support for PAPR NVDIMM Specific Methods (PDSM) in papr_scm
module and add the command family NVDIMM_FAMILY_PAPR to the white list
of NVDIMM command sets. Also advertise support for ND_CMD_CALL for the
nvdimm command mask and implement necessary scaffolding in the module
to handle ND_CMD_CALL ioctl and PDSM requests that we receive.

The layout of the PDSM request as we expect from libnvdimm/libndctl is
described in newly introduced uapi header 'papr_pdsm.h' which
defines a 'struct nd_pkg_pdsm' and a maximal union named
'nd_pdsm_payload'. These new structs together with 'struct nd_cmd_pkg'
for a pdsm envelop thats sent by libndctl to libnvdimm and serviced by
papr_scm in 'papr_scm_service_pdsm()'. The PDSM request is
communicated by member 'struct nd_cmd_pkg.nd_command' together with
other information on the pdsm payload (size-in, size-out).

The patch also introduces 'struct pdsm_cmd_desc' instances of which
are stored in an array __pdsm_cmd_descriptors[] indexed with PDSM cmd
and corresponding access function pdsm_cmd_desc() is
introduced. 'struct pdsm_cdm_desc' holds the service function for a
given PDSM and corresponding payload in/out sizes.

A new function papr_scm_service_pdsm() is introduced and is called from
papr_scm_ndctl() in case of a PDSM request is received via ND_CMD_CALL
command from libnvdimm. The function performs validation on the PDSM
payload based on info present in corresponding PDSM descriptor and if
valid calls the 'struct pdcm_cmd_desc.service' function to service the
PDSM.

Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/r/20200615124407.32596-6-vaibhav@linux.ibm.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2020-06-15 18:22:44 -07:00
Alaa Hleihel
505ee3a1ca netfilter: flowtable: Make nf_flow_table_offload_add/del_cb inline
Currently, nf_flow_table_offload_add/del_cb are exported by nf_flow_table
module, therefore modules using them will have hard-dependency
on nf_flow_table and will require loading it all the time.

This can lead to an unnecessary overhead on systems that do not
use this API.

To relax the hard-dependency between the modules, we unexport these
functions and make them static inline.

Fixes: 978703f425 ("netfilter: flowtable: Add API for registering to flow table events")
Signed-off-by: Alaa Hleihel <alaa@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-15 18:06:52 -07:00
Alaa Hleihel
762f926d6f net/sched: act_ct: Make tcf_ct_flow_table_restore_skb inline
Currently, tcf_ct_flow_table_restore_skb is exported by act_ct
module, therefore modules using it will have hard-dependency
on act_ct and will require loading it all the time.

This can lead to an unnecessary overhead on systems that do not
use hardware connection tracking action (ct_metadata action) in
the first place.

To relax the hard-dependency between the modules, we unexport this
function and make it a static inline one.

Fixes: 30b0cf90c6 ("net/sched: act_ct: Support restoring conntrack info on skbs")
Signed-off-by: Alaa Hleihel <alaa@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-15 18:06:52 -07:00
Randy Dunlap
5bcc066c05 trace/events/block.h: drop kernel-doc for dropped function parameter
Fix kernel-doc warning: the parameter was removed, so also remove
the kernel-doc notation for it.

../include/trace/events/block.h:278: warning: Excess function parameter 'error' description in 'trace_block_bio_complete'

Fixes: d24de76af8 ("block: remove the error argument to the block_bio_complete tracepoint")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-06-15 16:51:46 -06:00
Linus Torvalds
3be20b6fc1 Merge tag 'ext4-for-linus-5.8-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull more ext4 updates from Ted Ts'o:
 "This is the second round of ext4 commits for 5.8 merge window [1].

  It includes the per-inode DAX support, which was dependant on the DAX
  infrastructure which came in via the XFS tree, and a number of
  regression and bug fixes; most notably the "BUG: using
  smp_processor_id() in preemptible code in ext4_mb_new_blocks" reported
  by syzkaller"

[1] The pull request actually came in 15 minutes after I had tagged the
    rc1 release. Tssk, tssk, late..   - Linus

* tag 'ext4-for-linus-5.8-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  ext4, jbd2: ensure panic by fix a race between jbd2 abort and ext4 error handlers
  ext4: support xattr gnu.* namespace for the Hurd
  ext4: mballoc: Use this_cpu_read instead of this_cpu_ptr
  ext4: avoid utf8_strncasecmp() with unstable name
  ext4: stop overwrite the errcode in ext4_setup_super
  ext4: fix partial cluster initialization when splitting extent
  ext4: avoid race conditions when remounting with options that change dax
  Documentation/dax: Update DAX enablement for ext4
  fs/ext4: Introduce DAX inode flag
  fs/ext4: Remove jflag variable
  fs/ext4: Make DAX mount option a tri-state
  fs/ext4: Only change S_DAX on inode load
  fs/ext4: Update ext4_should_use_dax()
  fs/ext4: Change EXT4_MOUNT_DAX to EXT4_MOUNT_DAX_ALWAYS
  fs/ext4: Disallow verity if inode is DAX
  fs/ext4: Narrow scope of DAX check in setflags
2020-06-15 09:32:10 -07:00