linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-03-22 01:31:18 -04:00

Author	SHA1	Message	Date
Michal Kalderon	eb7f84e379	RDMA/qedr: Add EDPM max size to alloc ucontext response User space should receive the maximum edpm size from kernel driver, similar to other edpm/ldpm related limits. Add an additional parameter to the alloc_ucontext_resp structure for the edpm maximum size. In addition, pass an indication from user-space to kernel (and not just kernel to user) that the DPM sizes are supported. This is for supporting backward-forward compatibility between driver and lib for everything related to DPM transaction and limit sizes. This should have been part of commit mentioned in Fixes tag. Link: https://lore.kernel.org/r/20200707063100.3811-3-michal.kalderon@marvell.com Fixes: `93a3d05f9d` ("RDMA/qedr: Add kernel capability flags for dpm enabled mode") Signed-off-by: Ariel Elior <ariel.elior@marvell.com> Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-07-16 16:01:55 -03:00
Michal Kalderon	bbe4f42452	RDMA/qedr: Add EDPM mode type for user-fw compatibility In older FW versions the completion flag was treated as the ack flag in edpm messages. commit `ff937b916e` ("qed: Add EDPM mode type for user-fw compatibility") exposed the FW option of setting which mode the QP is in by adding a flag to the qedr <-> qed API. This patch adds the qedr <-> libqedr interface so that the libqedr can set the flag appropriately and qedr can pass it down to FW. Flag is added for backward compatibility with libqedr. For older libs, this flag didn't exist and therefore set to zero. Fixes: `ac1b36e55a` ("qedr: Add support for user context verbs") Link: https://lore.kernel.org/r/20200707063100.3811-2-michal.kalderon@marvell.com Signed-off-by: Yuval Bason <yuval.bason@marvell.com> Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-07-16 16:01:55 -03:00
Mark Zhang	7c97f3aded	RDMA/counter: Add PID category support in auto mode With the "PID" category QPs have same PID will be bound to same counter; If this category is not set then QPs have different PIDs will be bound to same counter. This is implemented for 2 reasons: 1. The counter is a limited resource, while there may be dozens of applications, each of which creates several types of QPs, which means it may doesn't have enough counter. 2. The system administrator needs all QPs created by all applications with same type bound to one counter. The counter name and PID is only make sense when "PID" category are configured. This category can also be used in combine with others, e.g. QP type. Link: https://lore.kernel.org/r/20200702082933.424537-2-leon@kernel.org Signed-off-by: Mark Zhang <markz@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-07-10 16:50:53 -03:00
Leon Romanovsky	28ad5f65c3	RDMA: Move XRCD to be under ib_core responsibility Update the code to allocate and free ib_xrcd structure in the ib_core instead of inside drivers. Link: https://lore.kernel.org/r/20200630101855.368895-4-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-07-06 20:11:24 -03:00
Leon Romanovsky	3b023e1b68	RDMA/core: Create and destroy counters in the ib_core Move allocation and destruction of counters under ib_core responsibility Link: https://lore.kernel.org/r/20200630101855.368895-2-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-07-06 20:04:40 -03:00
Yishai Hadas	6c01e6b218	IB/uverbs: Expose UAPI to query MR Expose UAPI to query MR, this will let user space application that didn't allocate the MR but has access to by owning the matching command FD to retrieve its information. Link: https://lore.kernel.org/r/20200630093916.332097-8-leon@kernel.org Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-07-06 19:50:34 -03:00
Yishai Hadas	05f71ef979	RDMA/mlx5: Introduce UAPI to query PD attributes Introduce UAPI to query PD attributes, this can be used to retrieve PD attributes by having the PD handle of the created one and owning the command FD for the ucontxet. Link: https://lore.kernel.org/r/20200630093916.332097-7-leon@kernel.org Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-07-06 19:50:34 -03:00
Yishai Hadas	0fb556b2b5	RDMA/mlx5: Implement the query ucontext functionality Implement the query ucontext functionality by returning the original ucontext data as part of an extra mlx5 attribute that holds the driver UAPI response. Link: https://lore.kernel.org/r/20200630093916.332097-6-leon@kernel.org Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-07-06 19:50:33 -03:00
Yishai Hadas	1c8fb1ea5a	IB/uverbs: Expose UAPI to query ucontext Expose UAPI to query ucontext, this will let user space application that didn't allocate the ucontext but has access to by owning the matching command FD to retrieve the ucontext information. Link: https://lore.kernel.org/r/20200630093916.332097-4-leon@kernel.org Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-07-06 19:50:33 -03:00
Maor Gottlieb	6f3ca6f4f5	RDMA/core: Optimize XRC target lookup Replace the mutex with read write semaphore and use xarray instead of linked list for XRC target QPs. This will give faster XRC target lookup. In addition, when QP is closed, don't insert it back to the xarray if the destroy command failed. Link: https://lore.kernel.org/r/20200706122716.647338-4-leon@kernel.org Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-07-06 19:32:23 -03:00
Maor Gottlieb	b73efcb26e	RDMA/core: Clean ib_alloc_xrcd() and reuse it to allocate XRC domain ib_alloc_xrcd() already does the required initialization, so move the uverbs to call it and save code duplication, while cleaning the function argument lists of that function. Link: https://lore.kernel.org/r/20200706122716.647338-3-leon@kernel.org Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-07-06 19:32:23 -03:00
Gal Pressman	42a3b15396	RDMA: Remove the udata parameter from alloc_mr callback Allocating an MR flow can only be initiated by kernel users, and not from userspace so a udata parameter is redundant. Link: https://lore.kernel.org/r/20200706120343.10816-4-galpress@amazon.com Signed-off-by: Gal Pressman <galpress@amazon.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-07-06 19:25:53 -03:00
Gal Pressman	b64b74b1d5	RDMA/core: Remove ib_alloc_mr_user function Allocating an MR flow can only be initiated by kernel users, and not from userspace. As a result, the udata parameter is always being passed as NULL. Rename ib_alloc_mr_user function to ib_alloc_mr and remove the udata parameter. Link: https://lore.kernel.org/r/20200706120343.10816-3-galpress@amazon.com Signed-off-by: Gal Pressman <galpress@amazon.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-07-06 19:25:53 -03:00
Jason Gunthorpe	f11f3f76c7	Merge branch 'mlx5_ipoib_qpn' into rdma.git for-next Michael Guralnik says: ==================== This series handles IPoIB child interface creation with setting interface's HW address. In current implementation, lladdr requested by user is ignored and overwritten. Child interface gets the same GID as the parent interface and a QP number which is assigned by the underlying drivers. In this series we fix this behavior so that user's requested address is assigned to the newly created interface. As specific QP number request is not supported for all vendors, QP number requested by user will still be overwritten when this is not supported. Behavior of creation of child interfaces through the sysfs mechanism or without specifying a requested address, stays the same. ==================== Based on the mlx5-next branch at git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux due to dependencies. * branch 'mlx5_ipoib_qpn': RDMA/ipoib: Handle user-supplied address when creating child net/mlx5: Enable QP number request when creating IPoIB underlay QP Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-07-06 14:29:58 -03:00
Michael Guralnik	4dca650991	net/mlx5: Enable QP number request when creating IPoIB underlay QP If in the process of creating the underlay QP for an IPoIB interface the user has set the address and specifically the 1st-3rd bytes representing the QP number, use the requested QP number when creating the underlay QP. For a user to be able to request a QP number on QP creation, the MKEY_BY_NAME NVCONFIG should be set. As mkey_by_name and qp_by_name are coupled in FW. This requires driver to query the mkey_by_name max cap during initialization and set the current cap if it was enabled in FW. Signed-off-by: Michael Guralnik <michaelgur@mellanox.com> Reviewed-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>	2020-07-03 18:38:01 +03:00
Maor Gottlieb	d473f4dc2f	RDMA/mlx5: Introduce ODP prefetch counter For debugging purpose it will be easier to understand if prefetch works okay if it has its own counter. Introduce ODP prefetch counter and count per MR the total number of prefetched pages. In addition remove comment which is not relevant anymore and anyway not in the correct place. Link: https://lore.kernel.org/r/20200621104147.53795-1-leon@kernel.org Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-07-03 09:16:25 -03:00
Tariq Toukan	2d1b69ed65	net/mlx5: kTLS, Improve TLS params layout structures Add explicit WQE segment structures for the TLS static and progress params. According to the HW spec, TISN is not part of the progress params context, take it out of it. Rename the control segment tisn field as it could hold either a TIS or a TIR number. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2020-06-27 13:50:46 -07:00
Parav Pandit	9205d7b1c1	net/mlx5: Avoid RDMA file inclusion in core driver mlx5 cq.h does not depend on RDMA verbs. Remove RDMA verbs file inclusion. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2020-06-27 13:50:46 -07:00
Leon Romanovsky	14c2b89634	RDMA/core: Delete not-used create RWQ table function The RWQ table is used for RSS uverbs and not in used for the kernel consumers, delete ib_create_rwq_ind_table() routine that is not called at all. Link: https://lore.kernel.org/r/20200624105422.1452290-5-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-06-24 16:46:18 -03:00
Maor Gottlieb	65959522f8	RDMA: Add support to dump resource tracker in RAW format Add support to get resource dump in raw format. It enable drivers to return the entire device specific QP/CQ/MR context without a need from the driver to set each field separately. The raw query returns only the device specific data, general data is still returned by using the existing queries. Example: $ rdma res show mr dev mlx5_1 mrn 2 -r -j [{"ifindex":7,"ifname":"mlx5_1", "data":[0,4,255,254,0,0,0,0,0,0,0,0,16,28,0,216,...]}] Link: https://lore.kernel.org/r/20200623113043.1228482-9-leon@kernel.org Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-06-24 08:52:29 -03:00
Maor Gottlieb	211cd9459f	RDMA: Add dedicated CM_ID resource tracker function In order to avoid double multiplexing of the resource when it is a cm id, add a dedicated callback function. In addition remove fill_res_entry which is not used anymore. Link: https://lore.kernel.org/r/20200623113043.1228482-8-leon@kernel.org Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-06-23 11:46:27 -03:00
Maor Gottlieb	5cc34116cc	RDMA: Add dedicated QP resource tracker function In order to avoid double multiplexing of the resource when it is a QP, add a dedicated callback function. Link: https://lore.kernel.org/r/20200623113043.1228482-7-leon@kernel.org Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-06-23 11:46:27 -03:00
Maor Gottlieb	9e2a187a93	RDMA: Add a dedicated CQ resource tracker function In order to avoid double multiplexing of the resource when it is a CQ, add a dedicated callback function. Link: https://lore.kernel.org/r/20200623113043.1228482-6-leon@kernel.org Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-06-23 11:46:27 -03:00
Maor Gottlieb	f443452900	RDMA: Add dedicated MR resource tracker function In order to avoid double multiplexing of the resource when it is a MR, add a dedicated callback function. Link: https://lore.kernel.org/r/20200623113043.1228482-5-leon@kernel.org Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-06-23 11:46:27 -03:00
Maor Gottlieb	608ca553c9	net/mlx5: Add support in query QP, CQ and MKEY segments Introduce new resource dump segments - PRM_QUERY_QP, PRM_QUERY_CQ and PRM_QUERY_MKEY. These segments contains the resource dump in PRM query format. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>	2020-06-23 17:26:10 +03:00
Maor Gottlieb	d63cc24933	net/mlx5: Export resource dump interface Export some of the resource dump API. mlx5_ib driver will use it in downstream patches. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>	2020-06-23 17:26:10 +03:00
Linus Torvalds	7561393908	Merge tag 'powerpc-5.8-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - One fix for the interrupt rework we did last release which broke KVM-PR - Three commits fixing some fallout from the READ_ONCE() changes interacting badly with our 8xx 16K pages support, which uses a pte_t that is a structure of 4 actual PTEs - A cleanup of the 8xx pte_update() to use the newly added pmd_off() - A fix for a crash when handling an oops if CONFIG_DEBUG_VIRTUAL is enabled - A minor fix for the SPU syscall generation Thanks to Aneesh Kumar K.V, Christian Zigotzky, Christophe Leroy, Mike Rapoport, Nicholas Piggin. * tag 'powerpc-5.8-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/8xx: Provide ptep_get() with 16k pages mm: Allow arches to provide ptep_get() mm/gup: Use huge_ptep_get() in gup_hugepte() powerpc/syscalls: Use the number when building SPU syscall table powerpc/8xx: use pmd_off() to access a PMD entry in pte_update() powerpc/64s: Fix KVM interrupt using wrong save area powerpc: Fix kernel crash in show_instructions() w/DEBUG_VIRTUAL	2020-06-21 10:02:53 -07:00
Linus Torvalds	93bbca271a	Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fixes from Herbert Xu: - NULL dereference in octeontx - PM reference imbalance in ks-sa - deadlock in crypto manager - memory leak in drbg - missing socket limit check on receive SG list size in algif_skcipher - typos in caam - warnings in ccp and hisilicon * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: drbg - always try to free Jitter RNG instance crypto: marvell/octeontx - Fix a potential NULL dereference crypto: algboss - don't wait during notifier callback crypto: caam - fix typos crypto: ccp - Fix sparse warnings in sev-dev crypto: hisilicon - Cap block size at 2^31 crypto: algif_skcipher - Cap recv SG list at ctx->used hwrng: ks-sa - Fix runtime PM imbalance on error	2020-06-21 10:01:03 -07:00
Linus Torvalds	64677779e8	Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "One minor fix and two patches reworking the ata dma drain for the !CONFIG_LIBATA case. The latter is a 5.7 regression fix" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: Wire up ata_scsi_dma_need_drain for SAS HBA drivers scsi: libata: Provide an ata_scsi_dma_need_drain stub for !CONFIG_ATA scsi: ufs-bsg: Fix runtime PM imbalance on error	2020-06-20 19:23:13 -07:00
Linus Torvalds	a5c6a1f0fe	Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: - a small collection of remaining API conversion patches (all acked) which allow to finally remove the deprecated API - some documentation fixes and a MAINTAINERS addition * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: MAINTAINERS: Add robert and myself as qcom i2c cci maintainers i2c: smbus: Fix spelling mistake in the comments Documentation/i2c: SMBus start signal is S not A i2c: remove deprecated i2c_new_device API Documentation: media: convert to use i2c_new_client_device() video: backlight: tosa_lcd: convert to use i2c_new_client_device() x86/platform/intel-mid: convert to use i2c_new_client_device() drm: encoder_slave: use new I2C API drm: encoder_slave: fix refcouting error for modules	2020-06-20 19:18:27 -07:00
Linus Torvalds	8b6ddd10d6	Merge tag 'trace-v5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixes from Steven Rostedt: - Have recordmcount work with > 64K sections (to support LTO) - kprobe RCU fixes - Correct a kprobe critical section with missing mutex - Remove redundant arch_disarm_kprobe() call - Fix lockup when kretprobe triggers within kprobe_flush_task() - Fix memory leak in fetch_op_data operations - Fix sleep in atomic in ftrace trace array sample code - Free up memory on failure in sample trace array code - Fix incorrect reporting of function_graph fields in format file - Fix quote within quote parsing in bootconfig - Fix return value of bootconfig tool - Add testcases for bootconfig tool - Fix maybe uninitialized warning in ftrace pid file code - Remove unused variable in tracing_iter_reset() - Fix some typos * tag 'trace-v5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: ftrace: Fix maybe-uninitialized compiler warning tools/bootconfig: Add testcase for show-command and quotes test tools/bootconfig: Fix to return 0 if succeeded to show the bootconfig tools/bootconfig: Fix to use correct quotes for value proc/bootconfig: Fix to use correct quotes for value tracing: Remove unused event variable in tracing_iter_reset tracing/probe: Fix memleak in fetch_op_data operations trace: Fix typo in allocate_ftrace_ops()'s comment tracing: Make ftrace packed events have align of 1 sample-trace-array: Remove trace_array 'sample-instance' sample-trace-array: Fix sleeping function called from invalid context kretprobe: Prevent triggering kretprobe from within kprobe_flush_task kprobes: Remove redundant arch_disarm_kprobe() call kprobes: Fix to protect kick_kprobe_optimizer() by kprobe_mutex kprobes: Use non RCU traversal APIs on kprobe_tables if possible kprobes: Suppress the suspicious RCU warning on kprobes recordmcount: support >64k sections	2020-06-20 13:17:47 -07:00
Linus Torvalds	eede2b9b3f	Merge tag 'libnvdimm-for-5.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull libnvdimm updates from Dan Williams: "A feature (papr_scm health retrieval) and a fix (sysfs attribute visibility) for v5.8. Vaibhav explains in the merge commit below why missing v5.8 would be painful and I agreed to try a -rc2 pull because only cosmetics kept this out of -rc1 and his initial versions were posted in more than enough time for v5.8 consideration: 'These patches are tied to specific features that were committed to customers in upcoming distros releases (RHEL and SLES) whose time-lines are tied to 5.8 kernel release. Being able to track the health of an nvdimm is critical for our customers that are running workloads leveraging papr-scm nvdimms. Missing the 5.8 kernel would mean missing the distro timelines and shifting forward the availability of this feature in distro kernels by at least 6 months' Summary: - Fix the visibility of the region 'align' attribute. The new unit tests for region alignment handling caught a corner case where the alignment cannot be specified if the region is converted from static to dynamic provisioning at runtime. - Add support for device health retrieval for the persistent memory supported by the papr_scm driver. This includes both the standard sysfs "health flags" that the nfit persistent memory driver publishes and a mechanism for the ndctl tool to retrieve a health-command payload" * tag 'libnvdimm-for-5.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: nvdimm/region: always show the 'align' attribute powerpc/papr_scm: Implement support for PAPR_PDSM_HEALTH ndctl/papr_scm,uapi: Add support for PAPR nvdimm specific methods powerpc/papr_scm: Improve error logging and handling papr_scm_ndctl() powerpc/papr_scm: Fetch nvdimm health information from PHYP seq_buf: Export seq_buf_printf powerpc: Document details on H_SCM_HEALTH hcall	2020-06-20 13:13:21 -07:00
Christophe Leroy	481e980a7c	mm: Allow arches to provide ptep_get() Since commit `9e343b467c` ("READ_ONCE: Enforce atomicity for {READ,WRITE}_ONCE() memory accesses") it is not possible anymore to use READ_ONCE() to access complex page table entries like the one defined for powerpc 8xx with 16k size pages. Define a ptep_get() helper that architectures can override instead of performing a READ_ONCE() on the page table entry pointer. Fixes: `9e343b467c` ("READ_ONCE: Enforce atomicity for {READ,WRITE}_ONCE() memory accesses") Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Acked-by: Will Deacon <will@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/087fa12b6e920e32315136b998aa834f99242695.1592225558.git.christophe.leroy@csgroup.eu	2020-06-20 22:14:53 +10:00
Linus Torvalds	d2b1c81f5f	Merge tag 'block-5.8-2020-06-19' of git://git.kernel.dk/linux-block Pull block fixes from Jens Axboe: - Use import_uuid() where appropriate (Andy) - bcache fixes (Coly, Mauricio, Zhiqiang) - blktrace sparse warnings fix (Jan) - blktrace concurrent setup fix (Luis) - blkdev_get use-after-free fix (Jason) - Ensure all blk-mq maps are updated (Weiping) - Loop invalidate bdev fix (Zheng) * tag 'block-5.8-2020-06-19' of git://git.kernel.dk/linux-block: block: make function 'kill_bdev' static loop: replace kill_bdev with invalidate_bdev partitions/ldm: Replace uuid_copy() with import_uuid() where it makes sense block: update hctx map when use multiple maps blktrace: Avoid sparse warnings when assigning q->blk_trace blktrace: break out of blktrace setup on concurrent calls block: Fix use-after-free in blkdev_get() trace/events/block.h: drop kernel-doc for dropped function parameter blk-mq: Remove redundant 'return' statement bcache: pr_info() format clean up in bcache_device_init() bcache: use delayed kworker fo asynchronous devices registration bcache: check and adjust logical block size for backing devices bcache: fix potential deadlock problem in btree_gc_coalesce	2020-06-19 13:11:26 -07:00
Linus Torvalds	592be758f1	Merge tag 'libata-5.8-2020-06-19' of git://git.kernel.dk/linux-block Pull libata fixes from Jens Axboe: "A few minor changes that should go into this release" * tag 'libata-5.8-2020-06-19' of git://git.kernel.dk/linux-block: libata: Use per port sync for detach ata/libata: Fix usage of page address by page_address in ata_scsi_mode_select_xlat function sata_rcar: handle pm_runtime_get_sync failure cases	2020-06-19 13:09:40 -07:00
Linus Torvalds	672f9255a7	Merge tag 'ceph-for-5.8-rc2' of git://github.com/ceph/ceph-client Pull ceph fixes from Ilya Dryomov: "An important follow-up for replica reads support that went into -rc1 and two target_copy() fixups" * tag 'ceph-for-5.8-rc2' of git://github.com/ceph/ceph-client: libceph: don't omit used_replica in target_copy() libceph: don't omit recovery_deletes in target_copy() libceph: move away from global osd_req_flags	2020-06-19 12:25:04 -07:00
Linus Torvalds	98b769942c	Merge tag 'overflow-v5.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull flex-array size helper from Kees Cook: "During the treewide clean-ups of zero-length "flexible arrays", the struct_size() helper was heavily used, but it was noticed that many times it would have been nice to have an additional helper to get the size of just the flexible array itself. This need appears to be even more common when cleaning up the 1-byte array "flexible arrays", so Gustavo implemented it. I'd love to get this landed early so it can be used during the v5.9 dev cycle to ease the 1-byte array cleanups." * tag 'overflow-v5.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: overflow.h: Add flex_array_size() helper	2020-06-19 11:45:03 -07:00
Wolfram Sang	390fd0475a	i2c: remove deprecated i2c_new_device API All in-tree users have been converted to the new i2c_new_client_device function, so remove this deprecated one. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Signed-off-by: Wolfram Sang <wsa@kernel.org>	2020-06-19 09:20:28 +02:00
Linus Torvalds	5e857ce6ea	Merge branch 'hch' (maccess patches from Christoph Hellwig) Merge non-faulting memory access cleanups from Christoph Hellwig: "Andrew and I decided to drop the patches implementing your suggested rename of the probe_kernel_* and probe_user_* helpers from -mm as there were way to many conflicts. After -rc1 might be a good time for this as all the conflicts are resolved now" This also adds a type safety checking patch on top of the renaming series to make the subtle behavioral difference between 'get_user()' and 'get_kernel_nofault()' less potentially dangerous and surprising. * emailed patches from Christoph Hellwig <hch@lst.de>: maccess: make get_kernel_nofault() check for minimal type compatibility maccess: rename probe_kernel_address to get_kernel_nofault maccess: rename probe_user_{read,write} to copy_{from,to}_user_nofault maccess: rename probe_kernel_{read,write} to copy_{from,to}_kernel_nofault	2020-06-18 12:35:51 -07:00
Linus Torvalds	0c389d89ab	maccess: make get_kernel_nofault() check for minimal type compatibility Now that we've renamed probe_kernel_address() to get_kernel_nofault() and made it look and behave more in line with get_user(), some of the subtle type behavior differences end up being more obvious and possibly dangerous. When you do get_user(val, user_ptr); the type of the access comes from the "user_ptr" part, and the above basically acts as val = user_ptr; by design (except, of course, for the fact that the actual dereference is done with a user access). Note how in the above case, the type of the end result comes from the pointer argument, and then the value is cast to the type of 'val' as part of the assignment. So the type of the pointer is ultimately the more important type both for the access itself. But 'get_kernel_nofault()' may now _look_ similar, but it behaves very differently. When you do get_kernel_nofault(val, kernel_ptr); it behaves like val = (typeof(val) )kernel_ptr; except, of course, for the fact that the actual dereference is done with exception handling so that a faulting access is suppressed and returned as the error code. But note how different the casting behavior of the two superficially similar accesses are: one does the actual access in the size of the type the pointer points to, while the other does the access in the size of the target, and ignores the pointer type entirely. Actually changing get_kernel_nofault() to act like get_user() is almost certainly the right thing to do eventually, but in the meantime this patch adds logit to at least verify that the pointer type is compatible with the type of the result. In many cases, this involves just casting the pointer to 'void ' to make it obvious that the type of the pointer is not the important part. It's not how 'get_user()' acts, but at least the behavioral difference is now obvious and explicit. Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-06-18 12:10:37 -07:00
Christoph Hellwig	25f12ae45f	maccess: rename probe_kernel_address to get_kernel_nofault Better describe what this helper does, and match the naming of copy_from_kernel_nofault. Also switch the argument order around, so that it acts and looks like get_user(). Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-06-18 11:14:40 -07:00
Luc Van Oostenryck	670d0a4b10	sparse: use identifiers to define address spaces Currently, address spaces in warnings are displayed as '<asn:X>' with 'X' being the address space's arbitrary number. But since sparse v0.6.0-rc1 (late December 2018), sparse allows you to define the address spaces using an identifier instead of a number. This identifier is then directly used in the warnings. So, use the identifiers '__user', '__iomem', '__percpu' & '__rcu' for the corresponding address spaces. The default address space, __kernel, being not displayed in warnings, stays defined as '0'. With this change, warnings that used to be displayed as: cast removes address space '<asn:1>' of expression ... void [noderef] <asn:2> * will now be displayed as: cast removes address space '__user' of expression ... void [noderef] __iomem * This also moves the __kernel annotation to be the first one, since it is quite different from the others because it's the default one, and so: - it's never displayed - it's normally not needed, nor in type annotations, nor in cast between address spaces. The only time it's needed is when it's combined with a typeof to express "the same type as this one but without the address space" - it can't be defined with a name, '0' must be used. So, it seemed strange to me to have it in the middle of the other ones. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Acked-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-06-18 10:05:23 -07:00
Zheng Bin	3373a3461a	block: make function 'kill_bdev' static kill_bdev does not have any external user, so make it static. Signed-off-by: Zheng Bin <zhengbin13@huawei.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2020-06-18 09:24:35 -06:00
Kai-Heng Feng	b5292111de	libata: Use per port sync for detach Commit `130f4caf14` ("libata: Ensure ata_port probe has completed before detach") may cause system freeze during suspend. Using async_synchronize_full() in PM callbacks is wrong, since async callbacks that are already scheduled may wait for not-yet-scheduled callbacks, causes a circular dependency. Instead of using big hammer like async_synchronize_full(), use async cookie to make sure port probe are synced, without affecting other scheduled PM callbacks. Fixes: `130f4caf14` ("libata: Ensure ata_port probe has completed before detach") Suggested-by: John Garry <john.garry@huawei.com> Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Tested-by: John Garry <john.garry@huawei.com> BugLink: https://bugs.launchpad.net/bugs/1867983 Signed-off-by: Jens Axboe <axboe@kernel.dk>	2020-06-18 09:21:40 -06:00
Christoph Hellwig	c0ee37e85e	maccess: rename probe_user_{read,write} to copy_{from,to}_user_nofault Better describe what these functions do. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-06-17 10:57:41 -07:00
Christoph Hellwig	fe557319aa	maccess: rename probe_kernel_{read,write} to copy_{from,to}_kernel_nofault Better describe what these functions do. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-06-17 10:57:41 -07:00
Gustavo A. R. Silva	b19d57d0f3	overflow.h: Add flex_array_size() helper Add flex_array_size() helper for the calculation of the size, in bytes, of a flexible array member contained within an enclosing structure. Example of usage: struct something { size_t count; struct foo items[]; }; struct something *instance; instance = kmalloc(struct_size(instance, items, count), GFP_KERNEL); instance->count = count; memcpy(instance->items, src, flex_array_size(instance, items, instance->count)); The helper returns SIZE_MAX on overflow instead of wrapping around. Additionally replaces parameter "n" with "count" in struct_size() helper for greater clarity and unification. Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Link: https://lore.kernel.org/r/20200609012233.GA3371@embeddedor Signed-off-by: Kees Cook <keescook@chromium.org>	2020-06-16 20:45:08 -07:00
Jiri Olsa	9b38cc704e	kretprobe: Prevent triggering kretprobe from within kprobe_flush_task Ziqian reported lockup when adding retprobe on _raw_spin_lock_irqsave. My test was also able to trigger lockdep output: ============================================ WARNING: possible recursive locking detected 5.6.0-rc6+ #6 Not tainted -------------------------------------------- sched-messaging/2767 is trying to acquire lock: ffffffff9a492798 (&(kretprobe_table_locks[i].lock)){-.-.}, at: kretprobe_hash_lock+0x52/0xa0 but task is already holding lock: ffffffff9a491a18 (&(kretprobe_table_locks[i].lock)){-.-.}, at: kretprobe_trampoline+0x0/0x50 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&(kretprobe_table_locks[i].lock)); lock(&(kretprobe_table_locks[i].lock)); * DEADLOCK * May be due to missing lock nesting notation 1 lock held by sched-messaging/2767: #0: ffffffff9a491a18 (&(kretprobe_table_locks[i].lock)){-.-.}, at: kretprobe_trampoline+0x0/0x50 stack backtrace: CPU: 3 PID: 2767 Comm: sched-messaging Not tainted 5.6.0-rc6+ #6 Call Trace: dump_stack+0x96/0xe0 __lock_acquire.cold.57+0x173/0x2b7 ? native_queued_spin_lock_slowpath+0x42b/0x9e0 ? lockdep_hardirqs_on+0x590/0x590 ? __lock_acquire+0xf63/0x4030 lock_acquire+0x15a/0x3d0 ? kretprobe_hash_lock+0x52/0xa0 _raw_spin_lock_irqsave+0x36/0x70 ? kretprobe_hash_lock+0x52/0xa0 kretprobe_hash_lock+0x52/0xa0 trampoline_handler+0xf8/0x940 ? kprobe_fault_handler+0x380/0x380 ? find_held_lock+0x3a/0x1c0 kretprobe_trampoline+0x25/0x50 ? lock_acquired+0x392/0xbc0 ? _raw_spin_lock_irqsave+0x50/0x70 ? __get_valid_kprobe+0x1f0/0x1f0 ? _raw_spin_unlock_irqrestore+0x3b/0x40 ? finish_task_switch+0x4b9/0x6d0 ? __switch_to_asm+0x34/0x70 ? __switch_to_asm+0x40/0x70 The code within the kretprobe handler checks for probe reentrancy, so we won't trigger any _raw_spin_lock_irqsave probe in there. The problem is in outside kprobe_flush_task, where we call: kprobe_flush_task kretprobe_table_lock raw_spin_lock_irqsave _raw_spin_lock_irqsave where _raw_spin_lock_irqsave triggers the kretprobe and installs kretprobe_trampoline handler on _raw_spin_lock_irqsave return. The kretprobe_trampoline handler is then executed with already locked kretprobe_table_locks, and first thing it does is to lock kretprobe_table_locks ;-) the whole lockup path like: kprobe_flush_task kretprobe_table_lock raw_spin_lock_irqsave _raw_spin_lock_irqsave ---> probe triggered, kretprobe_trampoline installed ---> kretprobe_table_locks locked kretprobe_trampoline trampoline_handler kretprobe_hash_lock(current, &head, &flags); <--- deadlock Adding kprobe_busy_begin/end helpers that mark code with fake probe installed to prevent triggering of another kprobe within this code. Using these helpers in kprobe_flush_task, so the probe recursion protection check is hit and the probe is never set to prevent above lockup. Link: http://lkml.kernel.org/r/158927059835.27680.7011202830041561604.stgit@devnote2 Fixes: `ef53d9c5e4` ("kprobes: improve kretprobe scalability with hashed locking") Cc: Ingo Molnar <mingo@kernel.org> Cc: "Gustavo A . R . Silva" <gustavoars@kernel.org> Cc: Anders Roxell <anders.roxell@linaro.org> Cc: "Naveen N . Rao" <naveen.n.rao@linux.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: David Miller <davem@davemloft.net> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: stable@vger.kernel.org Reported-by: "Ziqian SUN (Zamir)" <zsun@redhat.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>	2020-06-16 21:21:01 -04:00
Linus Torvalds	69119673bd	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from David Miller: 1) Don't get per-cpu pointer with preemption enabled in nft_set_pipapo, fix from Stefano Brivio. 2) Fix memory leak in ctnetlink, from Pablo Neira Ayuso. 3) Multiple definitions of MPTCP_PM_MAX_ADDR, from Geliang Tang. 4) Accidently disabling NAPI in non-error paths of macb_open(), from Charles Keepax. 5) Fix races between alx_stop and alx_remove, from Zekun Shen. 6) We forget to re-enable SRIOV during resume in bnxt_en driver, from Michael Chan. 7) Fix memory leak in ipv6_mc_destroy_dev(), from Wang Hai. 8) rxtx stats use wrong index in mvpp2 driver, from Sven Auhagen. 9) Fix memory leak in mptcp_subflow_create_socket error path, from Wei Yongjun. 10) We should not adjust the TCP window advertised when sending dup acks in non-SACK mode, because it won't be counted as a dup by the sender if the window size changes. From Eric Dumazet. 11) Destroy the right number of queues during remove in mvpp2 driver, from Sven Auhagen. 12) Various WOL and PM fixes to e1000 driver, from Chen Yu, Vaibhav Gupta, and Arnd Bergmann. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (35 commits) e1000e: fix unused-function warning e1000: use generic power management e1000e: Do not wake up the system via WOL if device wakeup is disabled lan743x: add MODULE_DEVICE_TABLE for module loading alias mlxsw: spectrum: Adjust headroom buffers for 8x ports bareudp: Fixed configuration to avoid having garbage values mvpp2: remove module bugfix tcp: grow window for OOO packets only for SACK flows mptcp: fix memory leak in mptcp_subflow_create_socket() netfilter: flowtable: Make nf_flow_table_offload_add/del_cb inline net/sched: act_ct: Make tcf_ct_flow_table_restore_skb inline net: dsa: sja1105: fix PTP timestamping with large tc-taprio cycles mvpp2: ethtool rxtx stats fix MAINTAINERS: switch to my private email for Renesas Ethernet drivers rocker: fix incorrect error handling in dma_rings_init test_objagg: Fix potential memory leak in error handling net: ethernet: mtk-star-emac: simplify interrupt handling mld: fix memory leak in ipv6_mc_destroy_dev() bnxt_en: Return from timer if interface is not in open state. bnxt_en: Fix AER reset logic on 57500 chips. ...	2020-06-16 17:44:54 -07:00
Linus Torvalds	ffbc93768e	Merge tag 'flex-array-conversions-5.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux Pull flexible-array member conversions from Gustavo A. R. Silva: "Replace zero-length arrays with flexible-array members. Notice that all of these patches have been baking in linux-next for two development cycles now. There is a regular need in the kernel to provide a way to declare having a dynamically sized set of trailing elements in a structure. Kernel code should always use “flexible array members”[1] for these cases. The older style of one-element or zero-length arrays should no longer be used[2]. C99 introduced “flexible array members”, which lacks a numeric size for the array declaration entirely: struct something { size_t count; struct foo items[]; }; This is the way the kernel expects dynamically sized trailing elements to be declared. It allows the compiler to generate errors when the flexible array does not occur last in the structure, which helps to prevent some kind of undefined behavior[3] bugs from being inadvertently introduced to the codebase. It also allows the compiler to correctly analyze array sizes (via sizeof(), CONFIG_FORTIFY_SOURCE, and CONFIG_UBSAN_BOUNDS). For instance, there is no mechanism that warns us that the following application of the sizeof() operator to a zero-length array always results in zero: struct something { size_t count; struct foo items[0]; }; struct something instance; instance = kmalloc(struct_size(instance, items, count), GFP_KERNEL); instance->count = count; size = sizeof(instance->items) instance->count; memcpy(instance->items, source, size); At the last line of code above, size turns out to be zero, when one might have thought it represents the total size in bytes of the dynamic memory recently allocated for the trailing array items. Here are a couple examples of this issue[4][5]. Instead, flexible array members have incomplete type, and so the sizeof() operator may not be applied[6], so any misuse of such operators will be immediately noticed at build time. The cleanest and least error-prone way to implement this is through the use of a flexible array member: struct something { size_t count; struct foo items[]; }; struct something instance; instance = kmalloc(struct_size(instance, items, count), GFP_KERNEL); instance->count = count; size = sizeof(instance->items[0]) instance->count; memcpy(instance->items, source, size); instead" [1] https://en.wikipedia.org/wiki/Flexible_array_member [2] https://github.com/KSPP/linux/issues/21 [3] commit `7649773293` ("cxgb3/l2t: Fix undefined behaviour") [4] commit `f2cd32a443` ("rndis_wlan: Remove logically dead code") [5] commit `ab91c2a89f` ("tpm: eventlog: Replace zero-length array with flexible-array member") [6] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html * tag 'flex-array-conversions-5.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux: (41 commits) w1: Replace zero-length array with flexible-array tracing/probe: Replace zero-length array with flexible-array soc: ti: Replace zero-length array with flexible-array tifm: Replace zero-length array with flexible-array dmaengine: tegra-apb: Replace zero-length array with flexible-array stm class: Replace zero-length array with flexible-array Squashfs: Replace zero-length array with flexible-array ASoC: SOF: Replace zero-length array with flexible-array ima: Replace zero-length array with flexible-array sctp: Replace zero-length array with flexible-array phy: samsung: Replace zero-length array with flexible-array RxRPC: Replace zero-length array with flexible-array rapidio: Replace zero-length array with flexible-array media: pwc: Replace zero-length array with flexible-array firmware: pcdp: Replace zero-length array with flexible-array oprofile: Replace zero-length array with flexible-array block: Replace zero-length array with flexible-array tools/testing/nvdimm: Replace zero-length array with flexible-array libata: Replace zero-length array with flexible-array kprobes: Replace zero-length array with flexible-array ...	2020-06-16 17:23:57 -07:00

1 2 3 4 5 ...

121067 Commits