linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-06-16 21:58:07 -04:00

Author	SHA1	Message	Date
David S. Miller	446bf64b61	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Merge conflict of mlx5 resolved using instructions in merge commit `9566e650bf`. Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-19 11:54:03 -07:00
Maxime Jourdan	60a039eb27	media: videodev2.h: add V4L2_FMT_FLAG_DYN_RESOLUTION Add an enum_fmt format flag to specifically tag coded formats where dynamic resolution switching is supported by the device. This is useful for some codec drivers that can support dynamic resolution switching for one or more of their listed coded formats. It allows userspace to know whether it should extract the video parameters itself, or if it can rely on the device to send V4L2_EVENT_SOURCE_CHANGE when such changes are detected. Signed-off-by: Maxime Jourdan <mjourdan@baylibre.com> Reviewed-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com> Reviewed-by: Alexandre Courbot <acourbot@chromium.org> Acked-by: Tomasz Figa <tfiga@chromium.org> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>	2019-08-19 14:56:31 -03:00
Hans Verkuil	2b770bee78	media: videodev2.h: add V4L2_FMT_FLAG_CONTINUOUS_BYTESTREAM Add an enum_fmt format flag to specifically tag coded formats where full bytestream parsing is supported by the device. Some stateful decoders are capable of fully parsing a bytestream, but others require that userspace pre-parses the bytestream into frames or fields (see the corresponding pixelformat descriptions for details). If this flag is set, then this pre-parsing step is not required (but still possible, of course). Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Reviewed-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com> Reviewed-by: Alexandre Courbot <acourbot@chromium.org> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>	2019-08-19 14:41:45 -03:00
Linus Torvalds	06821504fd	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from David Miller: 1) Fix jmp to 1st instruction in x64 JIT, from Alexei Starovoitov. 2) Severl kTLS fixes in mlx5 driver, from Tariq Toukan. 3) Fix severe performance regression due to lack of SKB coalescing of fragments during local delivery, from Guillaume Nault. 4) Error path memory leak in sch_taprio, from Ivan Khoronzhuk. 5) Fix batched events in skbedit packet action, from Roman Mashak. 6) Propagate VLAN TX offload to hw_enc_features in bond and team drivers, from Yue Haibing. 7) RXRPC local endpoint refcounting fix and read after free in rxrpc_queue_local(), from David Howells. 8) Fix endian bug in ibmveth multicast list handling, from Thomas Falcon. 9) Oops, make nlmsg_parse() wrap around the correct function, __nlmsg_parse not __nla_parse(). Fix from David Ahern. 10) Memleak in sctp_scend_reset_streams(), fro Zheng Bin. 11) Fix memory leak in cxgb4, from Wenwen Wang. 12) Yet another race in AF_PACKET, from Eric Dumazet. 13) Fix false detection of retransmit failures in tipc, from Tuong Lien. 14) Use after free in ravb_tstamp_skb, from Tho Vu. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (101 commits) ravb: Fix use-after-free ravb_tstamp_skb netfilter: nf_tables: map basechain priority to hardware priority net: sched: use major priority number as hardware priority wimax/i2400m: fix a memory leak bug net: cavium: fix driver name ibmvnic: Unmap DMA address of TX descriptor buffers after use bnxt_en: Fix to include flow direction in L2 key bnxt_en: Use correct src_fid to determine direction of the flow bnxt_en: Suppress HWRM errors for HWRM_NVM_GET_VARIABLE command bnxt_en: Fix handling FRAG_ERR when NVM_INSTALL_UPDATE cmd fails bnxt_en: Improve RX doorbell sequence. bnxt_en: Fix VNIC clearing logic for 57500 chips. net: kalmia: fix memory leaks cx82310_eth: fix a memory leak bug bnx2x: Fix VF's VLAN reconfiguration in reload. Bluetooth: Add debug setting for changing minimum encryption key size tipc: fix false detection of retransmit failures lan78xx: Fix memory leaks MAINTAINERS: r8169: Update path to the driver MAINTAINERS: PHY LIBRARY: Update files in the record ...	2019-08-19 10:00:01 -07:00
David Howells	555df336c7	keys: Fix description size The maximum key description size is 4095. Commit `f771fde820` ("keys: Simplify key description management") inadvertantly reduced that to 255 and made sizes between 256 and 4095 work weirdly, and any size whereby size & 255 == 0 would cause an assertion in __key_link_begin() at the following line: BUG_ON(index_key->desc_len == 0); This can be fixed by simply increasing the size of desc_len in struct keyring_index_key to a u16. Note the argument length test in keyutils only checked empty descriptions and descriptions with a size around the limit (ie. 4095) and not for all the values in between, so it missed this. This has been addressed and https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/keyutils.git/commit/?id=066bf56807c26cd3045a25f355b34c1d8a20a5aa now exhaustively tests all possible lengths of type, description and payload and then some. The assertion failure looks something like: kernel BUG at security/keys/keyring.c:1245! ... RIP: 0010:__key_link_begin+0x88/0xa0 ... Call Trace: key_create_or_update+0x211/0x4b0 __x64_sys_add_key+0x101/0x200 do_syscall_64+0x5b/0x1e0 entry_SYSCALL_64_after_hwframe+0x44/0xa9 It can be triggered by: keyctl add user "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa" a @s Fixes: `f771fde820` ("keys: Simplify key description management") Reported-by: kernel test robot <rong.a.chen@intel.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2019-08-19 09:43:57 -07:00
Boris Brezillon	c3adb85745	media: uapi: h264: Get rid of the p0/b0/b1 ref-lists Those lists can be extracted from the dpb, let's simplify userspace life and build that list kernel-side (generic helpers will be provided for drivers that need this list). Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Nicolas Dufresne <nicolas.dufresne@collabora.com> Reviewed-by: Ezequiel Garcia <ezequiel@collabora.com> Reviewed-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com> Tested-by: Philipp Zabel <p.zabel@pengutronix.de> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>	2019-08-19 13:24:04 -03:00
Ezequiel Garcia	8cae93e090	media: uapi: h264: Add the concept of start code Stateless decoders have different expectations about the start code that is prepended on H264 slices. Add a menu control to express the supported start code types (including no start code). Drivers are allowed to support only one start code type, but they can support both too. Note that this is independent of the H264 decoding mode, which specifies the granularity of the decoding operations. Either in frame-based or slice-based mode, this new control will allow to define the start code expected on H264 slices. Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com> Tested-by: Philipp Zabel <p.zabel@pengutronix.de> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>	2019-08-19 13:23:12 -03:00
Boris Brezillon	5604be66a5	media: uapi: h264: Add the concept of decoding mode Some stateless decoders don't support per-slice decoding granularity (or at least not in a way that would make them efficient or easy to use). Expose a menu to control the supported decoding modes. Drivers are allowed to support only one decoding but they can support both too. To fully specify the decoding operation, we need to introduce a start_byte_offset, to indicate where slices start. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com> Tested-by: Philipp Zabel <p.zabel@pengutronix.de> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>	2019-08-19 13:21:51 -03:00
Ezequiel Garcia	7bb3c32abd	media: uapi: h264: Rename pixel format The V4L2_PIX_FMT_H264_SLICE_RAW name was originally suggested because the pixel format would represent H264 slices without any start code. However, as we will now introduce a start code menu control, give the pixel format a more meaningful name, while it's still early enough to do so. Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com> Tested-by: Philipp Zabel <p.zabel@pengutronix.de> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>	2019-08-19 13:16:08 -03:00
Rasmus Villemoes	4333fb96ca	media: lib/sort.c: implement sort() variant taking context argument Our list_sort() utility has always supported a context argument that is passed through to the comparison routine. Now there's a use case for the similar thing for sort(). This implements sort_r by simply extending the existing sort function in the obvious way. To avoid code duplication, we want to implement sort() in terms of sort_r(). The naive way to do that is static int cmp_wrapper(const void a, const void b, const void ctx) { int (real_cmp)(const void, const void) = ctx; return real_cmp(a, b); } sort(..., cmp) { sort_r(..., cmp_wrapper, cmp) } but this would do two indirect calls for each comparison. Instead, do as is done for the default swap functions - that only adds a cost of a single easily predicted branch to each comparison call. Aside from introducing support for the context argument, this also serves as preparation for patches that will eliminate the indirect comparison calls in common cases. Requested-by: Boris Brezillon <boris.brezillon@collabora.com> Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Acked-by: Andrew Morton <akpm@linux-foundation.org> Tested-by: Philipp Zabel <p.zabel@pengutronix.de> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>	2019-08-19 13:14:53 -03:00
Trond Myklebust	b72679ee89	notify: export symbols for use by the knfsd file cache The knfsd file cache will need to detect when files are unlinked, so that it can close the associated cached files. Export a minimal set of notifier functions to allow it to do so. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2019-08-19 11:00:39 -04:00
Jeff Layton	18f6622ebb	locks: create a new notifier chain for lease attempts With the new file caching infrastructure in nfsd, we can end up holding files open for an indefinite period of time, even when they are still idle. This may prevent the kernel from handing out leases on the file, which is something we don't want to block. Fix this by running a SRCU notifier call chain whenever on any lease attempt. nfsd can then purge the cache for that inode before returning. Since SRCU is only conditionally compiled in, we must only define the new chain if it's enabled, and users of the chain must ensure that SRCU is enabled. Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2019-08-19 11:00:39 -04:00
Jeff Layton	f69d6d8eef	sunrpc: add a new cache_detail operation for when a cache is flushed When the exports table is changed, exportfs will usually write a new time to the "flush" file in the nfsd.export cache procfile. This tells the kernel to flush any entries that are older than that value. This gives us a mechanism to tell whether an unexport might have occurred. Add a new ->flush cache_detail operation that is called after flushing the cache whenever someone writes to a "flush" file. Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2019-08-19 11:00:39 -04:00
Chuck Lever	4866073e6d	svcrdma: Use llist for managing cache of recv_ctxts Use a wait-free mechanism for managing the svc_rdma_recv_ctxts free list. Subsequently, sc_recv_lock can be eliminated. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2019-08-19 10:59:28 -04:00
Chuck Lever	d6dfe43ec6	svcrdma: Remove svc_rdma_wq Clean up: the system workqueue will work just as well. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2019-08-19 10:59:28 -04:00
Junxiao Bi	988721db93	block: remove struct request_queue queue_head The dispatch list is not used any more, as the legacy block IO stack has been removed. Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-08-19 08:55:10 -06:00
Thomas Gleixner	b6a32bbd87	genirq: Force interrupt threading on RT Switch force_irqthreads from a boot time modifiable variable to a compile time constant when CONFIG_PREEMPT_RT is enabled. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20190816160923.12855-1-bigeasy@linutronix.de	2019-08-19 15:45:48 +02:00
Masahiro Yamada	38a429c898	netfilter: add include guard to nf_conntrack_h323_types.h Add a header include guard just in case. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2019-08-19 13:59:57 +02:00
Leonard Crestez	be378b6007	clk: imx8mn: Add GIC clock This is enabled by default but if it's not explicitly defined and marked as critical then its parent might get turned off. Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org>	2019-08-19 13:54:40 +02:00
Eric W. Biederman	33da8e7c81	signal: Allow cifs and drbd to receive their terminating signals My recent to change to only use force_sig for a synchronous events wound up breaking signal reception cifs and drbd. I had overlooked the fact that by default kthreads start out with all signals set to SIG_IGN. So a change I thought was safe turned out to have made it impossible for those kernel thread to catch their signals. Reverting the work on force_sig is a bad idea because what the code was doing was very much a misuse of force_sig. As the way force_sig ultimately allowed the signal to happen was to change the signal handler to SIG_DFL. Which after the first signal will allow userspace to send signals to these kernel threads. At least for wake_ack_receiver in drbd that does not appear actively wrong. So correct this problem by adding allow_kernel_signal that will allow signals whose siginfo reports they were sent by the kernel through, but will not allow userspace generated signals, and update cifs and drbd to call allow_kernel_signal in an appropriate place so that their thread can receive this signal. Fixing things this way ensures that userspace won't be able to send signals and cause problems, that it is clear which signals the threads are expecting to receive, and it guarantees that nothing else in the system will be affected. This change was partly inspired by similar cifs and drbd patches that added allow_signal. Reported-by: ronnie sahlberg <ronniesahlberg@gmail.com> Reported-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> Tested-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> Cc: Steve French <smfrench@gmail.com> Cc: Philipp Reisner <philipp.reisner@linbit.com> Cc: David Laight <David.Laight@ACULAB.COM> Fixes: `247bc9470b` ("cifs: fix rmmod regression in cifs.ko caused by force_sig changes") Fixes: `72abe3bcf0` ("signal/cifs: Fix cifs_put_tcp_session to call send_sig instead of force_sig") Fixes: `fee109901f` ("signal/drbd: Use send_sig not force_sig") Fixes: `3cf5d076fb` ("signal: Remove task parameter from force_sig") Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2019-08-19 06:34:13 -05:00
Juliana Rodrigueiro	89a26cd4b5	netfilter: xt_nfacct: Fix alignment mismatch in xt_nfacct_match_info When running a 64-bit kernel with a 32-bit iptables binary, the size of the xt_nfacct_match_info struct diverges. kernel: sizeof(struct xt_nfacct_match_info) : 40 iptables: sizeof(struct xt_nfacct_match_info)) : 36 Trying to append nfacct related rules results in an unhelpful message. Although it is suggested to look for more information in dmesg, nothing can be found there. # iptables -A <chain> -m nfacct --nfacct-name <acct-object> iptables: Invalid argument. Run `dmesg' for more information. This patch fixes the memory misalignment by enforcing 8-byte alignment within the struct's first revision. This solution is often used in many other uapi netfilter headers. Signed-off-by: Juliana Rodrigueiro <juliana.rodrigueiro@intra2net.com> Acked-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2019-08-19 09:34:21 +02:00
Greg Kroah-Hartman	7ffc95e90e	Merge 5.3-rc5 into usb-next We need the usb fixes in here as well for other patches to build on. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-08-19 07:15:42 +02:00
Greg Kroah-Hartman	c6d6832ce3	Merge 5.3-rc5 into staging-next We need the staging fixes in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-08-19 07:13:42 +02:00
Greg Kroah-Hartman	e70c971d7d	Merge 5.3-rc5 into char-misc-next We need the char/misc fixes in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-08-19 07:11:53 +02:00
Pablo Neira Ayuso	3bc158f8d0	netfilter: nf_tables: map basechain priority to hardware priority This patch adds initial support for offloading basechains using the priority range from 1 to 65535. This is restricting the netfilter priority range to 16-bit integer since this is what most drivers assume so far from tc. It should be possible to extend this range of supported priorities later on once drivers are updated to support for 32-bit integer priorities. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-18 14:13:23 -07:00
Pablo Neira Ayuso	ef01adae0e	net: sched: use major priority number as hardware priority tc transparently maps the software priority number to hardware. Update it to pass the major priority which is what most drivers expect. Update drivers too so they do not need to lshift the priority field of the flow_cls_common_offload object. The stmmac driver is an exception, since this code assumes the tc software priority is fine, therefore, lshift it just to be conservative. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-18 14:13:23 -07:00
Marc Zyngier	24cab82c34	KVM: arm/arm64: vgic: Add LPI translation cache definition Add the basic data structure that expresses an MSI to LPI translation as well as the allocation/release hooks. The size of the cache is arbitrarily defined as 16*nr_vcpus. Tested-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2019-08-18 18:38:35 +01:00
Linus Torvalds	359334caf7	Merge tag 'usb-5.3-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB fixes from Greg KH: "Here are number of small USB fixes for 5.3-rc5. Syzbot has been on a tear recently now that it has some good USB debugging hooks integrated, so there's a number of fixes in here found by those tools for some _very_ old bugs. Also a handful of gadget driver fixes for reported issues, some hopefully-final dma fixes for host controller drivers, and some new USB serial gadget driver ids. All of these have been in linux-next this week with no reported issues (the usb-serial ones were in linux-next in its own branch, but merged into mine on Friday)" * tag 'usb-5.3-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: usb: add a hcd_uses_dma helper usb: don't create dma pools for HCDs with a localmem_pool usb: chipidea: imx: fix EPROBE_DEFER support during driver probe usb: host: fotg2: restart hcd after port reset USB: CDC: fix sanity checks in CDC union parser usb: cdc-acm: make sure a refcount is taken early enough USB: serial: option: add the BroadMobi BM818 card USB: serial: option: Add Motorola modem UARTs USB: core: Fix races in character device registration and deregistraion usb: gadget: mass_storage: Fix races between fsg_disable and fsg_set_alt usb: gadget: composite: Clear "suspended" on reset/disconnect usb: gadget: udc: renesas_usb3: Fix sysfs interface of "role" USB: serial: option: add D-Link DWM-222 device ID USB: serial: option: Add support for ZTE MF871A	2019-08-18 09:11:29 -07:00
Linus Torvalds	8fde2832bd	Merge tag 'for-linus-2019-08-17' of git://git.kernel.dk/linux-block Pull block fixes from Jens Axboe: "A collection of fixes that should go into this series. This contains: - Revert of the REQ_NOWAIT_INLINE and associated dio changes. There were still corner cases there, and even though I had a solution for it, it's too involved for this stage. (me) - Set of NVMe fixes (via Sagi) - io_uring fix for fixed buffers (Anthony) - io_uring defer issue fix (Jackie) - Regression fix for queue sync at exit time (zhengbin) - xen blk-back memory leak fix (Wenwen)" * tag 'for-linus-2019-08-17' of git://git.kernel.dk/linux-block: io_uring: fix an issue when IOSQE_IO_LINK is inserted into defer list block: remove REQ_NOWAIT_INLINE io_uring: fix manual setup of iov_iter for fixed buffers xen/blkback: fix memory leaks blk-mq: move cancel of requeue_work to the front of blk_exit_queue nvme-pci: Fix async probe remove race nvme: fix controller removal race with scan work nvme-rdma: fix possible use-after-free in connect error flow nvme: fix a possible deadlock when passthru commands sent to a multipath device nvme-core: Fix extra device_put() call on error path nvmet-file: fix nvmet_file_flush() always returning an error nvmet-loop: Flush nvme_delete_wq when removing the port nvmet: Fix use-after-free bug when a port is removed nvme-multipath: revalidate nvme_ns_head gendisk in nvme_validate_ns	2019-08-17 19:39:54 -07:00
Björn Töpel	0402acd683	xsk: remove AF_XDP socket from map when the socket is released When an AF_XDP socket is released/closed the XSKMAP still holds a reference to the socket in a "released" state. The socket will still use the netdev queue resource, and block newly created sockets from attaching to that queue, but no user application can access the fill/complete/rx/tx queues. This results in that all applications need to explicitly clear the map entry from the old "zombie state" socket. This should be done automatically. In this patch, the sockets tracks, and have a reference to, which maps it resides in. When the socket is released, it will remove itself from all maps. Suggested-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-08-17 23:24:45 +02:00
Stanislav Fomichev	8f51dfc73b	bpf: support cloning sk storage on accept() Add new helper bpf_sk_storage_clone which optionally clones sk storage and call it from sk_clone_lock. Cc: Martin KaFai Lau <kafai@fb.com> Cc: Yonghong Song <yhs@fb.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-08-17 23:18:54 +02:00
Stanislav Fomichev	b0e4701ce1	bpf: export bpf_map_inc_not_zero Rename existing bpf_map_inc_not_zero to __bpf_map_inc_not_zero to indicate that it's caller's responsibility to do proper locking. Create and export bpf_map_inc_not_zero wrapper that properly locks map_idr_lock. Will be used in the next commit to hold a map while cloning a socket. Cc: Martin KaFai Lau <kafai@fb.com> Cc: Yonghong Song <yhs@fb.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-08-17 23:18:54 +02:00
Magnus Karlsson	77cd0d7b3f	xsk: add support for need_wakeup flag in AF_XDP rings This commit adds support for a new flag called need_wakeup in the AF_XDP Tx and fill rings. When this flag is set, it means that the application has to explicitly wake up the kernel Rx (for the bit in the fill ring) or kernel Tx (for bit in the Tx ring) processing by issuing a syscall. Poll() can wake up both depending on the flags submitted and sendto() will wake up tx processing only. The main reason for introducing this new flag is to be able to efficiently support the case when application and driver is executing on the same core. Previously, the driver was just busy-spinning on the fill ring if it ran out of buffers in the HW and there were none on the fill ring. This approach works when the application is running on another core as it can replenish the fill ring while the driver is busy-spinning. Though, this is a lousy approach if both of them are running on the same core as the probability of the fill ring getting more entries when the driver is busy-spinning is zero. With this new feature the driver now sets the need_wakeup flag and returns to the application. The application can then replenish the fill queue and then explicitly wake up the Rx processing in the kernel using the syscall poll(). For Tx, the flag is only set to one if the driver has no outstanding Tx completion interrupts. If it has some, the flag is zero as it will be woken up by a completion interrupt anyway. As a nice side effect, this new flag also improves the performance of the case where application and driver are running on two different cores as it reduces the number of syscalls to the kernel. The kernel tells user space if it needs to be woken up by a syscall, and this eliminates many of the syscalls. This flag needs some simple driver support. If the driver does not support this, the Rx flag is always zero and the Tx flag is always one. This makes any application relying on this feature default to the old behaviour of not requiring any syscalls in the Rx path and always having to call sendto() in the Tx path. For backwards compatibility reasons, this feature has to be explicitly turned on using a new bind flag (XDP_USE_NEED_WAKEUP). I recommend that you always turn it on as it so far always have had a positive performance impact. The name and inspiration of the flag has been taken from io_uring by Jens Axboe. Details about this feature in io_uring can be found in http://kernel.dk/io_uring.pdf, section 8.3. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-08-17 23:07:32 +02:00
Magnus Karlsson	9116e5e2b1	xsk: replace ndo_xsk_async_xmit with ndo_xsk_wakeup This commit replaces ndo_xsk_async_xmit with ndo_xsk_wakeup. This new ndo provides the same functionality as before but with the addition of a new flags field that is used to specifiy if Rx, Tx or both should be woken up. The previous ndo only woke up Tx, as implied by the name. The i40e and ixgbe drivers (which are all the supported ones) are updated with this new interface. This new ndo will be used by the new need_wakeup functionality of XDP sockets that need to be able to wake up both Rx and Tx driver processing. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-08-17 23:07:31 +02:00
Ido Schimmel	f3047ca01f	Documentation: Add devlink-trap documentation Add initial documentation of the devlink-trap mechanism, explaining the background, motivation and the semantics of the interface. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-17 12:40:08 -07:00
Ido Schimmel	391203ab11	devlink: Add generic packet traps and groups Add generic packet traps and groups that can report dropped packets as well as exceptions such as TTL error. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-17 12:40:08 -07:00
Ido Schimmel	0f420b6c52	devlink: Add packet trap infrastructure Add the basic packet trap infrastructure that allows device drivers to register their supported packet traps and trap groups with devlink. Each driver is expected to provide basic information about each supported trap, such as name and ID, but also the supported metadata types that will accompany each packet trapped via the trap. The currently supported metadata type is just the input port, but more will be added in the future. For example, output port and traffic class. Trap groups allow users to set the action of all member traps. In addition, users can retrieve per-group statistics in case per-trap statistics are too narrow. In the future, the trap group object can be extended with more attributes, such as policer settings which will limit the amount of traffic generated by member traps towards the CPU. Beside registering their packet traps with devlink, drivers are also expected to report trapped packets to devlink along with relevant metadata. devlink will maintain packets and bytes statistics for each packet trap and will potentially report the trapped packet with its metadata to user space via drop monitor netlink channel. The interface towards the drivers is simple and allows devlink to set the action of the trap. Currently, only two actions are supported: 'trap' and 'drop'. When set to 'trap', the device is expected to provide the sole copy of the packet to the driver which will pass it to devlink. When set to 'drop', the device is expected to drop the packet and not send a copy to the driver. In the future, more actions can be added, such as 'mirror'. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-17 12:40:08 -07:00
Ido Schimmel	8e94c3bc92	drop_monitor: Allow user to start monitoring hardware drops Drop monitor has start and stop commands, but so far these were only used to start and stop monitoring of software drops. Now that drop monitor can also monitor hardware drops, we should allow the user to control these as well. Do that by adding SW and HW flags to these commands. If no flag is specified, then only start / stop monitoring software drops. This is done in order to maintain backward-compatibility with existing user space applications. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-17 12:40:08 -07:00
Ido Schimmel	d40e1deb93	drop_monitor: Add support for summary alert mode for hardware drops In summary alert mode a notification is sent with a list of recent drop reasons and a count of how many packets were dropped due to this reason. To avoid expensive operations in the context in which packets are dropped, each CPU holds an array whose number of entries is the maximum number of drop reasons that can be encoded in the netlink notification. Each entry stores the drop reason and a count. When a packet is dropped the array is traversed and a new entry is created or the count of an existing entry is incremented. Later, in process context, the array is replaced with a newly allocated copy and the old array is encoded in a netlink notification. To avoid breaking user space, the notification includes the ancillary header, which is 'struct net_dm_alert_msg' with number of entries set to '0'. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-17 12:40:08 -07:00
Ido Schimmel	5e58109b1e	drop_monitor: Add support for packet alert mode for hardware drops In a similar fashion to software drops, extend drop monitor to send netlink events when packets are dropped by the underlying hardware. The main difference is that instead of encoding the program counter (PC) from which kfree_skb() was called in the netlink message, we encode the hardware trap name. The two are mostly equivalent since they should both help the user understand why the packet was dropped. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-17 12:40:08 -07:00
Ido Schimmel	edd3d0074c	drop_monitor: Add basic infrastructure for hardware drops Export a function that can be invoked in order to report packets that were dropped by the underlying hardware along with metadata. Subsequent patches will add support for the different alert modes. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-17 12:40:08 -07:00
David S. Miller	42eb455470	Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth Johan Hedberg says: ==================== pull request: bluetooth 2019-08-17 Here's a set of Bluetooth fixes for the 5.3-rc series: - Multiple fixes for Qualcomm (btqca & hci_qca) drivers - Minimum encryption key size debugfs setting (this is required for Bluetooth Qualification) - Fix hidp_send_message() to have a meaningful return value ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-17 12:37:47 -07:00
Heiner Kallweit	4b9cb2a5ce	net: phy: remove genphy_config_init Now that all users have been removed we can remove genphy_config_init. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-17 12:34:50 -07:00
Chris Wilson	f2cb60e9a3	dma-fence: Store the timestamp in the same union as the cb_list The timestamp and the cb_list are mutually exclusive, the cb_list can only be added to prior to being signaled (and once signaled we drain), while the timestamp is only valid upon being signaled. Both the timestamp and the cb_list are only valid while the fence is alive, and as soon as no references are held can be replaced by the rcu_head. By reusing the union for the timestamp, we squeeze the base dma_fence struct to 64 bytes on x86-64. v2: Sort the union chronologically Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Christian König <christian.koenig@amd.com> Acked-by: Christian König <christian.koenig@amd.com>. Link: https://patchwork.freedesktop.org/patch/msgid/20190817153022.5749-1-chris@chris-wilson.co.uk	2019-08-17 18:46:33 +01:00
Chris Wilson	4fe3997a68	dma-fence: Shrink size of struct dma_fence Rearrange the couple of 32-bit atomics hidden amongst the field of pointers that unnecessarily caused the compiler to insert some padding, shrinks the size of the base struct dma_fence from 80 to 72 bytes on x86-64. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Christian König <christian.koenig@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190817144736.7826-3-chris@chris-wilson.co.uk	2019-08-17 18:02:49 +01:00
Marcel Holtmann	58a96fc353	Bluetooth: Add debug setting for changing minimum encryption key size For testing and qualification purposes it is useful to allow changing the minimum encryption key size value that the host stack is going to enforce. This adds a new debugfs setting min_encrypt_key_size to achieve this functionality. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2019-08-17 13:54:40 +03:00
Christoph Hellwig	f7bc6e42bf	drivers: remove the SGI SN2 IOC4 base support The IOC4 is a multi-function chip seen on SGI SN2 and some SGI MIPS systems. This removes the base driver, which while not having an SN2 Kconfig dependency was only for sub-drivers that had one. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lkml.kernel.org/r/20190813072514.23299-15-hch@lst.de Signed-off-by: Tony Luck <tony.luck@intel.com>	2019-08-16 11:33:57 -07:00
Stephen Boyd	0214f33c4e	clk: Overwrite clk_hw::init with NULL during clk_register() We don't want clk provider drivers to use the init structure after clk registration time, but we leave a dangling reference to it by means of clk_hw::init. Let's overwrite the member with NULL during clk_register() so that this can't be used anymore after registration time. Cc: Bjorn Andersson <bjorn.andersson@linaro.org> Cc: Doug Anderson <dianders@chromium.org> Signed-off-by: Stephen Boyd <sboyd@kernel.org> Link: https://lkml.kernel.org/r/20190731193517.237136-10-sboyd@kernel.org Reviewed-by: Sylwester Nawrocki <s.nawrocki@samsung.com>	2019-08-16 10:27:29 -07:00
Linus Torvalds	2d63ba3e41	Merge tag 'pm-5.3-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael Wysocki: "These add a check to avoid recent suspend-to-idle power regression on systems with NVMe drives where the PCIe ASPM policy is "performance" (or when the kernel is built without ASPM support), fix an issue related to frequency limits in the schedutil cpufreq governor and fix a mistake related to the PM QoS usage in the cpufreq core introduced recently. Specifics: - Disable NVMe power optimization related to suspend-to-idle added recently on systems where PCIe ASPM is not able to put PCIe links into low-power states to prevent excess power from being drawn by the system while suspended (Rafael Wysocki). - Make the schedutil governor handle frequency limits changes properly in all cases (Viresh Kumar). - Prevent the cpufreq core from treating positive values returned by dev_pm_qos_update_request() as errors (Viresh Kumar)" * tag 'pm-5.3-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: nvme-pci: Allow PCI bus-level PM to be used if ASPM is disabled PCI/ASPM: Add pcie_aspm_enabled() cpufreq: schedutil: Don't skip freq update when limits change cpufreq: dev_pm_qos_update_request() can return 1 on success	2019-08-16 09:13:16 -07:00
Jason Gunthorpe	2c7933f53f	mm/mmu_notifiers: add a get/put scheme for the registration Many places in the kernel have a flow where userspace will create some object and that object will need to connect to the subsystem's mmu_notifier subscription for the duration of its lifetime. In this case the subsystem is usually tracking multiple mm_structs and it is difficult to keep track of what struct mmu_notifier's have been allocated for what mm's. Since this has been open coded in a variety of exciting ways, provide core functionality to do this safely. This approach uses the struct mmu_notifier_ops * as a key to determine if the subsystem has a notifier registered on the mm or not. If there is a registration then the existing notifier struct is returned, otherwise the ops->alloc_notifiers() is used to create a new per-subsystem notifier for the mm. The destroy side incorporates an async call_srcu based destruction which will avoid bugs in the callers such as commit `6d7c3cde93` ("mm/hmm: fix use after free with struct hmm in the mmu notifiers"). Since we are inside the mmu notifier core locking is fairly simple, the allocation uses the same approach as for mmu_notifier_mm, the write side of the mmap_sem makes everything deterministic and we only need to do hlist_add_head_rcu() under the mm_take_all_locks(). The new users count and the discoverability in the hlist is fully serialized by the mmu_notifier_mm->lock. Link: https://lore.kernel.org/r/20190806231548.25242-4-jgg@ziepe.ca Co-developed-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Christoph Hellwig <hch@infradead.org> Reviewed-by: Ralph Campbell <rcampbell@nvidia.com> Tested-by: Ralph Campbell <rcampbell@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2019-08-16 12:02:52 -03:00

... 18 19 20 21 22 ...

114223 Commits