linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-05-02 03:57:34 -04:00

Author	SHA1	Message	Date
Mike Christie	efd838fec1	vhost scsi: Add support for LUN resets. In newer versions of virtio-scsi we just reset the timer when an a command times out, so TMFs are never sent for the cmd time out case. However, in older kernels and for the TMF inject cases, we can still get resets and we end up just failing immediately so the guest might see the device get offlined and IO errors. For the older kernel cases, we want the same end result as the modern virtio-scsi driver where we let the lower levels fire their error handling and handle the problem. And at the upper levels we want to wait. This patch ties the LUN reset handling into the LIO TMF code which will just wait for outstanding commands to complete like we are doing in the modern virtio-scsi case. Note: I did not handle the ABORT case to keep this simple. For ABORTs LIO just waits on the cmd like how it does for the RESET case. If an ABORT fails, the guest OS ends up escalating to LUN RESET, so in the end we get the same behavior where we wait on the outstanding cmds. Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/1604986403-4931-6-git-send-email-michael.christie@oracle.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>	2020-11-15 17:30:55 -05:00
Mike Christie	18f1becb69	vhost scsi: add lun parser helper Move code to parse lun from req's lun_buf to helper, so tmf code can use it in the next patch. Signed-off-by: Mike Christie <michael.christie@oracle.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/1604986403-4931-5-git-send-email-michael.christie@oracle.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>	2020-11-15 17:30:55 -05:00
Mike Christie	47a3565e8b	vhost scsi: fix cmd completion race We might not do the final se_cmd put from vhost_scsi_complete_cmd_work. When the last put happens a little later then we could race where vhost_scsi_complete_cmd_work does vhost_signal, the guest runs and sends more IO, and vhost_scsi_handle_vq runs but does not find any free cmds. This patch has us delay completing the cmd until the last lio core ref is dropped. We then know that once we signal to the guest that the cmd is completed that if it queues a new command it will find a free cmd. Signed-off-by: Mike Christie <michael.christie@oracle.com> Reviewed-by: Maurizio Lombardi <mlombard@redhat.com> Link: https://lore.kernel.org/r/1604986403-4931-4-git-send-email-michael.christie@oracle.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>	2020-11-15 17:30:55 -05:00
Mike Christie	25b98b64e2	vhost scsi: alloc cmds per vq instead of session We currently are limited to 256 cmds per session. This leads to problems where if the user has increased virtqueue_size to more than 2 or cmd_per_lun to more than 256 vhost_scsi_get_tag can fail and the guest will get IO errors. This patch moves the cmd allocation to per vq so we can easily match whatever the user has specified for num_queues and virtqueue_size/cmd_per_lun. It also makes it easier to control how much memory we preallocate. For cases, where perf is not as important and we can use the current defaults (1 vq and 128 cmds per vq) memory use from preallocate cmds is cut in half. For cases, where we are willing to use more memory for higher perf, cmd mem use will now increase as the num queues and queue depth increases. Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/1604986403-4931-3-git-send-email-michael.christie@oracle.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Maurizio Lombardi <mlombard@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>	2020-11-15 17:30:55 -05:00
Mike Christie	6bcf34224a	vhost: add helper to check if a vq has been setup This adds a helper check if a vq has been setup. The next patches will use this when we move the vhost scsi cmd preallocation from per session to per vq. In the per vq case, we only want to allocate cmds for vqs that have actually been setup and not for all the possible vqs. Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/1604986403-4931-2-git-send-email-michael.christie@oracle.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>	2020-11-15 17:30:54 -05:00
Laurent Vivier	a312db697c	vdpasim: fix "mac_pton" undefined error ERROR: modpost: "mac_pton" [drivers/vdpa/vdpa_sim/vdpa_sim.ko] undefined! mac_pton() is defined in lib/net_utils.c and is not built if NET is not set. Select GENERIC_NET_UTILS as vdpasim doesn't depend on NET. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Laurent Vivier <lvivier@redhat.com> Link: https://lore.kernel.org/r/20201113155706.599434-1-lvivier@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested	2020-11-15 17:30:54 -05:00
Stephen Rothwell	f51778db08	swiotlb: using SIZE_MAX needs limits.h included After merging the drm-misc tree, linux-next build (arm multi_v7_defconfig) failed like this: In file included from drivers/gpu/drm/nouveau/nouveau_ttm.c:26: include/linux/swiotlb.h: In function 'swiotlb_max_mapping_size': include/linux/swiotlb.h:99:9: error: 'SIZE_MAX' undeclared (first use in this function) 99 \| return SIZE_MAX; \| ^~~~~~~~ include/linux/swiotlb.h:7:1: note: 'SIZE_MAX' is defined in header '<stdint.h>'; did you forget to '#include <stdint.h>'? 6 \| #include <linux/init.h> +++ \|+#include <stdint.h> 7 \| #include <linux/types.h> include/linux/swiotlb.h:99:9: note: each undeclared identifier is reported only once for each function it appears in 99 \| return SIZE_MAX; \| ^~~~~~~~ Caused by commit `abe420bfae` ("swiotlb: Introduce swiotlb_max_mapping_size()") but only exposed by commit "drm/nouveu: fix swiotlb include" Fix it by including linux/limits.h as appropriate. Fixes: `abe420bfae` ("swiotlb: Introduce swiotlb_max_mapping_size()") Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Link: https://lore.kernel.org/r/20201102124327.2f82b2a7@canb.auug.org.au Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2020-11-02 06:05:05 -05:00
Laurent Vivier	0c86d77488	vdpasim: allow to assign a MAC address Add macaddr parameter to the module to set the MAC address to use Signed-off-by: Laurent Vivier <lvivier@redhat.com> Link: https://lore.kernel.org/r/20201029122050.776445-3-lvivier@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>	2020-10-30 04:04:35 -04:00
Laurent Vivier	4a6a42db53	vdpasim: fix MAC address configuration vdpa_sim generates a ramdom MAC address but it is never used by upper layers because the VIRTIO_NET_F_MAC bit is not set in the features list. Because of that, virtio-net always regenerates a random MAC address each time it is loaded whereas the address should only change on vdpa_sim load/unload. Fix that by adding VIRTIO_NET_F_MAC in the features list of vdpa_sim. Fixes: `2c53d0f64c` ("vdpasim: vDPA device simulator") Cc: jasowang@redhat.com Signed-off-by: Laurent Vivier <lvivier@redhat.com> Link: https://lore.kernel.org/r/20201029122050.776445-2-lvivier@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>	2020-10-30 04:04:35 -04:00
Zhu Lingshan	e01afe36df	vdpa: handle irq bypass register failure case LKP considered variable 'ret' in vhost_vdpa_setup_vq_irq() as a unused variable, so suggest we remove it. Actually it stores return value of irq_bypass_register_producer(), but we did not check it, we should handle the failure case. This commit will print a message if irq bypass register producer fail, in this case, vqs still remain functional. Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/r/20201023104046.404794-1-lingshan.zhu@intel.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>	2020-10-30 04:02:53 -04:00
Laurent Vivier	1eca16b231	vdpa_sim: Fix DMA mask Since commit `f959dcd6dd` ("dma-direct: Fix potential NULL pointer dereference") an error is reported when we load vdpa_sim and virtio-vdpa: [ 129.351207] net eth0: Unexpected TXQ (0) queue failure: -12 It seems that dma_mask is not initialized. This patch initializes dma_mask() and calls dma_set_mask_and_coherent() to fix the problem. Full log: [ 128.548628] ------------[ cut here ]------------ [ 128.553268] WARNING: CPU: 23 PID: 1105 at kernel/dma/mapping.c:149 dma_map_page_attrs+0x14c/0x1d0 [ 128.562139] Modules linked in: virtio_net net_failover failover virtio_vdpa vdpa_sim vringh vhost_iotlb vdpa xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 nft_compat nft_counter nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink tun bridge stp llc iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi rfkill intel_rapl_msr intel_rapl_common isst_if_common sunrpc skx_edac nfit libnvdimm x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel ipmi_ssif kvm mgag200 i2c_algo_bit irqbypass drm_kms_helper crct10dif_pclmul crc32_pclmul syscopyarea ghash_clmulni_intel iTCO_wdt sysfillrect iTCO_vendor_support sysimgblt rapl fb_sys_fops dcdbas intel_cstate drm acpi_ipmi ipmi_si mei_me dell_smbios intel_uncore ipmi_devintf mei i2c_i801 dell_wmi_descriptor wmi_bmof pcspkr lpc_ich i2c_smbus ipmi_msghandler acpi_power_meter ip_tables xfs libcrc32c sd_mod t10_pi sg ahci libahci libata megaraid_sas tg3 crc32c_intel wmi dm_mirror dm_region_hash dm_log [ 128.562188] dm_mod [ 128.651334] CPU: 23 PID: 1105 Comm: NetworkManager Tainted: G S I 5.10.0-rc1+ #59 [ 128.659939] Hardware name: Dell Inc. PowerEdge R440/04JN2K, BIOS 2.8.1 06/30/2020 [ 128.667419] RIP: 0010:dma_map_page_attrs+0x14c/0x1d0 [ 128.672384] Code: 1c 25 28 00 00 00 0f 85 97 00 00 00 48 83 c4 10 5b 5d 41 5c 41 5d c3 4c 89 da eb d7 48 89 f2 48 2b 50 18 48 89 d0 eb 8d 0f 0b <0f> 0b 48 c7 c0 ff ff ff ff eb c3 48 89 d9 48 8b 40 40 e8 2d a0 aa [ 128.691131] RSP: 0018:ffffae0f0151f3c8 EFLAGS: 00010246 [ 128.696357] RAX: ffffffffc06b7400 RBX: 00000000000005fa RCX: 0000000000000000 [ 128.703488] RDX: 0000000000000040 RSI: ffffcee3c7861200 RDI: ffff9e2bc16cd000 [ 128.710620] RBP: 0000000000000000 R08: 0000000000000002 R09: 0000000000000000 [ 128.717754] R10: 0000000000000002 R11: 0000000000000000 R12: ffff9e472cb291f8 [ 128.724886] R13: ffff9e2bc14da780 R14: ffff9e472bc20000 R15: ffff9e2bc1b14940 [ 128.732020] FS: 00007f887bae23c0(0000) GS:ffff9e4ac01c0000(0000) knlGS:0000000000000000 [ 128.740105] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 128.745852] CR2: 0000562bc09de998 CR3: 00000003c156c006 CR4: 00000000007706e0 [ 128.752982] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 128.760114] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 128.767247] PKRU: 55555554 [ 128.769961] Call Trace: [ 128.772418] virtqueue_add+0x81e/0xb00 [ 128.776176] virtqueue_add_inbuf_ctx+0x26/0x30 [ 128.780625] try_fill_recv+0x3a2/0x6e0 [virtio_net] [ 128.785509] virtnet_open+0xf9/0x180 [virtio_net] [ 128.790217] __dev_open+0xe8/0x180 [ 128.793620] __dev_change_flags+0x1a7/0x210 [ 128.797808] dev_change_flags+0x21/0x60 [ 128.801646] do_setlink+0x328/0x10e0 [ 128.805227] ? __nla_validate_parse+0x121/0x180 [ 128.809757] ? __nla_parse+0x21/0x30 [ 128.813338] ? inet6_validate_link_af+0x5c/0xf0 [ 128.817871] ? cpumask_next+0x17/0x20 [ 128.821535] ? __snmp6_fill_stats64.isra.54+0x6b/0x110 [ 128.826676] ? __nla_validate_parse+0x47/0x180 [ 128.831120] __rtnl_newlink+0x541/0x8e0 [ 128.834962] ? __nla_reserve+0x38/0x50 [ 128.838713] ? security_sock_rcv_skb+0x2a/0x40 [ 128.843158] ? netlink_deliver_tap+0x2c/0x1e0 [ 128.847518] ? netlink_attachskb+0x1d8/0x220 [ 128.851793] ? skb_queue_tail+0x1b/0x50 [ 128.855641] ? fib6_clean_node+0x43/0x170 [ 128.859652] ? _cond_resched+0x15/0x30 [ 128.863406] ? kmem_cache_alloc_trace+0x3a3/0x420 [ 128.868110] rtnl_newlink+0x43/0x60 [ 128.871602] rtnetlink_rcv_msg+0x12c/0x380 [ 128.875701] ? rtnl_calcit.isra.39+0x110/0x110 [ 128.880147] netlink_rcv_skb+0x50/0x100 [ 128.883987] netlink_unicast+0x1a5/0x280 [ 128.887913] netlink_sendmsg+0x23d/0x470 [ 128.891839] sock_sendmsg+0x5b/0x60 [ 128.895331] ____sys_sendmsg+0x1ef/0x260 [ 128.899255] ? copy_msghdr_from_user+0x5c/0x90 [ 128.903702] ___sys_sendmsg+0x7c/0xc0 [ 128.907369] ? dev_forward_change+0x130/0x130 [ 128.911731] ? sysctl_head_finish.part.29+0x24/0x40 [ 128.916616] ? new_sync_write+0x11f/0x1b0 [ 128.920628] ? mntput_no_expire+0x47/0x240 [ 128.924727] __sys_sendmsg+0x57/0xa0 [ 128.928309] do_syscall_64+0x33/0x40 [ 128.931887] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 128.936937] RIP: 0033:0x7f88792e3857 [ 128.940518] Code: c3 66 90 41 54 41 89 d4 55 48 89 f5 53 89 fb 48 83 ec 10 e8 0b ed ff ff 44 89 e2 48 89 ee 89 df 41 89 c0 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 35 44 89 c7 48 89 44 24 08 e8 44 ed ff ff 48 [ 128.959263] RSP: 002b:00007ffdca60dea0 EFLAGS: 00000293 ORIG_RAX: 000000000000002e [ 128.966827] RAX: ffffffffffffffda RBX: 000000000000000c RCX: 00007f88792e3857 [ 128.973960] RDX: 0000000000000000 RSI: 00007ffdca60def0 RDI: 000000000000000c [ 128.981095] RBP: 00007ffdca60def0 R08: 0000000000000000 R09: 0000000000000000 [ 128.988224] R10: 0000000000000001 R11: 0000000000000293 R12: 0000000000000000 [ 128.995357] R13: 0000000000000000 R14: 00007ffdca60e0a8 R15: 00007ffdca60e09c [ 129.002492] CPU: 23 PID: 1105 Comm: NetworkManager Tainted: G S I 5.10.0-rc1+ #59 [ 129.011093] Hardware name: Dell Inc. PowerEdge R440/04JN2K, BIOS 2.8.1 06/30/2020 [ 129.018571] Call Trace: [ 129.021027] dump_stack+0x57/0x6a [ 129.024346] __warn.cold.14+0xe/0x3d [ 129.027925] ? dma_map_page_attrs+0x14c/0x1d0 [ 129.032283] report_bug+0xbd/0xf0 [ 129.035602] handle_bug+0x44/0x80 [ 129.038922] exc_invalid_op+0x13/0x60 [ 129.042589] asm_exc_invalid_op+0x12/0x20 [ 129.046602] RIP: 0010:dma_map_page_attrs+0x14c/0x1d0 [ 129.051566] Code: 1c 25 28 00 00 00 0f 85 97 00 00 00 48 83 c4 10 5b 5d 41 5c 41 5d c3 4c 89 da eb d7 48 89 f2 48 2b 50 18 48 89 d0 eb 8d 0f 0b <0f> 0b 48 c7 c0 ff ff ff ff eb c3 48 89 d9 48 8b 40 40 e8 2d a0 aa [ 129.070311] RSP: 0018:ffffae0f0151f3c8 EFLAGS: 00010246 [ 129.075536] RAX: ffffffffc06b7400 RBX: 00000000000005fa RCX: 0000000000000000 [ 129.082669] RDX: 0000000000000040 RSI: ffffcee3c7861200 RDI: ffff9e2bc16cd000 [ 129.089803] RBP: 0000000000000000 R08: 0000000000000002 R09: 0000000000000000 [ 129.096936] R10: 0000000000000002 R11: 0000000000000000 R12: ffff9e472cb291f8 [ 129.104068] R13: ffff9e2bc14da780 R14: ffff9e472bc20000 R15: ffff9e2bc1b14940 [ 129.111200] virtqueue_add+0x81e/0xb00 [ 129.114952] virtqueue_add_inbuf_ctx+0x26/0x30 [ 129.119399] try_fill_recv+0x3a2/0x6e0 [virtio_net] [ 129.124280] virtnet_open+0xf9/0x180 [virtio_net] [ 129.128984] __dev_open+0xe8/0x180 [ 129.132390] __dev_change_flags+0x1a7/0x210 [ 129.136575] dev_change_flags+0x21/0x60 [ 129.140415] do_setlink+0x328/0x10e0 [ 129.143994] ? __nla_validate_parse+0x121/0x180 [ 129.148528] ? __nla_parse+0x21/0x30 [ 129.152107] ? inet6_validate_link_af+0x5c/0xf0 [ 129.156639] ? cpumask_next+0x17/0x20 [ 129.160306] ? __snmp6_fill_stats64.isra.54+0x6b/0x110 [ 129.165443] ? __nla_validate_parse+0x47/0x180 [ 129.169890] __rtnl_newlink+0x541/0x8e0 [ 129.173731] ? __nla_reserve+0x38/0x50 [ 129.177483] ? security_sock_rcv_skb+0x2a/0x40 [ 129.181928] ? netlink_deliver_tap+0x2c/0x1e0 [ 129.186286] ? netlink_attachskb+0x1d8/0x220 [ 129.190560] ? skb_queue_tail+0x1b/0x50 [ 129.194401] ? fib6_clean_node+0x43/0x170 [ 129.198411] ? _cond_resched+0x15/0x30 [ 129.202163] ? kmem_cache_alloc_trace+0x3a3/0x420 [ 129.206869] rtnl_newlink+0x43/0x60 [ 129.210361] rtnetlink_rcv_msg+0x12c/0x380 [ 129.214462] ? rtnl_calcit.isra.39+0x110/0x110 [ 129.218908] netlink_rcv_skb+0x50/0x100 [ 129.222747] netlink_unicast+0x1a5/0x280 [ 129.226672] netlink_sendmsg+0x23d/0x470 [ 129.230599] sock_sendmsg+0x5b/0x60 [ 129.234090] ____sys_sendmsg+0x1ef/0x260 [ 129.238015] ? copy_msghdr_from_user+0x5c/0x90 [ 129.242461] ___sys_sendmsg+0x7c/0xc0 [ 129.246128] ? dev_forward_change+0x130/0x130 [ 129.250487] ? sysctl_head_finish.part.29+0x24/0x40 [ 129.255368] ? new_sync_write+0x11f/0x1b0 [ 129.259381] ? mntput_no_expire+0x47/0x240 [ 129.263478] __sys_sendmsg+0x57/0xa0 [ 129.267058] do_syscall_64+0x33/0x40 [ 129.270639] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 129.275689] RIP: 0033:0x7f88792e3857 [ 129.279268] Code: c3 66 90 41 54 41 89 d4 55 48 89 f5 53 89 fb 48 83 ec 10 e8 0b ed ff ff 44 89 e2 48 89 ee 89 df 41 89 c0 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 35 44 89 c7 48 89 44 24 08 e8 44 ed ff ff 48 [ 129.298015] RSP: 002b:00007ffdca60dea0 EFLAGS: 00000293 ORIG_RAX: 000000000000002e [ 129.305581] RAX: ffffffffffffffda RBX: 000000000000000c RCX: 00007f88792e3857 [ 129.312712] RDX: 0000000000000000 RSI: 00007ffdca60def0 RDI: 000000000000000c [ 129.319846] RBP: 00007ffdca60def0 R08: 0000000000000000 R09: 0000000000000000 [ 129.326978] R10: 0000000000000001 R11: 0000000000000293 R12: 0000000000000000 [ 129.334109] R13: 0000000000000000 R14: 00007ffdca60e0a8 R15: 00007ffdca60e09c [ 129.341249] ---[ end trace c551e8028fbaf59d ]--- [ 129.351207] net eth0: Unexpected TXQ (0) queue failure: -12 [ 129.360445] net eth0: Unexpected TXQ (0) queue failure: -12 [ 129.824428] net eth0: Unexpected TXQ (0) queue failure: -12 Fixes: `2c53d0f64c` ("vdpasim: vDPA device simulator") Signed-off-by: Laurent Vivier <lvivier@redhat.com> Link: https://lore.kernel.org/r/20201027175914.689278-1-lvivier@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Cc: stable@vger.kernel.org Acked-by: Jason Wang <jasowang@redhat.com>	2020-10-30 04:02:45 -04:00
Michael S. Tsirkin	5e1a3149ee	Revert "vhost-vdpa: fix page pinning leakage in error path" This reverts commit `7ed9e3d97c`. The patch creates a DoS risk since it can result in a high order memory allocation. Fixes: `7ed9e3d97c` ("vhost-vdpa: fix page pinning leakage in error path") Cc: stable@vger.kernel.org Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2020-10-30 04:02:39 -04:00
Jing Xiangfeng	7ba08e81cb	vdpa/mlx5: Fix error return in map_direct_mr() Fix to return the variable "err" from the error handling case instead of "ret". Fixes: `94abbccdf2` ("vdpa/mlx5: Add shared memory registration code") Signed-off-by: Jing Xiangfeng <jingxiangfeng@huawei.com> Link: https://lore.kernel.org/r/20201026070637.164321-1-jingxiangfeng@huawei.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Eli Cohen <elic@nvidia.com> Cc: stable@vger.kernel.org Acked-by: Jason Wang <jasowang@redhat.com>	2020-10-30 04:02:34 -04:00
Dan Carpenter	7922460e33	vhost_vdpa: Return -EFAULT if copy_from_user() fails The copy_to/from_user() functions return the number of bytes which we weren't able to copy but the ioctl should return -EFAULT if they fail. Fixes: `a127c5bbb6` ("vhost-vdpa: fix backend feature ioctls") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/20201023120853.GI282278@mwanda Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Cc: stable@vger.kernel.org Acked-by: Jason Wang <jasowang@redhat.com>	2020-10-30 04:02:25 -04:00
Jason Wang	70a62fce26	vdpa_sim: implement get_iova_range() This implements a sample get_iova_range() for the simulator which advertise [0, ULLONG_MAX] as the valid range. Signed-off-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20201023090043.14430-4-jasowang@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2020-10-23 11:55:28 -04:00
Jason Wang	1b48dc03e5	vhost: vdpa: report iova range This patch introduces a new ioctl for vhost-vdpa device that can report the iova range by the device. For device that implements get_iova_range() method, we fetch it from the vDPA device. If device doesn't implement get_iova_range() but depends on platform IOMMU, we will query via DOMAIN_ATTR_GEOMETRY, otherwise [0, ULLONG_MAX] is assumed. For safety, this patch also rules out the map request which is not in the valid range. Signed-off-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20201023090043.14430-3-jasowang@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2020-10-23 11:55:28 -04:00
Jason Wang	3f1b623a1b	vdpa: introduce config op to get valid iova range This patch introduce a config op to get valid iova range from the vDPA device. Signed-off-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20201023090043.14430-2-jasowang@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2020-10-23 11:55:27 -04:00
David Hildenbrand	88a0d60c64	MAINTAINERS: add URL for virtio-mem Let's add the status/info page, which is still under construction, however, already contains valuable documentation/information. Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20200617104756.6312-1-david@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2020-10-21 10:48:11 -04:00
Zhu Lingshan	86e182fe12	vhost_vdpa: remove unnecessary spin_lock in vhost_vring_call This commit removed unnecessary spin_locks in vhost_vring_call and related operations. Because we manipulate irq offloading contents in vhost_vdpa ioctl code path which is already protected by dev mutex and vq mutex. Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Link: https://lore.kernel.org/r/20200909065234.3313-1-lingshan.zhu@intel.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>	2020-10-21 10:48:10 -04:00
Stefano Garzarella	5745bcfbbf	vringh: fix __vringh_iov() when riov and wiov are different If riov and wiov are both defined and they point to different objects, only riov is initialized. If the wiov is not initialized by the caller, the function fails returning -EINVAL and printing "Readable desc 0x... after writable" error message. This issue happens when descriptors have both readable and writable buffers (eg. virtio-blk devices has virtio_blk_outhdr in the readable buffer and status as last byte of writable buffer) and we call __vringh_iov() to get both type of buffers in two different iovecs. Let's replace the 'else if' clause with 'if' to initialize both riov and wiov if they are not NULL. As checkpatch pointed out, we also avoid crashing the kernel when riov and wiov are both NULL, replacing BUG() with WARN_ON() and returning -EINVAL. Fixes: `f87d0fbb57` ("vringh: host-side implementation of virtio rings.") Cc: stable@vger.kernel.org Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://lore.kernel.org/r/20201008204256.162292-1-sgarzare@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2020-10-21 10:38:45 -04:00
Eli Cohen	1897f0b618	vdpa/mlx5: Setup driver only if VIRTIO_CONFIG_S_DRIVER_OK set_map() is used by mlx5 vdpa to create a memory region based on the address map passed by the iotlb argument. If we get successive calls, we will destroy the current memory region and build another one based on the new address mapping. We also need to setup the hardware resources since they depend on the memory region. If these calls happen before DRIVER_OK, It means that driver VQs may also not been setup and we may not create them yet. In this case we want to avoid setting up the other resources and defer this till we get DRIVER OK. Fixes: `1a86b377aa` ("vdpa/mlx5: Add VDPA driver for supported mlx5 devices") Signed-off-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20200908123346.GA169007@mtl-vdi-166.wap.labs.mlnx Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2020-10-21 10:36:54 -04:00
Pierre Morel	4ce1cf7b02	s390: virtio: PV needs VIRTIO I/O device protection If protected virtualization is active on s390, VIRTIO has only retricted access to the guest memory. Define CONFIG_ARCH_HAS_RESTRICTED_VIRTIO_MEMORY_ACCESS and export arch_has_restricted_virtio_memory_access to advertize VIRTIO if that's the case. Signed-off-by: Pierre Morel <pmorel@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Halil Pasic <pasic@linux.ibm.com> Link: https://lore.kernel.org/r/1599728030-17085-3-git-send-email-pmorel@linux.ibm.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>	2020-10-21 10:34:13 -04:00
Pierre Morel	0afa15e1a5	virtio: let arch advertise guest's memory access restrictions An architecture may restrict host access to guest memory, e.g. IBM s390 Secure Execution or AMD SEV. Provide a new Kconfig entry the architecture can select, CONFIG_ARCH_HAS_RESTRICTED_VIRTIO_MEMORY_ACCESS, when it provides the arch_has_restricted_virtio_memory_access callback to advertise to VIRTIO common code when the architecture restricts memory access from the host. The common code can then fail the probe for any device where VIRTIO_F_ACCESS_PLATFORM is required, but not set. Signed-off-by: Pierre Morel <pmorel@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Halil Pasic <pasic@linux.ibm.com> Link: https://lore.kernel.org/r/1599728030-17085-2-git-send-email-pmorel@linux.ibm.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>	2020-10-21 10:34:12 -04:00
Tian Tao	b9747fdf0c	vhost_vdpa: Fix duplicate included kernel.h linux/kernel.h is included more than once, Remove the one that isn't necessary. Signed-off-by: Tian Tao <tiantao6@hisilicon.com> Link: https://lore.kernel.org/r/1600131102-24672-1-git-send-email-tiantao6@hisilicon.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>	2020-10-21 10:34:11 -04:00
Li Wang	5e5e8736ad	vhost: reduce stack usage in log_used Fix the warning: [-Werror=-Wframe-larger-than=] drivers/vhost/vhost.c: In function log_used: drivers/vhost/vhost.c:1906:1: warning: the frame size of 1040 bytes is larger than 1024 bytes Signed-off-by: Li Wang <li.wang@windriver.com> Link: https://lore.kernel.org/r/1600106889-25013-1-git-send-email-li.wang@windriver.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>	2020-10-21 10:34:10 -04:00
Rikard Falkeborn	7ab4de6002	virtio-mem: Constify mem_id_table mem_id_table is not modified, so make it const to allow the compiler to put it in read-only memory. Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com> Link: https://lore.kernel.org/r/20200911203509.26505-4-rikard.falkeborn@gmail.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: David Hildenbrand <david@redhat.com>	2020-10-21 10:34:10 -04:00
Rikard Falkeborn	7f90611693	virtio_input: Constify id_table id_table is not modified, so make it const to allow the compiler to put it in read-only memory. Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com> Link: https://lore.kernel.org/r/20200911203509.26505-3-rikard.falkeborn@gmail.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com>	2020-10-21 10:34:09 -04:00
Rikard Falkeborn	bfec6c8307	virtio-balloon: Constify id_table id_table is not modified, so make it const to allow the compiler to put it in read-only memory. Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com> Link: https://lore.kernel.org/r/20200911203509.26505-2-rikard.falkeborn@gmail.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: David Hildenbrand <david@redhat.com>	2020-10-21 10:34:08 -04:00
Eli Cohen	36b02df2d2	vdpa/mlx5: Fix failure to bring link up Set VIRTIO_NET_S_LINK_UP in config status to allow the get the bring the net device's link up. Fixes: `1a86b377aa` ("vdpa/mlx5: Add VDPA driver for supported mlx5 devices") Signed-off-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20200917121540.GA98184@mtl-vdi-166.wap.labs.mlnx Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>	2020-10-21 10:34:07 -04:00
Eli Cohen	36bdcf318b	vdpa/mlx5: Make use of a specific 16 bit endianness API Introduce a dedicated function to be used for setting 16 bit fields per virio endianness requirements and use it to set the mtu field. Signed-off-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20200917121425.GA98139@mtl-vdi-166.wap.labs.mlnx Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>	2020-10-21 10:34:06 -04:00
Linus Torvalds	bbf5c97901	Linux 5.9 v5.9	2020-10-11 14:15:50 -07:00
Linus Torvalds	3dd0130f24	Merge branch 'akpm' (patches from Andrew) Merge misc fixes from Andrew Morton: "Five fixes. Subsystems affected by this patch series: MAINTAINERS, mm/pagemap, mm/swap, and mm/hugetlb" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: mm: khugepaged: recalculate min_free_kbytes after memory hotplug as expected by khugepaged mm: validate inode in mapping_set_error() mm: mmap: Fix general protection fault in unlink_file_vma() MAINTAINERS: Antoine Tenart's email address MAINTAINERS: change hardening mailing list	2020-10-11 11:18:04 -07:00
Linus Torvalds	5b697f86f9	Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs fix from Al Viro: "Fixes an obvious bug (memory leak introduced in 5.8)" * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: pipe: Fix memory leaks in create_pipe_files()	2020-10-11 11:11:35 -07:00
Linus Torvalds	c120ec12e2	Merge tag 'x86-urgent-2020-10-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "Two fixes: - Fix a (hopefully final) IRQ state tracking bug vs MCE handling - Fix a documentation link" * tag 'x86-urgent-2020-10-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: Documentation/x86: Fix incorrect references to zero-page.txt x86/mce: Use idtentry_nmi_enter/exit()	2020-10-11 10:53:37 -07:00
Linus Torvalds	aa5c3a2911	Merge tag 'perf-urgent-2020-10-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fix from Ingo Molnar: "Fix an error handling bug that can cause a lockup if a CPU is offline (doh ...)" * tag 'perf-urgent-2020-10-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf: Fix task_function_call() error handling	2020-10-11 10:43:37 -07:00
Vijay Balakrishna	4aab2be098	mm: khugepaged: recalculate min_free_kbytes after memory hotplug as expected by khugepaged When memory is hotplug added or removed the min_free_kbytes should be recalculated based on what is expected by khugepaged. Currently after hotplug, min_free_kbytes will be set to a lower default and higher default set when THP enabled is lost. This change restores min_free_kbytes as expected for THP consumers. [vijayb@linux.microsoft.com: v5] Link: https://lkml.kernel.org/r/1601398153-5517-1-git-send-email-vijayb@linux.microsoft.com Fixes: `f000565adb` ("thp: set recommended min free kbytes") Signed-off-by: Vijay Balakrishna <vijayb@linux.microsoft.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Pavel Tatashin <pasha.tatashin@soleen.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Allen Pais <apais@microsoft.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Song Liu <songliubraving@fb.com> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/1600305709-2319-2-git-send-email-vijayb@linux.microsoft.com Link: https://lkml.kernel.org/r/1600204258-13683-1-git-send-email-vijayb@linux.microsoft.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-10-11 10:31:11 -07:00
Minchan Kim	8b7b2eb131	mm: validate inode in mapping_set_error() The swap address_space doesn't have host. Thus, it makes kernel crash once swap write meets error. Fix it. Fixes: `735e4ae5ba` ("vfs: track per-sb writeback errors and report them to syncfs") Signed-off-by: Minchan Kim <minchan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Jeff Layton <jlayton@kernel.org> Cc: Jan Kara <jack@suse.cz> Cc: Andres Freund <andres@anarazel.de> Cc: Matthew Wilcox <willy@infradead.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@infradead.org> Cc: Dave Chinner <david@fromorbit.com> Cc: David Howells <dhowells@redhat.com> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20201010000650.750063-1-minchan@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-10-11 10:31:10 -07:00
Miaohe Lin	bc4fe4cdd6	mm: mmap: Fix general protection fault in unlink_file_vma() The syzbot reported the below general protection fault: general protection fault, probably for non-canonical address 0xe00eeaee0000003b: 0000 [#1] PREEMPT SMP KASAN KASAN: maybe wild-memory-access in range [0x00777770000001d8-0x00777770000001df] CPU: 1 PID: 10488 Comm: syz-executor721 Not tainted 5.9.0-rc3-syzkaller #0 RIP: 0010:unlink_file_vma+0x57/0xb0 mm/mmap.c:164 Call Trace: free_pgtables+0x1b3/0x2f0 mm/memory.c:415 exit_mmap+0x2c0/0x530 mm/mmap.c:3184 __mmput+0x122/0x470 kernel/fork.c:1076 mmput+0x53/0x60 kernel/fork.c:1097 exit_mm kernel/exit.c:483 [inline] do_exit+0xa8b/0x29f0 kernel/exit.c:793 do_group_exit+0x125/0x310 kernel/exit.c:903 get_signal+0x428/0x1f00 kernel/signal.c:2757 arch_do_signal+0x82/0x2520 arch/x86/kernel/signal.c:811 exit_to_user_mode_loop kernel/entry/common.c:136 [inline] exit_to_user_mode_prepare+0x1ae/0x200 kernel/entry/common.c:167 syscall_exit_to_user_mode+0x7e/0x2e0 kernel/entry/common.c:242 entry_SYSCALL_64_after_hwframe+0x44/0xa9 It's because the ->mmap() callback can change vma->vm_file and fput the original file. But the commit `d70cec8983` ("mm: mmap: merge vma after call_mmap() if possible") failed to catch this case and always fput() the original file, hence add an extra fput(). [ Thanks Hillf for pointing this extra fput() out. ] Fixes: `d70cec8983` ("mm: mmap: merge vma after call_mmap() if possible") Reported-by: syzbot+c5d5a51dcbb558ca0cb5@syzkaller.appspotmail.com Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Christian König <ckoenig.leichtzumerken@gmail.com> Cc: Hongxiang Lou <louhongxiang@huawei.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Dave Airlie <airlied@redhat.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: John Hubbard <jhubbard@nvidia.com> Link: https://lkml.kernel.org/r/20200916090733.31427-1-linmiaohe@huawei.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-10-11 10:31:10 -07:00
Antoine Tenart	512b557ac8	MAINTAINERS: Antoine Tenart's email address Use my kernel.org address instead of my bootlin.com one. Signed-off-by: Antoine Tenart <atenart@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Link: https://lkml.kernel.org/r/20201005164533.16811-1-atenart@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-10-11 10:31:10 -07:00
Kees Cook	ae4a380109	MAINTAINERS: change hardening mailing list As more email from git history gets aimed at the OpenWall kernel-hardening@ list, there has been a desire to separate "new topics" from "on-going" work. To handle this, the superset of hardening email topics are now to be directed to linux-hardening@vger.kernel.org. Update the MAINTAINERS file and the .mailmap to accomplish this, so that linux-hardening@ can be treated like any other regular upstream kernel development list. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Emese Revfy <re.emese@gmail.com> Cc: "Tobin C. Harding" <me@tobin.cc> Cc: Tycho Andersen <tycho@tycho.pizza> Cc: Jonathan Corbet <corbet@lwn.net> Link: https://lore.kernel.org/linux-hardening/202010051443.279CC265D@keescook/ Link: https://lkml.kernel.org/r/20201006000012.2768958-1-keescook@chromium.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-10-11 10:31:10 -07:00
Linus Torvalds	da690031a5	Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: "Some more driver bugfixes for I2C. Including a revert - the updated series for it will come during the next merge window" * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: owl: Clear NACK and BUS error bits Revert "i2c: imx: Fix reset of I2SR_IAL flag" i2c: meson: fixup rate calculation with filter delay i2c: meson: keep peripheral clock enabled i2c: meson: fix clock setting overwrite i2c: imx: Fix reset of I2SR_IAL flag	2020-10-10 16:09:12 -07:00
Vladimir Zapolskiy	64b7f674c2	cifs: Fix incomplete memory allocation on setxattr path On setxattr() syscall path due to an apprent typo the size of a dynamically allocated memory chunk for storing struct smb2_file_full_ea_info object is computed incorrectly, to be more precise the first addend is the size of a pointer instead of the wanted object size. Coincidentally it makes no difference on 64-bit platforms, however on 32-bit targets the following memcpy() writes 4 bytes of data outside of the dynamically allocated memory. ============================================================================= BUG kmalloc-16 (Not tainted): Redzone overwritten ----------------------------------------------------------------------------- Disabling lock debugging due to kernel taint INFO: 0x79e69a6f-0x9e5cdecf @offset=368. First byte 0x73 instead of 0xcc INFO: Slab 0xd36d2454 objects=85 used=51 fp=0xf7d0fc7a flags=0x35000201 INFO: Object 0x6f171df3 @offset=352 fp=0x00000000 Redzone 5d4ff02d: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc ................ Object 6f171df3: 00 00 00 00 00 05 06 00 73 6e 72 75 62 00 66 69 ........snrub.fi Redzone 79e69a6f: 73 68 32 0a sh2. Padding 56254d82: 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZ CPU: 0 PID: 8196 Comm: attr Tainted: G B 5.9.0-rc8+ #3 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014 Call Trace: dump_stack+0x54/0x6e print_trailer+0x12c/0x134 check_bytes_and_report.cold+0x3e/0x69 check_object+0x18c/0x250 free_debug_processing+0xfe/0x230 __slab_free+0x1c0/0x300 kfree+0x1d3/0x220 smb2_set_ea+0x27d/0x540 cifs_xattr_set+0x57f/0x620 __vfs_setxattr+0x4e/0x60 __vfs_setxattr_noperm+0x4e/0x100 __vfs_setxattr_locked+0xae/0xd0 vfs_setxattr+0x4e/0xe0 setxattr+0x12c/0x1a0 path_setxattr+0xa4/0xc0 __ia32_sys_lsetxattr+0x1d/0x20 __do_fast_syscall_32+0x40/0x70 do_fast_syscall_32+0x29/0x60 do_SYSENTER_32+0x15/0x20 entry_SYSENTER_32+0x9f/0xf2 Fixes: `5517554e43` ("cifs: Add support for writing attributes on SMB2+") Signed-off-by: Vladimir Zapolskiy <vladimir@tuxera.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-10-10 15:52:54 -07:00
Hugh Dickins	033b5d7755	mm/khugepaged: fix filemap page_to_pgoff(page) != offset There have been elusive reports of filemap_fault() hitting its VM_BUG_ON_PAGE(page_to_pgoff(page) != offset, page) on kernels built with CONFIG_READ_ONLY_THP_FOR_FS=y. Suren has hit it on a kernel with CONFIG_READ_ONLY_THP_FOR_FS=y and CONFIG_NUMA is not set: and he has analyzed it down to how khugepaged without NUMA reuses the same huge page after collapse_file() failed (whereas NUMA targets its allocation to the respective node each time). And most of us were usually testing with CONFIG_NUMA=y kernels. collapse_file(old start) new_page = khugepaged_alloc_page(hpage) __SetPageLocked(new_page) new_page->index = start // hpage->index=old offset new_page->mapping = mapping xas_store(&xas, new_page) filemap_fault page = find_get_page(mapping, offset) // if offset falls inside hpage then // compound_head(page) == hpage lock_page_maybe_drop_mmap() __lock_page(page) // collapse fails xas_store(&xas, old page) new_page->mapping = NULL unlock_page(new_page) collapse_file(new start) new_page = khugepaged_alloc_page(hpage) __SetPageLocked(new_page) new_page->index = start // hpage->index=new offset new_page->mapping = mapping // mapping becomes valid again // since compound_head(page) == hpage // page_to_pgoff(page) got changed VM_BUG_ON_PAGE(page_to_pgoff(page) != offset) An initial patch replaced __SetPageLocked() by lock_page(), which did fix the race which Suren illustrates above. But testing showed that it's not good enough: if the racing task's __lock_page() gets delayed long after its find_get_page(), then it may follow collapse_file(new start)'s successful final unlock_page(), and crash on the same VM_BUG_ON_PAGE. It could be fixed by relaxing filemap_fault()'s VM_BUG_ON_PAGE to a check and retry (as is done for mapping), with similar relaxations in find_lock_entry() and pagecache_get_page(): but it's not obvious what else might get caught out; and khugepaged non-NUMA appears to be unique in exposing a page to page cache, then revoking, without going through a full cycle of freeing before reuse. Instead, non-NUMA khugepaged_prealloc_page() release the old page if anyone else has a reference to it (1% of cases when I tested). Although never reported on huge tmpfs, I believe its find_lock_entry() has been at similar risk; but huge tmpfs does not rely on khugepaged for its normal working nearly so much as READ_ONLY_THP_FOR_FS does. Reported-by: Denis Lisov <dennis.lissov@gmail.com> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=206569 Link: https://lore.kernel.org/linux-mm/?q=20200219144635.3b7417145de19b65f258c943%40linux-foundation.org Reported-by: Qian Cai <cai@lca.pw> Link: https://lore.kernel.org/linux-xfs/?q=20200616013309.GB815%40lca.pw Reported-and-analyzed-by: Suren Baghdasaryan <surenb@google.com> Fixes: `87c460a0bd` ("mm/khugepaged: collapse_shmem() without freezing new_page") Signed-off-by: Hugh Dickins <hughd@google.com> Cc: stable@vger.kernel.org # v4.9+ Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-10-10 15:52:54 -07:00
Cristian Ciocaltea	f5b3f43364	i2c: owl: Clear NACK and BUS error bits When the NACK and BUS error bits are set by the hardware, the driver is responsible for clearing them by writing "1" into the corresponding status registers. Hence perform the necessary operations in owl_i2c_interrupt(). Fixes: `d211e62af4` ("i2c: Add Actions Semiconductor Owl family S900 I2C driver") Reported-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Cristian Ciocaltea <cristian.ciocaltea@gmail.com> Signed-off-by: Wolfram Sang <wsa@kernel.org>	2020-10-10 13:15:46 +02:00
Wolfram Sang	5a02e7c429	Revert "i2c: imx: Fix reset of I2SR_IAL flag" This reverts commit `fa4d305568`. An updated version was sent. So, revert this version and give the new version more time for testing. Signed-off-by: Wolfram Sang <wsa@kernel.org>	2020-10-10 13:03:54 +02:00
Linus Torvalds	6f2f486d57	Merge tag 'spi-fix-v5.9-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi Pull spi fix from Mark Brown: "One last minute fix for v5.9 which has been causing crashes in test systems with the fsl-dspi driver when they hit deferred probe (and which I probably let cook in next a bit longer than is ideal). And an update to MAINTAINERS reflecting Serge's extensive and detailed recent work on the DesignWare driver" * tag 'spi-fix-v5.9-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: MAINTAINERS: Add maintainer of DW APB SSI driver spi: fsl-dspi: fix NULL pointer dereference	2020-10-09 18:05:12 -07:00
Linus Torvalds	8a5f78d98c	Merge tag 'riscv-for-linus-5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Palmer Dabbelt: "Two fixes this week: - A fix to actually reserve the device tree's memory. Without this the device tree can be overwritten on systems that don't otherwise reserve it. This issue should only manifest on !MMU systems. - A workaround for a BUG() that triggers when the memory that originally contained initdata is freed and later repurposed. This triggers a BUG() on builds that had HARDENED_USERCOPY enabled" * tag 'riscv-for-linus-5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: riscv: Fixup bootup failure with HARDENED_USERCOPY RISC-V: Make sure memblock reserves the memory containing DT	2020-10-09 11:49:22 -07:00
Linus Torvalds	277e570ae1	Merge tag 'for-v5.9-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply Pull power supply fix from Sebastian Reichel: "Just a single change to revert enablement of packet error checking for battery data on Chromebooks, since some of their embedded controllers do not handle it correctly" * tag 'for-v5.9-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply: power: supply: sbs-battery: chromebook workaround for PEC	2020-10-09 11:38:07 -07:00
Linus Torvalds	d813a8cb8d	Merge tag 'gpio-v5.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio Pull GPIO fixes from Linus Walleij: "Some late fixes: one IRQ issue and one compilation issue for UML. - Fix a compilation issue with User Mode Linux - Handle spurious interrupts properly in the PCA953x driver" * tag 'gpio-v5.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: gpio: pca953x: Survive spurious interrupts gpiolib: Disable compat ->read() code in UML case	2020-10-09 11:33:48 -07:00
Linus Torvalds	f318052ef2	Merge tag 'mmc-v5.9-rc4-4' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc Pull MMC fix from Ulf Hansson: "Assign a proper discard granularity rather than incorrectly set it to zero" * tag 'mmc-v5.9-rc4-4' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: mmc: core: don't set limits.discard_granularity as 0	2020-10-09 10:10:52 -07:00

1 2 3 4 5 ...

951302 Commits