linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-05-09 19:51:25 -04:00

Author	SHA1	Message	Date
Abhishek Sahu	453e6c98fd	vfio/pci: Implement VFIO_DEVICE_FEATURE_LOW_POWER_ENTRY_WITH_WAKEUP This patch implements VFIO_DEVICE_FEATURE_LOW_POWER_ENTRY_WITH_WAKEUP device feature. In the VFIO_DEVICE_FEATURE_LOW_POWER_ENTRY, if there is any access for the VFIO device on the host side, then the device will be moved out of the low power state without the user's guest driver involvement. Once the device access has been finished, then the host can move the device again into low power state. With the low power entry happened through VFIO_DEVICE_FEATURE_LOW_POWER_ENTRY_WITH_WAKEUP, the device will not be moved back into the low power state and a notification will be sent to the user by triggering wakeup eventfd. vfio_pci_core_pm_entry() will be called for both the variants of low power feature entry so add an extra argument for wakeup eventfd context and store locally in 'struct vfio_pci_core_device'. For the entry happened without wakeup eventfd, all the exit related handling will be done by the LOW_POWER_EXIT device feature only. When the LOW_POWER_EXIT will be called, then the vfio core layer vfio_device_pm_runtime_get() will increment the usage count and will resume the device. In the driver runtime_resume callback, the 'pm_wake_eventfd_ctx' will be NULL. Then vfio_pci_core_pm_exit() will call vfio_pci_runtime_pm_exit() and all the exit related handling will be done. For the entry happened with wakeup eventfd, in the driver resume callback, eventfd will be triggered and all the exit related handling will be done. When vfio_pci_runtime_pm_exit() will be called by vfio_pci_core_pm_exit(), then it will return early. But if the runtime suspend has not happened on the host side, then all the exit related handling will be done in vfio_pci_core_pm_exit() only. Signed-off-by: Abhishek Sahu <abhsahu@nvidia.com> Link: https://lore.kernel.org/r/20220829114850.4341-6-abhsahu@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2022-09-01 15:29:11 -06:00
Abhishek Sahu	cc2742fe36	vfio/pci: Implement VFIO_DEVICE_FEATURE_LOW_POWER_ENTRY/EXIT Currently, if the runtime power management is enabled for vfio-pci based devices in the guest OS, then the guest OS will do the register write for PCI_PM_CTRL register. This write request will be handled in vfio_pm_config_write() where it will do the actual register write of PCI_PM_CTRL register. With this, the maximum D3hot state can be achieved for low power. If we can use the runtime PM framework, then we can achieve the D3cold state (on the supported systems) which will help in saving maximum power. 1. D3cold state can't be achieved by writing PCI standard PM config registers. This patch implements the following newly added low power related device features: - VFIO_DEVICE_FEATURE_LOW_POWER_ENTRY - VFIO_DEVICE_FEATURE_LOW_POWER_EXIT The VFIO_DEVICE_FEATURE_LOW_POWER_ENTRY feature will allow the device to make use of low power platform states on the host while the VFIO_DEVICE_FEATURE_LOW_POWER_EXIT will prevent further use of those power states. 2. The vfio-pci driver uses runtime PM framework for low power entry and exit. On the platforms where D3cold state is supported, the runtime PM framework will put the device into D3cold otherwise, D3hot or some other power state will be used. There are various cases where the device will not go into the runtime suspended state. For example, - The runtime power management is disabled on the host side for the device. - The user keeps the device busy after calling LOW_POWER_ENTRY. - There are dependent devices that are still in runtime active state. For these cases, the device will be in the same power state that has been configured by the user through PCI_PM_CTRL register. 3. The hypervisors can implement virtual ACPI methods. For example, in guest linux OS if PCI device ACPI node has _PR3 and _PR0 power resources with _ON/_OFF method, then guest linux OS invokes the _OFF method during D3cold transition and then _ON during D0 transition. The hypervisor can tap these virtual ACPI calls and then call the low power device feature IOCTL. 4. The 'pm_runtime_engaged' flag tracks the entry and exit to runtime PM. This flag is protected with 'memory_lock' semaphore. 5. All the config and other region access are wrapped under pm_runtime_resume_and_get() and pm_runtime_put(). So, if any device access happens while the device is in the runtime suspended state, then the device will be resumed first before access. Once the access has been finished, then the device will again go into the runtime suspended state. 6. The memory region access through mmap will not be allowed in the low power state. Since __vfio_pci_memory_enabled() is a common function, so check for 'pm_runtime_engaged' has been added explicitly in vfio_pci_mmap_fault() to block only mmap'ed access. Signed-off-by: Abhishek Sahu <abhsahu@nvidia.com> Link: https://lore.kernel.org/r/20220829114850.4341-5-abhsahu@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2022-09-01 15:29:11 -06:00
Abhishek Sahu	4813724c4b	vfio/pci: Mask INTx during runtime suspend This patch adds INTx handling during runtime suspend/resume. All the suspend/resume related code for the user to put the device into the low power state will be added in subsequent patches. The INTx lines may be shared among devices. Whenever any INTx interrupt comes for the VFIO devices, then vfio_intx_handler() will be called for each device sharing the interrupt. Inside vfio_intx_handler(), it calls pci_check_and_mask_intx() and checks if the interrupt has been generated for the current device. Now, if the device is already in the D3cold state, then the config space can not be read. Attempt to read config space in D3cold state can cause system unresponsiveness in a few systems. To prevent this, mask INTx in runtime suspend callback, and unmask the same in runtime resume callback. If INTx has been already masked, then no handling is needed in runtime suspend/resume callbacks. 'pm_intx_masked' tracks this, and vfio_pci_intx_mask() has been updated to return true if the INTx vfio_pci_irq_ctx.masked value is changed inside this function. For the runtime suspend which is triggered for the no user of VFIO device, the 'irq_type' will be VFIO_PCI_NUM_IRQS and these callbacks won't do anything. The MSI/MSI-X are not shared so similar handling should not be needed for MSI/MSI-X. vfio_msihandler() triggers eventfd_signal() without doing any device-specific config access. When the user performs any config access or IOCTL after receiving the eventfd notification, then the device will be moved to the D0 state first before servicing any request. Another option was to check this flag 'pm_intx_masked' inside vfio_intx_handler() instead of masking the interrupts. This flag is being set inside the runtime_suspend callback but the device can be in non-D3cold state (for example, if the user has disabled D3cold explicitly by sysfs, the D3cold is not supported in the platform, etc.). Also, in D3cold supported case, the device will be in D0 till the PCI core moves the device into D3cold. In this case, there is a possibility that the device can generate an interrupt. Adding check in the IRQ handler will not clear the IRQ status and the interrupt line will still be asserted. This can cause interrupt flooding. Signed-off-by: Abhishek Sahu <abhsahu@nvidia.com> Link: https://lore.kernel.org/r/20220829114850.4341-4-abhsahu@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2022-09-01 15:29:11 -06:00
Abhishek Sahu	8e5c699511	vfio: Increment the runtime PM usage count during IOCTL call The vfio-pci based drivers will have runtime power management support where the user can put the device into the low power state and then PCI devices can go into the D3cold state. If the device is in the low power state and the user issues any IOCTL, then the device should be moved out of the low power state first. Once the IOCTL is serviced, then it can go into the low power state again. The runtime PM framework manages this with help of usage count. One option was to add the runtime PM related API's inside vfio-pci driver but some IOCTL (like VFIO_DEVICE_FEATURE) can follow a different path and more IOCTL can be added in the future. Also, the runtime PM will be added for vfio-pci based drivers variant currently, but the other VFIO based drivers can use the same in the future. So, this patch adds the runtime calls runtime-related API in the top-level IOCTL function itself. For the VFIO drivers which do not have runtime power management support currently, the runtime PM API's won't be invoked. Only for vfio-pci based drivers currently, the runtime PM API's will be invoked to increment and decrement the usage count. In the vfio-pci drivers also, the variant drivers can opt-out by incrementing the usage count during device-open. The pm_runtime_resume_and_get() checks the device current status and will return early if the device is already in the ACTIVE state. Taking this usage count incremented while servicing IOCTL will make sure that the user won't put the device into the low power state when any other IOCTL is being serviced in parallel. Let's consider the following scenario: 1. Some other IOCTL is called. 2. The user has opened another device instance and called the IOCTL for low power entry. 3. The low power entry IOCTL moves the device into the low power state. 4. The other IOCTL finishes. If we don't keep the usage count incremented then the device access will happen between step 3 and 4 while the device has already gone into the low power state. The pm_runtime_resume_and_get() will be the first call so its error should not be propagated to user space directly. For example, if pm_runtime_resume_and_get() can return -EINVAL for the cases where the user has passed the correct argument. So the pm_runtime_resume_and_get() errors have been masked behind -EIO. Signed-off-by: Abhishek Sahu <abhsahu@nvidia.com> Link: https://lore.kernel.org/r/20220829114850.4341-3-abhsahu@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2022-09-01 15:29:11 -06:00
Abhishek Sahu	385ecfdfb5	vfio: Add the device features for the low power entry and exit This patch adds the following new device features for the low power entry and exit in the header file. The implementation for the same will be added in the subsequent patches. - VFIO_DEVICE_FEATURE_LOW_POWER_ENTRY - VFIO_DEVICE_FEATURE_LOW_POWER_ENTRY_WITH_WAKEUP - VFIO_DEVICE_FEATURE_LOW_POWER_EXIT For vfio-pci based devices, with the standard PCI PM registers, all power states cannot be achieved. The platform-based power management needs to be involved to go into the lowest power state. For doing low power entry and exit with platform-based power management, these device features can be used. The entry device feature has two variants. These two variants are mainly to support the different behaviour for the low power entry. If there is any access for the VFIO device on the host side, then the device will be moved out of the low power state without the user's guest driver involvement. Some devices (for example NVIDIA VGA or 3D controller) require the user's guest driver involvement for each low-power entry. In the first variant, the host can return the device to low power automatically. The device will continue to attempt to reach low power until the low power exit feature is called. In the second variant, if the device exits low power due to an access, the host kernel will signal the user via the provided eventfd and will not return the device to low power without a subsequent call to one of the low power entry features. A call to the low power exit feature is optional if the user provided eventfd is signaled. These device features only support VFIO_DEVICE_FEATURE_SET and VFIO_DEVICE_FEATURE_PROBE operations. Signed-off-by: Abhishek Sahu <abhsahu@nvidia.com> Link: https://lore.kernel.org/r/20220829114850.4341-2-abhsahu@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2022-09-01 15:29:11 -06:00
Jason Gunthorpe	99a27c088b	vfio: Split VFIO_GROUP_GET_STATUS into a function This is the last sizable implementation in vfio_group_fops_unl_ioctl(), move it to a function so vfio_group_fops_unl_ioctl() is emptied out. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/8-v2-0f9e632d54fb+d6-vfio_ioctl_split_jgg@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2022-09-01 15:29:11 -06:00
Jason Gunthorpe	b3b43590fa	vfio: Follow the naming pattern for vfio_group_ioctl_unset_container() Make it clear that this is the body of the ioctl. Fold the locking into the function so it is self contained like the other ioctls. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/7-v2-0f9e632d54fb+d6-vfio_ioctl_split_jgg@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2022-09-01 15:29:11 -06:00
Jason Gunthorpe	67671f153e	vfio: Fold VFIO_GROUP_SET_CONTAINER into vfio_group_set_container() No reason to split it up like this, just have one function to process the ioctl. Move the lock into the function as well to avoid having a lockdep annotation. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/6-v2-0f9e632d54fb+d6-vfio_ioctl_split_jgg@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2022-09-01 15:29:11 -06:00
Jason Gunthorpe	150ee2f9cd	vfio: Fold VFIO_GROUP_GET_DEVICE_FD into vfio_group_get_device_fd() No reason to split it up like this, just have one function to process the ioctl. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/5-v2-0f9e632d54fb+d6-vfio_ioctl_split_jgg@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2022-09-01 15:29:11 -06:00
Jason Gunthorpe	663eab456e	vfio-pci: Replace 'void __user *' with proper types in the ioctl functions This makes the code clearer and replaces a few places trying to access a flex array with an actual flex array. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/4-v2-0f9e632d54fb+d6-vfio_ioctl_split_jgg@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2022-09-01 15:29:11 -06:00
Jason Gunthorpe	ea3fc04d4f	vfio-pci: Re-indent what was vfio_pci_core_ioctl() Done mechanically with: $ git clang-format-14 -i --lines 675:1210 drivers/vfio/pci/vfio_pci_core.c And manually reflow the multi-line comments clang-format doesn't fix. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/3-v2-0f9e632d54fb+d6-vfio_ioctl_split_jgg@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2022-09-01 15:29:11 -06:00
Jason Gunthorpe	2ecf3b58ed	vfio-pci: Break up vfio_pci_core_ioctl() into one function per ioctl 500 lines is a bit long for a single function, move the bodies of each ioctl into separate functions and leave behind a switch statement to dispatch them. This patch just adds the function declarations and does not fix the indenting. The next patch will restore the indenting. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/2-v2-0f9e632d54fb+d6-vfio_ioctl_split_jgg@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2022-09-01 15:29:11 -06:00
Jason Gunthorpe	16f4cbd9e1	vfio-pci: Fix vfio_pci_ioeventfd() to return int This only returns 0 or -ERRNO, it should return int like all the other ioctl dispatch functions. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/1-v2-0f9e632d54fb+d6-vfio_ioctl_split_jgg@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2022-09-01 15:29:11 -06:00
Jason Gunthorpe	c462a8c5d9	vfio/pci: Simplify the is_intx/msi/msix/etc defines Only three of these are actually used, simplify to three inline functions, and open code the if statement in vfio_pci_config.c. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Link: https://lore.kernel.org/r/3-v2-1bd95d72f298+e0e-vfio_pci_priv_jgg@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2022-09-01 15:29:11 -06:00
Jason Gunthorpe	1e979ef5df	vfio/pci: Rename vfio_pci_register_dev_region() As this is part of the vfio_pci_core component it should be called vfio_pci_core_register_dev_region() like everything else exported from this module. Suggested-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Link: https://lore.kernel.org/r/2-v2-1bd95d72f298+e0e-vfio_pci_priv_jgg@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2022-09-01 15:29:11 -06:00
Jason Gunthorpe	e34a0425b8	vfio/pci: Split linux/vfio_pci_core.h The header in include/linux should have only the exported interface for other vfio_pci modules to use. Internal definitions for vfio_pci.ko should be in a "priv" header along side the .c files. Move the internal declarations out of vfio_pci_core.h. They either move to vfio_pci_priv.h or to the C file that is the only user. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Link: https://lore.kernel.org/r/1-v2-1bd95d72f298+e0e-vfio_pci_priv_jgg@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2022-09-01 15:29:11 -06:00
Linus Torvalds	b90cb10531	Linux 6.0-rc3 v6.0-rc3	2022-08-28 15:05:29 -07:00
Linus Torvalds	b467192ec7	Merge tag 'mm-hotfixes-stable-2022-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull more hotfixes from Andrew Morton: "Seventeen hotfixes. Mostly memory management things. Ten patches are cc:stable, addressing pre-6.0 issues" * tag 'mm-hotfixes-stable-2022-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: .mailmap: update Luca Ceresoli's e-mail address mm/mprotect: only reference swap pfn page if type match squashfs: don't call kmalloc in decompressors mm/damon/dbgfs: avoid duplicate context directory creation mailmap: update email address for Colin King asm-generic: sections: refactor memory_intersects bootmem: remove the vmemmap pages from kmemleak in put_page_bootmem ocfs2: fix freeing uninitialized resource on ocfs2_dlm_shutdown Revert "memcg: cleanup racy sum avoidance code" mm/zsmalloc: do not attempt to free IS_ERR handle binder_alloc: add missing mmap_lock calls when using the VMA mm: re-allow pinning of zero pfns (again) vmcoreinfo: add kallsyms_num_syms symbol mailmap: update Guilherme G. Piccoli's email addresses writeback: avoid use-after-free after removing device shmem: update folio if shmem_replace_page() updates the page mm/hugetlb: avoid corrupting page->mapping in hugetlb_mcopy_atomic_pte	2022-08-28 14:49:59 -07:00
Linus Torvalds	373eff576e	Merge tag 'bitmap-6.0-rc3' of github.com:/norov/linux Pull bitmap fixes from Yury Norov: "Fix the reported issues, and implements the suggested improvements, for the version of the cpumask tests [1] that was merged with commit `c41e8866c2` ("lib/test: introduce cpumask KUnit test suite"). These changes include fixes for the tests, and better alignment with the KUnit style guidelines" * tag 'bitmap-6.0-rc3' of github.com:/norov/linux: lib/cpumask_kunit: add tests file to MAINTAINERS lib/cpumask_kunit: log mask contents lib/test_cpumask: follow KUnit style guidelines lib/test_cpumask: fix cpu_possible_mask last test lib/test_cpumask: drop cpu_possible_mask full test	2022-08-28 14:36:27 -07:00
Luca Ceresoli	0ebafe2ea8	.mailmap: update Luca Ceresoli's e-mail address My Bootlin address is preferred from now on. Link: https://lkml.kernel.org/r/20220826130515.3011951-1-luca.ceresoli@bootlin.com Signed-off-by: Luca Ceresoli <luca.ceresoli@bootlin.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Atish Patra <atishp@atishpatra.org> Cc: Hans Verkuil <hverkuil-cisco@xs4all.nl> Cc: Thomas Petazzoni <thomas.petazzoni@bootlin.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-08-28 14:02:46 -07:00
Peter Xu	3d2f78f08c	mm/mprotect: only reference swap pfn page if type match Yu Zhao reported a bug after the commit "mm/swap: Add swp_offset_pfn() to fetch PFN from swap entry" added a check in swp_offset_pfn() for swap type [1]: kernel BUG at include/linux/swapops.h:117! CPU: 46 PID: 5245 Comm: EventManager_De Tainted: G S O L 6.0.0-dbg-DEV #2 RIP: 0010:pfn_swap_entry_to_page+0x72/0xf0 Code: c6 48 8b 36 48 83 fe ff 74 53 48 01 d1 48 83 c1 08 48 8b 09 f6 c1 01 75 7b 66 90 48 89 c1 48 8b 09 f6 c1 01 74 74 5d c3 eb 9e <0f> 0b 48 ba ff ff ff ff 03 00 00 00 eb ae a9 ff 0f 00 00 75 13 48 RSP: 0018:ffffa59e73fabb80 EFLAGS: 00010282 RAX: 00000000ffffffe8 RBX: 0c00000000000000 RCX: ffffcd5440000000 RDX: 1ffffffffff7a80a RSI: 0000000000000000 RDI: 0c0000000000042b RBP: ffffa59e73fabb80 R08: ffff9965ca6e8bb8 R09: 0000000000000000 R10: ffffffffa5a2f62d R11: 0000030b372e9fff R12: ffff997b79db5738 R13: 000000000000042b R14: 0c0000000000042b R15: 1ffffffffff7a80a FS: 00007f549d1bb700(0000) GS:ffff99d3cf680000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000440d035b3180 CR3: 0000002243176004 CR4: 00000000003706e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> change_pte_range+0x36e/0x880 change_p4d_range+0x2e8/0x670 change_protection_range+0x14e/0x2c0 mprotect_fixup+0x1ee/0x330 do_mprotect_pkey+0x34c/0x440 __x64_sys_mprotect+0x1d/0x30 It triggers because pfn_swap_entry_to_page() could be called upon e.g. a genuine swap entry. Fix it by only calling it when it's a write migration entry where the page* is used. [1] https://lore.kernel.org/lkml/CAOUHufaVC2Za-p8m0aiHw6YkheDcrO-C3wRGixwDS32VTS+k1w@mail.gmail.com/ Link: https://lkml.kernel.org/r/20220823221138.45602-1-peterx@redhat.com Fixes: `6c287605fd` ("mm: remember exclusively mapped anonymous pages with PG_anon_exclusive") Signed-off-by: Peter Xu <peterx@redhat.com> Reported-by: Yu Zhao <yuzhao@google.com> Tested-by: Yu Zhao <yuzhao@google.com> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-08-28 14:02:46 -07:00
Phillip Lougher	1f13dff09f	squashfs: don't call kmalloc in decompressors The decompressors may be called while in an atomic section. So move the kmalloc() out of this path, and into the "page actor" init function. This fixes a regression introduced by commit `f268eedddf` ("squashfs: extend "page actor" to handle missing pages") Link: https://lkml.kernel.org/r/20220822215430.15933-1-phillip@squashfs.org.uk Fixes: `f268eedddf` ("squashfs: extend "page actor" to handle missing pages") Reported-by: Chris Murphy <lists@colorremedies.com> Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-08-28 14:02:45 -07:00
Badari Pulavarty	d26f607036	mm/damon/dbgfs: avoid duplicate context directory creation When user tries to create a DAMON context via the DAMON debugfs interface with a name of an already existing context, the context directory creation fails but a new context is created and added in the internal data structure, due to absence of the directory creation success check. As a result, memory could leak and DAMON cannot be turned on. An example test case is as below: # cd /sys/kernel/debug/damon/ # echo "off" > monitor_on # echo paddr > target_ids # echo "abc" > mk_context # echo "abc" > mk_context # echo $$ > abc/target_ids # echo "on" > monitor_on <<< fails Return value of 'debugfs_create_dir()' is expected to be ignored in general, but this is an exceptional case as DAMON feature is depending on the debugfs functionality and it has the potential duplicate name issue. This commit therefore fixes the issue by checking the directory creation failure and immediately return the error in the case. Link: https://lkml.kernel.org/r/20220821180853.2400-1-sj@kernel.org Fixes: `75c1c2b53c` ("mm/damon/dbgfs: support multiple contexts") Signed-off-by: Badari Pulavarty <badari.pulavarty@intel.com> Signed-off-by: SeongJae Park <sj@kernel.org> Cc: <stable@vger.kernel.org> [ 5.15.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-08-28 14:02:45 -07:00
Colin Ian King	ac733f6558	mailmap: update email address for Colin King Colin King is working on kernel janitorial fixes in his spare time and using his Intel email is confusing. Use his gmail account as the default email address. Link: https://lkml.kernel.org/r/20220817212753.101109-1-colin.i.king@gmail.com Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-08-28 14:02:45 -07:00
Quanyang Wang	0c7d7cc2b4	asm-generic: sections: refactor memory_intersects There are two problems with the current code of memory_intersects: First, it doesn't check whether the region (begin, end) falls inside the region (virt, vend), that is (virt < begin && vend > end). The second problem is if vend is equal to begin, it will return true but this is wrong since vend (virt + size) is not the last address of the memory region but (virt + size -1) is. The wrong determination will trigger the misreporting when the function check_for_illegal_area calls memory_intersects to check if the dma region intersects with stext region. The misreporting is as below (stext is at 0x80100000): WARNING: CPU: 0 PID: 77 at kernel/dma/debug.c:1073 check_for_illegal_area+0x130/0x168 DMA-API: chipidea-usb2 e0002000.usb: device driver maps memory from kernel text or rodata [addr=800f0000] [len=65536] Modules linked in: CPU: 1 PID: 77 Comm: usb-storage Not tainted 5.19.0-yocto-standard #5 Hardware name: Xilinx Zynq Platform unwind_backtrace from show_stack+0x18/0x1c show_stack from dump_stack_lvl+0x58/0x70 dump_stack_lvl from __warn+0xb0/0x198 __warn from warn_slowpath_fmt+0x80/0xb4 warn_slowpath_fmt from check_for_illegal_area+0x130/0x168 check_for_illegal_area from debug_dma_map_sg+0x94/0x368 debug_dma_map_sg from __dma_map_sg_attrs+0x114/0x128 __dma_map_sg_attrs from dma_map_sg_attrs+0x18/0x24 dma_map_sg_attrs from usb_hcd_map_urb_for_dma+0x250/0x3b4 usb_hcd_map_urb_for_dma from usb_hcd_submit_urb+0x194/0x214 usb_hcd_submit_urb from usb_sg_wait+0xa4/0x118 usb_sg_wait from usb_stor_bulk_transfer_sglist+0xa0/0xec usb_stor_bulk_transfer_sglist from usb_stor_bulk_srb+0x38/0x70 usb_stor_bulk_srb from usb_stor_Bulk_transport+0x150/0x360 usb_stor_Bulk_transport from usb_stor_invoke_transport+0x38/0x440 usb_stor_invoke_transport from usb_stor_control_thread+0x1e0/0x238 usb_stor_control_thread from kthread+0xf8/0x104 kthread from ret_from_fork+0x14/0x2c Refactor memory_intersects to fix the two problems above. Before the `1d7db834a0` ("dma-debug: use memory_intersects() directly"), memory_intersects is called only by printk_late_init: printk_late_init -> init_section_intersects ->memory_intersects. There were few places where memory_intersects was called. When commit `1d7db834a0` ("dma-debug: use memory_intersects() directly") was merged and CONFIG_DMA_API_DEBUG is enabled, the DMA subsystem uses it to check for an illegal area and the calltrace above is triggered. [akpm@linux-foundation.org: fix nearby comment typo] Link: https://lkml.kernel.org/r/20220819081145.948016-1-quanyang.wang@windriver.com Fixes: `9795593625` ("asm/sections: add helpers to check for section data") Signed-off-by: Quanyang Wang <quanyang.wang@windriver.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Thierry Reding <treding@nvidia.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-08-28 14:02:45 -07:00
Liu Shixin	dd0ff4d12d	bootmem: remove the vmemmap pages from kmemleak in put_page_bootmem The vmemmap pages is marked by kmemleak when allocated from memblock. Remove it from kmemleak when freeing the page. Otherwise, when we reuse the page, kmemleak may report such an error and then stop working. kmemleak: Cannot insert 0xffff98fb6eab3d40 into the object search tree (overlaps existing) kmemleak: Kernel memory leak detector disabled kmemleak: Object 0xffff98fb6be00000 (size 335544320): kmemleak: comm "swapper", pid 0, jiffies 4294892296 kmemleak: min_count = 0 kmemleak: count = 0 kmemleak: flags = 0x1 kmemleak: checksum = 0 kmemleak: backtrace: Link: https://lkml.kernel.org/r/20220819094005.2928241-1-liushixin2@huawei.com Fixes: `f41f2ed43c` (mm: hugetlb: free the vmemmap pages associated with each HugeTLB page) Signed-off-by: Liu Shixin <liushixin2@huawei.com> Reviewed-by: Muchun Song <songmuchun@bytedance.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-08-28 14:02:45 -07:00
Heming Zhao	550842cc60	ocfs2: fix freeing uninitialized resource on ocfs2_dlm_shutdown After commit `0737e01de9` ("ocfs2: ocfs2_mount_volume does cleanup job before return error"), any procedure after ocfs2_dlm_init() fails will trigger crash when calling ocfs2_dlm_shutdown(). ie: On local mount mode, no dlm resource is initialized. If ocfs2_mount_volume() fails in ocfs2_find_slot(), error handling will call ocfs2_dlm_shutdown(), then does dlm resource cleanup job, which will trigger kernel crash. This solution should bypass uninitialized resources in ocfs2_dlm_shutdown(). Link: https://lkml.kernel.org/r/20220815085754.20417-1-heming.zhao@suse.com Fixes: `0737e01de9` ("ocfs2: ocfs2_mount_volume does cleanup job before return error") Signed-off-by: Heming Zhao <heming.zhao@suse.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-08-28 14:02:45 -07:00
Shakeel Butt	dbb16df644	Revert "memcg: cleanup racy sum avoidance code" This reverts commit `96e51ccf1a`. Recently we started running the kernel with rstat infrastructure on production traffic and begin to see negative memcg stats values. Particularly the 'sock' stat is the one which we observed having negative value. $ grep "sock " /mnt/memory/job/memory.stat sock 253952 total_sock 18446744073708724224 Re-run after couple of seconds $ grep "sock " /mnt/memory/job/memory.stat sock 253952 total_sock 53248 For now we are only seeing this issue on large machines (256 CPUs) and only with 'sock' stat. I think the networking stack increase the stat on one cpu and decrease it on another cpu much more often. So, this negative sock is due to rstat flusher flushing the stats on the CPU that has seen the decrement of sock but missed the CPU that has increments. A typical race condition. For easy stable backport, revert is the most simple solution. For long term solution, I am thinking of two directions. First is just reduce the race window by optimizing the rstat flusher. Second is if the reader sees a negative stat value, force flush and restart the stat collection. Basically retry but limited. Link: https://lkml.kernel.org/r/20220817172139.3141101-1-shakeelb@google.com Fixes: `96e51ccf1a` ("memcg: cleanup racy sum avoidance code") Signed-off-by: Shakeel Butt <shakeelb@google.com> Cc: "Michal Koutný" <mkoutny@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Muchun Song <songmuchun@bytedance.com> Cc: David Hildenbrand <david@redhat.com> Cc: Yosry Ahmed <yosryahmed@google.com> Cc: Greg Thelen <gthelen@google.com> Cc: <stable@vger.kernel.org> [5.15] Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-08-28 14:02:44 -07:00
Sergey Senozhatsky	a5d2172180	mm/zsmalloc: do not attempt to free IS_ERR handle zsmalloc() now returns ERR_PTR values as handles, which zram accidentally can pass to zs_free(). Another bad scenario is when zcomp_compress() fails - handle has default -ENOMEM value, and zs_free() will try to free that "pointer value". Add the missing check and make sure that zs_free() bails out when ERR_PTR() is passed to it. Link: https://lkml.kernel.org/r/20220816050906.2583956-1-senozhatsky@chromium.org Fixes: `c7e6f17b52` ("zsmalloc: zs_malloc: return ERR_PTR on failure") Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org>, Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-08-28 14:02:44 -07:00
Liam Howlett	44e602b4e5	binder_alloc: add missing mmap_lock calls when using the VMA Take the mmap_read_lock() when using the VMA in binder_alloc_print_pages() and when checking for a VMA in binder_alloc_new_buf_locked(). It is worth noting binder_alloc_new_buf_locked() drops the VMA read lock after it verifies a VMA exists, but may be taken again deeper in the call stack, if necessary. Link: https://lkml.kernel.org/r/20220810160209.1630707-1-Liam.Howlett@oracle.com Fixes: `a43cfc87ca` (android: binder: stop saving a pointer to the VMA) Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Reported-by: Ondrej Mosnacek <omosnace@redhat.com> Reported-by: <syzbot+a7b60a176ec13cafb793@syzkaller.appspotmail.com> Acked-by: Carlos Llamas <cmllamas@google.com> Tested-by: Ondrej Mosnacek <omosnace@redhat.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Christian Brauner (Microsoft) <brauner@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Hridya Valsaraju <hridya@google.com> Cc: Joel Fernandes <joel@joelfernandes.org> Cc: Martijn Coenen <maco@android.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Todd Kjos <tkjos@android.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: "Arve Hjønnevåg" <arve@android.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-08-28 14:02:44 -07:00
Alex Williamson	fcab34b433	mm: re-allow pinning of zero pfns (again) The below referenced commit makes the same error as `1c56343258` ("mm: fix is_pinnable_page against a cma page"), re-interpreting the logic to exclude pinning of the zero page, which breaks device assignment with vfio. To avoid further subtle mistakes, split the logic into discrete tests. [akpm@linux-foundation.org: simplify comment, per John] Link: https://lkml.kernel.org/r/166015037385.760108.16881097713975517242.stgit@omen Link: https://lore.kernel.org/all/165490039431.944052.12458624139225785964.stgit@omen Fixes: `f25cbb7a95` ("mm: add zone device coherent type memory support") Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Suggested-by: Matthew Wilcox <willy@infradead.org> Suggested-by: Felix Kuehling <felix.kuehling@amd.com> Tested-by: Slawomir Laba <slawomirx.laba@intel.com> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Cc: Alex Sierra <alex.sierra@amd.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Alistair Popple <apopple@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-08-28 14:02:44 -07:00
Stephen Brennan	f09bddbd86	vmcoreinfo: add kallsyms_num_syms symbol The rest of the kallsyms symbols are useless without knowing the number of symbols in the table. In an earlier patch, I somehow dropped the kallsyms_num_syms symbol, so add it back in. Link: https://lkml.kernel.org/r/20220808205410.18590-1-stephen.s.brennan@oracle.com Fixes: `5fd8fea935` ("vmcoreinfo: include kallsyms symbols") Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com> Cc: Baoquan He <bhe@redhat.com> Cc: Dave Young <dyoung@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-08-28 14:02:44 -07:00
Guilherme G. Piccoli	6c26d17eea	mailmap: update Guilherme G. Piccoli's email addresses Both @canonical and @ibm email addresses are invalid now; use my personal address instead. Link: https://lkml.kernel.org/r/20220804202207.439427-1-gpiccoli@igalia.com Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-08-28 14:02:43 -07:00
Khazhismel Kumykov	f87904c075	writeback: avoid use-after-free after removing device When a disk is removed, bdi_unregister gets called to stop further writeback and wait for associated delayed work to complete. However, wb_inode_writeback_end() may schedule bandwidth estimation dwork after this has completed, which can result in the timer attempting to access the just freed bdi_writeback. Fix this by checking if the bdi_writeback is alive, similar to when scheduling writeback work. Since this requires wb->work_lock, and wb_inode_writeback_end() may get called from interrupt, switch wb->work_lock to an irqsafe lock. Link: https://lkml.kernel.org/r/20220801155034.3772543-1-khazhy@google.com Fixes: `45a2966fd6` ("writeback: fix bandwidth estimate for spiky workload") Signed-off-by: Khazhismel Kumykov <khazhy@google.com> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Michael Stapelberg <stapelberg+linux@google.com> Cc: Wu Fengguang <fengguang.wu@intel.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-08-28 14:02:43 -07:00
Matthew Wilcox (Oracle)	9dfb3b8d65	shmem: update folio if shmem_replace_page() updates the page If we allocate a new page, we need to make sure that our folio matches that new page. If we do end up in this code path, we store the wrong page in the shmem inode's page cache, and I would rather imagine that data corruption ensues. This will be solved by changing shmem_replace_page() to shmem_replace_folio(), but this is the minimal fix. Link: https://lkml.kernel.org/r/20220730042518.1264767-1-willy@infradead.org Fixes: `da08e9b793` ("mm/shmem: convert shmem_swapin_page() to shmem_swapin_folio()") Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: William Kucharski <william.kucharski@oracle.com> Cc: Hugh Dickins <hughd@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-08-28 14:02:43 -07:00
Miaohe Lin	ab74ef708d	mm/hugetlb: avoid corrupting page->mapping in hugetlb_mcopy_atomic_pte In MCOPY_ATOMIC_CONTINUE case with a non-shared VMA, pages in the page cache are installed in the ptes. But hugepage_add_new_anon_rmap is called for them mistakenly because they're not vm_shared. This will corrupt the page->mapping used by page cache code. Link: https://lkml.kernel.org/r/20220712130542.18836-1-linmiaohe@huawei.com Fixes: `f619147104` ("userfaultfd: add UFFDIO_CONTINUE ioctl") Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Peter Xu <peterx@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-08-28 14:02:43 -07:00
Linus Torvalds	8379c0b31f	Merge tag 'for-6.0-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: "Fixes: - check that subvolume is writable when changing xattrs from security namespace - fix memory leak in device lookup helper - update generation of hole file extent item when merging holes - fix space cache corruption and potential double allocations; this is a rare bug but can be serious once it happens, stable backports and analysis tool will be provided - fix error handling when deleting root references - fix crash due to assert when attempting to cancel suspended device replace, add message what to do if mount fails due to missing replace item Regressions: - don't merge pages into bio if their page offset is not contiguous - don't allow large NOWAIT direct reads, this could lead to short reads eg. in io_uring" * tag 'for-6.0-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: add info when mount fails due to stale replace target btrfs: replace: drop assert for suspended replace btrfs: fix silent failure when deleting root reference btrfs: fix space cache corruption and potential double allocations btrfs: don't allow large NOWAIT direct reads btrfs: don't merge pages into bio if their page offset is not contiguous btrfs: update generation of hole file extent item when merging holes btrfs: fix possible memory leak in btrfs_get_dev_args_from_path() btrfs: check if root is readonly while setting security xattr	2022-08-28 10:44:04 -07:00
Linus Torvalds	c7bb3fbc1b	Merge tag '6.0-rc2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6 Pull cfis fixes from Steve French: - two locking fixes (zero range, punch hole) - DFS 9 fix (padding), affecting some servers - three minor cleanup changes * tag '6.0-rc2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: cifs: Add helper function to check smb1+ server cifs: Use help macro to get the mid header size cifs: Use help macro to get the header preamble size cifs: skip extra NULL byte in filenames smb3: missing inode locks in punch hole smb3: missing inode locks in zero range	2022-08-28 10:35:16 -07:00
Linus Torvalds	2f23a7c914	Merge tag 'x86-urgent-2022-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull misc x86 fixes from Ingo Molnar: - Fix PAT on Xen, which caused i915 driver failures - Fix compat INT 80 entry crash on Xen PV guests - Fix 'MMIO Stale Data' mitigation status reporting on older Intel CPUs - Fix RSB stuffing regressions - Fix ORC unwinding on ftrace trampolines - Add Intel Raptor Lake CPU model number - Fix (work around) a SEV-SNP bootloader bug providing bogus values in boot_params->cc_blob_address, by ignoring the value on !SEV-SNP bootups. - Fix SEV-SNP early boot failure - Fix the objtool list of noreturn functions and annotate snp_abort(), which bug confused objtool on gcc-12. - Fix the documentation for retbleed * tag 'x86-urgent-2022-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: Documentation/ABI: Mention retbleed vulnerability info file for sysfs x86/sev: Mark snp_abort() noreturn x86/sev: Don't use cc_platform_has() for early SEV-SNP calls x86/boot: Don't propagate uninitialized boot_params->cc_blob_address x86/cpu: Add new Raptor Lake CPU model number x86/unwind/orc: Unwind ftrace trampolines with correct ORC entry x86/nospec: Fix i386 RSB stuffing x86/nospec: Unwreck the RSB stuffing x86/bugs: Add "unknown" reporting for MMIO Stale Data x86/entry: Fix entry_INT80_compat for Xen PV guests x86/PAT: Have pat_enabled() properly reflect state when running on Xen	2022-08-28 10:10:23 -07:00
Linus Torvalds	4459d800f7	Merge tag 'perf-urgent-2022-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 perf fixes from Ingo Molnar: "Misc fixes: an Arch-LBR fix, a PEBS enumeration fix, an Intel DS fix, PEBS constraints fix on Alder Lake CPUs and an Intel uncore PMU fix" * tag 'perf-urgent-2022-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/intel/uncore: Fix broken read_counter() for SNB IMC PMU perf/x86/intel: Fix pebs event constraints for ADL perf/x86/intel/ds: Fix precise store latency handling perf/x86/core: Set pebs_capable and PMU_FL_PEBS_ALL for the Baseline perf/x86/lbr: Enable the branch type for the Arch LBR by default	2022-08-28 10:05:42 -07:00
Linus Torvalds	611875d510	Merge tag 'perf-tools-fixes-for-v6.0-2022-08-27' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux Pull perf tools fixes from Arnaldo Carvalho de Melo: - Fixup setup of weak groups when using 'perf stat --repeat', add a 'perf test' for it. - Fix memory leaks in 'perf sched record' detected with -fsanitize=address. - Fix build when PYTHON_CONFIG is user supplied. - Capitalize topdown metrics' names in 'perf stat', so that the output, sometimes parsed, matches the Intel SDM docs. - Make sure the documentation for the save_type filter about Intel systems with Arch LBR support (12th-Gen+ client or 4th-Gen Xeon+ server) reflects recent related kernel changes. - Fix 'perf record' man page formatting of description of support to hybrid systems. - Update arm64´s KVM header from the kernel sources. * tag 'perf-tools-fixes-for-v6.0-2022-08-27' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: perf stat: Capitalize topdown metrics' names perf docs: Update the documentation for the save_type filter perf sched: Fix memory leaks in __cmd_record detected with -fsanitize=address perf record: Fix manpage formatting of description of support to hybrid systems perf test: Stat test for repeat with a weak group perf stat: Clear evsel->reset_group for each stat run tools kvm headers arm64: Update KVM header from the kernel sources perf python: Fix build when PYTHON_CONFIG is user supplied	2022-08-28 09:58:00 -07:00
Linus Torvalds	10d4879f9e	Merge tag 'thermal-6.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull thermal control fixes from Rafael Wysocki: "Fix two issues introduced recently and one driver problem leading to a NULL pointer dereference in some cases. Specifics: - Add missing EXPORT_SYMBOL_GPL in the thermal core and add back the required 'trips' property to the thermal zone DT bindings (Daniel Lezcano) - Prevent the int340x_thermal driver from crashing when a package with a buffer of 0 length is returned by an ACPI control method evaluated by it (Lee, Chun-Yi)" * tag 'thermal-6.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: thermal/int340x_thermal: handle data_vault when the value is ZERO_SIZE_PTR dt-bindings: thermal: Fix missing required property thermal/core: Add missing EXPORT_SYMBOL_GPL	2022-08-27 15:58:38 -07:00
Linus Torvalds	b98f602df7	Merge tag 'pm-6.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fix from Rafael Wysocki: "Make __resolve_freq() check the presence of the frequency table instead of checking whether or not the ->target_index() callback is implemented by the driver, because that need not be the case when __resolve_freq() is used (Lukasz Luba)" * tag 'pm-6.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: cpufreq: check only freq_table in __resolve_freq()	2022-08-27 15:53:49 -07:00
Linus Torvalds	2b1ddb5950	Merge tag 'acpi-6.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fixes from Rafael Wysocki: "These fix issues introduced by recent changes related to the handling of ACPI device properties and a coding mistake in the exit path of the ACPI processor driver. Specifics: - Prevent acpi_thermal_cpufreq_exit() from attempting to remove the same frequency QoS request multiple times (Riwen Lu) - Fix type detection for integer ACPI device properties (Stefan Binding) - Avoid emitting false-positive warnings when processing ACPI device properties and drop the useless default case from the acpi_copy_property_array_uint() macro (Sakari Ailus)" * tag 'acpi-6.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI: property: Remove default association from integer maximum values ACPI: property: Ignore already existing data node tags ACPI: property: Fix type detection of unified integer reading functions ACPI: processor: Remove freq Qos request for all CPUs	2022-08-27 15:47:02 -07:00
Linus Torvalds	dee1873796	Merge tag 's390-6.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Vasily Gorbik: - Fix double free of guarded storage and runtime instrumentation control blocks on fork() failure - Fix triggering write fault when VMA does not allow VM_WRITE * tag 's390-6.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/mm: do not trigger write fault when vma does not allow VM_WRITE s390: fix double free of GS and RI CBs on fork() failure	2022-08-27 15:40:51 -07:00
Linus Torvalds	05519f2480	Merge tag 'for-linus-6.0-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: - two minor cleanups - a fix of the xen/privcmd driver avoiding a possible NULL dereference in an error case * tag 'for-linus-6.0-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/privcmd: fix error exit of privcmd_ioctl_dm_op() xen: move from strlcpy with unused retval to strscpy xen: x86: remove setting the obsolete config XEN_MAX_DOMAIN_MEMORY	2022-08-27 15:38:00 -07:00
Linus Torvalds	17b28d4267	Merge tag 'audit-pr-20220826' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit Pull audit fix from Paul Moore: "Another small audit patch, this time to fix a bug where the return codes were not properly set before the audit filters were run, potentially resulting in missed audit records" * tag 'audit-pr-20220826' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit: audit: move audit_return_fixup before the filters	2022-08-27 15:31:12 -07:00
Linus Torvalds	89b749d855	Merge tag 'fbdev-for-6.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev Pull fbdev fixes and updates from Helge Deller: "Mostly just small patches, with the exception of the bigger indenting cleanups in the sisfb and radeonfb drivers. Two patches should be mentioned though: A fix-up for fbdev if the screen resize fails (by Shigeru Yoshida), and a potential divide by zero fix in fb_pm2fb (by Letu Ren). Summary: Major fixes: - Revert the changes for fbcon console when vc_resize() fails [Shigeru Yoshida] - Avoid a potential divide by zero error in fb_pm2fb [Letu Ren] Minor fixes: - Add missing pci_disable_device() in chipsfb_pci_init() [Yang Yingliang] - Fix tests for platform_get_irq() failure in omapfb [Yu Zhe] - Destroy mutex on freeing struct fb_info in fbsysfs [Shigeru Yoshida] Cleanups: - Move fbdev drivers from strlcpy to strscpy [Wolfram Sang] - Indenting fixes, comment fixes, ... [Jiapeng Chong & Jilin Yuan]" * tag 'fbdev-for-6.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev: fbdev: fbcon: Properly revert changes when vc_resize() failed fbdev: Move fbdev drivers from strlcpy to strscpy fbdev: omap: Remove unnecessary print function dev_err() fbdev: chipsfb: Add missing pci_disable_device() in chipsfb_pci_init() fbdev: fbcon: Destroy mutex on freeing struct fb_info fbdev: radeon: Clean up some inconsistent indenting fbdev: sisfb: Clean up some inconsistent indenting fbdev: fb_pm2fb: Avoid potential divide by zero error fbdev: ssd1307fb: Fix repeated words in comments fbdev: omapfb: Fix tests for platform_get_irq() failure	2022-08-27 09:57:58 -07:00
Mikulas Patocka	d6ffe6067a	provide arch_test_bit_acquire for architectures that define test_bit Some architectures define their own arch_test_bit and they also need arch_test_bit_acquire, otherwise they won't compile. We also clean up the code by using the generic test_bit if that is equivalent to the arch-specific version. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org Fixes: `8238b45798` ("wait_on_bit: add an acquire memory barrier") Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2022-08-27 09:49:54 -07:00
Zhengjun Xing	48648548ef	perf stat: Capitalize topdown metrics' names Capitalize topdown metrics' names to follow the intel SDM. Before: # ./perf stat -a sleep 1 Performance counter stats for 'system wide': 228,094.05 msec cpu-clock # 225.026 CPUs utilized 842 context-switches # 3.691 /sec 224 cpu-migrations # 0.982 /sec 70 page-faults # 0.307 /sec 23,164,105 cycles # 0.000 GHz 29,403,446 instructions # 1.27 insn per cycle 5,268,185 branches # 23.097 K/sec 33,239 branch-misses # 0.63% of all branches 136,248,990 slots # 597.337 K/sec 32,976,450 topdown-retiring # 24.2% retiring 4,651,918 topdown-bad-spec # 3.4% bad speculation 26,148,695 topdown-fe-bound # 19.2% frontend bound 72,515,776 topdown-be-bound # 53.2% backend bound 6,008,540 topdown-heavy-ops # 4.4% heavy operations # 19.8% light operations 3,934,049 topdown-br-mispredict # 2.9% branch mispredict # 0.5% machine clears 16,655,439 topdown-fetch-lat # 12.2% fetch latency # 7.0% fetch bandwidth 41,635,972 topdown-mem-bound # 30.5% memory bound # 22.7% Core bound 1.013634593 seconds time elapsed After: # ./perf stat -a sleep 1 Performance counter stats for 'system wide': 228,081.94 msec cpu-clock # 225.003 CPUs utilized 824 context-switches # 3.613 /sec 224 cpu-migrations # 0.982 /sec 67 page-faults # 0.294 /sec 22,647,423 cycles # 0.000 GHz 28,870,551 instructions # 1.27 insn per cycle 5,167,099 branches # 22.655 K/sec 32,383 branch-misses # 0.63% of all branches 133,411,074 slots # 584.926 K/sec 32,352,607 topdown-retiring # 24.3% Retiring 4,456,977 topdown-bad-spec # 3.3% Bad Speculation 25,626,487 topdown-fe-bound # 19.2% Frontend Bound 70,955,316 topdown-be-bound # 53.2% Backend Bound 5,834,844 topdown-heavy-ops # 4.4% Heavy Operations # 19.9% Light Operations 3,738,781 topdown-br-mispredict # 2.8% Branch Mispredict # 0.5% Machine Clears 16,286,803 topdown-fetch-lat # 12.2% Fetch Latency # 7.0% Fetch Bandwidth 40,802,069 topdown-mem-bound # 30.6% Memory Bound # 22.6% Core Bound 1.013683125 seconds time elapsed Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220825015458.3252239-1-zhengjun.xing@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-08-27 11:55:17 -03:00

1 2 3 4 5 ...

1122280 Commits