linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-04-04 15:55:59 -04:00

Author	SHA1	Message	Date
Linus Torvalds	4e21e585b6	Merge tag 'irq-cleanups-2026-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq cleanups from Thomas Gleixner: "A series of treewide cleanups to ensure interrupt request consistency. - Add the missing IRQF_COND_ONESHOT flag to devm_request_irq() This is inconsistent vs request_irq() and causes the same issues which where addressed with the introduction of this flag - Cleanup IRQF_ONESHOT and IRQF_NO_THREAD usage Quite some drivers have inconsistent interrupt request flags related to interrupt threading namely IRQF_ONESHOT and IRQF_NO_THREAD. This leads to warnings and/or malfunction when forced interrupt threading is enabled. - Remove stub primary (hard interrupt) handlers A bunch of drivers implement a stub primary (hard interrupt) handler which just returns IRQ_WAKE_THREAD. The same functionality is provided by the core code when the primary handler argument of request_thread_irq() is set to NULL" * tag 'irq-cleanups-2026-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: media: pci: mg4b: Use IRQF_NO_THREAD mfd: wm8350-core: Use IRQF_ONESHOT thermal/qcom/lmh: Replace IRQF_ONESHOT with IRQF_NO_THREAD rtc: amlogic-a4: Remove IRQF_ONESHOT usb: typec: fusb302: Remove IRQF_ONESHOT EDAC/altera: Remove IRQF_ONESHOT char: tpm: cr50: Remove IRQF_ONESHOT ARM: versatile: Remove IRQF_ONESHOT scsi: efct: Use IRQF_ONESHOT and default primary handler Bluetooth: btintel_pcie: Use IRQF_ONESHOT and default primary handler bus: fsl-mc: Use default primary handler mailbox: bcm-ferxrm-mailbox: Use default primary handler iommu/amd: Use core's primary handler and set IRQF_ONESHOT platform/x86: int0002: Remove IRQF_ONESHOT from request_irq() genirq: Set IRQF_COND_ONESHOT in devm_request_irq().	2026-02-10 13:22:50 -08:00
Linus Torvalds	0c00ed308d	Merge tag 'for-7.0/block-20260206' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux Pull block updates from Jens Axboe: - Support for batch request processing for ublk, improving the efficiency of the kernel/ublk server communication. This can yield nice 7-12% performance improvements - Support for integrity data for ublk - Various other ublk improvements and additions, including a ton of selftests additions and updated - Move the handling of blk-crypto software fallback from below the block layer to above it. This reduces the complexity of dealing with bio splitting - Series fixing a number of potential deadlocks in blk-mq related to the queue usage counter and writeback throttling and rq-qos debugfs handling - Add an async_depth queue attribute, to resolve a performance regression that's been around for a qhilw related to the scheduler depth handling - Only use task_work for IOPOLL completions on NVMe, if it is necessary to do so. An earlier fix for an issue resulted in all these completions being punted to task_work, to guarantee that completions were only run for a given io_uring ring when it was local to that ring. With the new changes, we can detect if it's necessary to use task_work or not, and avoid it if possible. - rnbd fixes: - Fix refcount underflow in device unmap path - Handle PREFLUSH and NOUNMAP flags properly in protocol - Fix server-side bi_size for special IOs - Zero response buffer before use - Fix trace format for flags - Add .release to rnbd_dev_ktype - MD pull requests via Yu Kuai - Fix raid5_run() to return error when log_init() fails - Fix IO hang with degraded array with llbitmap - Fix percpu_ref not resurrected on suspend timeout in llbitmap - Fix GPF in write_page caused by resize race - Fix NULL pointer dereference in process_metadata_update - Fix hang when stopping arrays with metadata through dm-raid - Fix any_working flag handling in raid10_sync_request - Refactor sync/recovery code path, improve error handling for badblocks, and remove unused recovery_disabled field - Consolidate mddev boolean fields into mddev_flags - Use mempool to allocate stripe_request_ctx and make sure max_sectors is not less than io_opt in raid5 - Fix return value of mddev_trylock - Fix memory leak in raid1_run() - Add Li Nan as mdraid reviewer - Move phys_vec definitions to the kernel types, mostly in preparation for some VFIO and RDMA changes - Improve the speed for secure erase for some devices - Various little rust updates - Various other minor fixes, improvements, and cleanups * tag 'for-7.0/block-20260206' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: (162 commits) blk-mq: ABI/sysfs-block: fix docs build warnings selftests: ublk: organize test directories by test ID block: decouple secure erase size limit from discard size limit block: remove redundant kill_bdev() call in set_blocksize() blk-mq: add documentation for new queue attribute async_dpeth block, bfq: convert to use request_queue->async_depth mq-deadline: covert to use request_queue->async_depth kyber: covert to use request_queue->async_depth blk-mq: add a new queue sysfs attribute async_depth blk-mq: factor out a helper blk_mq_limit_depth() blk-mq-sched: unify elevators checking for async requests block: convert nr_requests to unsigned int block: don't use strcpy to copy blockdev name blk-mq-debugfs: warn about possible deadlock blk-mq-debugfs: add missing debugfs_mutex in blk_mq_debugfs_register_hctxs() blk-mq-debugfs: remove blk_mq_debugfs_unregister_rqos() blk-mq-debugfs: make blk_mq_debugfs_register_rqos() static blk-rq-qos: fix possible debugfs_mutex deadlock blk-mq-debugfs: factor out a helper to register debugfs for all rq_qos blk-wbt: fix possible deadlock to nest pcpu_alloc_mutex under q_usage_counter ...	2026-02-09 17:57:21 -08:00
Sebastian Andrzej Siewior	bd81f07e9a	scsi: efct: Use IRQF_ONESHOT and default primary handler There is no added value in efct_intr_msix() compared to irq_default_primary_handler(). Using a threaded interrupt without a dedicated primary handler mandates the IRQF_ONESHOT flag to mask the interrupt source while the threaded handler is active. Otherwise the interrupt can fire again before the threaded handler had a chance to run. Use the default primary interrupt handler by specifying NULL and set IRQF_ONESHOT so the interrupt source is masked until the secondary handler is done. Fixes: `4df84e8466` ("scsi: elx: efct: Driver initialization routines") Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Link: https://patch.msgid.link/20260128095540.863589-8-bigeasy@linutronix.de	2026-02-01 17:37:14 +01:00
Haoxiang Li	4747bafaa5	scsi: be2iscsi: Fix a memory leak in beiscsi_boot_get_sinfo() If nonemb_cmd->va fails to be allocated, free the allocation previously made by alloc_mcc_wrb(). Fixes: `50a4b824be` ("scsi: be2iscsi: Fix to make boot discovery non-blocking") Cc: stable@vger.kernel.org Signed-off-by: Haoxiang Li <lihaoxiang@isrc.iscas.ac.cn> Link: https://patch.msgid.link/20251213083643.301240-1-lihaoxiang@isrc.iscas.ac.cn Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2026-01-23 22:39:07 -05:00
Thomas Fourier	56bd3c0f74	scsi: qla2xxx: edif: Fix dma_free_coherent() size Earlier in the function, the ha->flt buffer is allocated with size sizeof(struct qla_flt_header) + FLT_REGIONS_SIZE but freed in the error path with size SFP_DEV_SIZE. Fixes: `84318a9f01` ("scsi: qla2xxx: edif: Add send, receive, and accept for auth_els") Cc: stable@vger.kernel.org Signed-off-by: Thomas Fourier <fourier.thomas@gmail.com> Link: https://patch.msgid.link/20260112134326.55466-2-fourier.thomas@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2026-01-23 22:31:56 -05:00
Ming Lei	5e2fde1a94	block: pass io_comp_batch to rq_end_io_fn callback Add a third parameter 'const struct io_comp_batch *' to the rq_end_io_fn callback signature. This allows end_io handlers to access the completion batch context when requests are completed via blk_mq_end_request_batch(). The io_comp_batch is passed from blk_mq_end_request_batch(), while NULL is passed from __blk_mq_end_request() and blk_mq_put_rq_ref() which don't have batch context. This infrastructure change enables drivers to detect whether they're being called from a batched completion path (like iopoll) and access additional context stored in the io_comp_batch. Update all rq_end_io_fn implementations: - block/blk-mq.c: blk_end_sync_rq - block/blk-flush.c: flush_end_io, mq_flush_data_end_io - drivers/nvme/host/ioctl.c: nvme_uring_cmd_end_io - drivers/nvme/host/core.c: nvme_keep_alive_end_io - drivers/nvme/host/pci.c: abort_endio, nvme_del_queue_end, nvme_del_cq_end - drivers/nvme/target/passthru.c: nvmet_passthru_req_done - drivers/scsi/scsi_error.c: eh_lock_door_done - drivers/scsi/sg.c: sg_rq_end_io - drivers/scsi/st.c: st_scsi_execute_end - drivers/target/target_core_pscsi.c: pscsi_req_done - drivers/md/dm-rq.c: end_clone_request Signed-off-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2026-01-20 10:12:54 -07:00
Jiasheng Jiang	19bc5f2a69	scsi: qla2xxx: Sanitize payload size to prevent member overflow In qla27xx_copy_fpin_pkt() and qla27xx_copy_multiple_pkt(), the frame_size reported by firmware is used to calculate the copy length into item->iocb. However, the iocb member is defined as a fixed-size 64-byte array within struct purex_item. If the reported frame_size exceeds 64 bytes, subsequent memcpy calls will overflow the iocb member boundary. While extra memory might be allocated, this cross-member write is unsafe and triggers warnings under CONFIG_FORTIFY_SOURCE. Fix this by capping total_bytes to the size of the iocb member (64 bytes) before allocation and copying. This ensures all copies remain within the bounds of the destination structure member. Fixes: `875386b988` ("scsi: qla2xxx: Add Unsolicited LS Request and Response Support for NVMe") Signed-off-by: Jiasheng Jiang <jiashengjiangcool@gmail.com> Reviewed-by: Himanshu Madhani <hmadhani2024@gmail.com> Link: https://patch.msgid.link/20260106205344.18031-1-jiashengjiangcool@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2026-01-16 23:08:11 -05:00
David Jeffery	fe2f8ad6f0	scsi: core: Wake up the error handler when final completions race against each other The fragile ordering between marking commands completed or failed so that the error handler only wakes when the last running command completes or times out has race conditions. These race conditions can cause the SCSI layer to fail to wake the error handler, leaving I/O through the SCSI host stuck as the error state cannot advance. First, there is an memory ordering issue within scsi_dec_host_busy(). The write which clears SCMD_STATE_INFLIGHT may be reordered with reads counting in scsi_host_busy(). While the local CPU will see its own write, reordering can allow other CPUs in scsi_dec_host_busy() or scsi_eh_inc_host_failed() to see a raised busy count, causing no CPU to see a host busy equal to the host_failed count. This race condition can be prevented with a memory barrier on the error path to force the write to be visible before counting host busy commands. Second, there is a general ordering issue with scsi_eh_inc_host_failed(). By counting busy commands before incrementing host_failed, it can race with a final command in scsi_dec_host_busy(), such that scsi_dec_host_busy() does not see host_failed incremented but scsi_eh_inc_host_failed() counts busy commands before SCMD_STATE_INFLIGHT is cleared by scsi_dec_host_busy(), resulting in neither waking the error handler task. This needs the call to scsi_host_busy() to be moved after host_failed is incremented to close the race condition. Fixes: `6eb045e092` ("scsi: core: avoid host-wide host_busy counter for scsi_mq") Signed-off-by: David Jeffery <djeffery@redhat.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Link: https://patch.msgid.link/20260113161036.6730-1-djeffery@redhat.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2026-01-16 22:54:27 -05:00
Long Li	9eacec5d18	scsi: storvsc: Process unsupported MODE_SENSE_10 The Hyper-V host does not support MODE_SENSE_10 and MODE_SENSE. The driver handles MODE_SENSE as unsupported command, but not for MODE_SENSE_10. Add MODE_SENSE_10 to the same handling logic and return correct code to SCSI layer. Fixes: `89ae7d7093` ("Staging: hv: storvsc: Move the storage driver out of the staging area") Cc: stable@kernel.org Signed-off-by: Long Li <longli@microsoft.com> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Link: https://patch.msgid.link/20260117010302.294068-1-longli@linux.microsoft.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2026-01-16 22:49:41 -05:00
Julia Lawall	d0f6cfb491	scsi: bfa: Update outdated comment The function bfa_lps_is_brcd_fabric() was eliminated, being a one-line function, in commit `f7f73812e9` ("[SCSI] bfa: clean up one line functions"). Replace the call in the comment by its inlined counterpart, referring to the parameter of the subsequent function. Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr> Link: https://patch.msgid.link/20251231165027.142443-1-Julia.Lawall@inria.fr Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2026-01-04 15:28:08 -05:00
Brian Kao	9a49157dee	scsi: core: Fix error handler encryption support Some low-level drivers (LLD) access block layer crypto fields, such as rq->crypt_keyslot and rq->crypt_ctx within `struct request`, to configure hardware for inline encryption. However, SCSI Error Handling (EH) commands (e.g., TEST UNIT READY, START STOP UNIT) should not involve any encryption setup. To prevent drivers from erroneously applying crypto settings during EH, this patch saves the original values of rq->crypt_keyslot and rq->crypt_ctx before an EH command is prepared via scsi_eh_prep_cmnd(). These fields in the 'struct request' are then set to NULL. The original values are restored in scsi_eh_restore_cmnd() after the EH command completes. This ensures that the block layer crypto context does not leak into EH command execution. Signed-off-by: Brian Kao <powenkao@google.com> Link: https://patch.msgid.link/20251218031726.2642834-1-powenkao@google.com Cc: stable@vger.kernel.org Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2026-01-04 15:16:20 -05:00
Miao Li	1523d50aba	scsi: core: Correct documentation for scsi_test_unit_ready() If scsi_test_unit_ready() returns zero, TEST UNIT READY was executed successfully. Signed-off-by: Miao Li <limiao@kylinos.cn> Link: https://patch.msgid.link/20251218023129.284307-1-limiao870622@163.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2026-01-04 15:13:55 -05:00
Michal Rábek	0e16776542	scsi: sg: Fix occasional bogus elapsed time that exceeds timeout A race condition was found in sg_proc_debug_helper(). It was observed on a system using an IBM LTO-9 SAS Tape Drive (ULTRIUM-TD9) and monitoring /proc/scsi/sg/debug every second. A very large elapsed time would sometimes appear. This is caused by two race conditions. We reproduced the issue with an IBM ULTRIUM-HH9 tape drive on an x86_64 architecture. A patched kernel was built, and the race condition could not be observed anymore after the application of this patch. A reproducer C program utilising the scsi_debug module was also built by Changhui Zhong and can be viewed here: https://github.com/MichaelRabek/linux-tests/blob/master/drivers/scsi/sg/sg_race_trigger.c The first race happens between the reading of hp->duration in sg_proc_debug_helper() and request completion in sg_rq_end_io(). The hp->duration member variable may hold either of two types of information: #1 - The start time of the request. This value is present while the request is not yet finished. #2 - The total execution time of the request (end_time - start_time). If sg_proc_debug_helper() executes after the value of hp->duration was changed from #1 to #2, but before srp->done is set to 1 in sg_rq_end_io(), a fresh timestamp is taken in the else branch, and the elapsed time (value type #2) is subtracted from a timestamp, which cannot yield a valid elapsed time (which is a type #2 value as well). To fix this issue, the value of hp->duration must change under the protection of the sfp->rq_list_lock in sg_rq_end_io(). Since sg_proc_debug_helper() takes this read lock, the change to srp->done and srp->header.duration will happen atomically from the perspective of sg_proc_debug_helper() and the race condition is thus eliminated. The second race condition happens between sg_proc_debug_helper() and sg_new_write(). Even though hp->duration is set to the current time stamp in sg_add_request() under the write lock's protection, it gets overwritten by a call to get_sg_io_hdr(), which calls copy_from_user() to copy struct sg_io_hdr from userspace into kernel space. hp->duration is set to the start time again in sg_common_write(). If sg_proc_debug_helper() is called between these two calls, an arbitrary value set by userspace (usually zero) is used to compute the elapsed time. To fix this issue, hp->duration must be set to the current timestamp again after get_sg_io_hdr() returns successfully. A small race window still exists between get_sg_io_hdr() and setting hp->duration, but this window is only a few instructions wide and does not result in observable issues in practice, as confirmed by testing. Additionally, we fix the format specifier from %d to %u for printing unsigned int values in sg_proc_debug_helper(). Signed-off-by: Michal Rábek <mrabek@redhat.com> Suggested-by: Tomas Henzl <thenzl@redhat.com> Tested-by: Changhui Zhong <czhong@redhat.com> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Reviewed-by: John Meneghini <jmeneghi@redhat.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Link: https://patch.msgid.link/20251212160900.64924-1-mrabek@redhat.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2025-12-16 21:24:28 -05:00
Chandrakanth Patil	d373163194	scsi: mpi3mr: Read missing IOCFacts flag for reply queue full overflow The driver was not reading the MAX_REQ_PER_REPLY_QUEUE_LIMIT IOCFacts flag, so the reply-queue-full handling was never enabled, even on firmware that supports it. Reading this flag enables the feature and prevents reply queue overflow. Fixes: `f08b24d827` ("scsi: mpi3mr: Avoid reply queue full condition") Cc: stable@vger.kernel.org Signed-off-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com> Link: https://patch.msgid.link/20251211002929.22071-1-chandrakanth.patil@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2025-12-16 21:19:26 -05:00
John Garry	1f7d6e2efe	scsi: scsi_debug: Fix atomic write enable module param description The atomic write enable module param is "atomic_wr", and not "atomic_write", so fix the module param description. Fixes: `84f3a3c01d` ("scsi: scsi_debug: Atomic write support") Signed-off-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Link: https://patch.msgid.link/20251211100651.9056-1-john.g.garry@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2025-12-16 21:09:44 -05:00
Linus Torvalds	6a1636e066	Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "The only core fix is in doc; all the others are in drivers, with the biggest impacts in libsas being the rollback on error handling and in ufs coming from a couple of error handling fixes, one causing a crash if it's activated before scanning and the other fixing W-LUN resumption" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: ufs: qcom: Fix confusing cleanup.h syntax scsi: libsas: Add rollback handling when an error occurs scsi: device_handler: Return error pointer in scsi_dh_attached_handler_name() scsi: ufs: core: Fix a deadlock in the frequency scaling code scsi: ufs: core: Fix an error handler crash scsi: Revert "scsi: libsas: Fix exp-attached device scan after probe failure scanned in again after probe failed" scsi: ufs: core: Fix RPMB link error by reversing Kconfig dependencies scsi: qla4xxx: Use time conversion macros scsi: qla2xxx: Enable/disable IRQD_NO_BALANCING during reset scsi: ipr: Enable/disable IRQD_NO_BALANCING during reset scsi: imm: Fix use-after-free bug caused by unfinished delayed work scsi: target: sbp: Remove KMSG_COMPONENT macro scsi: core: Correct documentation for scsi_device_quiesce() scsi: mpi3mr: Prevent duplicate SAS/SATA device entries in channel 1 scsi: target: Reset t_task_cdb pointer in error case scsi: ufs: core: Fix EH failure after W-LUN resume error	2025-12-14 15:35:35 +12:00
Chaohai Chen	362432e9b9	scsi: libsas: Add rollback handling when an error occurs In sas_register_phys(), if an error is triggered in the loop process, we need to roll back the resources that have already been requested. Add sas_unregister_phys() when an error occurs in sas_register_ha(). [mkp: a few coding style tweaks and address John's comment] Signed-off-by: Chaohai Chen <wdhh6@aliyun.com> Reviewed-by: John Garry <john.g.garry@oracle.com> Link: https://patch.msgid.link/20251206060616.69216-1-wdhh6@aliyun.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2025-12-08 22:08:31 -05:00
Benjamin Marzinski	fd81bc5cca	scsi: device_handler: Return error pointer in scsi_dh_attached_handler_name() If scsi_dh_attached_handler_name() fails to allocate the handler name, dm-multipath (its only caller) assumes there is no attached device handler, and sets the device up incorrectly. Return an error pointer instead, so multipath can distinguish between failure, success where there is no attached device handler, or when the path device is not a SCSI device at all. Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> Reviewed-by: Martin Wilck <mwilck@suse.com> Link: https://patch.msgid.link/20251206010015.1595225-1-bmarzins@redhat.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2025-12-08 22:04:38 -05:00
Linus Torvalds	4482ebb297	Merge tag 'block-6.19-20251208' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux Pull block updates from Jens Axboe: "Followup set of fixes and updates for block for the 6.19 merge window. NVMe had some late minute debates which lead to dropping some patches from that tree, which is why the initial PR didn't have NVMe included. It's here now. This pull request contains: - NVMe pull request via Keith: - Subsystem usage cleanups (Max) - Endpoint device fixes (Shin'ichiro) - Debug statements (Gerd) - FC fabrics cleanups and fixes (Daniel) - Consistent alloc API usages (Israel) - Code comment updates (Chu) - Authentication retry fix (Justin) - Fix a memory leak in the discard ioctl code, if the task is being interrupted by a signal at just the wrong time - Zoned write plugging fixes - Add ioctls for for persistent reservations - Enable per-cpu bio caching by default - Various little fixes and tweaks" * tag 'block-6.19-20251208' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: (27 commits) nvme-fabrics: add ENOKEY to no retry criteria for authentication failures nvme-auth: use kvfree() for memory allocated with kvcalloc() nvmet-tcp: use kvcalloc for commands array nvmet-rdma: use kvcalloc for commands and responses arrays nvme: fix typo error in nvme target nvmet-fc: use pr_* print macros instead of dev_* nvmet-fcloop: remove unused lsdir member. nvmet-fcloop: check all request and response have been processed nvme-fc: check all request and response have been processed block: fix memory leak in __blkdev_issue_zero_pages block: fix comment for op_is_zone_mgmt() to include RESET_ALL block: Clear BLK_ZONE_WPLUG_PLUGGED when aborting plugged BIOs blk-mq: Abort suspend when wakeup events are pending blk-mq: add blk_rq_nr_bvec() helper block: add IOC_PR_READ_RESERVATION ioctl block: add IOC_PR_READ_KEYS ioctl nvme: reject invalid pr_read_keys() num_keys values scsi: sd: reject invalid pr_read_keys() num_keys values block: enable per-cpu bio cache by default block: use bio_alloc_bioset for passthru IO by default ...	2025-12-09 08:53:24 +09:00
Linus Torvalds	7eb7f5723d	Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI updates from James Bottomley: "Usual driver updates (ufs, lpfc, target, qla2xxx) plus assorted cleanups and fixes including the WQ_PERCPU series. The biggest core change is the new allocation of pseudo-devices which allow the sending of internal commands to a given SCSI target" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (147 commits) scsi: MAINTAINERS: Add the UFS include directory scsi: scsi_debug: Support injecting unaligned write errors scsi: qla2xxx: Fix improper freeing of purex item scsi: ufs: rockchip: Fix compile error without CONFIG_GPIOLIB scsi: ufs: rockchip: Reset controller on PRE_CHANGE of hce enable notify scsi: ufs: core: Use scsi_device_busy() scsi: ufs: core: Fix single doorbell mode support scsi: pm80xx: Add WQ_PERCPU to alloc_workqueue() users scsi: target: Add WQ_PERCPU to alloc_workqueue() users scsi: qedi: Add WQ_PERCPU to alloc_workqueue() users scsi: target: ibmvscsi: Add WQ_PERCPU to alloc_workqueue() users scsi: qedf: Add WQ_PERCPU to alloc_workqueue() users scsi: bnx2fc: Add WQ_PERCPU to alloc_workqueue() users scsi: be2iscsi: Add WQ_PERCPU to alloc_workqueue() users scsi: message: fusion: Add WQ_PERCPU to alloc_workqueue() users scsi: lpfc: WQ_PERCPU added to alloc_workqueue() users scsi: scsi_transport_fc: WQ_PERCPU added to alloc_workqueue users() scsi: scsi_dh_alua: WQ_PERCPU added to alloc_workqueue() users scsi: qla2xxx: WQ_PERCPU added to alloc_workqueue() users scsi: target: sbp: Replace use of system_unbound_wq with system_dfl_wq ...	2025-12-05 19:56:50 -08:00
Linus Torvalds	43dfc13ca9	Merge tag 'pci-v6.19-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci Pull PCI updates from Bjorn Helgaas: "Enumeration: - Enable host bridge emulation for PCI_DOMAINS_GENERIC platforms (Dan Williams) - Switch vmd from custom domain number allocator to the common allocator to prevent a potential race with new non-VMD buses (Dan Williams) - Enable Precision Time Measurement (PTM) only if device advertises support for a relevant role, to prevent invalid PTM Requests that cause ACS violations that are reported as AER Uncorrectable Non-Fatal errors (Mika Westerberg) Resource management: - Prevent resource tree corruption when BAR resize fails (Ilpo Järvinen) - Restore BARs to the original size if a BAR resize fails (Ilpo Järvinen) - Remove BAR release from BAR resize attempts by the xe, i915, and amdgpu drivers so the PCI core can restore BARs if the resize fails (Ilpo Järvinen) - Move Resizable BAR code to rebar.c (Ilpo Järvinen) - Add pci_rebar_size_supported() and use it in i915 and xe (Ilpo Järvinen) - Add pci_rebar_get_max_size() and use it in xe and amdgpu (Ilpo Järvinen) Power management and error handling: - For drivers using PCI legacy suspend, save config state at suspend so that state (not any earlier state from enumeration, probe, or error recovery) will be restored when resuming (Lukas Wunner) - For devices with no driver or a driver that lacks power management, save config state at hibernate so that state (not any earlier state from enumeration, probe, or error recovery) will be restored when resuming (Lukas Wunner) - Save device config space on device addition, before driver binding, so error recovery works more reliably (Lukas Wunner) - Drop pci_save_state() from several drivers that no longer need it since the PCI core always does it and pci_restore_state() no longer invalidates the saved state (Lukas Wunner) - Document use of pci_save_state() by drivers to capture the state they want restored during error recovery (Lukas Wunner) Power control: - Add a struct pci_ops.assert_perst() function pointer to assert/deassert PCIe PERST# and implement it for the qcom driver (Krishna Chaitanya Chundru) - Add DT binding and pwrctrl driver for the Toshiba TC9563 PCIe switch, which must be held in reset after poweron so the pwrctrl driver can configure the switch via I2C before bringing up the links (Krishna Chaitanya Chundru) Endpoint framework: - Convert the endpoint doorbell test to use a threaded IRQ to fix a 'sleeping while atomic' issue (Bhanu Seshu Kumar Valluri) - Add endpoint VNTB MSI doorbell support to reduce latency between host and endpoint (Frank Li) New native PCIe controller drivers: - Add CIX Sky1 host controller DT binding and driver (Hans Zhang) - Add NXP S32G host controller DT binding and driver (Vincent Guittot) - Add Renesas RZ/G3S host controller DT binding and driver (Claudiu Beznea) - Add SpacemiT K1 host controller DT binding and driver (Alex Elder) Amlogic Meson PCIe controller driver: - Update DT binding to name DBI region 'dbi', not 'elbi', and update driver to support both (Manivannan Sadhasivam) Apple PCIe controller driver: - Move struct pci_host_bridge allocation from pci_host_common_init() to callers, which significantly simplifies pcie-apple (Marc Zyngier) Broadcom STB PCIe controller driver: - Disable advertising ASPM L0s support correctly (Jim Quinlan) - Add a panic/die handler to print diagnostic info in case PCIe caused an unrecoverable abort (Jim Quinlan) Cadence PCIe controller driver: - Add module support for Cadence platform host and endpoint controller driver (Manikandan K Pillai) - Split headers into 'legacy' (LGA) and 'high perf' (HPA) to prepare for new CIX Sky1 driver (Manikandan K Pillai) MediaTek PCIe controller driver: - Convert DT binding to YAML schema (Christian Marangi) - Add Airoha AN7583 DT compatible and driver support (Christian Marangi) Qualcomm PCIe controller driver: - Add Qualcomm Kaanapali to SM8550 DT binding (Qiang Yu) - Add required 'power-domains' and 'resets' to qcom sa8775p, sc7280, sc8280xp, sm8150, sm8250, sm8350, sm8450, sm8550, x1e80100 DT schemas (Krzysztof Kozlowski) - Look up OPP using both frequency and data rate (not just frequency) so RPMh votes can account for both (Krishna Chaitanya Chundru) Rockchip DesignWare PCIe controller driver: - Add Rockchip RK3528 compatible strings in DT binding (Yao Zi) STMicroelectronics STM32MP25 PCIe controller driver: - Fix a race between link training and endpoint register initialization (Christian Bruel) - Align endpoint allocations to match the ATU requirements (Christian Bruel) Synopsys DesignWare PCIe controller driver: - Clear L1 PM Substate Capability 'Supported' bits unless glue driver says it's supported, which prevents users from enabling non-working L1SS. Currently only qcom and tegra194 support L1SS (Bjorn Helgaas) - Remove now-superfluous L1SS disable code from tegra194 (Bjorn Helgaas) - Configure L1SS support in dw-rockchip when DT says 'supports-clkreq' (Shawn Lin) TI Keystone PCIe controller driver: - Fail the probe instead of silently succeeding if ks_pcie_of_data didn't specify Root Complex or Endpoint mode (Siddharth Vadapalli) - Make keystone buildable as a loadable module, except on ARM32 where hook_fault_code() is __init (Siddharth Vadapalli)" * tag 'pci-v6.19-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (100 commits) MAINTAINERS: Add Manivannan Sadhasivam as PCI/pwrctrl maintainer MAINTAINERS: Add CIX Sky1 PCIe controller driver maintainer PCI: sky1: Add PCIe host support for CIX Sky1 dt-bindings: PCI: Add CIX Sky1 PCIe Root Complex bindings PCI: cadence: Add support for High Perf Architecture (HPA) controller MAINTAINERS: Add NXP S32G PCIe controller driver maintainer PCI: s32g: Add NXP S32G PCIe controller driver (RC) PCI: dwc: Add register and bitfield definitions dt-bindings: PCI: s32g: Add NXP S32G PCIe controller PCI: Add Renesas RZ/G3S host controller driver PCI: host-generic: Move bridge allocation outside of pci_host_common_init() dt-bindings: PCI: Add Renesas RZ/G3S PCIe controller binding PCI: Validate pci_rebar_size_supported() input Documentation: PCI: Amend error recovery doc with pci_save_state() rules treewide: Drop pci_save_state() after pci_restore_state() PCI/ERR: Ensure error recoverability at all times PCI/PM: Stop needlessly clearing state_saved on enumeration and thaw PCI/PM: Reinstate clearing state_saved in legacy and !PM codepaths PCI: dw-rockchip: Configure L1SS support PCI: tegra194: Remove unnecessary L1SS disable code ...	2025-12-04 17:29:41 -08:00
Stefan Hajnoczi	ab4fb1d8f6	scsi: sd: reject invalid pr_read_keys() num_keys values The pr_read_keys() interface has a u32 num_keys parameter. The SCSI PERSISTENT RESERVE IN command has a maximum READ KEYS service action size of 65536 bytes. Reject num_keys values that are too large to fit into the SCSI command. This will become important when pr_read_keys() is exposed to untrusted userspace via an <linux/pr.h> ioctl. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-12-04 07:19:26 -07:00
Linus Torvalds	cc25df3e2e	Merge tag 'for-6.19/block-20251201' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux Pull block updates from Jens Axboe: - Fix head insertion for mq-deadline, a regression from when priority support was added - Series simplifying and improving the ublk user copy code - Various ublk related cleanups - Fixup REQ_NOWAIT handling in loop/zloop, clearing NOWAIT when the request is punted to a thread for handling - Merge and then later revert loop dio nowait support, as it ended up causing excessive stack usage for when the inline issue code needs to dip back into the full file system code - Improve auto integrity code, making it less deadlock prone - Speedup polled IO handling, but manually managing the hctx lookups - Fixes for blk-throttle for SSD devices - Small series with fixes for the S390 dasd driver - Add support for caching zones, avoiding unnecessary report zone queries - MD pull requests via Yu: - fix null-ptr-dereference regression for dm-raid0 - fix IO hang for raid5 when array is broken with IO inflight - remove legacy 1s delay to speed up system shutdown - change maintainer's email address - data can be lost if array is created with different lbs devices, fix this problem and record lbs of the array in metadata - fix rcu protection for md_thread - fix mddev kobject lifetime regression - enable atomic writes for md-linear - some cleanups - bcache updates via Coly - remove useless discard and cache device code - improve usage of per-cpu workqueues - Reorganize the IO scheduler switching code, fixing some lockdep reports as well - Improve the block layer P2P DMA support - Add support to the block tracing code for zoned devices - Segment calculation improves, and memory alignment flexibility improvements - Set of prep and cleanups patches for ublk batching support. The actual batching hasn't been added yet, but helps shrink down the workload of getting that patchset ready for 6.20 - Fix for how the ps3 block driver handles segments offsets - Improve how block plugging handles batch tag allocations - nbd fixes for use-after-free of the configuration on device clear/put - Set of improvements and fixes for zloop - Add Damien as maintainer of the block zoned device code handling - Various other fixes and cleanups * tag 'for-6.19/block-20251201' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: (162 commits) block/rnbd: correct all kernel-doc complaints blk-mq: use queue_hctx in blk_mq_map_queue_type md: remove legacy 1s delay in md_notify_reboot md/raid5: fix IO hang when array is broken with IO inflight md: warn about updating super block failure md/raid0: fix NULL pointer dereference in create_strip_zones() for dm-raid sbitmap: fix all kernel-doc warnings ublk: add helper of __ublk_fetch() ublk: pass const pointer to ublk_queue_is_zoned() ublk: refactor auto buffer register in ublk_dispatch_req() ublk: add `union ublk_io_buf` with improved naming ublk: add parameter `struct io_uring_cmd *` to ublk_prep_auto_buf_reg() kfifo: add kfifo_alloc_node() helper for NUMA awareness blk-mq: fix potential uaf for 'queue_hw_ctx' blk-mq: use array manage hctx map instead of xarray ublk: prevent invalid access with DEBUG s390/dasd: Use scnprintf() instead of sprintf() s390/dasd: Move device name formatting into separate function s390/dasd: Remove unnecessary debugfs_create() return checks s390/dasd: Fix gendisk parent after copy pair swap ...	2025-12-03 19:26:18 -08:00
Linus Torvalds	4d38b88fd1	Merge tag 'printk-for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux Pull printk updates from Petr Mladek: - Allow creaing nbcon console drivers with an unsafe write_atomic() callback that can only be called by the final nbcon_atomic_flush_unsafe(). Otherwise, the driver would rely on the kthread. It is going to be used as the-best-effort approach for an experimental nbcon netconsole driver, see https://lore.kernel.org/r/20251121-nbcon-v1-2-503d17b2b4af@debian.org Note that a safe .write_atomic() callback is supposed to work in NMI context. But some networking drivers are not safe even in IRQ context: https://lore.kernel.org/r/oc46gdpmmlly5o44obvmoatfqo5bhpgv7pabpvb6sjuqioymcg@gjsma3ghoz35 In an ideal world, all networking drivers would be fixed first and the atomic flush would be blocked only in NMI context. But it brings the question how reliable networking drivers are when the system is in a bad state. They might block flushing more reliable serial consoles which are more suitable for serious debugging anyway. - Allow to use the last 4 bytes of the printk ring buffer. - Prevent queuing IRQ work and block printk kthreads when consoles are suspended. Otherwise, they create non-necessary churn or even block the suspend. - Release console_lock() between each record in the kthread used for legacy consoles on RT. It might significantly speed up the boot. - Release nbcon context between each record in the atomic flush. It prevents stalls of the related printk kthread after it has lost the ownership in the middle of a record - Add support for NBCON consoles into KDB - Add %ptsP modifier for printing struct timespec64 and use it where possible - Misc code clean up * tag 'printk-for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux: (48 commits) printk: Use console_is_usable on console_unblank arch: um: kmsg_dump: Use console_is_usable drivers: serial: kgdboc: Drop checks for CON_ENABLED and CON_BOOT lib/vsprintf: Unify FORMAT_STATE_NUM handlers printk: Avoid irq_work for printk_deferred() on suspend printk: Avoid scheduling irq_work on suspend printk: Allow printk_trigger_flush() to flush all types tracing: Switch to use %ptSp scsi: snic: Switch to use %ptSp scsi: fnic: Switch to use %ptSp s390/dasd: Switch to use %ptSp ptp: ocp: Switch to use %ptSp pps: Switch to use %ptSp PCI: epf-test: Switch to use %ptSp net: dsa: sja1105: Switch to use %ptSp mmc: mmc_test: Switch to use %ptSp media: av7110: Switch to use %ptSp ipmi: Switch to use %ptSp igb: Switch to use %ptSp e1000e: Switch to use %ptSp ...	2025-12-03 12:42:36 -08:00
Xingui Yang	278712d20b	scsi: Revert "scsi: libsas: Fix exp-attached device scan after probe failure scanned in again after probe failed" This reverts commit `ab2068a6fb`. When probing the exp-attached sata device, libsas/libata will issue a hard reset in sas_probe_sata() -> ata_sas_async_probe(), then a broadcast event will be received after the disk probe fails, and this commit causes the probe will be re-executed on the disk, and a faulty disk may get into an indefinite loop of probe. Therefore, revert this commit, although it can fix some temporary issues with disk probe failure. Signed-off-by: Xingui Yang <yangxingui@huawei.com> Reviewed-by: Jason Yan <yanaijie@huawei.com> Reviewed-by: John Garry <john.g.garry@oracle.com> Link: https://patch.msgid.link/20251202065627.140361-1-yangxingui@huawei.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2025-12-03 01:36:13 -05:00
Shi Hao	9086cac895	scsi: qla4xxx: Use time conversion macros Replace the raw use of 500 value in schedule_timeout() function with msecs_to_jiffies() to ensure intended value across different kernel configurations regardless of HZ value. Signed-off-by: Shi Hao <i.shihao.999@gmail.com> Link: https://patch.msgid.link/20251117200949.42557-1-i.shihao.999@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2025-11-29 16:37:44 -05:00
Wen Xiong	eaea513077	scsi: qla2xxx: Enable/disable IRQD_NO_BALANCING during reset A dynamic remove/add storage adapter test hits EEH on PowerPC: EEH: [c00000000004f77c] __eeh_send_failure_event+0x7c/0x160 EEH: [c000000000048464] eeh_dev_check_failure.part.0+0x254/0x660 EEH: [c000000000934e0c] __pci_read_msi_msg+0x1ac/0x280 EEH: [c000000000100f68] pseries_msi_compose_msg+0x28/0x40 EEH: [c00000000020e1cc] irq_chip_compose_msi_msg+0x5c/0x90 EEH: [c000000000214b1c] msi_domain_set_affinity+0xbc/0x100 EEH: [c000000000206be4] irq_do_set_affinity+0x214/0x2c0 EEH: [c000000000206e04] irq_set_affinity_locked+0x174/0x230 EEH: [c000000000207044] irq_set_affinity+0x64/0xa0 EEH: [c000000000212890] write_irq_affinity.constprop.0.isra.0+0x130/0x150 EEH: [c00000000068868c] proc_reg_write+0xfc/0x160 EEH: [c0000000005adb48] vfs_write+0xf8/0x4e0 EEH: [c0000000005ae234] ksys_write+0x84/0x140 EEH: [c00000000002e994] system_call_exception+0x164/0x310 EEH: [c00000000000bfe8] system_call_vectored_common+0xe8/0x278 The irqbalance daemon kicks in before invoking qla2xxx->slot_reset during the EEH recovery process. irqbalance daemon ->irq_set_affinity() ->msi_domain_set_affinity() ->irq_chip_set_affiinity_parent() ->xive_irq_set_affinity() ->pseries_msi_compose_ms() ->__pci_read_msi_msg() ->irq_chip_compose_msi_msg() In __pci_read_msi_msg(), the first MSI-X vector is set to all F by the irqbalance daemon. pci_write_msg_msix: index=0, lo=ffffffff hi=fffffff IRQ balancing is not required during adapter reset. Enable "IRQ_NO_BALANCING" bit before starting adapter reset and disable it calling pci_restore_state(). The irqbalance daemon is disabled for this short period of time (~2s). Co-developed-by: Kyle Mahlkuch <Kyle.Mahlkuch@ibm.com> Signed-off-by: Kyle Mahlkuch <Kyle.Mahlkuch@ibm.com> Signed-off-by: Wen Xiong <wenxiong@linux.ibm.com> Link: https://patch.msgid.link/20251028142427.3969819-3-wenxiong@linux.ibm.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2025-11-29 15:56:18 -05:00
Wen Xiong	6ac3484fb1	scsi: ipr: Enable/disable IRQD_NO_BALANCING during reset A dynamic remove/add storage adapter test hits EEH on PowerPC: EEH: [c00000000004f75c] __eeh_send_failure_event+0x7c/0x160 EEH: [c000000000048444] eeh_dev_check_failure.part.0+0x254/0x650 EEH: [c008000001650678] eeh_readl+0x60/0x90 [ipr] EEH: [c00800000166746c] ipr_cancel_op+0x2b8/0x524 [ipr] EEH: [c008000001656524] ipr_eh_abort+0x6c/0x130 [ipr] EEH: [c000000000ab0d20] scmd_eh_abort_handler+0x140/0x440 EEH: [c00000000017e558] process_one_work+0x298/0x590 EEH: [c00000000017eef8] worker_thread+0xa8/0x620 EEH: [c00000000018be34] kthread+0x124/0x130 EEH: [c00000000000cd64] ret_from_kernel_thread+0x5c/0x64 A PCIe bus trace reveals that a vector of MSI-X is cleared to 0 by irqbalance daemon. If we disable irqbalance daemon, we won't see the issue. With debug enabled in ipr driver: [ 44.103071] ipr: Entering __ipr_remove [ 44.103083] ipr: Entering ipr_initiate_ioa_bringdown [ 44.103091] ipr: Entering ipr_reset_shutdown_ioa [ 44.103099] ipr: Leaving ipr_reset_shutdown_ioa [ 44.103105] ipr: Leaving ipr_initiate_ioa_bringdown [ 44.149918] ipr: Entering ipr_reset_ucode_download [ 44.149935] ipr: Entering ipr_reset_alert [ 44.150032] ipr: Entering ipr_reset_start_timer [ 44.150038] ipr: Leaving ipr_reset_alert [ 44.244343] scsi 1:2:3:0: alua: Detached [ 44.254300] ipr: Entering ipr_reset_start_bist [ 44.254320] ipr: Entering ipr_reset_start_timer [ 44.254325] ipr: Leaving ipr_reset_start_bist [ 44.364329] scsi 1:2:4:0: alua: Detached [ 45.134341] scsi 1:2:5:0: alua: Detached [ 45.860949] ipr: Entering ipr_reset_shutdown_ioa [ 45.860962] ipr: Leaving ipr_reset_shutdown_ioa [ 45.860966] ipr: Entering ipr_reset_alert [ 45.861028] ipr: Entering ipr_reset_start_timer [ 45.861035] ipr: Leaving ipr_reset_alert [ 45.964302] ipr: Entering ipr_reset_start_bist [ 45.964309] ipr: Entering ipr_reset_start_timer [ 45.964313] ipr: Leaving ipr_reset_start_bist [ 46.264301] ipr: Entering ipr_reset_bist_done [ 46.264309] ipr: Leaving ipr_reset_bist_done During adapter reset, ipr device driver blocks config space access but can't block MMIO access for MSI-X entries. There is very small window: irqbalance daemon kicks in during adapter reset before ipr driver calls pci_restore_state(pdev) to restore MSI-X table. irqbalance daemon reads back all 0 for that MSI-X vector in __pci_read_msi_msg(). irqbalance daemon: msi_domain_set_affinity() ->irq_chip_set_affinity_patent() ->xive_irq_set_affinity() ->irq_chip_compose_msi_msg() ->pseries_msi_compose_msg() ->__pci_read_msi_msg(): read all 0 since didn't call pci_restore_state ->irq_chip_write_msi_msg() -> pci_write_msg_msi(): write 0 to the msix vector entry When ipr driver calls pci_restore_state(pdev) in ipr_reset_restore_cfg_space(), the MSI-X vector entry has been cleared by irqbalance daemon in pci_write_msg_msix(). pci_restore_state() ->__pci_restore_msix_state() Below is the MSI-X table for ipr adapter after irqbalance daemon kicked in during adapter reset: Dump MSIx table: index=0 address_lo=c800 address_hi=10000000 msg_data=0 Dump MSIx table: index=1 address_lo=c810 address_hi=10000000 msg_data=0 Dump MSIx table: index=2 address_lo=c820 address_hi=10000000 msg_data=0 Dump MSIx table: index=3 address_lo=c830 address_hi=10000000 msg_data=0 Dump MSIx table: index=4 address_lo=c840 address_hi=10000000 msg_data=0 Dump MSIx table: index=5 address_lo=c850 address_hi=10000000 msg_data=0 Dump MSIx table: index=6 address_lo=c860 address_hi=10000000 msg_data=0 Dump MSIx table: index=7 address_lo=c870 address_hi=10000000 msg_data=0 Dump MSIx table: index=8 address_lo=0 address_hi=0 msg_data=0 ---------> Hit EEH since msix vector of index=8 are 0 Dump MSIx table: index=9 address_lo=c890 address_hi=10000000 msg_data=0 Dump MSIx table: index=10 address_lo=c8a0 address_hi=10000000 msg_data=0 Dump MSIx table: index=11 address_lo=c8b0 address_hi=10000000 msg_data=0 Dump MSIx table: index=12 address_lo=c8c0 address_hi=10000000 msg_data=0 Dump MSIx table: index=13 address_lo=c8d0 address_hi=10000000 msg_data=0 Dump MSIx table: index=14 address_lo=c8e0 address_hi=10000000 msg_data=0 Dump MSIx table: index=15 address_lo=c8f0 address_hi=10000000 msg_data=0 [ 46.264312] ipr: Entering ipr_reset_restore_cfg_space [ 46.267439] ipr: Entering ipr_fail_all_ops [ 46.267447] ipr: Leaving ipr_fail_all_ops [ 46.267451] ipr: Leaving ipr_reset_restore_cfg_space [ 46.267454] ipr: Entering ipr_ioa_bringdown_done [ 46.267458] ipr: Leaving ipr_ioa_bringdown_done [ 46.267467] ipr: Entering ipr_worker_thread [ 46.267470] ipr: Leaving ipr_worker_thread IRQ balancing is not required during adapter reset. Enable "IRQ_NO_BALANCING" flag before starting adapter reset and disable it after calling pci_restore_state(). The irqbalance daemon is disabled for this short period of time (~2s). Co-developed-by: Kyle Mahlkuch <Kyle.Mahlkuch@ibm.com> Signed-off-by: Kyle Mahlkuch <Kyle.Mahlkuch@ibm.com> Signed-off-by: Wen Xiong <wenxiong@linux.ibm.com> Link: https://patch.msgid.link/20251028142427.3969819-2-wenxiong@linux.ibm.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2025-11-29 15:56:18 -05:00
Duoming Zhou	ab58153ec6	scsi: imm: Fix use-after-free bug caused by unfinished delayed work The delayed work item 'imm_tq' is initialized in imm_attach() and scheduled via imm_queuecommand() for processing SCSI commands. When the IMM parallel port SCSI host adapter is detached through imm_detach(), the imm_struct device instance is deallocated. However, the delayed work might still be pending or executing when imm_detach() is called, leading to use-after-free bugs when the work function imm_interrupt() accesses the already freed imm_struct memory. The race condition can occur as follows: CPU 0(detach thread) \| CPU 1 \| imm_queuecommand() \| imm_queuecommand_lck() imm_detach() \| schedule_delayed_work() kfree(dev) //FREE \| imm_interrupt() \| dev = container_of(...) //USE dev-> //USE Add disable_delayed_work_sync() in imm_detach() to guarantee proper cancellation of the delayed work item before imm_struct is deallocated. Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> Link: https://patch.msgid.link/20251028100149.40721-1-duoming@zju.edu.cn Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2025-11-29 15:42:17 -05:00
Miao Li	c131c9bf98	scsi: core: Correct documentation for scsi_device_quiesce() If scsi_device_quiesce() returns zero, the function executed successfully. Signed-off-by: Miao Li <limiao@kylinos.cn> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://patch.msgid.link/20251124075444.32699-1-limiao870622@163.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2025-11-29 15:20:15 -05:00
Suganath Prabu S	4588e65cfd	scsi: mpi3mr: Prevent duplicate SAS/SATA device entries in channel 1 Avoid scanning SAS/SATA devices in channel 1 when SAS transport is enabled, as the SAS/SATA devices are exposed through channel 0. Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com> Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Link: https://lore.kernel.org/stable/20251120071955.463475-1-suganath-prabu.subramani%40broadcom.com Link: https://patch.msgid.link/20251120071955.463475-1-suganath-prabu.subramani@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2025-11-29 15:08:41 -05:00
Rafael J. Wysocki	f086594adb	Merge branch 'pm-sleep' Merge updates related to system suspend and hibernation for 6.19-rc1: - Replace snprintf() with scnprintf() in show_trace_dev_match() (Kaushlendra Kumar) - Fix memory allocation error handling in pm_vt_switch_required() (Malaya Kumar Rout) - Introduce CALL_PM_OP() macro and use it to simplify code in generic PM operations (Kaushlendra Kumar) - Add module param to backtrace all CPUs in the device power management watchdog (Sergey Senozhatsky) - Rework message printing in swsusp_save() (Rafael Wysocki) - Make it possible to change the number of hibernation compression threads (Xueqin Luo) - Clarify that only cgroup1 freezer uses PM freezer (Tejun Heo) - Add document on debugging shutdown hangs to PM documentation and correct a mistaken configuration option in it (Mario Limonciello) - Shut down wakeup source timer before removing the wakeup source from the list (Kaushlendra Kumar, Rafael Wysocki) - Introduce new PMSG_POWEROFF event for system shutdown handling with the help of PM device callbacks (Mario Limonciello) - Make pm_test delay interruptible by wakeup events (Riwen Lu) - Clean up kernel-doc comment style usage in the core hibernation code and remove unuseful comments from it (Sunday Adelodun, Rafael Wysocki) - Add support for handling wakeup events and aborting the suspend process while it is syncing file systems (Samuel Wu, Rafael Wysocki) * pm-sleep: (21 commits) PM: hibernate: Extra cleanup of comments in swap handling code PM: sleep: Call pm_sleep_fs_sync() instead of ksys_sync_helper() PM: sleep: Add support for wakeup during filesystem sync PM: hibernate: Clean up kernel-doc comment style usage PM: suspend: Make pm_test delay interruptible by wakeup events usb: sl811-hcd: Add PM_EVENT_POWEROFF into suspend callbacks scsi: Add PM_EVENT_POWEROFF into suspend callbacks PM: Introduce new PMSG_POWEROFF event PM: wakeup: Update after recent wakeup source removal ordering change PM: wakeup: Delete timer before removing wakeup source from list Documentation: power: Correct a mistaken configuration option Documentation: power: Add document on debugging shutdown hangs freezer: Clarify that only cgroup1 freezer uses PM freezer PM: hibernate: add sysfs interface for hibernate_compression_threads PM: hibernate: make compression threads configurable PM: hibernate: dynamically allocate crc->unc_len/unc for configurable threads PM: hibernate: Rework message printing in swsusp_save() PM: dpm_watchdog: add module param to backtrace all CPUs PM: sleep: Introduce CALL_PM_OP() macro to simplify code PM: console: Fix memory allocation error handling in pm_vt_switch_required() ...	2025-11-28 16:01:13 +01:00
Lukas Wunner	383d89699c	treewide: Drop pci_save_state() after pci_restore_state() In 2009, commit `c82f63e411` ("PCI: check saved state before restore") changed the behavior of pci_restore_state() such that it became necessary to call pci_save_state() afterwards, lest recovery from subsequent PCI errors fails. The commit has just been reverted and so all the pci_save_state() after pci_restore_state() calls that have accumulated in the tree are now superfluous. Drop them. Two drivers chose a different approach to achieve the same result: drivers/scsi/ipr.c and drivers/net/ethernet/intel/e1000e/netdev.c set the pci_dev's "state_saved" flag to true before calling pci_restore_state(). Drop this as well. Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Dave Jiang <dave.jiang@intel.com> Acked-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> # qat Link: https://patch.msgid.link/c2b28cc4defa1b743cf1dedee23c455be98b397a.1760274044.git.lukas@wunner.de	2025-11-24 16:58:59 -06:00
Martin K. Petersen	e54f7b4b81	Merge branch 6.18/scsi-fixes into 6.19/scsi-staging Pull in fixes branch to resolve UFS merge conflict. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2025-11-19 22:59:25 -05:00
Bart Van Assche	90449f2d1e	scsi: sg: Do not sleep in atomic context sg_finish_rem_req() calls blk_rq_unmap_user(). The latter function may sleep. Hence, call sg_finish_rem_req() with interrupts enabled instead of disabled. Reported-by: syzbot+c01f8e6e73f20459912e@syzkaller.appspotmail.com Closes: https://lore.kernel.org/linux-scsi/691560c4.a70a0220.3124cb.001a.GAE@google.com/ Cc: Hannes Reinecke <hare@suse.de> Cc: stable@vger.kernel.org Fixes: `97d27b0dd0` ("scsi: sg: close race condition in sg_remove_sfp_usercontext()") Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://patch.msgid.link/20251113181643.1108973-1-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2025-11-19 22:46:36 -05:00
Bart Van Assche	13b77ed9c2	scsi: scsi_debug: Support injecting unaligned write errors Allow user space software, e.g. a blktests test, to inject unaligned write errors. Acked-by: Douglas Gilbert <dgilbert@interlog.com> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Ming Lei <ming.lei@redhat.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://patch.msgid.link/20251113174151.1095574-1-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2025-11-19 22:42:09 -05:00
Zilin Guan	78b1a242fe	scsi: qla2xxx: Fix improper freeing of purex item In qla2xxx_process_purls_iocb(), an item is allocated via qla27xx_copy_multiple_pkt(), which internally calls qla24xx_alloc_purex_item(). The qla24xx_alloc_purex_item() function may return a pre-allocated item from a per-adapter pool for small allocations, instead of dynamically allocating memory with kzalloc(). An error handling path in qla2xxx_process_purls_iocb() incorrectly uses kfree() to release the item. If the item was from the pre-allocated pool, calling kfree() on it is a bug that can lead to memory corruption. Fix this by using the correct deallocation function, qla24xx_free_purex_item(), which properly handles both dynamically allocated and pre-allocated items. Fixes: `875386b988` ("scsi: qla2xxx: Add Unsolicited LS Request and Response Support for NVMe") Signed-off-by: Zilin Guan <zilin@seu.edu.cn> Reviewed-by: Himanshu Madhani <hmadhani2024@gmail.com> Link: https://patch.msgid.link/20251113151246.762510-1-zilin@seu.edu.cn Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2025-11-19 22:38:27 -05:00
Andy Shevchenko	7b040d4571	scsi: snic: Switch to use %ptSp Use %ptSp instead of open coded variants to print content of struct timespec64 in human readable format. Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Karan Tilak Kumar <kartilak@cisco.com> Link: https://patch.msgid.link/20251113150217.3030010-21-andriy.shevchenko@linux.intel.com Signed-off-by: Petr Mladek <pmladek@suse.com>	2025-11-19 12:30:11 +01:00
Andy Shevchenko	d710741f83	scsi: fnic: Switch to use %ptSp Use %ptSp instead of open coded variants to print content of struct timespec64 in human readable format. Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Karan Tilak Kumar <kartilak@cisco.com> Link: https://patch.msgid.link/20251113150217.3030010-20-andriy.shevchenko@linux.intel.com [pmladek@suse.com: Fixed output ordering and last_read_time update.] Signed-off-by: Petr Mladek <pmladek@suse.com>	2025-11-19 12:28:03 +01:00
Rafael J. Wysocki	37d6d92fe0	Merge back earlier material related to system sleep for 6.19	2025-11-17 16:55:55 +01:00
Mario Limonciello (AMD)	988dd0bd91	scsi: Add PM_EVENT_POWEROFF into suspend callbacks If the PM core uses hibernation callbacks for powering off the system, drivers will receive PM_EVENT_POWEROFF and should handle it the same as they previously handled PM_EVENT_HIBERNATE. Support this case in the scsi driver. No functional changes. Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Tested-by: Eric Naim <dnaim@cachyos.org> Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org> Link: https://patch.msgid.link/20251112224025.2051702-3-superm1@kernel.org Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2025-11-14 17:05:53 +01:00
Martin K. Petersen	e360bb6dc8	Merge patch series "replace old wq(s), added WQ_PERCPU to alloc_workqueue" Marco Crivellari <marco.crivellari@suse.com> says: Hi, === Current situation: problems === Let's consider a nohz_full system with isolated CPUs: wq_unbound_cpumask is set to the housekeeping CPUs, for !WQ_UNBOUND the local CPU is selected. This leads to different scenarios if a work item is scheduled on an isolated CPU where "delay" value is 0 or greater then 0: schedule_delayed_work(, 0); This will be handled by __queue_work() that will queue the work item on the current local (isolated) CPU, while: schedule_delayed_work(, 1); Will move the timer on an housekeeping CPU, and schedule the work there. Currently if a user enqueue a work item using schedule_delayed_work() the used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to schedule_work() that is using system_wq and queue_work(), that makes use again of WORK_CPU_UNBOUND. This lack of consistency cannot be addressed without refactoring the API. === Recent changes to the WQ API === The following, address the recent changes in the Workqueue API: - commit `128ea9f6cc` ("workqueue: Add system_percpu_wq and system_dfl_wq") - commit `930c2ea566` ("workqueue: Add new WQ_PERCPU flag") The old workqueues will be removed in a future release cycle. === Introduced Changes by this series === 1) [P 1] Replace uses of system_wq and system_unbound_wq system_unbound_wq is to be used when locality is not required. Because of that, system_unbound_wq has been replaced with system_dfl_wq, to make clear it should be used when locality is not required. 2) [P 2-3-4] WQ_PERCPU added to alloc_workqueue() This change adds a new WQ_PERCPU flag to explicitly request alloc_workqueue() to be per-cpu when WQ_UNBOUND has not been specified. Thanks! Link: https://patch.msgid.link/20251031095643.74246-1-marco.crivellari@suse.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2025-11-12 21:30:23 -05:00
Marco Crivellari	8d5cad38cf	scsi: pm80xx: Add WQ_PERCPU to alloc_workqueue() users Currently if a user enqueues a work item using schedule_delayed_work() the used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to schedule_work() that is using system_wq and queue_work(), that makes use again of WORK_CPU_UNBOUND. This lack of consistency cannot be addressed without refactoring the API. alloc_workqueue() treats all queues as per-CPU by default, while unbound workqueues must opt-in via WQ_UNBOUND. This default is suboptimal: most workloads benefit from unbound queues, allowing the scheduler to place worker threads where they’re needed and reducing noise when CPUs are isolated. This continues the effort to refactor workqueue APIs, which began with the introduction of new workqueues and a new alloc_workqueue flag in: commit `128ea9f6cc` ("workqueue: Add system_percpu_wq and system_dfl_wq") commit `930c2ea566` ("workqueue: Add new WQ_PERCPU flag") This adds a new WQ_PERCPU flag to explicitly request to alloc_workqueue() to be per-cpu when WQ_UNBOUND has not been specified. With the introduction of the WQ_PERCPU flag (equivalent to !WQ_UNBOUND), any alloc_workqueue() caller that doesn’t explicitly specify WQ_UNBOUND must now use WQ_PERCPU. Once migration is complete, WQ_UNBOUND can be removed and unbound will become the implicit default. Suggested-by: Tejun Heo <tj@kernel.org> Signed-off-by: Marco Crivellari <marco.crivellari@suse.com> Link: https://patch.msgid.link/20251107155257.316728-1-marco.crivellari@suse.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2025-11-12 21:28:27 -05:00
Marco Crivellari	f60b8957d8	scsi: qedi: Add WQ_PERCPU to alloc_workqueue() users Currently if a user enqueues a work item using schedule_delayed_work() the used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to schedule_work() that is using system_wq and queue_work(), that makes use again of WORK_CPU_UNBOUND. This lack of consistency cannot be addressed without refactoring the API. alloc_workqueue() treats all queues as per-CPU by default, while unbound workqueues must opt-in via WQ_UNBOUND. This default is suboptimal: most workloads benefit from unbound queues, allowing the scheduler to place worker threads where they’re needed and reducing noise when CPUs are isolated. This continues the effort to refactor workqueue APIs, which began with the introduction of new workqueues and a new alloc_workqueue flag in: commit `128ea9f6cc` ("workqueue: Add system_percpu_wq and system_dfl_wq") commit `930c2ea566` ("workqueue: Add new WQ_PERCPU flag") This change adds a new WQ_PERCPU flag to explicitly request alloc_workqueue() to be per-cpu when WQ_UNBOUND has not been specified. With the introduction of the WQ_PERCPU flag (equivalent to !WQ_UNBOUND), any alloc_workqueue() caller that doesn’t explicitly specify WQ_UNBOUND must now use WQ_PERCPU. Once migration is complete, WQ_UNBOUND can be removed and unbound will become the implicit default. Suggested-by: Tejun Heo <tj@kernel.org> Signed-off-by: Marco Crivellari <marco.crivellari@suse.com> Link: https://patch.msgid.link/20251107151618.281250-1-marco.crivellari@suse.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2025-11-12 21:28:26 -05:00
Marco Crivellari	42312d3acd	scsi: target: ibmvscsi: Add WQ_PERCPU to alloc_workqueue() users Currently if a user enqueues a work item using schedule_delayed_work() the used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to schedule_work() that is using system_wq and queue_work(), that makes use again of WORK_CPU_UNBOUND. This lack of consistency cannot be addressed without refactoring the API. alloc_workqueue() treats all queues as per-CPU by default, while unbound workqueues must opt-in via WQ_UNBOUND. This default is suboptimal: most workloads benefit from unbound queues, allowing the scheduler to place worker threads where they’re needed and reducing noise when CPUs are isolated. This continues the effort to refactor workqueue APIs, which began with the introduction of new workqueues and a new alloc_workqueue flag in: commit `128ea9f6cc` ("workqueue: Add system_percpu_wq and system_dfl_wq") commit `930c2ea566` ("workqueue: Add new WQ_PERCPU flag") This change adds a new WQ_PERCPU flag to explicitly request alloc_workqueue() to be per-cpu when WQ_UNBOUND has not been specified. With the introduction of the WQ_PERCPU flag (equivalent to !WQ_UNBOUND), any alloc_workqueue() caller that doesn’t explicitly specify WQ_UNBOUND must now use WQ_PERCPU. Once migration is complete, WQ_UNBOUND can be removed and unbound will become the implicit default. Suggested-by: Tejun Heo <tj@kernel.org> Signed-off-by: Marco Crivellari <marco.crivellari@suse.com> Link: https://patch.msgid.link/20251107150542.271229-1-marco.crivellari@suse.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2025-11-12 21:28:26 -05:00
Marco Crivellari	e036dadf78	scsi: qedf: Add WQ_PERCPU to alloc_workqueue() users Currently if a user enqueues a work item using schedule_delayed_work() the used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to schedule_work() that is using system_wq and queue_work(), that makes use again of WORK_CPU_UNBOUND. This lack of consistency cannot be addressed without refactoring the API. alloc_workqueue() treats all queues as per-CPU by default, while unbound workqueues must opt-in via WQ_UNBOUND. This default is suboptimal: most workloads benefit from unbound queues, allowing the scheduler to place worker threads where they’re needed and reducing noise when CPUs are isolated. This continues the effort to refactor workqueue APIs, which began with the introduction of new workqueues and a new alloc_workqueue flag in: commit `128ea9f6cc` ("workqueue: Add system_percpu_wq and system_dfl_wq") commit `930c2ea566` ("workqueue: Add new WQ_PERCPU flag") This change adds a new WQ_PERCPU flag to explicitly request alloc_workqueue() to be per-cpu when WQ_UNBOUND has not been specified. With the introduction of the WQ_PERCPU flag (equivalent to !WQ_UNBOUND), any alloc_workqueue() caller that doesn’t explicitly specify WQ_UNBOUND must now use WQ_PERCPU. Once migration is complete, WQ_UNBOUND can be removed and unbound will become the implicit default. Suggested-by: Tejun Heo <tj@kernel.org> Signed-off-by: Marco Crivellari <marco.crivellari@suse.com> Link: https://patch.msgid.link/20251107150155.267651-3-marco.crivellari@suse.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2025-11-12 21:28:26 -05:00
Marco Crivellari	a43a2e48d5	scsi: bnx2fc: Add WQ_PERCPU to alloc_workqueue() users Currently if a user enqueues a work item using schedule_delayed_work() the used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to schedule_work() that is using system_wq and queue_work(), that makes use again of WORK_CPU_UNBOUND. This lack of consistency cannot be addressed without refactoring the API. alloc_workqueue() treats all queues as per-CPU by default, while unbound workqueues must opt-in via WQ_UNBOUND. This default is suboptimal: most workloads benefit from unbound queues, allowing the scheduler to place worker threads where they’re needed and reducing noise when CPUs are isolated. This continues the effort to refactor workqueue APIs, which began with the introduction of new workqueues and a new alloc_workqueue flag in: commit `128ea9f6cc` ("workqueue: Add system_percpu_wq and system_dfl_wq") commit `930c2ea566` ("workqueue: Add new WQ_PERCPU flag") This change adds a new WQ_PERCPU flag to explicitly request alloc_workqueue() to be per-cpu when WQ_UNBOUND has not been specified. With the introduction of the WQ_PERCPU flag (equivalent to !WQ_UNBOUND), any alloc_workqueue() caller that doesn’t explicitly specify WQ_UNBOUND must now use WQ_PERCPU. Once migration is complete, WQ_UNBOUND can be removed and unbound will become the implicit default. Suggested-by: Tejun Heo <tj@kernel.org> Signed-off-by: Marco Crivellari <marco.crivellari@suse.com> Link: https://patch.msgid.link/20251107150155.267651-2-marco.crivellari@suse.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2025-11-12 21:28:26 -05:00
Marco Crivellari	6184ec8b63	scsi: be2iscsi: Add WQ_PERCPU to alloc_workqueue() users Currently if a user enqueues a work item using schedule_delayed_work() the used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to schedule_work() that is using system_wq and queue_work(), that makes use again of WORK_CPU_UNBOUND. This lack of consistency cannot be addressed without refactoring the API. alloc_workqueue() treats all queues as per-CPU by default, while unbound workqueues must opt-in via WQ_UNBOUND. This default is suboptimal: most workloads benefit from unbound queues, allowing the scheduler to place worker threads where they’re needed and reducing noise when CPUs are isolated. This continues the effort to refactor workqueue APIs, which began with the introduction of new workqueues and a new alloc_workqueue flag in: commit `128ea9f6cc` ("workqueue: Add system_percpu_wq and system_dfl_wq") commit `930c2ea566` ("workqueue: Add new WQ_PERCPU flag") This change adds a new WQ_PERCPU flag to explicitly request alloc_workqueue() to be per-cpu when WQ_UNBOUND has not been specified. With the introduction of the WQ_PERCPU flag (equivalent to !WQ_UNBOUND), any alloc_workqueue() caller that doesn’t explicitly specify WQ_UNBOUND must now use WQ_PERCPU. Once migration is complete, WQ_UNBOUND can be removed and unbound will become the implicit default. Suggested-by: Tejun Heo <tj@kernel.org> Signed-off-by: Marco Crivellari <marco.crivellari@suse.com> Link: https://patch.msgid.link/20251107144949.256894-1-marco.crivellari@suse.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2025-11-12 21:28:26 -05:00
Marco Crivellari	84150ef06f	scsi: lpfc: WQ_PERCPU added to alloc_workqueue() users Currently if a user enqueue a work item using schedule_delayed_work() the used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to schedule_work() that is using system_wq and queue_work(), that makes use again of WORK_CPU_UNBOUND. This lack of consistentcy cannot be addressed without refactoring the API. alloc_workqueue() treats all queues as per-CPU by default, while unbound workqueues must opt-in via WQ_UNBOUND. This default is suboptimal: most workloads benefit from unbound queues, allowing the scheduler to place worker threads where they’re needed and reducing noise when CPUs are isolated. This change adds a new WQ_PERCPU flag to explicitly request alloc_workqueue() to be per-cpu when WQ_UNBOUND has not been specified. This patch continues the effort to refactor worqueue APIs, which has begun with the change introducing new workqueues and a new alloc_workqueue flag: commit `128ea9f6cc` ("workqueue: Add system_percpu_wq and system_dfl_wq") commit `930c2ea566` ("workqueue: Add new WQ_PERCPU flag") With the introduction of the WQ_PERCPU flag (equivalent to !WQ_UNBOUND), any alloc_workqueue() caller that doesn’t explicitly specify WQ_UNBOUND must now use WQ_PERCPU. Once migration is complete, WQ_UNBOUND can be removed and unbound will become the implicit default. Suggested-by: Tejun Heo <tj@kernel.org> Signed-off-by: Marco Crivellari <marco.crivellari@suse.com> Reviewed-by: Justin Tee <justin.tee@broadcom.com> Link: https://patch.msgid.link/20251104110808.123424-1-marco.crivellari@suse.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2025-11-12 21:28:26 -05:00
Marco Crivellari	f76e4e1e83	scsi: scsi_transport_fc: WQ_PERCPU added to alloc_workqueue users() Currently if a user enqueue a work item using schedule_delayed_work() the used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to schedule_work() that is using system_wq and queue_work(), that makes use again of WORK_CPU_UNBOUND. This lack of consistentcy cannot be addressed without refactoring the API. alloc_workqueue() treats all queues as per-CPU by default, while unbound workqueues must opt-in via WQ_UNBOUND. This default is suboptimal: most workloads benefit from unbound queues, allowing the scheduler to place worker threads where they’re needed and reducing noise when CPUs are isolated. This change adds a new WQ_PERCPU flag to explicitly request alloc_workqueue() to be per-cpu when WQ_UNBOUND has not been specified. With the introduction of the WQ_PERCPU flag (equivalent to !WQ_UNBOUND), any alloc_workqueue() caller that doesn’t explicitly specify WQ_UNBOUND must now use WQ_PERCPU. Once migration is complete, WQ_UNBOUND can be removed and unbound will become the implicit default. Suggested-by: Tejun Heo <tj@kernel.org> Signed-off-by: Marco Crivellari <marco.crivellari@suse.com> Reviewed-by: Justin Tee <justin.tee@broadcom.com> Link: https://patch.msgid.link/20251031095643.74246-5-marco.crivellari@suse.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2025-11-12 21:28:26 -05:00

1 2 3 4 5 ...

25749 Commits