We needed fake nodes in __io_run_local_work() and to avoid unecessary wake
ups while the task already running task_works, but we don't need them
anymore since wake ups are protected by cq_waiting, which is always
cleared by the time we're executing deferred task_work items.
Note that because of loose sync around cq_waiting clearing
io_req_local_work_add() may wake the task more than once, but that's
fine and should be rare to not hurt perf.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/8839534891f0a2f1076e78554a31ea7e099f7de5.1673274244.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Flush completions is done either from the submit syscall or by the
task_work, both are in the context of the submitter task, and when it
goes for a single threaded rings like implied by ->task_complete, there
won't be any waiters on ->cq_wait but the master task. That means that
there can be no tasks sleeping on cq_wait while we run
__io_submit_flush_completions() and so waking up can be skipped.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/60ad9768ec74435a0ddaa6eec0ffa7729474f69f.1673274244.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Even though io_poll_wq_wake()'s waitqueue_active reuses a barrier we do
for another waitqueue, it's not going to be the case in the future and
so we want to have a fast path for it when the ring has never been
polled.
Move poll_wq wake ups into __io_commit_cqring_flush() using a new flag
called ->poll_activated. The idea behind the flag is to set it when the
ring was polled for the first time. This requires additional sync to not
miss events, which is done here by using task_work for ->task_complete
rings, and by default enabling the flag for all other types of rings.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/060785e8e9137a920b232c0c7f575b131af19cac.1673274244.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
This patch adds a new flag (IORING_MSG_RING_FLAGS_PASS) in the message
ring operations (IORING_OP_MSG_RING). This new flag enables the sender
to specify custom flags, which will be copied over to cqe->flags in the
receiving ring. These custom flags should be specified using the
sqe->file_index field.
This mechanism provides additional flexibility when sending messages
between rings.
Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://lore.kernel.org/r/20230103160507.617416-1-leitao@debian.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
io_cqring_wait_schedule() is called after we started waiting on the cq
wq and set the state to TASK_INTERRUPTIBLE, for that reason we have to
constantly worry whether we has returned the state back to running or
not. Leave only quick checks in io_cqring_wait_schedule() and move the
rest including running task work to the callers. Note, we run tw in the
loop after the sched checks because of the fast path in the beginning of
the function.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/2814fabe75e2e019e7ca43ea07daa94564349805.1672916894.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pull irq fix from Borislav Petkov:
- Cleanup the firmware node for the new IRQ MSI domain properly, to
avoid leaking memory
* tag 'irq_urgent_for_v6.2_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
genirq/msi: Free the fwnode created by msi_create_device_irq_domain()
Pull x86 fixes from Borislav Petkov:
- Start checking for -mindirect-branch-cs-prefix clang support too now
that LLVM 16 will support it
- Fix a NULL ptr deref when suspending with Xen PV
- Have a SEV-SNP guest check explicitly for features enabled by the
hypervisor and fail gracefully if some are unsupported by the guest
instead of failing in a non-obvious and hard-to-debug way
- Fix a MSI descriptor leakage under Xen
- Mark Xen's MSI domain as supporting MSI-X
- Prevent legacy PIC interrupts from being resent in software by
marking them level triggered, as they should be, which lead to a NULL
ptr deref
* tag 'x86_urgent_for_v6.2_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/build: Move '-mindirect-branch-cs-prefix' out of GCC-only block
acpi: Fix suspend with Xen PV
x86/sev: Add SEV-SNP guest feature negotiation support
x86/pci/xen: Fixup fallout from the PCI/MSI overhaul
x86/pci/xen: Set MSI_FLAG_PCI_MSIX support in Xen MSI domain
x86/i8259: Mark legacy PIC interrupts with IRQ_LEVEL
Pull input fixes from Dmitry Torokhov:
- touchpads on HP 15-* laptops switched back to PS/2 emulation mode
- a quirk for Clevo PCX0DX/TUXEDO XP1511 to make sure keyboard is
responding after resume
* tag 'input-for-v6.2-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: i8042 - add Clevo PCX0DX to i8042 quirk table
Revert "Input: synaptics - switch touchpad on HP Laptop 15-da3001TU to RMI mode"
Pull cxl fixes from Dan Williams:
"A couple of fixes for bugs introduced during the merge window. One is
a regression, the other was a bug in the CXL AER handler:
- Fix a crash regression due to module load order of cxl_pmem.ko
- Fix wrong register offset read in CXL AER handling path"
* tag 'cxl-fixes-for-6.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl:
cxl/pmem: Fix nvdimm unregistration when cxl_pmem driver is absent
cxl: fix cxl_report_and_clear() RAS UE addr mis-assignment
This reverts commit 7efc3b7261.
We have got openSUSE reports (Link 1) for 6.1 kernel with khugepaged
stalling CPU for long periods of time. Investigation of tracepoint data
shows that compaction is stuck in repeating fast_find_migrateblock()
based migrate page isolation, and then fails to migrate all isolated
pages.
Commit 7efc3b7261 ("mm/compaction: fix set skip in fast_find_migrateblock")
was suspected as it was merged in 6.1 and in theory can indeed remove a
termination condition for fast_find_migrateblock() under certain
conditions, as it removes a place that always marks a scanned pageblock
from being re-scanned. There are other such places, but those can be
skipped under certain conditions, which seems to match the tracepoint
data.
Testing of revert also appears to have resolved the issue, thus revert
the commit until a more robust solution for the original problem is
developed.
It's also likely this will fix qemu stalls with 6.1 kernel reported in
Link 2, but that is not yet confirmed.
Link: https://bugzilla.suse.com/show_bug.cgi?id=1206848
Link: https://lore.kernel.org/kvm/b8017e09-f336-3035-8344-c549086c2340@kernel.org/
Link: https://lore.kernel.org/lkml/20230125134434.18017-1-mgorman@techsingularity.net/
Fixes: 7efc3b7261 ("mm/compaction: fix set skip in fast_find_migrateblock")
Cc: <stable@vger.kernel.org>
Tested-by: Pedro Falcato <pedro.falcato@gmail.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe found another DT file that shouldn't be executable, and that
frustrated me enough that I went hunting with this script:
git ls-files -s |
grep '^100755' |
cut -f2 |
xargs grep -L '^#!'
and that found another file that shouldn't have been marked executable
either, despite being in the scripts directory.
Maybe these two are the last ones at least for now. But I'm sure we'll
be back in a few years, fixing things up again.
Fixes: 8c6789f4e2 ("ASoC: dt-bindings: Add Everest ES8326 audio CODEC")
Fixes: 4d8e5cd233 ("locking/atomics: Fix scripts/atomic/ script permissions")
Reported-by: Joe Perches <joe@perches.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull ksmbd server fixes from Steve French:
"Four smb3 server fixes, all also for stable:
- fix for signing bug
- fix to more strictly check packet length
- add a max connections parm to limit simultaneous connections
- fix error message flood that can occur with newer Samba xattr
format"
* tag '6.2-rc5-ksmbd-server-fixes' of git://git.samba.org/ksmbd:
ksmbd: downgrade ndr version error message to debug
ksmbd: limit pdu length size according to connection status
ksmbd: do not sign response to session request for guest login
ksmbd: add max connections parameter
Pull cifs fix from Steve French:
"Fix for reconnect oops in smbdirect (RDMA), also is marked for stable"
* tag '6.2-rc5-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
cifs: Fix oops due to uncleared server->smbd_conn in reconnect
Pull io_uring fixes from Jens Axboe:
"Two small fixes for this release:
- Sanitize how async prep is done for drain requests, so we ensure
that it always gets done (Dylan)
- A ring provided buffer recycling fix for multishot receive (me)"
* tag 'io_uring-6.2-2023-01-27' of git://git.kernel.dk/linux:
io_uring: always prep_async for drain requests
io_uring/net: cache provided buffer group value for multishot receives
Pull tracing fixes from Steven Rostedt:
- Fix filter memory leak by calling ftrace_free_filter()
- Initialize trace_printk() earlier so that ftrace_dump_on_oops shows
data on early crashes.
- Update the outdated instructions in scripts/tracing/ftrace-bisect.sh
- Add lockdep_is_held() to fix lockdep warning
- Add allocation failure check in create_hist_field()
- Don't initialize pointer that gets set right away in enabled_monitors_write()
- Update MAINTAINER entries
- Fix help messages in Kconfigs
- Fix kernel-doc header for update_preds()
* tag 'trace-v6.2-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
bootconfig: Update MAINTAINERS file to add tree and mailing list
rv: remove redundant initialization of pointer ptr
ftrace: Maintain samples/ftrace
tracing/filter: fix kernel-doc warnings
lib: Kconfig: fix spellos
trace_events_hist: add check for return value of 'create_hist_field'
tracing/osnoise: Use built-in RCU list checking
tracing: Kconfig: Fix spelling/grammar/punctuation
ftrace/scripts: Update the instructions for ftrace-bisect.sh
tracing: Make sure trace_printk() can output as soon as it can be used
ftrace: Export ftrace_free_filter() to modules
Pull i2c fixes from Wolfram Sang:
"A bunch of driver fixes with a tiny bit of new IDs"
* tag 'i2c-for-6.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: rk3x: fix a bunch of kernel-doc warnings
i2c: axxia: use 'struct' for kernel-doc notation
dt-bindings: i2c: renesas,rzv2m: Fix SoC specific string
i2c: mxs: suppress probe-deferral error message
i2c: designware-pci: Add new PCI IDs for AMD NAVI GPU
i2c: designware: Fix unbalanced suspended flag
i2c: designware: use casting of u64 in clock multiplication to avoid overflow
Pull gpio fixes from Bartosz Golaszewski:
- fix the -c option in the gpio-event-mode user-space example program
- fix the irq number translation in gpio-ep93xx and make its irqchip
immutable
- add a missing spin_unlock in error path in gpio-mxc
- fix a suspend breakage on System76 and Lenovo Gen2a introduced in
GPIO ACPI
* tag 'gpio-fixes-for-v6.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
tools: gpio: fix -c option of gpio-event-mon
gpio: ep93xx: remove unused variable
gpio: ep93xx: Make irqchip immutable
gpio: ep93xx: Fix port F hwirq numbers in handler
gpio: mxc: Unlock on error path in mxc_flip_edge()
gpiolib-acpi: Don't set GPIOs for wakeup in S3 mode
Pull regulator fix from Mark Brown:
"A fix for the DT binding documentation which dropped a property when
being converted to YAML format causing spurious errors validating
device trees for platforms using the device"
* tag 'regulator-fix-v6.2-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
regulator: dt-bindings: samsung,s2mps14: add lost samsung,ext-control-gpios
Pull overlayfs fixes from Miklos Szeredi:
"Fix two bugs, a recent one introduced in the last cycle, and an older
one from v5.11"
* tag 'ovl-fixes-6.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
ovl: fail on invalid uid/gid mapping at copy up
ovl: fix tmpfile leak
Pull drm fixes from Dave Airlie:
"Fairly small this week as well, i915 has a memory leak fix and some
minor changes, and amdgpu has some MST fixes, and some other minor
ones:
drm:
- DP MST kref fix
- fb_helper: check return value
i915:
- Fix BSC default context for Meteor Lake
- Fix selftest-scheduler's modify_type
- memory leak fix
amdgpu:
- GC11.x fixes
- SMU13.0.0 fix
- Freesync video fix
- DP MST fixes
- build fix"
* tag 'drm-fixes-2023-01-27' of git://anongit.freedesktop.org/drm/drm:
amdgpu: fix build on non-DCN platforms.
drm/amd/display: Fix timing not changning when freesync video is enabled
drm/display/dp_mst: Correct the kref of port.
drm/amdgpu/display/mst: update mst_mgr relevant variable when long HPD
drm/amdgpu/display/mst: limit payload to be updated one by one
drm/amdgpu/display/mst: Fix mst_state->pbn_div and slot count assignments
drm/amdgpu: declare firmware for new MES 11.0.4
drm/amdgpu: enable imu firmware for GC 11.0.4
drm/amd/pm: add missing AllowIHInterrupt message mapping for SMU13.0.0
drm/amdgpu: remove unconditional trap enable on add gfx11 queues
drm/fb-helper: Use a per-driver FB deferred I/O handler
drm/fb-helper: Check fb_deferred_io_init() return value
drm/i915/selftest: fix intel_selftest_modify_policy argument types
drm/i915/mtl: Fix bcs default context
drm/i915: Fix a memory leak with reused mmap_offset
drm/drm_vma_manager: Add drm_vma_node_allow_once()
Pull ACPI fixes from Rafael Wysocki:
"Add ACPI backlight handling quirks for 3 machines (Hans de Goede)"
* tag 'acpi-6.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: video: Add backlight=native DMI quirk for Asus U46E
ACPI: video: Add backlight=native DMI quirk for HP EliteBook 8460p
ACPI: video: Add backlight=native DMI quirk for HP Pavilion g6-1d80nr
Pull thermal control fixes from Rafael Wysocki:
"Add locking to the Intel int340x thermal control driver to prevent its
thermal zone callbacks from racing with firmware-induced thermal trip
point updates (Srinivas Pandruvada, Rafael Wysocki)"
* tag 'thermal-6.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
thermal: intel: int340x: Add locking to int340x_thermal_get_trip_type()
thermal: intel: int340x: Protect trip temperature from concurrent updates
Pull arm64 fix from Will Deacon:
- Fix event counting regression in Arm CMN PMU driver due to broken
optimisation
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
Partially revert "perf/arm-cmn: Optimise DTC counter accesses"
Pull RISC-V fixes from Palmer Dabbelt:
- A few DT bindings fixes to more closely align the ISA string
requirements between the bindings and the ISA manual.
- A handful of build error/warning fixes.
- A fix to move init_cpu_topology() later in the boot flow, so it can
allocate memory.
- The IRC channel is now in the MAINTAINERS file, so it's easier to
find.
* tag 'riscv-for-linus-6.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: Move call to init_cpu_topology() to later initialization stage
riscv/kprobe: Fix instruction simulation of JALR
riscv: fix -Wundef warning for CONFIG_RISCV_BOOT_SPINWAIT
MAINTAINERS: add an IRC entry for RISC-V
RISC-V: fix compile error from deduplicated __ALTERNATIVE_CFG_2
dt-bindings: riscv: fix single letter canonical order
dt-bindings: riscv: fix underscore requirement for multi-letter extensions
Pull ARM fixes from Russell King:
- fix nommu assignment build warning
- fix -Wundef preprocessor warning
- reduce __thumb2__ definitions for crypto files that require it
* tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
ARM: 9287/1: Reduce __thumb2__ definition to crypto files that require it
ARM: 9284/1: include <asm/pgtable.h> from proc-macros.S to fix -Wundef warnings
ARM: 9280/1: mm: fix warning on phys_addr_t to void pointer assignment
Pull Kselftest fixes from Shuah Khan:
"A single fix to a amd-pstate test Makefile bug that deletes source
files during make clean run"
* tag 'linux-kselftest-fixes-6.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
selftests: amd-pstate: Don't delete source files via Makefile