This commit queues work item for IT/IR events at hardIRQ handler to operate
the corresponding isochronous context. The work item is queued to any of
worker-pools.
The callback for either the implementation of unit protocol and user space
clients is executed in sleepable work process context. The change could
results in any errors of concurrent processing as well as sleep at atomic
context. These errors are fixed by the following commits.
Tested-by: Edmund Raile <edmund.raile@protonmail.com>
Link: https://lore.kernel.org/r/20240904125155.461886-4-o-takashi@sakamocchi.jp
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
This commit adds a workqueue dedicated for isochronous context processing.
The workqueue is allocated per instance of fw_card structure to satisfy the
following characteristics descending from 1394 OHCI specification:
In 1394 OHCI specification, memory pages are reserved to each isochronous
context dedicated to DMA transmission. It allows to operate these
per-context pages concurrently. Software can schedule hardware interrupt
for several isochronous context to the same cycle, thus WQ_UNBOUND is
specified. Additionally, it is sleepable to operate the content of pages,
thus WQ_BH is not used.
The isochronous context delivers the packets with time stamp, thus
WQ_HIGHPRI is specified for semi real-time data such as IEC 61883-1/6
protocol implemented by ALSA firewire stack. The isochronous context is not
used by the implementation of SCSI over IEEE1394 protocol (sbp2), thus
WQ_MEM_RECLAIM is not specified.
It is useful for users to adjust cpu affinity of the workqueue depending
on their work loads, thus WQ_SYS is specified to expose the attributes to
user space.
Tested-by: Edmund Raile <edmund.raile@protonmail.com>
Link: https://lore.kernel.org/r/20240904125155.461886-2-o-takashi@sakamocchi.jp
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Many tracepoints events have been added to 6.10 and 6.11 kernels. They are
available as an alternative of debug parameter in firewire-ohci module.
The logging messages enabled by the parameter require some cumbersomes in
a point of maintenance; e.g. the code to decode transaction frame.
This commit adds deprecation text to conduct users to them..
Link: https://lore.kernel.org/r/20240903101455.317067-1-o-takashi@sakamocchi.jp
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
When detecting updates of bus topology, the data of fw_device is newly
allocated and caches the content of configuration ROM from the
corresponding node. Then, the tree of device is sought to find the
previous data of fw_device corresponding to the node. If found, the
previous data is updated and reused and the data of fw_device newly
allocated is going to be released.
The above procedure is done in the call of device_find_child(), however it
is a bit abusing against the intention of the helper function, since it is
preferable to find only without updating.
This commit splits the update outside of the call.
Cc: Zijun Hu <zijun_hu@icloud.com>
Link: https://lore.kernel.org/r/20240820132132.28839-1-o-takashi@sakamocchi.jp
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
A commit 404957c1e207 ("firewire: ohci: use guard macro to serialize
accesses to phy registers") refactored initiated_reset() helper function,
while the error path was changed wrongly.
This commit fixes the bug.
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Fixes: 80f3401dfeb2 ("firewire: ohci: use guard macro to serialize accesses to phy registers")
Link: https://lore.kernel.org/r/20240817091128.180303-1-o-takashi@sakamocchi.jp
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
In core function, the instances of some client resource structures are
maintained by IDR. As of kernel v6.0, IDR has been superseded by XArray
and deprecated.
This commit replaces the usage of IDR with XArray to maintain the
resource instances. The instance of XArray is allocated per client with
XA_FLAGS_ALLOC1 so that the index of allocated entry is greater than zero
and returns to user space client as handle of the resource.
Link: https://lore.kernel.org/r/20240812235210.28458-6-o-takashi@sakamocchi.jp
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
This commit is a preparation to use xa_for_each() macro. Current
implementation uses idr_for_each() function and has a disadvantage to
replace with the macro. The IDR framework has idr_for_each_entry() macro
for the similar purpose. This commit replace the function with the
macro with minor code refactoring.
Link: https://lore.kernel.org/r/20240812235210.28458-5-o-takashi@sakamocchi.jp
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
All of local resource structure commonly have data of client_resource type
in its first member. This design sometimes requires usage of
container_of to retrieve parent structure by the first member.
This commit adds some helper functions for this purpose.
Link: https://lore.kernel.org/r/20240812235210.28458-3-o-takashi@sakamocchi.jp
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
In core function, the instances of fw_device corresponding to firewire device
node in system are maintained by IDR. As of kernel v6.0, IDR has been
superseded by XArray and deprecated.
This commit replaces the usage of IDR with XArray to maintain the device
instances. The instance of XArray is allocated statically, and
initialized with XA_FLAGS_ALLOC so that the index of allocated entry starts
with zero and available as the minor identifier of device node.
Link: https://lore.kernel.org/r/20240812014251.165492-2-o-takashi@sakamocchi.jp
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
A commit d8527cab6c ("firewire: cdev: implement new event to notify
response subaction with time stamp") adds an additional case,
FW_CDEV_EVENT_RESPONSE2, into switch statement in complete_transaction().
However, the range of block is beyond to the case label and reaches
neibour default label.
This commit corrects the range of block. Fortunately, it has few impacts
in practice since the local variable in the scope under the label is not
used in codes under default label.
Fixes: d8527cab6c ("firewire: cdev: implement new event to notify response subaction with time stamp")
Link: https://lore.kernel.org/r/20240810070403.36801-1-o-takashi@sakamocchi.jp
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
The core function provides UAPI to maintain isochronous resources allocated
by userspace clients across bus resets automatically. The resources are
maintained by IDR and the concurrent access to it is protected by spinlock
in the instance of client.
This commit uses guard macro to maintain the spinlock.
Link: https://lore.kernel.org/r/20240805085408.251763-11-o-takashi@sakamocchi.jp
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
The core function provides an operation for userspace application to
retrieve current value of CYCLE_TIMER register with several types of
system time. In the operation, local interrupt is disables so that the
access of the register and ktime are done atomically.
This commit uses guard macro to disable/enable local interrupts.
Link: https://lore.kernel.org/r/20240805085408.251763-9-o-takashi@sakamocchi.jp
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
The core function maintains the instance of fw_device structure by IDR.
The concurrent access to IDR is protected by static read/write semaphore.
The semaphore is also utilized to protect concurrent access to the
content of configuration ROM cached to the instance so that the cache is
swapped to the latest one.
This commit uses guard macro to maintain the mutex.
Link: https://lore.kernel.org/r/20240805085408.251763-7-o-takashi@sakamocchi.jp
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
The core function provide a kernel API to send phy configuration packet.
Current implementation of the feature uses packet object allocated
statically. The concurrent access to the object is protected by static
mutex.
This commit uses guard macro to maintain the mutex.
Link: https://lore.kernel.org/r/20240805085408.251763-2-o-takashi@sakamocchi.jp
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
In 1394 OHCI specification, the format of data for IT DMA is different from
the format of isochronous packet in IEEE 1394 specification, in its spd and
srcBusID fields.
This commit adds some static inline functions to serialize/deserialize the
data of IT DMA.
Link: https://lore.kernel.org/r/20240802003606.109402-4-o-takashi@sakamocchi.jp
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
In 1394 OHCI specification, the format of data for AT DMA is different from
the format of asynchronous packet in IEEE 1394 specification, in its spd
and srcBusID fields.
This commit adds some static inline functions to serialize/deserialize the
data of AT DMA.
Link: https://lore.kernel.org/r/20240802003606.109402-2-o-takashi@sakamocchi.jp
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
The string table for tcode is just used by log_ar_at_event(). In the case,
it is suitable to move the table inner the function definition.
This commit is for the purpose. Additionally, the hard-coded value for
tcode is replaced with defined macros as many as possible.
Link: https://lore.kernel.org/r/20240729134631.127189-3-o-takashi@sakamocchi.jp
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
In IEEE 1394 specification, 0x0e in tcode field is reserved for internal
purpose depending on link layer. In 1394 OHCI specification, it is used to
express phy packet in AT/AR contexts.
Current implementation of 1394 OHCI driver has several macros for the code.
They can be simply replaced with a macro in core code.
This commit obsoletes the macros.
Link: https://lore.kernel.org/r/20240729134631.127189-2-o-takashi@sakamocchi.jp
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
The kernel sleep profile is no longer working due to a recursive locking
bug introduced by commit 42a20f86dc ("sched: Add wrapper for get_wchan()
to keep task blocked")
Booting with the 'profile=sleep' kernel command line option added or
executing
# echo -n sleep > /sys/kernel/profiling
after boot causes the system to lock up.
Lockdep reports
kthreadd/3 is trying to acquire lock:
ffff93ac82e08d58 (&p->pi_lock){....}-{2:2}, at: get_wchan+0x32/0x70
but task is already holding lock:
ffff93ac82e08d58 (&p->pi_lock){....}-{2:2}, at: try_to_wake_up+0x53/0x370
with the call trace being
lock_acquire+0xc8/0x2f0
get_wchan+0x32/0x70
__update_stats_enqueue_sleeper+0x151/0x430
enqueue_entity+0x4b0/0x520
enqueue_task_fair+0x92/0x6b0
ttwu_do_activate+0x73/0x140
try_to_wake_up+0x213/0x370
swake_up_locked+0x20/0x50
complete+0x2f/0x40
kthread+0xfb/0x180
However, since nobody noticed this regression for more than two years,
let's remove 'profile=sleep' support based on the assumption that nobody
needs this functionality.
Fixes: 42a20f86dc ("sched: Add wrapper for get_wchan() to keep task blocked")
Cc: stable@vger.kernel.org # v5.16+
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull x86 fixes from Thomas Gleixner:
- Prevent a deadlock on cpu_hotplug_lock in the aperf/mperf driver.
A recent change in the ACPI code which consolidated code pathes moved
the invocation of init_freq_invariance_cppc() to be moved to a CPU
hotplug handler. The first invocation on AMD CPUs ends up enabling a
static branch which dead locks because the static branch enable tries
to acquire cpu_hotplug_lock but that lock is already held write by
the hotplug machinery.
Use static_branch_enable_cpuslocked() instead and take the hotplug
lock read for the Intel code path which is invoked from the
architecture code outside of the CPU hotplug operations.
- Fix the number of reserved bits in the sev_config structure bit field
so that the bitfield does not exceed 64 bit.
- Add missing Zen5 model numbers
- Fix the alignment assumptions of pti_clone_pgtable() and
clone_entry_text() on 32-bit:
The code assumes PMD aligned code sections, but on 32-bit the kernel
entry text is not PMD aligned. So depending on the code size and
location, which is configuration and compiler dependent, entry text
can cross a PMD boundary. As the start is not PMD aligned adding PMD
size to the start address is larger than the end address which
results in partially mapped entry code for user space. That causes
endless recursion on the first entry from userspace (usually #PF).
Cure this by aligning the start address in the addition so it ends up
at the next PMD start address.
clone_entry_text() enforces PMD mapping, but on 32-bit the tail might
eventually be PTE mapped, which causes a map fail because the PMD for
the tail is not a large page mapping. Use PTI_LEVEL_KERNEL_IMAGE for
the clone() invocation which resolves to PTE on 32-bit and PMD on
64-bit.
- Zero the 8-byte case for get_user() on range check failure on 32-bit
The recend consolidation of the 8-byte get_user() case broke the
zeroing in the failure case again. Establish it by clearing ECX
before the range check and not afterwards as that obvioulsy can't be
reached when the range check fails
* tag 'x86-urgent-2024-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/uaccess: Zero the 8-byte get_range case on failure on 32-bit
x86/mm: Fix pti_clone_entry_text() for i386
x86/mm: Fix pti_clone_pgtable() alignment assumption
x86/setup: Parse the builtin command line before merging
x86/CPU/AMD: Add models 0x60-0x6f to the Zen5 range
x86/sev: Fix __reserved field in sev_config
x86/aperfmperf: Fix deadlock on cpu_hotplug_lock
Pull timer fixes from Thomas Gleixner:
"Two fixes for the timer/clocksource code:
- The recent fix to make the take over of the broadcast timer more
reliable retrieves a per CPU pointer in preemptible context.
This went unnoticed in testing as some compilers hoist the access
into the non-preemotible section where the pointer is actually
used, but obviously compilers can rightfully invoke it where the
code put it.
Move it into the non-preemptible section right to the actual usage
side to cure it.
- The clocksource watchdog is supposed to emit a warning when the
retry count is greater than one and the number of retries reaches
the limit.
The condition is backwards and warns always when the count is
greater than one. Fixup the condition to prevent spamming dmesg"
* tag 'timers-urgent-2024-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
clocksource: Fix brown-bag boolean thinko in cs_watchdog_read()
tick/broadcast: Move per CPU pointer access into the atomic section
Pull scheduler fixes from Thomas Gleixner:
- When stime is larger than rtime due to accounting imprecision, then
utime = rtime - stime becomes negative. As this is unsigned math, the
result becomes a huge positive number.
Cure it by resetting stime to rtime in that case, so utime becomes 0.
- Restore consistent state when sched_cpu_deactivate() fails.
When offlining a CPU fails in sched_cpu_deactivate() after the SMT
present counter has been decremented, then the function aborts but
fails to increment the SMT present counter and leaves it imbalanced.
Consecutive operations cause it to underflow. Add the missing fixup
for the error path.
For SMT accounting the runqueue needs to marked online again in the
error exit path to restore consistent state.
* tag 'sched-urgent-2024-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/core: Fix unbalance set_rq_online/offline() in sched_cpu_deactivate()
sched/core: Introduce sched_set_rq_on/offline() helper
sched/smt: Fix unbalance sched_smt_present dec/inc
sched/smt: Introduce sched_smt_present_inc/dec() helper
sched/cputime: Fix mul_u64_u64_div_u64() precision for cputime
Pull x86 perf fixes from Thomas Gleixner:
- Move the smp_processor_id() invocation back into the non-preemtible
region, so that the result is valid to use
- Add the missing package C2 residency counters for Sierra Forest CPUs
to make the newly added support actually useful
* tag 'perf-urgent-2024-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86: Fix smp_processor_id()-in-preemptible warnings
perf/x86/intel/cstate: Add pkg C2 residency counter for Sierra Forest