This patch makes some improvement in how IOCTLs behave when the device is
disabled or under reset.
The new code checks, at the start of every IOCTL, if the device is
disabled or in reset. If so, it prints an appropriate kernel message and
returns -EBUSY to user-space.
In addition, the code modifies the location of where the
hard_reset_pending flag is being set or cleared:
1. It is now cleared immediately after the reset *tear-down* flow is
finished but before the re-initialization flow begins.
2. It is being set in the remove function of the device, to make the
behavior the same with the hard-reset flow
There are two exceptions to the disable or in reset check:
1. The HL_INFO_DEVICE_STATUS opcode in the INFO IOCTL. This opcode allows
the user to inquire about the status of the device, whether it is
operational, in reset or malfunction (disabled). If the driver will
block this IOCTL, the user won't be able to retrieve the status in
case of malfunction or in reset.
2. The WAIT_FOR_CS IOCTL. This IOCTL allows the user to inquire about the
status of a CS. We want to allow the user to continue to do so, even if
we started a soft-reset process because it will allow the user to get
the correct error code for each CS he submitted.
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
This patch fixes a bug in the implementation of the function that removes
the device.
The bug can happen when the device is removed but not the driver itself
(e.g. remove by the OS due to PCI freeze in Power architecture).
In that case, there maybe open users that are calling IOCTLs while the
device is removed. This is a possible race condition that the driver must
handle. Otherwise, a kernel panic may occur.
This race is prevented in the hard-reset flow, because the driver makes
sure the users are closed before continuing with the hard-reset. This
race can not occur when the driver itself is removed because the OS makes
sure all the file descriptors are closed.
The fix is to make sure the open users close their file descriptors and if
they don't (after a certain amount of time), the driver sends them a
SIGKILL, because the remove of the device can't be stopped.
The patch re-uses the same code that is called from the hard-reset flow.
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
To make the memory ioctl code more readable, this patch moves the
legacy/debug code path of mmu-disabled to a separate function, which is
called (if necessary) from the main memory ioctl function.
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
This patch removes the enum value of ASIC_AUTO_DETECT because we can use
the validity of the pdev variable to know whether we have a real device or
a simulator. For a real device, we detect the asic type from the device ID
while for a simulator, the simulator code calls create_hdev() with the
specified ASIC type.
Set ASIC_INVALID as the first option in the enum to make sure that no
other enum value will receive the value 0 (which indicates a non-existing
entry in the simulator array).
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
This patch does some refactoring in goya.c to make code more reusable
between goya code and the goya simulator code (which is not upstreamed).
In addition, the patch removes some dead functions from goya.c which are
not used by the current upstream code
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
This patch adds a better explanation about the sequence number that is
returned per CS. It also fixes the comment about queue numbering rules.
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
This patch adds the ASIC-specific function for GOYA to configure the
coresight components.
Most of the components have an enabled/disabled flag, depending on whether
the user wants to enable the component or disable it.
For some of the components, such as ETR and SPMU, the user can also
request to read values from them. Those values are needed for the user to
parse the trace data.
The ETR configuration is also checked for security purposes, to make sure
the trace data is written to the device's DRAM.
Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Habanalabs ASICs use the ARM coresight infrastructure to support debug,
tracing and profiling of neural networks topologies.
Because the coresight is configured using register writes and reads, and
some of the registers hold sensitive information (e.g. the address in
the device's DRAM where the trace data is written to), the user must go
through the kernel driver to configure this mechanism.
This patch implements the common code of the IOCTL and calls the
ASIC-specific function for the actual H/W configuration.
The IOCTL supports configuration of seven coresight components:
ETR, ETF, STM, FUNNEL, BMON, SPMU and TIMESTAMP
The user specifies which component he wishes to configure and provides a
pointer to a structure (located in its process space) that contains the
relevant configuration.
The common code copies the relevant data from the user-space to kernel
space and then calls the ASIC-specific function to do the H/W
configuration.
After the configuration is done, which is usually composed
of several IOCTL calls depending on what the user wanted to trace, the
user can start executing the topology. The trace data will be written to
the user's area in the device's DRAM.
After the tracing operation is complete, and user will call the IOCTL
again to disable the tracing operation. The user also need to read
values from registers for some of the components (e.g. the size of the
trace data in the device's DRAM). In that case, the user will provide a
pointer to an "output" structure in user-space, which the IOCTL code will
fill according the to selected component.
Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
This patch removes an extra ; after the closing brackets of a while loop.
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Mukesh Ojha <mojha@codeaurora.org>
Unmapping ptes in the device MMU on Palladium can take a long time, which
can cause a kernel BUG of CPU soft lockup.
This patch minimize the chances for this bug by sleeping a little between
unmapping ptes.
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
GIT does not like extra blank lines at the end of the file, so this patch
removes those lines.
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Mukesh Ojha <mojha@codeaurora.org>
This patch adds a new opcode to INFO IOCTL that returns the device status.
This will allow users to query the device status in order to avoid sending
command submissions while device is in reset.
Signed-off-by: Dalit Ben Zoor <dbenzoor@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
This patch allows the user to modify the TPC PLL clock relaxation value
on-the-fly in order to reduce power consumption.
To enable this, the patch removes the protection from the specific
register that controls this behavior.
Signed-off-by: Dalit Ben Zoor <dbenzoor@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
On init or context switch, set TPC clock relaxation counter
register to a golden value.
Signed-off-by: Dalit Ben Zoor <dbenzoor@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Hard-reset of our device should never fail, due to dangers of permanent
damage to the H/W.
This patch removes the last place in the reset path where the driver might
exit before doing the actual reset.
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
This patch refactors the code that is responsible to set the DMA mask for
the device.
Upon each change of the dma mask, the driver will save the new value that
was set. This is needed in order to make sure we don't try to increase the
mask a second time, in case we failed in the first time. This is
especially relevant for Power machines, as that may cause a change in
configuration of the TVT which will break the device.
Goya will first try to set the device's dma mask to 39 bits, so that the
memory that is allocated on the host machine for communication with the
device's cpu will be in a bus address which is lower then 39 bits. Later,
Goya will try to increase that mask to 48 bits, but only if setting the
mask to 39 bits was successful.
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
This patch adds shadow mapping to the MMU module. The shadow mapping
allows traversing the page table in host memory rather reading each PTE
from the device memory.
It brings better performance and avoids reading from invalid device
address upon PCI errors.
Only at the end of map/unmap flow, writings to the device are performed in
order to sync the H/W page tables with the shadow ones.
Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
The addr/data32 debugfs nodes currently permit the access to only physical
addresses of a device. This patch extends it and allows accessing also
device's DRAM virtual addresses.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Print the name of a busy engine when checking if a device is idle.
The change is done mainly to help a user to pinpoint problems in his
topology's recipe.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
This patch adds two comments in uapi/habanalabs.h:
- From which queue id the internal queues begin
- Invalid values that can be returned in the seq field from the CS IOCTL
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Remove pointers to ASIC-specific functions and instead call the functions
explicitly as they are not accessed from outside the ASIC-specific files.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Move duplicated PCI-related code from ASIC-specific files into the common
pci.c file.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
At the start of some IOCTLs we check if the device is disabled or in reset.
If it is, we return -EBUSY and print a message to kernel log.
Because these IOCTLs can be called at very high frequency, use ratelimit
to avoid spamming the kernel log. Also use the same type of message -
dev_warn - in all the relevant IOCTLs.
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
The Event Queue MSI/X ID is different per ASIC. This patch renames the
current define to have the GOYA_ prefix to mark it only for Goya. It also
moves it from the common armcp_if.h file to the ASIC specific goya_fw_if.h
file.
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
This patch moves the code that is responsible of the communication
vs. the F/W to a dedicated file. This will allow us to share the code
between different ASICs.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
This will prevent unneeded include of header files, which may increase
compilation time.
Signed-off-by: Dotan Barak <dbarak@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
The goya_non_fatal_events array actually contains all the possible events
the driver can receive from the F/W. Therefore, use a proper name
for the array.
The patch also adds missing event Ids to the goya_async_event_id enum.
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
This patch adds a definition of a new status in the device CPU boot stages
and add the handling of the new status.
Signed-off-by: Igor Grinberg <igrinberg@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
The function comment of __register_chrdev_region()
is out of date, so update it based on the code.
Signed-off-by: Chengguang Xu <cgxu519@gmx.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
register_chrdev_region() carefully checks minor range
before calling __register_chrdev_region() but there is
another path from alloc_chrdev_region() which does not
check the range properly. So add a check for given minor
range in __register_chrdev_region().
Signed-off-by: Chengguang Xu <cgxu519@gmx.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Pull KVM fixes from Paolo Bonzini:
"A collection of x86 and ARM bugfixes, and some improvements to
documentation.
On top of this, a cleanup of kvm_para.h headers, which were exported
by some architectures even though they not support KVM at all. This is
responsible for all the Kbuild changes in the diffstat"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (28 commits)
Documentation: kvm: clarify KVM_SET_USER_MEMORY_REGION
KVM: doc: Document the life cycle of a VM and its resources
KVM: selftests: complete IO before migrating guest state
KVM: selftests: disable stack protector for all KVM tests
KVM: selftests: explicitly disable PIE for tests
KVM: selftests: assert on exit reason in CR4/cpuid sync test
KVM: x86: update %rip after emulating IO
x86/kvm/hyper-v: avoid spurious pending stimer on vCPU init
kvm/x86: Move MSR_IA32_ARCH_CAPABILITIES to array emulated_msrs
KVM: x86: Emulate MSR_IA32_ARCH_CAPABILITIES on AMD hosts
kvm: don't redefine flags as something else
kvm: mmu: Used range based flushing in slot_handle_level_range
KVM: export <linux/kvm_para.h> and <asm/kvm_para.h> iif KVM is supported
KVM: x86: remove check on nr_mmu_pages in kvm_arch_commit_memory_region()
kvm: nVMX: Add a vmentry check for HOST_SYSENTER_ESP and HOST_SYSENTER_EIP fields
KVM: SVM: Workaround errata#1096 (insn_len maybe zero on SMAP violation)
KVM: Reject device ioctls from processes other than the VM's creator
KVM: doc: Fix incorrect word ordering regarding supported use of APIs
KVM: x86: fix handling of role.cr4_pae and rename it to 'gpte_size'
KVM: nVMX: Do not inherit quadrant and invalid for the root shadow EPT
...
Pull x86 fixes from Thomas Gleixner:
"A pile of x86 updates:
- Prevent exceeding he valid physical address space in the /dev/mem
limit checks.
- Move all header content inside the header guard to prevent compile
failures.
- Fix the bogus __percpu annotation in this_cpu_has() which makes
sparse very noisy.
- Disable switch jump tables completely when retpolines are enabled.
- Prevent leaking the trampoline address"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/realmode: Make set_real_mode_mem() static inline
x86/cpufeature: Fix __percpu annotation in this_cpu_has()
x86/mm: Don't exceed the valid physical address space
x86/retpolines: Disable switch jump tables when retpolines are enabled
x86/realmode: Don't leak the trampoline kernel address
x86/boot: Fix incorrect ifdeffery scope
x86/resctrl: Remove unused variable
Pull perf tooling fixes from Thomas Gleixner:
"Core libraries:
- Fix max perf_event_attr.precise_ip detection.
- Fix parser error for uncore event alias
- Fixup ordering of kernel maps after obtaining the main kernel map
address.
Intel PT:
- Fix TSC slip where A TSC packet can slip past MTC packets so that
the timestamp appears to go backwards.
- Fixes for exported-sql-viewer GUI conversion to python3.
ARM coresight:
- Fix the build by adding a missing case value for enumeration value
introduced in newer library, that now is the required one.
tool headers:
- Syncronize kernel headers with the kernel, getting new io_uring and
pidfd_send_signal syscalls so that 'perf trace' can handle them"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf pmu: Fix parser error for uncore event alias
perf scripts python: exported-sql-viewer.py: Fix python3 support
perf scripts python: exported-sql-viewer.py: Fix never-ending loop
perf machine: Update kernel map address and re-order properly
tools headers uapi: Sync powerpc's asm/kvm.h copy with the kernel sources
tools headers: Update x86's syscall_64.tbl and uapi/asm-generic/unistd
tools headers uapi: Update drm/i915_drm.h
tools arch x86: Sync asm/cpufeatures.h with the kernel sources
tools headers uapi: Sync linux/fcntl.h to get the F_SEAL_FUTURE_WRITE addition
tools headers uapi: Sync asm-generic/mman-common.h and linux/mman.h
perf evsel: Fix max perf_event_attr.precise_ip detection
perf intel-pt: Fix TSC slip
perf cs-etm: Add missing case value
Pull CPU hotplug fixes from Thomas Gleixner:
"Two SMT/hotplug related fixes:
- Prevent crash when HOTPLUG_CPU is disabled and the CPU bringup
aborts. This is triggered with the 'nosmt' command line option, but
can happen by any abort condition. As the real unplug code is not
compiled in, prevent the fail by keeping the CPU in zombie state.
- Enforce HOTPLUG_CPU for SMP on x86 to avoid the above situation
completely. With 'nosmt' being a popular option it's required to
unplug the half brought up sibling CPUs (due to the MCE wreckage)
completely"
* 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/smp: Enforce CONFIG_HOTPLUG_CPU when SMP=y
cpu/hotplug: Prevent crash when CPU bringup fails on CONFIG_HOTPLUG_CPU=n
Pull locking fixlet from Thomas Gleixner:
"Trivial update to the maintainers file"
* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
MAINTAINERS: Remove deleted file from futex file pattern
Pull core fixes from Thomas Gleixner:
"A small set of core updates:
- Make the watchdog respect the selected CPU mask again. That was
broken by the rework of the watchdog thread management and caused
inconsistent state and NMI watchdog being unstoppable.
- Ensure that the objtool build can find the libelf location.
- Remove dead kcore stub code"
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
watchdog: Respect watchdog cpumask on CPU hotplug
objtool: Query pkg-config for libelf location
proc/kcore: Remove unused kclist_add_remap()
Pull powerpc fixes from Michael Ellerman:
"Three non-regression fixes.
- Our optimised memcmp could read past the end of one of the buffers
and potentially trigger a page fault leading to an oops.
- Some of our code to read energy management data on PowerVM had an
endian bug leading to bogus results.
- When reporting a machine check exception we incorrectly reported
TLB multihits as D-Cache multhits due to a missing entry in the
array of causes.
Thanks to: Chandan Rajendra, Gautham R. Shenoy, Mahesh Salgaonkar,
Segher Boessenkool, Vaidyanathan Srinivasan"
* tag 'powerpc-5.1-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/pseries/mce: Fix misleading print for TLB mutlihit
powerpc/pseries/energy: Use OF accessor functions to read ibm,drc-indexes
powerpc/64: Fix memcmp reading past the end of src/dest
Pull LED fixes from Jacek Anaszewski:
- fix refcnt leak on interface rename
- use memcpy in device_name_store() to avoid including garbage from a
previous, longer value in the device_name
- fix a potential NULL pointer dereference in case of_match_device()
cannot find a match
* tag 'led-fixes-for-5.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds:
leds: trigger: netdev: use memcpy in device_name_store
leds: pca9532: fix a potential NULL pointer dereference
leds: trigger: netdev: fix refcnt leak on interface rename
Pull GPIO fixes from Linus Walleij:
"As you can see [in the git history] I was away on leave and Bartosz
kindly stepped in and collected a slew of fixes, I pulled them into my
tree in two sets and merged some two more fixes (fixing my own caused
bugs) on top.
Summary:
- Revert the extended use of gpio_set_config() and think about how we
can do this properly.
- Fix up the SPI CS GPIO handling so it now works properly on the SPI
bus children, as intended.
- Error paths and driver fixes"
* tag 'gpio-v5.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
gpio: mockup: use simple_read_from_buffer() in debugfs read callback
gpio: of: Fix of_gpiochip_add() error path
gpio: of: Check for "spi-cs-high" in child instead of parent node
gpio: of: Check propname before applying "cs-gpios" quirks
gpio: mockup: fix debugfs read
Revert "gpio: use new gpio_set_config() helper in more places"
gpio: aspeed: fix a potential NULL pointer dereference
gpio: amd-fch: Fix bogus SPDX identifier
gpio: adnp: Fix testing wrong value in adnp_gpio_direction_input
gpio: exar: add a check for the return value of ida_simple_get fails
If userspace doesn't end the input with a newline (which can easily
happen if the write happens from a C program that does write(fd,
iface, strlen(iface))), we may end up including garbage from a
previous, longer value in the device_name. For example
# cat device_name
# printf 'eth12' > device_name
# cat device_name
eth12
# printf 'eth3' > device_name
# cat device_name
eth32
I highly doubt anybody is relying on this behaviour, so switch to
simply copying the bytes (we've already checked that size is <
IFNAMSIZ) and unconditionally zero-terminate it; of course, we also
still have to strip a trailing newline.
This is also preparation for future patches.
Fixes: 06f502f57d ("leds: trigger: Introduce a NETDEV trigger")
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Jacek Anaszewski <jacek.anaszewski@gmail.com>
In case of_match_device cannot find a match, return -EINVAL to avoid
NULL pointer dereference.
Fixes: fa4191a609 ("leds: pca9532: Add device tree support")
Signed-off-by: Kangjie Lu <kjlu@umn.edu>
Signed-off-by: Jacek Anaszewski <jacek.anaszewski@gmail.com>
Pull staging driver fixes from Greg KH:
"Here are some small staging driver fixes for 5.1-rc3, and one driver
removal.
The biggest thing here is the removal of the mt7621-eth driver as a
"real" network driver was merged in 5.1-rc1 for this hardware, so this
old driver can now be removed.
Other than that, there are just a number of small fixes, all resolving
reported issues and some potential corner cases for error handling
paths.
All of these have been in linux-next with no reported issues"
* tag 'staging-5.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
staging: vt6655: Remove vif check from vnt_interrupt
staging: erofs: keep corrupted fs from crashing kernel in erofs_readdir()
staging: octeon-ethernet: fix incorrect PHY mode
staging: vc04_services: Fix an error code in vchiq_probe()
staging: erofs: fix error handling when failed to read compresssed data
staging: vt6655: Fix interrupt race condition on device start up.
staging: rtlwifi: Fix potential NULL pointer dereference of kzalloc
staging: rtl8712: uninitialized memory in read_bbreg_hdl()
staging: rtlwifi: rtl8822b: fix to avoid potential NULL pointer dereference
staging: rtl8188eu: Fix potential NULL pointer dereference of kcalloc
staging, mt7621-pci: fix build without pci support
staging: speakup_soft: Fix alternate speech with other synths
staging: axis-fifo: add CONFIG_OF dependency
staging: olpc_dcon_xo_1: add missing 'const' qualifier
staging: comedi: ni_mio_common: Fix divide-by-zero for DIO cmdtest
staging: erofs: fix to handle error path of erofs_vmap()
staging: mt7621-dts: update ethernet settings.
staging: remove mt7621-eth