Commit Graph

1089476 Commits

Author SHA1 Message Date
Catalin Marinas
201729d53a Merge branches 'for-next/sme', 'for-next/stacktrace', 'for-next/fault-in-subpage', 'for-next/misc', 'for-next/ftrace' and 'for-next/crashkernel', remote-tracking branch 'arm64/for-next/perf' into for-next/core
* arm64/for-next/perf:
  perf/arm-cmn: Decode CAL devices properly in debugfs
  perf/arm-cmn: Fix filter_sel lookup
  perf/marvell_cn10k: Fix tad_pmu_event_init() to check pmu type first
  drivers/perf: hisi: Add Support for CPA PMU
  drivers/perf: hisi: Associate PMUs in SICL with CPUs online
  drivers/perf: arm_spe: Expose saturating counter to 16-bit
  perf/arm-cmn: Add CMN-700 support
  perf/arm-cmn: Refactor occupancy filter selector
  perf/arm-cmn: Add CMN-650 support
  dt-bindings: perf: arm-cmn: Add CMN-650 and CMN-700
  perf: check return value of armpmu_request_irq()
  perf: RISC-V: Remove non-kernel-doc ** comments

* for-next/sme: (30 commits)
  : Scalable Matrix Extensions support.
  arm64/sve: Move sve_free() into SVE code section
  arm64/sve: Make kernel FPU protection RT friendly
  arm64/sve: Delay freeing memory in fpsimd_flush_thread()
  arm64/sme: More sensibly define the size for the ZA register set
  arm64/sme: Fix NULL check after kzalloc
  arm64/sme: Add ID_AA64SMFR0_EL1 to __read_sysreg_by_encoding()
  arm64/sme: Provide Kconfig for SME
  KVM: arm64: Handle SME host state when running guests
  KVM: arm64: Trap SME usage in guest
  KVM: arm64: Hide SME system registers from guests
  arm64/sme: Save and restore streaming mode over EFI runtime calls
  arm64/sme: Disable streaming mode and ZA when flushing CPU state
  arm64/sme: Add ptrace support for ZA
  arm64/sme: Implement ptrace support for streaming mode SVE registers
  arm64/sme: Implement ZA signal handling
  arm64/sme: Implement streaming SVE signal handling
  arm64/sme: Disable ZA and streaming mode when handling signals
  arm64/sme: Implement traps and syscall handling for SME
  arm64/sme: Implement ZA context switching
  arm64/sme: Implement streaming SVE context switching
  ...

* for-next/stacktrace:
  : Stacktrace cleanups.
  arm64: stacktrace: align with common naming
  arm64: stacktrace: rename stackframe to unwind_state
  arm64: stacktrace: rename unwinder functions
  arm64: stacktrace: make struct stackframe private to stacktrace.c
  arm64: stacktrace: delete PCS comment
  arm64: stacktrace: remove NULL task check from unwind_frame()

* for-next/fault-in-subpage:
  : btrfs search_ioctl() live-lock fix using fault_in_subpage_writeable().
  btrfs: Avoid live-lock in search_ioctl() on hardware with sub-page faults
  arm64: Add support for user sub-page fault probing
  mm: Add fault_in_subpage_writeable() to probe at sub-page granularity

* for-next/misc:
  : Miscellaneous patches.
  arm64: Kconfig.platforms: Add comments
  arm64: Kconfig: Fix indentation and add comments
  arm64: mm: avoid writable executable mappings in kexec/hibernate code
  arm64: lds: move special code sections out of kernel exec segment
  arm64/hugetlb: Implement arm64 specific huge_ptep_get()
  arm64/hugetlb: Use ptep_get() to get the pte value of a huge page
  arm64: mm: Make arch_faults_on_old_pte() check for migratability
  arm64: mte: Clean up user tag accessors
  arm64/hugetlb: Drop TLB flush from get_clear_flush()
  arm64: Declare non global symbols as static
  arm64: mm: Cleanup useless parameters in zone_sizes_init()
  arm64: fix types in copy_highpage()
  arm64: Set ARCH_NR_GPIO to 2048 for ARCH_APPLE
  arm64: cputype: Avoid overflow using MIDR_IMPLEMENTOR_MASK
  arm64: document the boot requirements for MTE
  arm64/mm: Compute PTRS_PER_[PMD|PUD] independently of PTRS_PER_PTE

* for-next/ftrace:
  : ftrace cleanups.
  arm64/ftrace: Make function graph use ftrace directly
  ftrace: cleanup ftrace_graph_caller enable and disable

* for-next/crashkernel:
  : Support for crashkernel reservations above ZONE_DMA.
  arm64: kdump: Do not allocate crash low memory if not needed
  docs: kdump: Update the crashkernel description for arm64
  of: Support more than one crash kernel regions for kexec -s
  of: fdt: Add memory for devices by DT property "linux,usable-memory-range"
  arm64: kdump: Reimplement crashkernel=X
  arm64: Use insert_resource() to simplify code
  kdump: return -ENOENT if required cmdline option does not exist
2022-05-20 18:50:35 +01:00
Geert Uytterhoeven
8e1f78a921 arm64/sve: Move sve_free() into SVE code section
If CONFIG_ARM64_SVE is not set:

    arch/arm64/kernel/fpsimd.c:294:13: warning: ‘sve_free’ defined but not used [-Wunused-function]

Fix this by moving sve_free() and __sve_free() into the existing section
protected by "#ifdef CONFIG_ARM64_SVE", now the last user outside that
section has been removed.

Fixes: a1259dd807 ("arm64/sve: Delay freeing memory in fpsimd_flush_thread()")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/cd633284683c24cb9469f8ff429915aedf67f868.1652798894.git.geert+renesas@glider.be
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-05-18 14:44:36 +01:00
Juerg Haefliger
aea3cb356c arm64: Kconfig.platforms: Add comments
Add trailing comments to endmenu statements for better readability.

Signed-off-by: Juerg Haefliger <juergh@canonical.com>
Link: https://lore.kernel.org/r/20220517141648.331976-3-juergh@canonical.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-05-18 14:41:45 +01:00
Juerg Haefliger
3cb7e662a9 arm64: Kconfig: Fix indentation and add comments
The convention for indentation seems to be a single tab. Help text is
further indented by an additional two whitespaces. Fix the lines that
violate these rules.

While add it, add trailing comments to endif and endmenu statements for
better readability.

Signed-off-by: Juerg Haefliger <juergh@canonical.com>
Link: https://lore.kernel.org/r/20220517141648.331976-2-juergh@canonical.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-05-18 14:41:45 +01:00
Ard Biesheuvel
01142791b0 arm64: mm: avoid writable executable mappings in kexec/hibernate code
The temporary mappings of the low-level kexec and hibernate helpers are
created with both writable and executable attributes, which is not
necessary here, and generally best avoided. So use read-only, executable
attributes instead.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20220429131347.3621090-3-ardb@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-05-17 09:32:45 +01:00
Ard Biesheuvel
6ee3cf6a20 arm64: lds: move special code sections out of kernel exec segment
There are a few code sections that are emitted into the kernel's
executable .text segment simply because they contain code, but are
actually never executed via this mapping, so they can happily live in a
region that gets mapped without executable permissions, reducing the
risk of being gadgetized.

Note that the kexec and hibernate region contents are always copied into
a fresh page, and so there is no need to align them as long as the
overall size of each is below 4 KiB.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20220429131347.3621090-2-ardb@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-05-17 09:32:38 +01:00
Baolin Wang
bc5dfb4fd7 arm64/hugetlb: Implement arm64 specific huge_ptep_get()
Now we use huge_ptep_get() to get the pte value of a hugetlb page,
however it will only return one specific pte value for the CONT-PTE
or CONT-PMD size hugetlb on ARM64 system, which can contain several
continuous pte or pmd entries with same page table attributes. And it
will not take into account the subpages' dirty or young bits of a
CONT-PTE/PMD size hugetlb page.

So the huge_ptep_get() is inconsistent with huge_ptep_get_and_clear(),
which already takes account the dirty or young bits for any subpages
in this CONT-PTE/PMD size hugetlb [1]. Meanwhile we can miss dirty or
young flags statistics for hugetlb pages with current huge_ptep_get(),
such as the gather_hugetlb_stats() function, and CONT-PTE/PMD hugetlb
monitoring with DAMON.

Thus define an ARM64 specific huge_ptep_get() implementation as well as
enabling __HAVE_ARCH_HUGE_PTEP_GET, that will take into account any
subpages' dirty or young bits for CONT-PTE/PMD size hugetlb page, for
those functions that want to check the dirty and young flags of a hugetlb
page.

[1] https://lore.kernel.org/linux-mm/85bd80b4-b4fd-0d3f-a2e5-149559f2f387@oracle.com/

Suggested-by: Muchun Song <songmuchun@bytedance.com>
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/624109a80ac4bbdf1e462dfa0b49e9f7c31a7c0d.1652496622.git.baolin.wang@linux.alibaba.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-05-16 23:39:07 +01:00
Baolin Wang
f0d9d79ec7 arm64/hugetlb: Use ptep_get() to get the pte value of a huge page
The original huge_ptep_get() on ARM64 is just a wrapper of ptep_get(),
which will not take into account any contig-PTEs dirty and access bits.
Meanwhile we will implement a new ARM64-specific huge_ptep_get()
interface in following patch, which will take into account any contig-PTEs
dirty and access bits. To keep the same efficient logic to get the pte
value, change to use ptep_get() as a preparation.

Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/5113ed6e103f995e1d0f0c9fda0373b761bbcad2.1652496622.git.baolin.wang@linux.alibaba.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-05-16 23:39:07 +01:00
Zhen Lei
8f0f104e2a arm64: kdump: Do not allocate crash low memory if not needed
When "crashkernel=X,high" is specified, the specified "crashkernel=Y,low"
memory is not required in the following corner cases:
1. If both CONFIG_ZONE_DMA and CONFIG_ZONE_DMA32 are disabled, it means
   that the devices can access any memory.
2. If the system memory is small, the crash high memory may be allocated
   from the DMA zones. If that happens, there's no need to allocate
   another crash low memory because there's already one.

Add condition '(crash_base >= CRASH_ADDR_LOW_MAX)' to determine whether
the 'high' memory is allocated above DMA zones. Note: when both
CONFIG_ZONE_DMA and CONFIG_ZONE_DMA32 are disabled, the entire physical
memory is DMA accessible, CRASH_ADDR_LOW_MAX equals 'PHYS_MASK + 1'.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Acked-by: Baoquan He <bhe@redhat.com>
Link: https://lore.kernel.org/r/20220511032033.426-1-thunder.leizhen@huawei.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-05-16 20:01:38 +01:00
Sebastian Andrzej Siewior
696207d425 arm64/sve: Make kernel FPU protection RT friendly
Non RT kernels need to protect FPU against preemption and bottom half
processing. This is achieved by disabling bottom halves via
local_bh_disable() which implictly disables preemption.

On RT kernels this protection mechanism is not sufficient because
local_bh_disable() does not disable preemption. It serializes bottom half
related processing via a CPU local lock.

As bottom halves are running always in thread context on RT kernels
disabling preemption is the proper choice as it implicitly prevents bottom
half processing.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20220505163207.85751-3-bigeasy@linutronix.de
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-05-16 19:15:46 +01:00
Sebastian Andrzej Siewior
a1259dd807 arm64/sve: Delay freeing memory in fpsimd_flush_thread()
fpsimd_flush_thread() invokes kfree() via sve_free()+sme_free() within a
preempt disabled section which is not working on -RT.

Delay freeing of memory until preemption is enabled again.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20220505163207.85751-2-bigeasy@linutronix.de
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-05-16 19:15:45 +01:00
Valentin Schneider
c733812dd7 arm64: mm: Make arch_faults_on_old_pte() check for migratability
arch_faults_on_old_pte() relies on the calling context being
non-preemptible. CONFIG_PREEMPT_RT turns the PTE lock into a sleepable
spinlock, which doesn't disable preemption once acquired, triggering the
warning in arch_faults_on_old_pte().

It does however disable migration, ensuring the task remains on the same
CPU during the entirety of the critical section, making the read of
cpu_has_hw_af() safe and stable.

Make arch_faults_on_old_pte() check cant_migrate() instead of preemptible().

Cc: Valentin Schneider <vschneid@redhat.com>
Suggested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Link: https://lore.kernel.org/r/20220127192437.1192957-1-valentin.schneider@arm.com
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20220505163207.85751-4-bigeasy@linutronix.de
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-05-16 19:15:18 +01:00
Robin Murphy
b4d6bb38f9 arm64: mte: Clean up user tag accessors
Invoking user_ldst to explicitly add a post-increment of 0 is silly.
Just use a normal USER() annotation and save the redundant instruction.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Tong Tiangen <tongtiangen@huawei.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20220420030418.3189040-6-tongtiangen@huawei.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-05-16 19:11:03 +01:00
Robin Murphy
c578121298 perf/arm-cmn: Decode CAL devices properly in debugfs
The debugfs code is lazy, and since it only keeps the bottom byte of
each connect_info register to save space, it also treats the whole thing
as the device_type since the other bits were reserved anyway. Upon
closer inspection, though, this is no longer true on newer IP versions,
so let's be good and decode the exact field properly. This should help
it not get confused when a Component Aggregation Layer is present (which
is already implied if Node IDs are found for both device addresses
represented by the next two lines of the table).

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/6a13a6128a28cfe2eec6d09cf372a167ec9c3b65.1652274773.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-05-12 13:44:56 +01:00
Anshuman Khandual
fb396bb459 arm64/hugetlb: Drop TLB flush from get_clear_flush()
This drops now redundant TLB flush in get_clear_flush() which is no longer
required after recent commit 697a1d44af ("tlb: hugetlb: Add more sizes to
tlb_remove_huge_tlb_entry"). It also renames this function i.e dropping off
'_flush' and replacing it with '__contig' as appropriate.

Cc: Will Deacon <will@kernel.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/20220510043930.2410985-1-anshuman.khandual@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-05-11 13:53:19 +01:00
Linu Cherian
710c8d6c02 arm64: Declare non global symbols as static
Fix below sparse warnings introduced while adding errata.

arch/arm64/kernel/cpu_errata.c:218:25: sparse: warning: symbol
'cavium_erratum_23154_cpus' was not declared. Should it be static?

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Linu Cherian <lcherian@marvell.com>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220509043221.16361-1-lcherian@marvell.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-05-11 13:40:26 +01:00
Robin Murphy
3630b2a863 perf/arm-cmn: Fix filter_sel lookup
Carefully considering the bounds of an array is all well and good,
until you forget that that array also contains a NULL sentinel at
the end and dereference it. So close...

Reported-by: Qian Cai <quic_qiancai@quicinc.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/bebba768156aa3c0757140457bdd0fec10819388.1652217788.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-05-11 10:20:42 +01:00
Tanmay Jagdale
33835e8dfb perf/marvell_cn10k: Fix tad_pmu_event_init() to check pmu type first
Make sure to check the pmu type first and then check event->attr.disabled.
Doing so would avoid reading the disabled attribute of an event that is
not handled by TAD PMU.

Signed-off-by: Tanmay Jagdale <tanmay@marvell.com>
Link: https://lore.kernel.org/r/20220510102657.487539-1-tanmay@marvell.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-05-10 12:14:27 +01:00
Zhen Lei
5832f1ae50 docs: kdump: Update the crashkernel description for arm64
Now arm64 has added support for "crashkernel=X,high" and
"crashkernel=Y,low". Unlike x86, crash low memory is not allocated if
"crashkernel=Y,low" is not specified.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Acked-by: Baoquan He <bhe@redhat.com>
Link: https://lore.kernel.org/r/20220506114402.365-7-thunder.leizhen@huawei.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-05-07 20:00:39 +01:00
Zhen Lei
8af6b91f58 of: Support more than one crash kernel regions for kexec -s
When "crashkernel=X,high" is used, there may be two crash regions:
high=crashk_res and low=crashk_low_res. But now the syscall
kexec_file_load() only add crashk_res into "linux,usable-memory-range",
this may cause the second kernel to have no available dma memory.

Fix it like kexec-tools does for option -c, add both 'high' and 'low'
regions into the dtb.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Acked-by: Rob Herring <robh@kernel.org>
Acked-by: Baoquan He <bhe@redhat.com>
Link: https://lore.kernel.org/r/20220506114402.365-6-thunder.leizhen@huawei.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-05-07 19:57:35 +01:00
Chen Zhou
fb319e77a0 of: fdt: Add memory for devices by DT property "linux,usable-memory-range"
When reserving crashkernel in high memory, some low memory is reserved
for crash dump kernel devices and never mapped by the first kernel.
This memory range is advertised to crash dump kernel via DT property
under /chosen,
        linux,usable-memory-range = <BASE1 SIZE1 [BASE2 SIZE2]>

We reused the DT property linux,usable-memory-range and made the low
memory region as the second range "BASE2 SIZE2", which keeps compatibility
with existing user-space and older kdump kernels.

Crash dump kernel reads this property at boot time and call memblock_add()
to add the low memory region after memblock_cap_memory_range() has been
called.

Signed-off-by: Chen Zhou <chenzhou10@huawei.com>
Co-developed-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Tested-by: Dave Kleikamp <dave.kleikamp@oracle.com>
Acked-by: Baoquan He <bhe@redhat.com>
Link: https://lore.kernel.org/r/20220506114402.365-5-thunder.leizhen@huawei.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-05-07 19:54:33 +01:00
Chen Zhou
944a45abfa arm64: kdump: Reimplement crashkernel=X
There are following issues in arm64 kdump:
1. We use crashkernel=X to reserve crashkernel in DMA zone, which
will fail when there is not enough low memory.
2. If reserving crashkernel above DMA zone, in this case, crash dump
kernel will fail to boot because there is no low memory available
for allocation.

To solve these issues, introduce crashkernel=X,[high,low].
The "crashkernel=X,high" is used to select a region above DMA zone, and
the "crashkernel=Y,low" is used to allocate specified size low memory.

Signed-off-by: Chen Zhou <chenzhou10@huawei.com>
Co-developed-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/r/20220506114402.365-4-thunder.leizhen@huawei.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-05-07 19:54:33 +01:00
Zhen Lei
e6b394425c arm64: Use insert_resource() to simplify code
insert_resource() traverses the subtree layer by layer from the root node
until a proper location is found. Compared with request_resource(), the
parent node does not need to be determined in advance.

In addition, move the insertion of node 'crashk_res' into function
reserve_crashkernel() to make the associated code close together.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Acked-by: John Donnelly  <john.p.donnelly@oracle.com>
Acked-by: Baoquan He <bhe@redhat.com>
Link: https://lore.kernel.org/r/20220506114402.365-3-thunder.leizhen@huawei.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-05-07 19:54:33 +01:00
Zhen Lei
2e5920bb07 kdump: return -ENOENT if required cmdline option does not exist
According to the current crashkernel=Y,low support in other ARCHes, it's
an optional command-line option. When it doesn't exist, kernel will try
to allocate minimum required memory below 4G automatically.

However, __parse_crashkernel() returns '-EINVAL' for all error cases. It
can't distinguish the nonexistent option from invalid option.

Change __parse_crashkernel() to return '-ENOENT' for the nonexistent option
case. With this change, crashkernel,low memory will take the default
value if crashkernel=,low is not specified; while crashkernel reservation
will fail and bail out if an invalid option is specified.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Acked-by: Baoquan He <bhe@redhat.com>
Link: https://lore.kernel.org/r/20220506114402.365-2-thunder.leizhen@huawei.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-05-07 19:54:33 +01:00
Mark Brown
d158a0608e arm64/sme: More sensibly define the size for the ZA register set
Since the vector length configuration mechanism is identical between SVE
and SME we share large elements of the code including the definition for
the maximum vector length. Unfortunately when we were defining the ABI
for SVE we included not only the actual maximum vector length of 2048
bits but also the value possible if all the bits reserved in the
architecture for expansion of the LEN field were used, 16384 bits.

This starts creating problems if we try to allocate anything for the ZA
matrix based on the maximum possible vector length, as we do for the
regset used with ptrace during the process of generating a core dump.
While the maximum potential size for ZA with the current architecture is
a reasonably managable 64K with the higher reserved limit ZA would be
64M which leads to entirely reasonable complaints from the memory
management code when we try to allocate a buffer of that size. Avoid
these issues by defining the actual maximum vector length for the
architecture and using it for the SME regsets.

Also use the full ZA_PT_SIZE() with the header rather than just the
actual register payload when specifying the size, fixing support for the
largest vector lengths now that we have this new, lower define. With the
SVE maximum this did not cause problems due to the extra headroom we
had.

While we're at it add a comment clarifying why even though ZA is a
single register we tell the regset code that it is a multi-register
regset.

Reported-by: Qian Cai <quic_qiancai@quicinc.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Tested-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Link: https://lore.kernel.org/r/20220505221517.1642014-1-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-05-06 18:24:19 +01:00
Qi Liu
6b79738b6e drivers/perf: hisi: Add Support for CPA PMU
On HiSilicon Hip09 platform, there is a CPA (Coherency Protocol Agent) on
each SICL (Super IO Cluster) which implements packet format translation,
route parsing and traffic statistics.

CPA PMU has 8 PMU counters and interrupt is supported to handle counter
overflow. Let's support its driver under the framework of HiSilicon PMU
driver.

Signed-off-by: Qi Liu <liuqi115@huawei.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Reviewed-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Link: https://lore.kernel.org/r/20220415102352.6665-3-liuqi115@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-05-06 15:14:31 +01:00
Qi Liu
807907dae9 drivers/perf: hisi: Associate PMUs in SICL with CPUs online
If a PMU is in a SICL (Super IO cluster), it is not appropriate to
associate this PMU with a CPU die. So we associate it with all CPUs
online, rather than CPUs in the nearest SCCL.

As the firmware of Hip09 platform hasn't been published yet, change
of PMU driver will not influence backwards compatibility between
driver and firmware.

Signed-off-by: Qi Liu <liuqi115@huawei.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Link: https://lore.kernel.org/r/20220415102352.6665-2-liuqi115@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-05-06 15:14:31 +01:00
Shaokun Zhang
47a9ed88a4 drivers/perf: arm_spe: Expose saturating counter to 16-bit
In order to acquire more accurate latency, Armv8.8[1] has defined the
CountSize field to 16-bit saturating counters when it's 0b0011.

Let's support this new feature and expose its to user under sysfs.

[1] https://developer.arm.com/documentation/ddi0487/latest

Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Link: https://lore.kernel.org/r/20220429063307.63251-1-zhangshaokun@hisilicon.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-05-06 15:10:00 +01:00
Robin Murphy
23760a0144 perf/arm-cmn: Add CMN-700 support
Add the identifiers, events, and subtleties for CMN-700. Highlights
include yet more options for doubling up CHI channels, which finally
grows event IDs beyond 8 bits for XPs, and a new set of CML gateway
nodes adding support for CXL as well as CCIX, where the Link Agent is
now internal to the CMN mesh so we gain regular PMU events for that too.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Tested-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Link: https://lore.kernel.org/r/cf892baa0d0258ea6cd6544b15171be0069a083a.1650320598.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-05-06 15:07:25 +01:00
Robin Murphy
65adf71398 perf/arm-cmn: Refactor occupancy filter selector
So far, DNs and HN-Fs have each had one event ralated to occupancy
trackers which are filtered by a separate field. CMN-700 raises the
stakes by introducing two more sets of HN-F events with corresponding
additional filter fields. Prepare for this by refactoring our filter
selection and tracking logic to account for multiple filter types
coexisting on the same node. This need not affect the uAPI, which can
just continue to encode any per-event filter setting in the "occupid"
config field, even if it's technically not the most accurate name for
some of them.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Tested-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Link: https://lore.kernel.org/r/1aa47ba0455b144c416537f6b0e58dc93b467a00.1650320598.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-05-06 15:07:25 +01:00
Robin Murphy
8e504d93ac perf/arm-cmn: Add CMN-650 support
Add the identifiers and events for CMN-650, which slots into its
evolutionary position between CMN-600 and the 700-series products.
Imagine CMN-600 made bigger, and with most of the rough edges smoothed
off, but that then balanced out by some bonkers PMU functionality for
the new HN-P enhancement in CMN-650r2.

Most of the CXG events are actually common to newer revisions of CMN-600
too, so they're arguably a little late; oh well.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Tested-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Link: https://lore.kernel.org/r/b0adc5824db53f71a2b561c293e2120390106536.1650320598.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-05-06 15:07:25 +01:00
Robin Murphy
2b60a22b70 dt-bindings: perf: arm-cmn: Add CMN-650 and CMN-700
If you were to guess from the product names that CMN-650 and CMN-700 are
the next two evolutionary steps of Arm's enterprise-level interconnect
following on from CMN-600, you'd be pleasantly correct. Add them to the
DT binding.

CC: devicetree@vger.kernel.org
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Rob Herring <robh@kernel.org>
Tested-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Link: https://lore.kernel.org/r/9b4dc0c82c91adff62b6f92eec5f61fb25b9db87.1650320598.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-05-06 15:07:25 +01:00
Ren Yu
4b5b712909 perf: check return value of armpmu_request_irq()
When the function armpmu_request_irq() failed, goto err

Signed-off-by: Ren Yu <renyu@nfschina.com>
Link: https://lore.kernel.org/r/20220425100436.4881-1-renyu@nfschina.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-05-06 15:04:48 +01:00
Palmer Dabbelt
c7a9dcea8e perf: RISC-V: Remove non-kernel-doc ** comments
This will presumably trip up some tools that try to parse the comments
as kernel doc when they're not.

Reported-by: kernel test robot <lkp@intel.com>
Fixes: 4905ec2fb7 ("RISC-V: Add sscofpmf extension support")
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>

--

These recently landed in for-next, but I'm trying to avoid rewriting
history as there's a lot in flight right now.

Reviewed-by: Atish Patra <atishp@rivosinc.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Link: https://lore.kernel.org/r/20220322220147.11407-1-palmer@rivosinc.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-05-06 13:48:30 +01:00
Kefeng Wang
f41ef4c2ee arm64: mm: Cleanup useless parameters in zone_sizes_init()
Directly use max_pfn for max and no one use min, kill them.

Reviewed-by: Vijay Balakrishna <vijayb@linux.microsoft.com>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Link: https://lore.kernel.org/r/20220411092455.1461-4-wangkefeng.wang@huawei.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-05-05 08:48:09 +01:00
Tong Tiangen
921d161f15 arm64: fix types in copy_highpage()
In copy_highpage() the `kto` and `kfrom` local variables are pointers to
struct page, but these are used to hold arbitrary pointers to kernel memory
. Each call to page_address() returns a void pointer to memory associated
with the relevant page, and copy_page() expects void pointers to this
memory.

This inconsistency was introduced in commit 2563776b41 ("arm64: mte:
Tags-aware copy_{user_,}highpage() implementations") and while this
doesn't appear to be harmful in practice it is clearly wrong.

Correct this by making `kto` and `kfrom` void pointers.

Fixes: 2563776b41 ("arm64: mte: Tags-aware copy_{user_,}highpage() implementations")
Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Link: https://lore.kernel.org/r/20220420030418.3189040-3-tongtiangen@huawei.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-05-04 20:00:13 +01:00
Hector Martin
5028fbad2d arm64: Set ARCH_NR_GPIO to 2048 for ARCH_APPLE
We're already running into the 512 GPIO limit on t600[01] depending on
how many SMC GPIOs we allocate, and a 2-die version could double that.
Let's make it 2K to be safe for now.

Signed-off-by: Hector Martin <marcan@marcan.st>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220502091427.28416-1-marcan@marcan.st
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-05-04 16:33:48 +01:00
Michal Orzel
48e6f22e25 arm64: cputype: Avoid overflow using MIDR_IMPLEMENTOR_MASK
Value of macro MIDR_IMPLEMENTOR_MASK exceeds the range of integer
and can lead to overflow. Currently there is no issue as it is used
in expressions implicitly casting it to u32. To avoid possible
problems, fix the macro.

Signed-off-by: Michal Orzel <michal.orzel@arm.com>
Link: https://lore.kernel.org/r/20220426070603.56031-1-michal.orzel@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-05-04 11:52:15 +01:00
Chengming Zhou
c4a0ebf87c arm64/ftrace: Make function graph use ftrace directly
As we do in commit 0c0593b45c ("x86/ftrace: Make function graph
use ftrace directly"), we don't need special hook for graph tracer,
but instead we use graph_ops:func function to install return_hooker.

Since commit 3b23e4991f ("arm64: implement ftrace with regs") add
implementation for FTRACE_WITH_REGS on arm64, we can easily adopt
the same cleanup on arm64.

And this cleanup only changes the FTRACE_WITH_REGS implementation,
so the mcount-based implementation is unaffected.

While in theory it would be possible to make a similar cleanup for
!FTRACE_WITH_REGS, this will require rework of the core code, and
so for now we only change the FTRACE_WITH_REGS implementation.

Tested-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Link: https://lore.kernel.org/r/20220420160006.17880-2-zhouchengming@bytedance.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-04-29 19:21:12 +01:00
Chengming Zhou
e999995c84 ftrace: cleanup ftrace_graph_caller enable and disable
The ftrace_[enable,disable]_ftrace_graph_caller() are used to do
special hooks for graph tracer, which are not needed on some ARCHs
that use graph_ops:func function to install return_hooker.

So introduce the weak version in ftrace core code to cleanup
in x86.

Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20220420160006.17880-1-zhouchengming@bytedance.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-04-29 19:21:12 +01:00
Wan Jiabing
2e29b9971a arm64/sme: Fix NULL check after kzalloc
Fix following coccicheck error:
./arch/arm64/kernel/process.c:322:2-23: alloc with no test, possible model on line 326

Here should be dst->thread.sve_state.

Fixes: 8bd7f91c03 ("arm64/sme: Implement traps and syscall handling for SME")
Signed-off-by: Wan Jiabing <wanjiabing@vivo.com>
Reviwed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20220426113054.630983-1-wanjiabing@vivo.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-04-29 16:03:47 +01:00
Mark Brown
8a58bcd00e arm64/sme: Add ID_AA64SMFR0_EL1 to __read_sysreg_by_encoding()
We need to explicitly enumerate all the ID registers which we rely on
for CPU capabilities in __read_sysreg_by_encoding(), ID_AA64SMFR0_EL1 was
missed from this list so we trip a BUG() in paths which rely on that
function such as CPU hotplug. Add the register.

Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Link: https://lore.kernel.org/r/20220427130828.162615-1-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-04-27 16:41:53 +01:00
Peter Collingbourne
b6ba1a89f7 arm64: document the boot requirements for MTE
When booting the kernel we access system registers such as GCR_EL1
if MTE is supported. These accesses are defined to trap to EL3 if
SCR_EL3.ATA is disabled. Furthermore, tag accesses will not behave
as expected if SCR_EL3.ATA is not set, or if HCR_EL2.ATA is not set
and we were booted at EL1. Therefore, require that these bits are
enabled when appropriate.

Signed-off-by: Peter Collingbourne <pcc@google.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://linux-review.googlesource.com/id/Iadcfd4dcd9ba3279b2813970b44d7485b0116709
Link: https://lore.kernel.org/r/20220422202912.292039-1-pcc@google.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-04-25 17:15:04 +01:00
Catalin Marinas
18788e3464 btrfs: Avoid live-lock in search_ioctl() on hardware with sub-page faults
Commit a48b73eca4 ("btrfs: fix potential deadlock in the search
ioctl") addressed a lockdep warning by pre-faulting the user pages and
attempting the copy_to_user_nofault() in an infinite loop. On
architectures like arm64 with MTE, an access may fault within a page at
a location different from what fault_in_writeable() probed. Since the
sk_offset is rewound to the previous struct btrfs_ioctl_search_header
boundary, there is no guaranteed forward progress and search_ioctl() may
live-lock.

Use fault_in_subpage_writeable() instead of fault_in_writeable() to
ensure the permission is checked at the right granularity (smaller than
PAGE_SIZE).

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Fixes: a48b73eca4 ("btrfs: fix potential deadlock in the search ioctl")
Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: David Sterba <dsterba@suse.com>
Cc: Chris Mason <clm@fb.com>
Cc: Josef Bacik <josef@toxicpanda.com>
Link: https://lore.kernel.org/r/20220423100751.1870771-4-catalin.marinas@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-04-25 10:25:43 +01:00
Catalin Marinas
f3ba50a7a1 arm64: Add support for user sub-page fault probing
With MTE, even if the pte allows an access, a mismatched tag somewhere
within a page can still cause a fault. Select ARCH_HAS_SUBPAGE_FAULTS if
MTE is enabled and implement the probe_subpage_writeable() function.
Note that get_user() is sufficient for the writeable MTE check since the
same tag mismatch fault would be triggered by a read. The caller of
probe_subpage_writeable() will need to check the pte permissions
(put_user, GUP).

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220423100751.1870771-3-catalin.marinas@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-04-25 10:25:43 +01:00
Catalin Marinas
da32b58172 mm: Add fault_in_subpage_writeable() to probe at sub-page granularity
On hardware with features like arm64 MTE or SPARC ADI, an access fault
can be triggered at sub-page granularity. Depending on how the
fault_in_writeable() function is used, the caller can get into a
live-lock by continuously retrying the fault-in on an address different
from the one where the uaccess failed.

In the majority of cases progress is ensured by the following
conditions:

1. copy_to_user_nofault() guarantees at least one byte access if the
   user address is not faulting.

2. The fault_in_writeable() loop is resumed from the first address that
   could not be accessed by copy_to_user_nofault().

If the loop iteration is restarted from an earlier (initial) point, the
loop is repeated with the same conditions and it would live-lock.

Introduce an arch-specific probe_subpage_writeable() and call it from
the newly added fault_in_subpage_writeable() function. The arch code
with sub-page faults will have to implement the specific probing
functionality.

Note that no other fault_in_subpage_*() functions are added since they
have no callers currently susceptible to a live-lock.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: https://lore.kernel.org/r/20220423100751.1870771-2-catalin.marinas@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-04-25 10:25:43 +01:00
Mark Brown
a1f4ccd25c arm64/sme: Provide Kconfig for SME
Now that basline support for the Scalable Matrix Extension (SME) is present
introduce the Kconfig option allowing it to be built. While the feature
registers don't impose a strong requirement for a system with SME to
support SVE at runtime the support for streaming mode SVE is mostly
shared with normal SVE so depend on SVE.

Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20220419112247.711548-28-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-04-22 18:51:23 +01:00
Mark Brown
861262ab86 KVM: arm64: Handle SME host state when running guests
While we don't currently support SME in guests we do currently support it
for the host system so we need to take care of SME's impact, including
the floating point register state, when running guests. Simiarly to SVE
we need to manage the traps in CPACR_RL1, what is new is the handling of
streaming mode and ZA.

Normally we defer any handling of the floating point register state until
the guest first uses it however if the system is in streaming mode FPSIMD
and SVE operations may generate SME traps which we would need to distinguish
from actual attempts by the guest to use SME. Rather than do this for the
time being if we are in streaming mode when entering the guest we force
the floating point state to be saved immediately and exit streaming mode,
meaning that the guest won't generate SME traps for supported operations.

We could handle ZA in the access trap similarly to the FPSIMD/SVE state
without the disruption caused by streaming mode but for simplicity
handle it the same way as streaming mode for now.

This will be revisited when we support SME for guests (hopefully before SME
hardware becomes available), for now it will only incur additional cost on
systems with SME and even there only if streaming mode or ZA are enabled.

Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220419112247.711548-27-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-04-22 18:51:23 +01:00
Mark Brown
51729fb1d0 KVM: arm64: Trap SME usage in guest
SME defines two new traps which need to be enabled for guests to ensure
that they can't use SME, one for the main SME operations which mirrors the
traps for SVE and another for access to TPIDR2 in SCTLR_EL2.

For VHE manage SMEN along with ZEN in activate_traps() and the FP state
management callbacks, along with SCTLR_EL2.EnTPIDR2.  There is no
existing dynamic management of SCTLR_EL2.

For nVHE manage TSM in activate_traps() along with the fine grained
traps for TPIDR2 and SMPRI.  There is no existing dynamic management of
fine grained traps.

Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220419112247.711548-26-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-04-22 18:51:22 +01:00
Mark Brown
90807748ca KVM: arm64: Hide SME system registers from guests
For the time being we do not support use of SME by KVM guests, support for
this will be enabled in future. In order to prevent any side effects or
side channels via the new system registers, including the EL0 read/write
register TPIDR2, explicitly undefine all the system registers added by
SME and mask out the SME bitfield in SYS_ID_AA64PFR1.

Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220419112247.711548-25-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-04-22 18:51:22 +01:00