linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-05-08 08:02:59 -04:00

Author	SHA1	Message	Date
Catalin Marinas	1ef21fcd6a	Revert "mm: add arch hook to validate mmap() prot flags" This reverts commit `cb1a393c40`. Since the arm64 WXN patch has been reverted, remove this hook as it would not have any users. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/ZfGESD3a91lxH367@arm.com	2024-03-13 10:59:38 +00:00
Catalin Marinas	69ebc01824	Revert "arm64: mm: add support for WXN memory translation attribute" This reverts commit `50e3ed0f93`. The SCTLR_EL1.WXN control forces execute-never when a page has write permissions. While the idea of hardening such write/exec combinations is good, with permissions indirection enabled (FEAT_PIE) this control becomes RES0. FEAT_PIE introduces a slightly different form of WXN which only has an effect when the base permission is RWX and the write is toggled by the permission overlay (FEAT_POE, not yet supported by the arm64 kernel). Revert the patch for now. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/ZfGESD3a91lxH367@arm.com	2024-03-13 10:53:20 +00:00
Catalin Marinas	f1bbc4e9cf	Revert "ARM64: Dynamically allocate cpumasks and increase supported CPUs to 512" This reverts commit `0499a78369`. Enabling CPUMASK_OFFSTACK on arm64 triggers a warning in the dev_pm_opp_set_config() function followed by a failure to set the regulators and cpufreq-dt probing error. There is no apparent reason why this happens, so revert this commit until further investigation. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Reported-by: Marek Szyprowski <m.szyprowski@samsung.com> Link: https://lore.kernel.org/r/c1f2902d-cefc-4122-9b86-d1d32911f590@samsung.com	2024-03-11 18:40:49 +00:00
Catalin Marinas	88f0912253	Merge branch 'for-next/stage1-lpa2' into for-next/core * for-next/stage1-lpa2: (48 commits) : Add support for LPA2 and WXN and stage 1 arm64/mm: Avoid ID mapping of kpti flag if it is no longer needed arm64/mm: Use generic __pud_free() helper in pud_free() implementation arm64: gitignore: ignore relacheck arm64: Use Signed/Unsigned enums for TGRAN{4,16,64} and VARange arm64: mm: Make PUD folding check in set_pud() a runtime check arm64: mm: add support for WXN memory translation attribute mm: add arch hook to validate mmap() prot flags arm64: defconfig: Enable LPA2 support arm64: Enable 52-bit virtual addressing for 4k and 16k granule configs arm64: kvm: avoid CONFIG_PGTABLE_LEVELS for runtime levels arm64: ptdump: Deal with translation levels folded at runtime arm64: ptdump: Disregard unaddressable VA space arm64: mm: Add support for folding PUDs at runtime arm64: kasan: Reduce minimum shadow alignment and enable 5 level paging arm64: mm: Add 5 level paging support to fixmap and swapper handling arm64: Enable LPA2 at boot if supported by the system arm64: mm: add LPA2 and 5 level paging support to G-to-nG conversion arm64: mm: Add definitions to support 5 levels of paging arm64: mm: Add LPA2 support to phys<->pte conversion routines arm64: mm: Wire up TCR.DS bit to PTE shareability fields ...	2024-03-07 19:05:29 +00:00
Catalin Marinas	0c5ade742e	Merge branches 'for-next/reorg-va-space', 'for-next/rust-for-arm64', 'for-next/misc', 'for-next/daif-cleanup', 'for-next/kselftest', 'for-next/documentation', 'for-next/sysreg' and 'for-next/dpisa', remote-tracking branch 'arm64/for-next/perf' into for-next/core * arm64/for-next/perf: (39 commits) docs: perf: Fix build warning of hisi-pcie-pmu.rst perf: starfive: Only allow COMPILE_TEST for 64-bit architectures MAINTAINERS: Add entry for StarFive StarLink PMU docs: perf: Add description for StarFive's StarLink PMU dt-bindings: perf: starfive: Add JH8100 StarLink PMU perf: starfive: Add StarLink PMU support docs: perf: Update usage for target filter of hisi-pcie-pmu drivers/perf: hisi_pcie: Merge find_related_event() and get_event_idx() drivers/perf: hisi_pcie: Relax the check on related events drivers/perf: hisi_pcie: Check the target filter properly drivers/perf: hisi_pcie: Add more events for counting TLP bandwidth drivers/perf: hisi_pcie: Fix incorrect counting under metric mode drivers/perf: hisi_pcie: Introduce hisi_pcie_pmu_get_event_ctrl_val() drivers/perf: hisi_pcie: Rename hisi_pcie_pmu_{config,clear}_filter() drivers/perf: hisi: Enable HiSilicon Erratum 162700402 quirk for HIP09 perf/arm_cspmu: Add devicetree support dt-bindings/perf: Add Arm CoreSight PMU perf/arm_cspmu: Simplify counter reset perf/arm_cspmu: Simplify attribute groups perf/arm_cspmu: Simplify initialisation ... * for-next/reorg-va-space: : Reorganise the arm64 kernel VA space in preparation for LPA2 support : (52-bit VA/PA). arm64: kaslr: Adjust randomization range dynamically arm64: mm: Reclaim unused vmemmap region for vmalloc use arm64: vmemmap: Avoid base2 order of struct page size to dimension region arm64: ptdump: Discover start of vmemmap region at runtime arm64: ptdump: Allow all region boundaries to be defined at boot time arm64: mm: Move fixmap region above vmemmap region arm64: mm: Move PCI I/O emulation region above the vmemmap region * for-next/rust-for-arm64: : Enable Rust support for arm64 arm64: rust: Enable Rust support for AArch64 rust: Refactor the build target to allow the use of builtin targets * for-next/misc: : Miscellaneous arm64 patches ARM64: Dynamically allocate cpumasks and increase supported CPUs to 512 arm64: Remove enable_daif macro arm64/hw_breakpoint: Directly use ESR_ELx_WNR for an watchpoint exception arm64: cpufeatures: Clean up temporary variable to simplify code arm64: Update setup_arch() comment on interrupt masking arm64: remove unnecessary ifdefs around is_compat_task() arm64: ftrace: Don't forbid CALL_OPS+CC_OPTIMIZE_FOR_SIZE with Clang arm64/sme: Ensure that all fields in SMCR_EL1 are set to known values arm64/sve: Ensure that all fields in ZCR_EL1 are set to known values arm64/sve: Document that __SVE_VQ_MAX is much larger than needed arm64: make member of struct pt_regs and it's offset macro in the same order arm64: remove unneeded BUILD_BUG_ON assertion arm64: kretprobes: acquire the regs via a BRK exception arm64: io: permit offset addressing arm64: errata: Don't enable workarounds for "rare" errata by default * for-next/daif-cleanup: : Clean up DAIF handling for EL0 returns arm64: Unmask Debug + SError in do_notify_resume() arm64: Move do_notify_resume() to entry-common.c arm64: Simplify do_notify_resume() DAIF masking * for-next/kselftest: : Miscellaneous arm64 kselftest patches kselftest/arm64: Test that ptrace takes effect in the target process * for-next/documentation: : arm64 documentation patches arm64/sme: Remove spurious 'is' in SME documentation arm64/fp: Clarify effect of setting an unsupported system VL arm64/sme: Fix cut'n'paste in ABI document arm64/sve: Remove bitrotted comment about syscall behaviour * for-next/sysreg: : sysreg updates arm64/sysreg: Update ID_AA64DFR0_EL1 register arm64/sysreg: Update ID_DFR0_EL1 register fields arm64/sysreg: Add register fields for ID_AA64DFR1_EL1 * for-next/dpisa: : Support for 2023 dpISA extensions kselftest/arm64: Add 2023 DPISA hwcap test coverage kselftest/arm64: Add basic FPMR test kselftest/arm64: Handle FPMR context in generic signal frame parser arm64/hwcap: Define hwcaps for 2023 DPISA features arm64/ptrace: Expose FPMR via ptrace arm64/signal: Add FPMR signal handling arm64/fpsimd: Support FEAT_FPMR arm64/fpsimd: Enable host kernel access to FPMR arm64/cpufeature: Hook new identification registers up to cpufeature	2024-03-07 19:04:55 +00:00
Christoph Lameter (Ampere)	0499a78369	ARM64: Dynamically allocate cpumasks and increase supported CPUs to 512 Currently defconfig selects NR_CPUS=256, but some vendors (e.g. Ampere Computing) are planning to ship systems with 512 CPUs. So that all CPUs on these systems can be used with defconfig, we'd like to bump NR_CPUS to 512. Therefore this patch increases the default NR_CPUS from 256 to 512. As increasing NR_CPUS will increase the size of cpumasks, there's a fear that this might have a significant impact on stack usage due to code which places cpumasks on the stack. To mitigate that concern, we can select CPUMASK_OFFSTACK. As that doesn't seem to be a problem today with NR_CPUS=256, we only select this when NR_CPUS > 256. CPUMASK_OFFSTACK configures the cpumasks in the kernel to be dynamically allocated. This was used in the X86 architecture in the past to enable support for larger CPU configurations up to 8k cpus. With that is becomes possible to dynamically size the allocation of the cpu bitmaps depending on the quantity of processors detected on bootup. Memory used for cpumasks will increase if the kernel is run on a machine with more cores. Further increases may be needed if ARM processor vendors start supporting more processors. Given the current inflationary trends in core counts from multiple processor manufacturers this may occur. There are minor regressions for hackbench. The kernel data size for 512 cpus is smaller with offstack than with onstack. Benchmark results using hackbench average over 10 runs of hackbench -s 512 -l 2000 -g 15 -f 25 -P on Altra 80 Core Support for 256 CPUs on stack. Baseline 7.8564 sec Support for 512 CUs on stack. 7.8713 sec + 0.18% 512 CPUS offstack 7.8916 sec + 0.44% Kernel size comparison: text data filename Difference to onstack256 baseline 25755648 `9589248` vmlinuz-6.8.0-rc4-onstack256 25755648 9607680 vmlinuz-6.8.0-rc4-onstack512 +0.19% 25755648 9603584 vmlinuz-6.8.0-rc4-offstack512 +0.14% Tested-by: Eric Mackay <eric.mackay@oracle.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Christoph Lameter (Ampere) <cl@linux.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/37099a57-b655-3b3a-56d0-5f7fbd49d7db@gentwo.org [catalin.marinas@arm.com: use 'select' instead of duplicating 'config CPUMASK_OFFSTACK'] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2024-03-07 18:18:06 +00:00
Mark Brown	44d10c27bd	kselftest/arm64: Add 2023 DPISA hwcap test coverage Add the hwcaps added for the 2023 DPISA extensions to the hwcaps test program. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20240306-arm64-2023-dpisa-v5-9-c568edc8ed7f@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2024-03-07 17:14:55 +00:00
Mark Brown	7bcebadda0	kselftest/arm64: Add basic FPMR test Verify that a FPMR frame is generated on systems that support FPMR and not generated otherwise. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20240306-arm64-2023-dpisa-v5-8-c568edc8ed7f@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2024-03-07 17:14:54 +00:00
Mark Brown	f4dcccdda5	kselftest/arm64: Handle FPMR context in generic signal frame parser Teach the generic signal frame parsing code about the newly added FPMR frame, avoiding warnings every time one is generated. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20240306-arm64-2023-dpisa-v5-7-c568edc8ed7f@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2024-03-07 17:14:54 +00:00
Mark Brown	c1932cac79	arm64/hwcap: Define hwcaps for 2023 DPISA features The 2023 architecture extensions include a large number of floating point features, most of which simply add new instructions. Add hwcaps so that userspace can enumerate these features. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20240306-arm64-2023-dpisa-v5-6-c568edc8ed7f@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2024-03-07 17:14:54 +00:00
Mark Brown	4035c22ef7	arm64/ptrace: Expose FPMR via ptrace Add a new regset to expose FPMR via ptrace. It is not added to the FPSIMD registers since that structure is exposed elsewhere without any allowance for extension we don't add there. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20240306-arm64-2023-dpisa-v5-5-c568edc8ed7f@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2024-03-07 17:14:53 +00:00
Mark Brown	8c46def444	arm64/signal: Add FPMR signal handling Expose FPMR in the signal context on systems where it is supported. The kernel validates the exact size of the FPSIMD registers so we can't readily add it to fpsimd_context without disruption. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20240306-arm64-2023-dpisa-v5-4-c568edc8ed7f@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2024-03-07 17:14:53 +00:00
Mark Brown	203f2b95a8	arm64/fpsimd: Support FEAT_FPMR FEAT_FPMR defines a new EL0 accessible register FPMR use to configure the FP8 related features added to the architecture at the same time. Detect support for this register and context switch it for EL0 when present. Due to the sharing of responsibility for saving floating point state between the host kernel and KVM FP8 support is not yet implemented in KVM and a stub similar to that used for SVCR is provided for FPMR in order to avoid bisection issues. To make it easier to share host state with the hypervisor we store FPMR as a hardened usercopy field in uw (along with some padding). Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20240306-arm64-2023-dpisa-v5-3-c568edc8ed7f@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2024-03-07 17:14:53 +00:00
Mark Brown	b6c0b424cb	arm64/fpsimd: Enable host kernel access to FPMR FEAT_FPMR provides a new generally accessible architectural register FPMR. This is only accessible to EL0 and EL1 when HCRX_EL2.EnFPM is set to 1, do this when the host is running. The guest part will be done along with context switching the new register and exposing it via guest management. Acked-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20240306-arm64-2023-dpisa-v5-2-c568edc8ed7f@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2024-03-07 17:14:52 +00:00
Mark Brown	cc9f69a3da	arm64/cpufeature: Hook new identification registers up to cpufeature The 2023 architecture extensions have defined several new ID registers, hook them up to the cpufeature code so we can add feature checks and hwcaps based on their contents. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20240306-arm64-2023-dpisa-v5-1-c568edc8ed7f@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2024-03-07 17:14:52 +00:00
Yicong Yang	b037e40a6a	docs: perf: Fix build warning of hisi-pcie-pmu.rst `make htmldocs SPHINXDIRS="admin-guide"` shows below warnings: Documentation/admin-guide/perf/hisi-pcie-pmu.rst:48: ERROR: Unexpected indentation. Documentation/admin-guide/perf/hisi-pcie-pmu.rst:49: WARNING: Block quote ends without a blank line; unexpected unindent. Fix this. Closes: https://lore.kernel.org/lkml/20231011172250.5a6498e5@canb.auug.org.au/ Fixes: `89a032923d` ("docs: perf: Update usage for target filter of hisi-pcie-pmu") Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20240305122517.12179-1-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>	2024-03-05 13:42:30 +00:00
Will Deacon	f0dbc6d0de	perf: starfive: Only allow COMPILE_TEST for 64-bit architectures The kbuild robot exploded while wasting its time building the Starfive PMU driver for the 32-bit PA-RISC and Hexagon architectures. Adjust the Kconfig dependencies so that COMPILE_TEST is only applicable for 64-bit architectures (which implement writeq()). Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Will Deacon <will@kernel.org>	2024-03-05 10:56:49 +00:00
Ji Sheng Teoh	b9f71ab215	MAINTAINERS: Add entry for StarFive StarLink PMU Add maintainer entry for StarFive StarLink PMU driver, and mark it as "Maintained" Signed-off-by: Ji Sheng Teoh <jisheng.teoh@starfivetech.com> Link: https://lore.kernel.org/r/20240229072720.3987876-5-jisheng.teoh@starfivetech.com Signed-off-by: Will Deacon <will@kernel.org>	2024-03-04 14:19:48 +00:00
Ji Sheng Teoh	49925c1c5a	docs: perf: Add description for StarFive's StarLink PMU StarFive StarLink PMU support monitoring L3 memory system PMU events. Add documentation to describe StarFive StarLink PMU support and it's usage. Signed-off-by: Ji Sheng Teoh <jisheng.teoh@starfivetech.com> Link: https://lore.kernel.org/r/20240229072720.3987876-4-jisheng.teoh@starfivetech.com Signed-off-by: Will Deacon <will@kernel.org>	2024-03-04 14:19:48 +00:00
Ji Sheng Teoh	66461b43b0	dt-bindings: perf: starfive: Add JH8100 StarLink PMU Add device tree binding for StarFive's JH8100 StarLink PMU (Performance Monitor Unit). Signed-off-by: Ji Sheng Teoh <jisheng.teoh@starfivetech.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20240229072720.3987876-3-jisheng.teoh@starfivetech.com Signed-off-by: Will Deacon <will@kernel.org>	2024-03-04 14:19:48 +00:00
Ji Sheng Teoh	c2b24812f7	perf: starfive: Add StarLink PMU support This patch adds support for StarFive's StarLink PMU (Performance Monitor Unit). StarLink PMU integrates one or more CPU cores with a shared L3 memory system. The PMU supports overflow interrupt, up to 16 programmable 64bit event counters, and an independent 64bit cycle counter. StarLink PMU is accessed via MMIO. Example Perf stat output: [root@user]# perf stat -a -e /starfive_starlink_pmu/cycles/ \ -e /starfive_starlink_pmu/read_miss/ \ -e /starfive_starlink_pmu/read_hit/ \ -e /starfive_starlink_pmu/release_request/ \ -e /starfive_starlink_pmu/write_hit/ \ -e /starfive_starlink_pmu/write_miss/ \ -e /starfive_starlink_pmu/write_request/ \ -e /starfive_starlink_pmu/writeback/ \ -e /starfive_starlink_pmu/read_request/ \ -- openssl speed rsa2048 Doing 2048 bits private rsa's for 10s: 5 2048 bits private RSA's in 2.84s Doing 2048 bits public rsa's for 10s: 169 2048 bits public RSA's in 2.42s version: 3.0.11 built on: Tue Sep 19 13:02:31 2023 UTC options: bn(64,64) CPUINFO: N/A sign verify sign/s verify/s rsa 2048 bits 0.568000s 0.014320s 1.8 69.8 ///////// Performance counter stats for 'system wide': 649991998 starfive_starlink_pmu/cycles/ 1009690 starfive_starlink_pmu/read_miss/ 1079750 starfive_starlink_pmu/read_hit/ 2089405 starfive_starlink_pmu/release_request/ 129 starfive_starlink_pmu/write_hit/ 70 starfive_starlink_pmu/write_miss/ 194 starfive_starlink_pmu/write_request/ 150080 starfive_starlink_pmu/writeback/ 2089423 starfive_starlink_pmu/read_request/ 27.062755678 seconds time elapsed Signed-off-by: Ji Sheng Teoh <jisheng.teoh@starfivetech.com> Link: https://lore.kernel.org/r/20240229072720.3987876-2-jisheng.teoh@starfivetech.com Signed-off-by: Will Deacon <will@kernel.org>	2024-03-04 14:19:48 +00:00
Junhao He	89a032923d	docs: perf: Update usage for target filter of hisi-pcie-pmu One of the "port" and "bdf" target filter interface must be set, and the related events should preferably used in the same group. Update the usage in the documentation. Signed-off-by: Junhao He <hejunhao3@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240223103359.18669-9-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>	2024-03-04 14:18:29 +00:00
Junhao He	7da377059e	drivers/perf: hisi_pcie: Merge find_related_event() and get_event_idx() The function xxx_find_related_event() scan all working events to find related events. During this process, we also can find the idle counters. If not found related events, return the first idle counter to simplify the code. Signed-off-by: Junhao He <hejunhao3@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240223103359.18669-8-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>	2024-03-04 14:18:29 +00:00
Junhao He	2fbf96ed88	drivers/perf: hisi_pcie: Relax the check on related events If we use two events with the same filter and related event type (see the following example), the driver check whether they are related events and are in the same group, otherwise the function hisi_pcie_pmu_find_related_event() return -EINVAL, then the 2nd event cannot count but the 1st event is running, although the PCIe PMU has other idle counters. In this case, The perf event scheduler will make the two events to multiplex a counter, if the user use the formula (1st event_value / 2nd event_value) to calculate the bandwidth, he/she won't get the correct value, because they are not counting at the same period. This patch tries to fix this by making the related events to use different idle counters if they are not in the same event group. And finally, I'm going to say. The related events are best used in the same group [1]. There are two ways to know if they are related events. a) By event name, such as the latency events "xxx_latency, xxx_cnt" or bandwidth events "xxx_flux, xxx_time". b) By event type, such as "event=0xXXXX, event=0x1XXXX". Use group to count the related events: [1] -e "{pmu_name/xxx_latency,port=1/,pmu_name/xxx_cnt,port=1/}" example: 1st event: hisi_pcie0_core1/event=0x804,port=1 2nd event: hisi_pcie0_core1/event=0x10804,port=1 test cmd: perf stat -e hisi_pcie0_core1/event=0x804,port=1/ \ -e hisi_pcie0_core1/event=0x10804,port=1/ before patch: 25,281 hisi_pcie0_core1/event=0x804,port=1/ (49.91%) 470,598 hisi_pcie0_core1/event=0x10804,port=1/ (50.09%) after patch: 24,147 hisi_pcie0_core1/event=0x804,port=1/ 474,558 hisi_pcie0_core1/event=0x10804,port=1/ Signed-off-by: Junhao He <hejunhao3@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huwei.com> Link: https://lore.kernel.org/r/20240223103359.18669-7-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>	2024-03-04 14:18:29 +00:00
Junhao He	2f864fee08	drivers/perf: hisi_pcie: Check the target filter properly The PMU can monitor traffic of certain target Root Port or downstream target Endpoint. User can specify the target filter by the "port" or "bdf" option respectively. The PMU can only monitor the Root Port or Endpoint on the same PCIe core so the value of "port" or "bdf" should be valid and will be checked by the driver. Currently at least and only one of "port" and "bdf" option must be set. If "port" filter is not set or is set explicitly to zero (default), driver will regard the user specifies a "bdf" option since "port" option is a bitmask of the target Root Ports and zero is not a valid value. If user not explicitly set "port" or "bdf" filter, the driver uses "bdf" default value (zero) to set target filter, but driver will skip the check of bdf=0, although it's a valid value (meaning 0000:000:00.0). Then the user just gets zero. Therefore, we need to check if both "port" and "bdf" are invalid, then return failure and report warning. Testing: before the patch: 0 hisi_pcie0_core1/rx_mrd_flux/ 0 hisi_pcie0_core1/rx_mrd_flux,port=0/ 24,124 hisi_pcie0_core1/rx_mrd_flux,port=1/ 0 hisi_pcie0_core1/rx_mrd_flux,bdf=0/ 0 hisi_pcie0_core1/rx_mrd_flux,port=0x800/ <not supported> hisi_pcie0_core1/rx_mrd_flux,bdf=1/ 24,132 hisi_pcie0_core1/rx_mrd_flux,bdf=0x1700/ <not supported> hisi_pcie0_core1/rx_mrd_flux,port=0x0,bdf=0x0/ <not supported> hisi_pcie0_core1/rx_mrd_flux,port=0x0,bdf=0x1/ 24,138 hisi_pcie0_core1/rx_mrd_flux,port=0x0,bdf=0x1700/ 24,126 hisi_pcie0_core1/rx_mrd_flux,port=0x1,bdf=0x0/ after the patch: <not supported> hisi_pcie0_core1/rx_mrd_flux/ <not supported> hisi_pcie0_core1/rx_mrd_flux,port=0/ 24,153 hisi_pcie0_core1/rx_mrd_flux,port=1/ 0 hisi_pcie0_core1/rx_mrd_flux,port=0x800/ <not supported> hisi_pcie0_core1/rx_mrd_flux,bdf=0/ <not supported> hisi_pcie0_core1/rx_mrd_flux,bdf=1/ 24,117 hisi_pcie0_core1/rx_mrd_flux,bdf=0x1700/ <not supported> hisi_pcie0_core1/rx_mrd_flux,port=0x0,bdf=0x0/ <not supported> hisi_pcie0_core1/rx_mrd_flux,port=0x0,bdf=0x1/ 24,120 hisi_pcie0_core1/rx_mrd_flux,port=0x0,bdf=0x1700/ 24,123 hisi_pcie0_core1/rx_mrd_flux,port=0x1,bdf=0x0/ Signed-off-by: Junhao He <hejunhao3@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240223103359.18669-6-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>	2024-03-04 14:18:29 +00:00
Yicong Yang	00ca69b856	drivers/perf: hisi_pcie: Add more events for counting TLP bandwidth A typical PCIe transaction is consisted of various TLP packets in both direction. For counting bandwidth only memory read events are exported currently. Add memory write and completion counting events of both direction to complete the bandwidth counting. Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240223103359.18669-5-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>	2024-03-04 14:18:28 +00:00
Yicong Yang	b6693ad68e	drivers/perf: hisi_pcie: Fix incorrect counting under metric mode The metric counting shows incorrect results if the events in the metric group using the same event but different filter options. This is because we only judge the event code to decide whether the event in the metric group should share the same hardware counter, but ignore the settings of the filter. For example, on a platform of 2 ports 0x1 and 0x2 but only port 0x1 has a downstream PCIe NVME device. The metric counting shows both ports have the same counts because we misassign these two events to one same hardware counter: [root@localhost perf-iostat]# ./perf stat -e '{hisi_pcie0_core1/event=0x0104,port=0x2/,hisi_pcie0_core1/event=0x0104,port=0x1/}' Performance counter stats for 'system wide': 7907484924 hisi_pcie0_core1/event=0x0104,port=0x2/ 7907484924 hisi_pcie0_core1/event=0x0104,port=0x1/ 10.153863691 seconds time elapsed Fix this by using the whole config rather than the event only to judge whether two events are the same and should share the same hardware counter. With this patch, the metric counting in the above case tends to be corrected: [root@localhost perf-iostat]# ./perf stat -e '{hisi_pcie0_core1/event=0x0104,port=0x2/,hisi_pcie0_core1/event=0x0104,port=0x1/}' Performance counter stats for 'system wide': 0 hisi_pcie0_core1/event=0x0104,port=0x2/ 8123122077 hisi_pcie0_core1/event=0x0104,port=0x1/ 10.152875631 seconds time elapsed Fixes: `8404b0fbc7` ("drivers/perf: hisi: Add driver for HiSilicon PCIe PMU") Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240223103359.18669-4-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>	2024-03-04 14:18:28 +00:00
Yicong Yang	4d473461e0	drivers/perf: hisi_pcie: Introduce hisi_pcie_pmu_get_event_ctrl_val() Factor out retrieving of the register value for the corresponding event from hisi_pcie_config_event_ctrl() into a new function hisi_pcie_pmu_get_event_ctrl_val() allowing future reuse. Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240223103359.18669-3-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>	2024-03-04 14:18:28 +00:00
Yicong Yang	54a9e47eeb	drivers/perf: hisi_pcie: Rename hisi_pcie_pmu_{config,clear}_filter() hisi_pcie_pmu_{config,clear}_filter() are config/clear HISI_PCIE_EVENT_CTRL register which contains not only the filter but also the event code. The function names are bit misleading. Rename it to hisi_pcie_pmu_{config,clear}_event_ctrl() to reflects their functions more accurately. Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240223103359.18669-2-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>	2024-03-04 14:18:28 +00:00
Junhao He	e10b6976f6	drivers/perf: hisi: Enable HiSilicon Erratum 162700402 quirk for HIP09 HiSilicon UC PMU v2 suffers the erratum 162700402 that the PMU counter cannot be set due to the lack of clock under power saving mode. This will lead to error or inaccurate counts. The clock can be enabled by the PMU global enabling control. This patch tries to fix this by set the UC PMU enable before set event period to turn on the clock, and then restore the UC PMU configuration. The counter register can hold its value without a clock. Signed-off-by: Junhao He <hejunhao3@huawei.com> Reviewed-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20240227125231.53127-1-hejunhao3@huawei.com Signed-off-by: Will Deacon <will@kernel.org>	2024-03-04 14:16:10 +00:00
Jinjie Ruan	527db67a4d	arm64: Remove enable_daif macro Since commit `bb8e93a287` ("arm64: entry: convert SError handlers to C"), the enable_daif assembler macro is no longer used anywhere, so remove it. Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Reviewed-by: Mark Brown <broonie@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20240229132802.1682026-2-ruanjinjie@huawei.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2024-03-01 17:41:37 +00:00
Anshuman Khandual	9d6b6789c8	arm64/hw_breakpoint: Directly use ESR_ELx_WNR for an watchpoint exception Let's use existing ISS encoding for an watchpoint exception i.e ESR_ELx_WNR This represents an instruction's either writing to or reading from a memory location during an watchpoint exception. While here this drops non-standard macro AARCH64_ESR_ACCESS_MASK. Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Reviewed-by: Mark Brown <broonie@kernel.org> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20240229083431.356578-1-anshuman.khandual@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2024-03-01 17:36:51 +00:00
Liao Chang	622442666d	arm64: cpufeatures: Clean up temporary variable to simplify code Clean up one temporary variable to simplifiy code in capability detection. Signed-off-by: Liao Chang <liaochang1@huawei.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20240229105208.456704-1-liaochang1@huawei.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2024-03-01 17:35:53 +00:00
Ard Biesheuvel	27f2b9fcdd	arm64/mm: Avoid ID mapping of kpti flag if it is no longer needed arm64_use_ng_mappings will be set to 'true' by the early boot code if it decides to use non-global (nG) attributes for all kernel mappings, typically when enabling KASLR on a system that does not implement E0PD. In this case, the G-to-nG update routines are never called, and so there is no reason to create the writable mapping of the associated status flag in the ID map. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20240301104046.1234309-6-ardb+git@google.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2024-03-01 15:25:45 +00:00
Ard Biesheuvel	3137db4c66	arm64/mm: Use generic __pud_free() helper in pud_free() implementation Commit `0dd4f60a2c` ("arm64: mm: Add support for folding PUDs at runtime") implements specialized PUD alloc/free helpers to allow the decision whether or not to fold PUDs to be made at runtime when the number of paging levels is 4 or higher. Its implementation of pud_free() is based on the generic version that existed when the patch was first written, but in the meantime, the freeing of a PUD has become a bit more involved, and so instead of simply freeing the page, we should invoke the generic __pud_free() that encapsulates whatever needs doing at this point. This fixes a reported warning emitted by the page flags self-diagnostics. Reported-by: Ryan Roberts <ryan.roberts@arm.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Tested-by: Ryan Roberts <ryan.roberts@arm.com> Link: https://lore.kernel.org/r/20240301104046.1234309-5-ardb+git@google.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2024-03-01 15:25:45 +00:00
Ryo Takakura	6d1ce806e1	arm64: Update setup_arch() comment on interrupt masking DAIF_PROCCTX_NOIRQ contains the FIQ bit. Update the comment as only asynchronous aborts are unmasked and FIQ is still masked. Signed-off-by: Ryo Takakura <takakura@valinux.co.jp> Link: https://lore.kernel.org/r/20240228022836.1756-1-takakura@valinux.co.jp Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2024-02-28 18:01:24 +00:00
Leonardo Bras	1984c80546	arm64: remove unnecessary ifdefs around is_compat_task() Currently some parts of the codebase will test for CONFIG_COMPAT before testing is_compat_task(). is_compat_task() is a inlined function only present on CONFIG_COMPAT. On the other hand, for !CONFIG_COMPAT, we have in linux/compat.h: #define is_compat_task() (0) Since we have this define available in every usage of is_compat_task() for !CONFIG_COMPAT, it's unnecessary to keep the ifdefs, since the compiler is smart enough to optimize-out those snippets on CONFIG_COMPAT=n This requires some regset code as well as a few other defines to be made available on !CONFIG_COMPAT, so some symbols can get resolved before getting optimized-out. Signed-off-by: Leonardo Bras <leobras@redhat.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20240109034651.478462-2-leobras@redhat.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2024-02-28 18:01:23 +00:00
Stephen Boyd	a743f26d03	arm64: ftrace: Don't forbid CALL_OPS+CC_OPTIMIZE_FOR_SIZE with Clang Per commit `b3f11af9b2` ("arm64: ftrace: forbid CALL_OPS with CC_OPTIMIZE_FOR_SIZE"), GCC is silently ignoring `-falign-functions=N` when passed `-Os`, causing functions to be improperly aligned. This doesn't seem to be a problem with Clang though, where enabling CALL_OPS with CC_OPTIMIZE_FOR_SIZE doesn't spit out any warnings at boot about misaligned patch-sites. Only forbid CALL_OPS if GCC is used and we're optimizing for size so that CALL_OPS can be used with clang optimizing for size. Cc: Jason Ling <jasonling@chromium.org> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Bill Wendling <morbo@google.com> Cc: Justin Stitt <justinstitt@google.com> Cc: llvm@lists.linux.dev Fixes: `b3f11af9b2` ("arm64: ftrace: forbid CALL_OPS with CC_OPTIMIZE_FOR_SIZE") Signed-off-by: Stephen Boyd <swboyd@chromium.org> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Link: https://lore.kernel.org/r/20240223064032.3463229-1-swboyd@chromium.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2024-02-28 15:16:37 +00:00
Bartosz Golaszewski	2758269149	arm64: gitignore: ignore relacheck Add the generated executable for relacheck to the list of ignored files. Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Link: https://lore.kernel.org/r/20240222210441.33142-1-brgl@bgdev.pl Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2024-02-22 21:57:52 +00:00
Mark Brown	93576e3498	arm64/sme: Ensure that all fields in SMCR_EL1 are set to known values At present nothing in our CPU initialisation code ever sets unknown fields in SMCR_EL1 to known values, all updates to SMCR_EL1 are read/modify/write sequences. All the unknown fields are RES0, explicitly initialise them as such to avoid future surprises. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20240213-arm64-fp-init-vec-cr-v1-2-7e7c2d584f26@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2024-02-22 19:39:34 +00:00
Mark Brown	2f0090549b	arm64/sve: Ensure that all fields in ZCR_EL1 are set to known values At present nothing in our CPU initialisation code ever sets unknown fields in ZCR_EL1 to known values, all updates to ZCR_EL1 are read/modify/write sequences for LEN. All the unknown fields are RES0, explicitly initialise them as such to avoid future surprises. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20240213-arm64-fp-init-vec-cr-v1-1-7e7c2d584f26@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2024-02-22 19:39:34 +00:00
Mark Brown	21eb468e9f	arm64/sve: Document that __SVE_VQ_MAX is much larger than needed __SVE_VQ_MAX is defined without comment as 512 but the actual architectural maximum is 16, a substantial difference which might not be obvious to readers especially given the several different units used for specifying vector sizes in various contexts and the fact that it's often used via macros. In an effort to minimise surprises for users who might assume the value is the architectural maximum and use it to do things like size allocations add a comment noting the difference, and add a note for SVE_VQ_MAX to aid discoverability. Signed-off-by: Mark Brown <broonie@kernel.org> Acked-by: Dave Martin <Dave.Martin@arm.com> Link: https://lore.kernel.org/r/20240209-arm64-sve-vl-max-comment-v2-1-111b283469ee@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2024-02-22 19:32:47 +00:00
Kemeng Shi	58a0484eaf	arm64: make member of struct pt_regs and it's offset macro in the same order In struct pt_regs, member pstate is after member pc. Move offset macro of pstate after offset macro of pc to improve readability a little. Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20240130175504.106364-1-shikemeng@huaweicloud.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2024-02-22 19:07:49 +00:00
Dawei Li	bce79b0c80	arm64: remove unneeded BUILD_BUG_ON assertion Since commit `c02433dd6d` ("arm64: split thread_info from task stack"), CONFIG_THREAD_INFO_IN_TASK is enabled unconditionally for arm64. So remove this always-true assertion from arch_dup_task_struct. Signed-off-by: Dawei Li <dawei.li@shingroup.cn> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20240202040211.3118918-1-dawei.li@shingroup.cn Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2024-02-22 11:02:51 +00:00
Anshuman Khandual	358fee2917	arm64/sysreg: Update ID_AA64DFR0_EL1 register This updates ID_AA64DFR0_EL1.PMSVer and ID_AA64DFR0_EL1.DebugVer register fields as per the definitions based on DDI0601 2023-12. Cc: Will Deacon <will@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20240220034829.3098373-1-anshuman.khandual@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2024-02-21 18:06:54 +00:00
Anshuman Khandual	7accfaad89	arm64/sysreg: Update ID_DFR0_EL1 register fields This updates ID_DFR0_EL1.PerfMon and ID_DFR0_EL1.CopDbg register fields as per the definitions based on DDI0601 2023-12. Cc: Will Deacon <will@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20240220025343.3093955-1-anshuman.khandual@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2024-02-21 18:06:44 +00:00
Anshuman Khandual	fdd867fe9b	arm64/sysreg: Add register fields for ID_AA64DFR1_EL1 This adds register fields for ID_AA64DFR1_EL1 as per the definitions based on DDI0601 2023-12. Cc: Will Deacon <will@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20240220023203.3091229-1-anshuman.khandual@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2024-02-21 18:06:33 +00:00
Mark Brown	e47c18c3b2	arm64/sme: Remove spurious 'is' in SME documentation Just a typographical error. Reported-by: Edmund Grimley-Evans <edmund.grimley-evans@arm.com> Reviewed-by: Dave Martin <Dave.Martin@arm.com> Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20240124-arm64-sve-sme-doc-v2-4-fe3964fb3c19@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2024-02-21 18:02:46 +00:00
Mark Brown	3fd97cf323	arm64/fp: Clarify effect of setting an unsupported system VL The documentation for system vector length configuration does not cover all cases where unsupported values are written, tighten it up. Reported-by: Edmund Grimley-Evans <edmund.grimley-evans@arm.com> Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Dave Martin <Dave.Martin@arm.com> Link: https://lore.kernel.org/r/20240124-arm64-sve-sme-doc-v2-3-fe3964fb3c19@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2024-02-21 18:02:46 +00:00
Mark Brown	ae35792764	arm64/sme: Fix cut'n'paste in ABI document The ABI for SME is very like that for SVE so bits of the ABI were copied but not adequately search and replaced, fix that. Reported-by: Edmund Grimley-Evans <edmund.grimley-evans@arm.com> Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Dave Martin <Dave.Martin@arm.com> Link: https://lore.kernel.org/r/20240124-arm64-sve-sme-doc-v2-2-fe3964fb3c19@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2024-02-21 18:02:46 +00:00

1 2 3 4 5 ...

1249738 Commits