Commit Graph

1185432 Commits

Author SHA1 Message Date
Naveen N Rao
25ea739ea1 powerpc: Fail build if using recordmcount with binutils v2.37
binutils v2.37 drops unused section symbols, which prevents recordmcount
from capturing mcount locations in sections that have no non-weak
symbols. This results in a build failure with a message such as:
	Cannot find symbol for section 12: .text.perf_callchain_kernel.
	kernel/events/callchain.o: failed

The change to binutils was reverted for v2.38, so this behavior is
specific to binutils v2.37:
https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=c09c8b42021180eee9495bd50d8b35e683d3901b

Objtool is able to cope with such sections, so this issue is specific to
recordmcount.

Fail the build and print a warning if binutils v2.37 is detected and if
we are using recordmcount.

Cc: stable@vger.kernel.org
Suggested-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Naveen N Rao <naveen@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230530061436.56925-1-naveen@kernel.org
2023-06-27 16:19:06 +10:00
Gaurav Batra
d61cd13e73 powerpc/iommu: TCEs are incorrectly manipulated with DLPAR add/remove of memory
When memory is dynamically added/removed, iommu_mem_notifier() is invoked. This
routine traverses through all the DMA windows (DDW only, not default windows)
to add/remove "direct" TCE mappings. The routines for this purpose are
tce_clearrange_multi_pSeriesLP() and tce_clearrange_multi_pSeriesLP().

Both these routines are designed for Direct mapped DMA windows only.

The issue is that there could be some DMA windows in the list which are not
"direct" mapped. Calling these routines will either,

1) remove some dynamically mapped TCEs, Or
2) try to add TCEs which are out of bounds and HCALL returns H_PARAMETER

Here are the side affects when these routines are incorrectly invoked for
"dynamically" mapped DMA windows.

tce_setrange_multi_pSeriesLP()

This adds direct mapped TCEs. Now, this could invoke HCALL to add TCEs with
out-of-bound range. In this scenario, HCALL will return H_PARAMETER and DLAR
ADD of memory will fail.

tce_clearrange_multi_pSeriesLP()

This will remove range of TCEs. The TCE range that is calculated, depending on
the memory range being added, could infact be mapping some other memory
address (for dynamic DMA window scenario). This will wipe out those TCEs.

The solution is for iommu_mem_notifier() to only invoke these routines for
"direct" mapped DMA windows.

Signed-off-by: Gaurav Batra <gbatra@linux.vnet.ibm.com>
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
[mpe: Initialise direct at allocation time in ddw_list_new_entry()]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230613171641.15641-1-gbatra@linux.vnet.ibm.com
2023-06-26 14:57:31 +10:00
Timothy Pearson
bfd8d98921 powerpc/iommu: Only build sPAPR access functions on pSeries
and PowerNV

A build failure with CONFIG_HAVE_PCI=y set without PSERIES or POWERNV
set was caught by the random configuration checker.  Guard the sPAPR
specific IOMMU functions on CONFIG_PPC_PSERIES || CONFIG_PPC_POWERNV.

Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/2015925968.3546872.1685990936823.JavaMail.zimbra@raptorengineeringinc.com
2023-06-21 15:13:57 +10:00
Rohan McLure
331e2cad6d powerpc: powernv: Annotate data races in opal events
The kopald thread handles opal events as they appear, but by polling a
static bit-vector in last_outstanding_events. Annotate these data races
accordingly. We are not at risk of missing events, but use of READ_ONCE,
WRITE_ONCE will assist readers in seeing that kopald only consumes the
events it is aware of when it is scheduled. Also removes extraneous
KCSAN warnings.

Signed-off-by: Rohan McLure <rmclure@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230510033117.1395895-10-rmclure@linux.ibm.com
2023-06-21 15:13:57 +10:00
Rohan McLure
86dacd967b powerpc: Mark writes registering ipi to host cpu through kvm and polling
Mark writes to hypervisor ipi state so that KCSAN recognises these
asynchronous issue of kvmppc_{set,clear}_host_ipi to be intended, with
atomic writes. Mark asynchronous polls to this variable in
kvm_ppc_read_one_intr().

Signed-off-by: Rohan McLure <rmclure@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230510033117.1395895-9-rmclure@linux.ibm.com
2023-06-21 15:13:57 +10:00
Rohan McLure
8608f14b49 powerpc: Annotate accesses to ipi message flags
IPI message flags are observed and consequently consumed in the
smp_ipi_demux_relaxed function, which handles these message sources
until it observes none more arriving. Mark the checked loop guard with
READ_ONCE, to signal to KCSAN that the read is known to be volatile, and
that non-determinism is expected. Mark write for message source in
smp_muxed_ipi_set_message().

Signed-off-by: Rohan McLure <rmclure@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230510033117.1395895-8-rmclure@linux.ibm.com
2023-06-21 15:13:57 +10:00
Rohan McLure
b0c5b4f1ee powerpc: powernv: Fix KCSAN datarace warnings on idle_state contention
The idle_state entry in the PACA on PowerNV features a bit which is
atomically tested and set through ldarx/stdcx. to be used as a spinlock.
This lock then guards access to other bit fields of idle_state. KCSAN
cannot differentiate between any of these bitfield accesses as they all
are implemented by 8-byte store/load instructions, thus cores contending
on the bit-lock appear to data race with modifications to idle_state.

Separate the bit-lock entry from the data guarded by the lock to avoid
the possibility of data races being detected by KCSAN.

Suggested-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Rohan McLure <rmclure@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230510033117.1395895-7-rmclure@linux.ibm.com
2023-06-21 15:13:57 +10:00
Rohan McLure
be286b8637 powerpc: Mark [h]ssr_valid accesses in check_return_regs_valid
Checks to see if the [H]SRR registers have been clobbered by (soft)
NMI interrupts imply the possibility for a data race on the
[h]srr_valid entries in the PACA. Annotate accesses to these fields with
READ_ONCE, removing the need for the barrier.

The diagnostic can use plain-access reads and writes, but annotate with
data_race.

Signed-off-by: Rohan McLure <rmclure@linux.ibm.com>
Reported-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230510033117.1395895-5-rmclure@linux.ibm.com
2023-06-21 15:13:57 +10:00
Rohan McLure
6f3136326e powerpc: qspinlock: Enforce qnode writes prior to publishing to queue
Annotate the release barrier and memory clobber (in effect, producing a
compiler barrier) in the publish_tail_cpu call. These barriers have the
effect of ensuring that qnode attributes are all written to prior to
publish the node to the waitqueue.

Even while the initial write to the 'locked' attribute is guaranteed to
terminate prior to the node being visible, KCSAN still complains that
the write is reorderable by the compiler. Issue a kcsan_release() to
inform KCSAN of the release barrier contained in publish_tail_cpu().

Signed-off-by: Rohan McLure <rmclure@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230510033117.1395895-3-rmclure@linux.ibm.com
2023-06-21 15:13:57 +10:00
Rohan McLure
03d44ee80e powerpc: qspinlock: Mark accesses to qnode lock checks
The powerpc implementation of qspinlocks will both poll and spin on the
bitlock guarding a qnode. Mark these accesses with READ_ONCE to convey
to KCSAN that polling is intentional here.

Signed-off-by: Rohan McLure <rmclure@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230510033117.1395895-2-rmclure@linux.ibm.com
2023-06-21 15:13:57 +10:00
Joel Stanley
98e61df570 powerpc/powernv/pci: Remove last IODA1 defines
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230613045202.294451-4-joel@jms.id.au
2023-06-21 15:13:57 +10:00
Joel Stanley
326b3f8c6e powerpc/powernv/pci: Remove MVE code
With IODA1 support gone the OPAL calls to set MVE are dead code. Remove
them.

Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230613045202.294451-3-joel@jms.id.au
2023-06-21 15:13:57 +10:00
Joel Stanley
5ac129cdb5 powerpc/powernv/pci: Remove ioda1 support
The final "VPL" Power7 boxes that were used for powernv bringup have
been scrapped, meaning there are no machines with ioda1 left.

This patch removes the obvious unused code.

Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230613045202.294451-2-joel@jms.id.au
2023-06-21 15:13:56 +10:00
Rob Herring
d65305bfa6 powerpc: 52xx: Make immr_id DT match tables static
In some builds, the mpc52xx_pm_prepare()/lite5200_pm_prepare() functions
generate stack size warnings. The addition of 'struct resource' in commit
2500763dd3 ("powerpc: Use of_address_to_resource()") grew the stack size
and is blamed for the warnings. However, the real issue is there's no
reason the 'struct of_device_id immr_ids' DT match tables need to be on
the stack as they are constant. Declare them as static to move them off
the stack.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202306130405.uTv5yOZD-lkp@intel.com/
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230614171724.2403982-1-robh@kernel.org
2023-06-21 15:13:56 +10:00
Rob Herring
ef8e341075 powerpc: mpc512x: Remove open coded "ranges" parsing
"ranges" is a standard property, and we have common helper functions
for parsing it, so let's use the for_each_of_range() iterator.

Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230609183232.1767050-1-robh@kernel.org
2023-06-21 15:13:56 +10:00
Rob Herring
be0f9ca024 powerpc: fsl_soc: Use of_range_to_resource() for "ranges" parsing
"ranges" is a standard property with common parsing functions. Users
shouldn't be implementing their own parsing of it. Refactor the FSL RapidIO
"ranges" parsing to use of_range_to_resource() instead.

Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230609183238.1767186-1-robh@kernel.org
2023-06-21 15:13:56 +10:00
Rob Herring
f892ac774b powerpc: fsl: Use of_property_read_reg() to parse "reg"
Use the recently added of_property_read_reg() helper to get the
untranslated "reg" address value.

Signed-off-by: Rob Herring <robh@kernel.org>
[mpe: Add required include of of_address.h]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230609183151.1766261-1-robh@kernel.org
2023-06-21 15:13:46 +10:00
Rob Herring
c4ae1799a5 powerpc: fsl_rio: Use of_range_to_resource() for "ranges" parsing
"ranges" is a standard property with common parsing functions. Users
shouldn't be implementing their own parsing of it. Refactor the FSL RapidIO
"ranges" parsing to use of_range_to_resource() instead.

One change is the original code would look for "#size-cells" and
"#address-cells" in the parent node if not found in the port child
nodes. That is non-standard behavior and not necessary AFAICT. In 2011
in commit 54986964c1 ("powerpc/85xx: Update SRIO device tree nodes")
there was an ABI break. The upstream .dts files have been correct since
at least that point.

Signed-off-by: Rob Herring <robh@kernel.org>
[mpe: Remove now unused "cell" variable]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230609183244.1767325-1-robh@kernel.org

"ranges" is a standard property with common parsing functions. Users
shouldn't be implementing their own parsing of it. Refactor the FSL RapidIO
"ranges" parsing to use of_range_to_resource() instead.

One change is the original code would look for "#size-cells" and
"#address-cells" in the parent node if not found in the port child
nodes. That is non-standard behavior and not necessary AFAICT. In 2011
in commit 54986964c1 ("powerpc/85xx: Update SRIO device tree nodes")
there was an ABI break. The upstream .dts files have been correct since
at least that point.

Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230609183244.1767325-1-robh@kernel.org
2023-06-21 15:10:38 +10:00
Rob Herring
6f3bdbbeaf macintosh: Use of_property_read_reg() to parse "reg"
Use the recently added of_property_read_reg() helper to get the
untranslated "reg" address value.

Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230609182926.1763589-1-robh@kernel.org
2023-06-21 14:08:54 +10:00
Rob Herring
93cfa6fb9f macintosh: Use of_address_to_resource()
Replace open coded reading of "reg" and of_translate_address() calls with
single call to of_address_to_resource().

Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230319163226.226583-1-robh@kernel.org
2023-06-21 14:08:54 +10:00
Rob Herring
bc1cf75027 powerpc: powermac: Use of_get_cpu_hwid() to read CPU node 'reg'
Replace open coded reading of CPU nodes' "reg" properties with
of_get_cpu_hwid() dedicated for this purpose.

Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230319145931.65499-1-robh@kernel.org
2023-06-21 14:08:54 +10:00
Paul Gortmaker
b751ed04bc powerpc: drop MPC85xx_CDS platform support
The MPC8541/8548/8555 Configurable Development System (CDS) were the
vehicle used to provide evaluation of the 1st e500-v2 CPUs around 2007.

Similar to the earlier MPC83xx-MDS systems we removed, the "brains"
exist on a PCI-X card, but additional connectors exist to the right of
the PCI-X slot, two structural metal pins are used to provide stability
in a vertical ATX mounting, and the CPU is now on a daughter-card vs. a
clamped down BGA.

Given the extra complexity and risk of connector damage, the 8548CDS
I had access to came pre-assembled in a basic white Antec case common
for that era, and I'm inclined to assume that was the default.

Power was typical "Pentium4" 2005 ATX - the main 20 pin connector went
to the PCI ATX form factor backplane, and the 4 pin black/yellow went
to the CPU card.

Like previous evaluation boards, they attempted to provide break-out
connectors for as many features as possible, and that made for a fairly
complex looking system.

In any case, these are over 15 years old, and fairly complex systems,
originally made for a small group of industry related people, and made
for use where quiet fan operation wasn't important.  Given that, it
makes sense to remove support from them in 2023.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230620043300.197546-3-paul.gortmaker@windriver.com
2023-06-21 14:08:53 +10:00
Paul Gortmaker
384e338a91 powerpc: drop MPC8540_ADS and MPC8560_ADS platform support
Based on the revision history in the manual(s), these e500-v1
platforms were first available around 2002.

Like a lot of evaluation boards, they attempted to provide break-out
connectors for all possible features, and that combined with four
PCI-X slots (and the age/era) meant for a considerably large board.

As I recall it, from a Linux point of view, the biggest difference
between 8540 and 8560 was in the UART implementation, and that is
reflected in a diff of the defconfigs.

In any case, these are over 20 years old, and by today's standards
only have a small amount of DDR1 memory, and were not widely available.

Given that, it makes sense to remove support from them in 2023.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230620043300.197546-2-paul.gortmaker@windriver.com
2023-06-21 14:08:53 +10:00
Nayna Jain
e66effaf61 security/integrity: fix pointer to ESL data and its size on pseries
On PowerVM guest, variable data is prefixed with 8 bytes of timestamp.
Extract ESL by stripping off the timestamp before passing to ESL parser.

Fixes: 4b3e71e9a3 ("integrity/powerpc: Support loading keys from PLPKS")
Cc: stable@vger.kenrnel.org # v6.3
Signed-off-by: Nayna Jain <nayna@linux.ibm.com>
Tested-by: Nageswara R Sastry <rnsastry@linux.ibm.com>
Acked-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230608120444.382527-1-nayna@linux.ibm.com
2023-06-21 14:08:53 +10:00
Aneesh Kumar K.V
c8eebc4a99 powerpc/mm/dax: Fix the condition when checking if altmap vmemap can cross-boundary
Without this fix, the last subsection vmemmap can end up in memory even if
the namespace is created with -M mem and has sufficient space in the altmap
area.

Fixes: cf387d9644 ("libnvdimm/altmap: Track namespace boundaries in altmap")
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Tested-by: Sachin Sant <sachinp@linux.ibm.com <mailto:sachinp@linux.ibm.com>>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230616110826.344417-6-aneesh.kumar@linux.ibm.com
2023-06-21 14:08:53 +10:00
Aneesh Kumar K.V
d933557b85 powerpc/book3s64/mm: Use PAGE_KERNEL instead of opencoding
No functional change in this patch.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Tested-by: Sachin Sant <sachinp@linux.ibm.com <mailto:sachinp@linux.ibm.com>>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230616110826.344417-5-aneesh.kumar@linux.ibm.com
2023-06-21 14:08:53 +10:00
Aneesh Kumar K.V
0da90af431 powerpc/book3s64/mm: Fix DirectMap stats in /proc/meminfo
On memory unplug reduce DirectMap page count correctly.
root@ubuntu-guest:# grep Direct /proc/meminfo
DirectMap4k:           0 kB
DirectMap64k:           0 kB
DirectMap2M:    115343360 kB
DirectMap1G:           0 kB

Before fix:
root@ubuntu-guest:# ndctl disable-namespace all
disabled 1 namespace
root@ubuntu-guest:# grep Direct /proc/meminfo
DirectMap4k:           0 kB
DirectMap64k:           0 kB
DirectMap2M:    115343360 kB
DirectMap1G:           0 kB

After fix:
root@ubuntu-guest:# ndctl disable-namespace all
disabled 1 namespace
root@ubuntu-guest:# grep Direct /proc/meminfo
DirectMap4k:           0 kB
DirectMap64k:           0 kB
DirectMap2M:    104857600 kB
DirectMap1G:           0 kB

Fixes: a2dc009afa ("powerpc/mm/book3s/radix: Add mapping statistics")
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Tested-by: Sachin Sant <sachinp@linux.ibm.com <mailto:sachinp@linux.ibm.com>>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230616110826.344417-4-aneesh.kumar@linux.ibm.com
2023-06-21 14:08:53 +10:00
Aneesh Kumar K.V
040ec6202b powerpc/mm/book3s64: Use pmdp_ptep helper instead of typecasting.
No functional change in this patch.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Tested-by: Sachin Sant <sachinp@linux.ibm.com <mailto:sachinp@linux.ibm.com>>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230616110826.344417-2-aneesh.kumar@linux.ibm.com
2023-06-20 21:50:57 +10:00
Aditya Gupta
b684c09f09 powerpc: update ppc_save_regs to save current r1 in pt_regs
ppc_save_regs() skips one stack frame while saving the CPU register states.
Instead of saving current R1, it pulls the previous stack frame pointer.

When vmcores caused by direct panic call (such as `echo c >
/proc/sysrq-trigger`), are debugged with gdb, gdb fails to show the
backtrace correctly. On further analysis, it was found that it was because
of mismatch between r1 and NIP.

GDB uses NIP to get current function symbol and uses corresponding debug
info of that function to unwind previous frames, but due to the
mismatching r1 and NIP, the unwinding does not work, and it fails to
unwind to the 2nd frame and hence does not show the backtrace.

GDB backtrace with vmcore of kernel without this patch:

---------
(gdb) bt
 #0  0xc0000000002a53e8 in crash_setup_regs (oldregs=<optimized out>,
    newregs=0xc000000004f8f8d8) at ./arch/powerpc/include/asm/kexec.h:69
 #1  __crash_kexec (regs=<optimized out>) at kernel/kexec_core.c:974
 #2  0x0000000000000063 in ?? ()
 #3  0xc000000003579320 in ?? ()
---------

Further analysis revealed that the mismatch occurred because
"ppc_save_regs" was saving the previous stack's SP instead of the current
r1. This patch fixes this by storing current r1 in the saved pt_regs.

GDB backtrace with vmcore of patched kernel:

--------
(gdb) bt
 #0  0xc0000000002a53e8 in crash_setup_regs (oldregs=0x0, newregs=0xc00000000670b8d8)
    at ./arch/powerpc/include/asm/kexec.h:69
 #1  __crash_kexec (regs=regs@entry=0x0) at kernel/kexec_core.c:974
 #2  0xc000000000168918 in panic (fmt=fmt@entry=0xc000000001654a60 "sysrq triggered crash\n")
    at kernel/panic.c:358
 #3  0xc000000000b735f8 in sysrq_handle_crash (key=<optimized out>) at drivers/tty/sysrq.c:155
 #4  0xc000000000b742cc in __handle_sysrq (key=key@entry=99, check_mask=check_mask@entry=false)
    at drivers/tty/sysrq.c:602
 #5  0xc000000000b7506c in write_sysrq_trigger (file=<optimized out>, buf=<optimized out>,
    count=2, ppos=<optimized out>) at drivers/tty/sysrq.c:1163
 #6  0xc00000000069a7bc in pde_write (ppos=<optimized out>, count=<optimized out>,
    buf=<optimized out>, file=<optimized out>, pde=0xc00000000362cb40) at fs/proc/inode.c:340
 #7  proc_reg_write (file=<optimized out>, buf=<optimized out>, count=<optimized out>,
    ppos=<optimized out>) at fs/proc/inode.c:352
 #8  0xc0000000005b3bbc in vfs_write (file=file@entry=0xc000000006aa6b00,
    buf=buf@entry=0x61f498b4f60 <error: Cannot access memory at address 0x61f498b4f60>,
    count=count@entry=2, pos=pos@entry=0xc00000000670bda0) at fs/read_write.c:582
 #9  0xc0000000005b4264 in ksys_write (fd=<optimized out>,
    buf=0x61f498b4f60 <error: Cannot access memory at address 0x61f498b4f60>, count=2)
    at fs/read_write.c:637
 #10 0xc00000000002ea2c in system_call_exception (regs=0xc00000000670be80, r0=<optimized out>)
    at arch/powerpc/kernel/syscall.c:171
 #11 0xc00000000000c270 in system_call_vectored_common ()
    at arch/powerpc/kernel/interrupt_64.S:192
--------

Nick adds:
  So this now saves regs as though it was an interrupt taken in the
  caller, at the instruction after the call to ppc_save_regs, whereas
  previously the NIP was there, but R1 came from the caller's caller and
  that mismatch is what causes gdb's dwarf unwinder to go haywire.

Signed-off-by: Aditya Gupta <adityag@linux.ibm.com>
Fixes: d16a58f885 ("powerpc: Improve ppc_save_regs()")
Reivewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230615091047.90433-1-adityag@linux.ibm.com
2023-06-19 17:37:14 +10:00
Naveen N Rao
d24da1f855 powerpc/ftrace: Disable ftrace on ppc32 if using clang
Ftrace on ppc32 expects a three instruction sequence at the beginning of
each function when specifying -pg:
	mflr	r0
	stw	r0,4(r1)
	bl	_mcount

This is the case with all supported versions of gcc. Clang however emits
a branch to _mcount after the function prologue, similar to the pre
-mprofile-kernel ABI on ppc64. This is not supported.

Disable ftrace on ppc32 if using clang for now. This can be re-enabled
later if clang picks up support for -fpatchable-function-entry on ppc32.

Signed-off-by: Naveen N Rao <naveen@kernel.org>
Acked-by: Nick Desaulniers <ndesaulniers@google.com>
Link: https://github.com/llvm/llvm-project/issues/63220
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230609034501.407971-1-naveen@kernel.org
2023-06-19 17:37:13 +10:00
Colin Ian King
f4f913c980 powerpc/powernv/sriov: perform null check on iov before dereferencing iov
Currently pointer iov is being dereferenced before the null check of iov
which can lead to null pointer dereference errors. Fix this by moving the
iov null check before the dereferencing.

Detected using cppcheck static analysis:
linux/arch/powerpc/platforms/powernv/pci-sriov.c:597:12: warning: Either
the condition '!iov' is redundant or there is possible null pointer
dereference: iov. [nullPointerRedundantCheck]
 num_vfs = iov->num_vfs;
           ^

Fixes: 052da31d45 ("powerpc/powernv/sriov: De-indent setup and teardown")
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230608095849.1147969-1-colin.i.king@gmail.com
2023-06-19 17:37:13 +10:00
Benjamin Gray
a16e472c35 selftests/powerpc/dexcr: Add DEXCR status utility lsdexcr
Add a utility 'lsdexcr' to print the current DEXCR status. Useful for
quickly checking the status such as when debugging test failures or
verifying the new default DEXCR does what you want (for userspace at
least). Example output:

    # ./lsdexcr
       uDEXCR: 04000000 (NPHIE)
       HDEXCR: 00000000
    Effective: 04000000 (NPHIE)

            SBHE   (0): clear  	(Speculative branch hint enable)
          IBRTPD   (3): clear  	(Indirect branch recurrent target ...)
           SRAPD   (4): clear  	(Subroutine return address ...)
           NPHIE * (5): set  	(Non-privileged hash instruction enable)
            PHIE   (6): clear  	(Privileged hash instruction enable)

    DEXCR[NPHIE] enabled: hashst/hashchk working

Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230616034846.311705-12-bgray@linux.ibm.com
2023-06-19 17:36:28 +10:00
Benjamin Gray
bdb07f35a5 selftests/powerpc/dexcr: Add hashst/hashchk test
Test the kernel DEXCR[NPHIE] interface and hashchk exception handling.

Introduces with it a DEXCR utils library for common DEXCR operations.

Volatile is used to prevent the compiler optimising away the signal
tests.

Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230616034846.311705-11-bgray@linux.ibm.com
2023-06-19 17:36:28 +10:00
Benjamin Gray
b9125c9aa0 selftests/powerpc: Add more utility macros
Adds _MSG assertion variants to provide more context behind why a
failure occurred. Also include unistd.h for _exit() and stdio.h for
fprintf(), and move ARRAY_SIZE macro to utils.h.

The _MSG variants and ARRAY_SIZE will be used by the following
DEXCR selftests.

Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
Reviewed-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230616034846.311705-10-bgray@linux.ibm.com
2023-06-19 17:36:27 +10:00
Benjamin Gray
65d6c884bf Documentation: Document PowerPC kernel DEXCR interface
Describe the DEXCR and document how to configure it.

Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230616034846.311705-9-bgray@linux.ibm.com
2023-06-19 17:36:27 +10:00
Benjamin Gray
97228ca375 powerpc/ptrace: Expose HASHKEYR register to ptrace
The HASHKEYR register contains a secret per-process key to enable unique
hashes per process. In general it should not be exposed to userspace
at all and a regular process has no need to know its key.

However, checkpoint restore in userspace (CRIU) functionality requires
that a process be able to set the HASHKEYR of another process, otherwise
existing hashes on the stack would be invalidated by a new random key.

Exposing HASHKEYR in this way also makes it appear in core dumps, which
is a security concern. Multiple threads may share a key, for example
just after a fork() call, where the kernel cannot know if the child is
going to return back along the parent's stack. If such a thread is
coerced into making a core dump, then the HASHKEYR value will be
readable and able to be used against all other threads sharing that key,
effectively undoing any protection offered by hashst/hashchk.

Therefore we expose HASHKEYR to ptrace when CONFIG_CHECKPOINT_RESTORE is
enabled, providing a choice of increased security or migratable ROP
protected processes. This is similar to how ARM exposes its PAC keys.

Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
Reviewed-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230616034846.311705-8-bgray@linux.ibm.com
2023-06-19 17:36:27 +10:00
Benjamin Gray
884ad5c52d powerpc/ptrace: Expose DEXCR and HDEXCR registers to ptrace
The DEXCR register is of interest when ptracing processes. Currently it
is static, but eventually will be dynamically controllable by a process.
If a process can control its own, then it is useful for it to be
ptrace-able to (e.g., for checkpoint-restore functionality).

It is also relevant to core dumps (the NPHIE aspect in particular),
which use the ptrace mechanism (or is it the other way around?) to
decide what to dump. The HDEXCR is useful here too, as the NPHIE aspect
may be set in the HDEXCR without being set in the DEXCR. Although the
HDEXCR is per-cpu and we don't track it in the task struct (it's useless
in normal operation), it would be difficult to imagine why a hypervisor
would set it to different values within a guest. A hypervisor cannot
safely set NPHIE differently at least, as that would break programs.

Expose a read-only view of the userspace DEXCR and HDEXCR to ptrace.
The HDEXCR is always readonly, and is useful for diagnosing the core
dumps (as the HDEXCR may set NPHIE without the DEXCR setting it).

Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
Reviewed-by: Russell Currey <ruscur@russell.cc>
[mpe: Use lower_32_bits() rather than open coding]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230616034846.311705-7-bgray@linux.ibm.com
2023-06-19 17:36:26 +10:00
Benjamin Gray
be98fcf7c1 powerpc/dexcr: Support userspace ROP protection
The ISA 3.1B hashst and hashchk instructions use a per-cpu SPR HASHKEYR
to hold a key used in the hash calculation. This key should be different
for each process to make it harder for a malicious process to recreate
valid hash values for a victim process.

Add support for storing a per-thread hash key, and setting/clearing
HASHKEYR appropriately.

Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
Reviewed-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230616034846.311705-6-bgray@linux.ibm.com
2023-06-19 17:36:26 +10:00
Benjamin Gray
5bcba4e6c1 powerpc/dexcr: Handle hashchk exception
Recognise and pass the appropriate signal to the user program when a
hashchk instruction triggers. This is independent of allowing
configuration of DEXCR[NPHIE], as a hypervisor can enforce this aspect
regardless of the kernel.

The signal mirrors how ARM reports their similar check failure. For
example, their FPAC handler in arch/arm64/kernel/traps.c do_el0_fpac()
does this. When we fail to read the instruction that caused the fault
we send a segfault, similar to how emulate_math() does it.

Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230616034846.311705-5-bgray@linux.ibm.com
2023-06-19 17:36:26 +10:00
Benjamin Gray
0ffd60b782 powerpc/dexcr: Add initial Dynamic Execution Control Register (DEXCR) support
ISA 3.1B introduces the Dynamic Execution Control Register (DEXCR). It
is a per-cpu register that allows control over various CPU behaviours
including branch hint usage, indirect branch speculation, and
hashst/hashchk support.

Add some definitions and basic support for the DEXCR in the kernel.
Right now it just

  * Initialises the DEXCR and HASHKEYR to a fixed value when a CPU
    onlines.
  * Clears them in reset_sprs().
  * Detects when the NPHIE aspect is supported (the others don't get
    looked at in this series, so there's no need to waste a CPU_FTR
    on them).

We initialise the HASHKEYR to ensure that all cores have the same key,
so an HV enforced NPHIE + swapping cores doesn't randomly crash a
process using hash instructions. The stores to HASHKEYR are
unconditional because the ISA makes no mention of the SPR being missing
if support for doing the hashes isn't present. So all that would happen
is the HASHKEYR value gets ignored. This helps slightly if NPHIE
detection fails; e.g., we currently only detect it on pseries.

Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
[mpe: Use simple values for DEXCR constants]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230616034846.311705-4-bgray@linux.ibm.com
2023-06-19 17:36:25 +10:00
Benjamin Gray
81e30a5412 powerpc/ptrace: Add missing <linux/regset.h> include
ptrace-decl.h uses user_regset_get2_fn (among other things) from
regset.h. While all current users of ptrace-decl.h include regset.h
before it anyway, it adds an implicit ordering dependency and breaks
source tooling that tries to inspect ptrace-decl.h by itself.

Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
Reviewed-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230616034846.311705-3-bgray@linux.ibm.com
2023-06-19 17:36:25 +10:00
Benjamin Gray
7eec97b32e powerpc/book3s: Add missing <linux/sched.h> include
The functions here use struct task_struct fields, so need to import
the full definition from <linux/sched.h>. The <asm/current.h> header
that defines current only forward declares struct task_struct.

Failing to include this <linux/sched.h> header leads to a compilation
error when a translation unit does not also include <linux/sched.h>
indirectly.

Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230616034846.311705-2-bgray@linux.ibm.com
2023-06-19 17:36:25 +10:00
Nicholas Piggin
8ad57add77 powerpc/build: vdso linker warning for orphan sections
Add --orphan-handlin for vdsos, and adjust vdso linker scripts to deal
with orphan sections.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230609051002.3342-1-npiggin@gmail.com
2023-06-15 14:04:19 +10:00
Nicholas Piggin
b4bda59b47 powerpc/64s: Fix VAS mm use after free
The refcount on mm is dropped before the coprocessor is detached.

Reported-by: Sachin Sant <sachinp@linux.ibm.com>
Fixes: 7bc6f71bdf ("powerpc/vas: Define and use common vas_window struct")
Fixes: b22f2d88e4 ("powerpc/pseries/vas: Integrate API with open/close windows")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Tested-by: Sachin Sant <sachinp@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230607101024.14559-1-npiggin@gmail.com
2023-06-15 14:04:19 +10:00
Nicholas Piggin
27be245633 powerpc/64: Rename entry_64.S to prom_entry_64.S
This file contains only the enter_prom implementation now.
Trim includes and update header comment while we're here.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230606132447.315714-7-npiggin@gmail.com
2023-06-15 14:04:19 +10:00
Nicholas Piggin
afc6386815 powerpc: merge 32-bit and 64-bit _switch implementation
The _switch stack frame setup are substantially the same, so are the
comments. The difference in how the stack and current are switched,
and other hardware and software housekeeping is done is moved into
macros.

Generated code should be unchanged.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Tweak include orer to fix compile errors on some configs]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230606132447.315714-6-npiggin@gmail.com
2023-06-15 14:03:55 +10:00
Nicholas Piggin
6958ad05d5 powerpc/32: Rearrange _switch to prepare for 32/64 merge
Change the order of some operations and change some register numbers in
preparation to merge 32-bit and 64-bit switch.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230606132447.315714-5-npiggin@gmail.com
2023-06-14 12:46:42 +10:00
Nicholas Piggin
fc8562c9b6 powerpc/32: Remove sync from _switch
64-bit has removed the sync from _switch since commit 9145effd62
("powerpc/64: Drop explicit hwsync in context switch"). The same
logic there should apply to 32-bit. Remove the sync and replace with
a placeholder comment (32 and 64 will be merged with a later change).

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230606132447.315714-4-npiggin@gmail.com
2023-06-14 12:46:42 +10:00
Nicholas Piggin
0eb8088b5a powerpc/64: Rearrange 64-bit _switch to prepare for 32/64 merge
More some 64-bit specifics out from the function epilogue and rearrange
this to be a bit neater, use 32-bit mem ops for CR save/restore, and
change some register numbers.

This is preparation to consolidate 32-bit and 64-bit switch code.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230606132447.315714-3-npiggin@gmail.com
2023-06-14 12:46:42 +10:00
Nicholas Piggin
d6b87c3eb6 powerpc/64s: move stack SLB pinning out of line from _switch
The large hunk of SLB pinning in _switch asm code makes it more
difficult to see everything else that's going on. It is a less important
path now, so icache and fetch footprint overhead can be avoided.

Move context switch stack SLB pinning out of line.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230606132447.315714-2-npiggin@gmail.com
2023-06-14 12:46:42 +10:00