mirror of
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
synced 2026-05-08 14:02:37 -04:00
Merge branch 'virtio-mem' into features
David Hildenbrand says: ==================== virtio-mem: s390 support Let's finally add s390 support for virtio-mem; my last RFC was sent 4 years ago, and a lot changed in the meantime. The latest QEMU series is available at [1], which contains some more details and a usage example on s390 (last patch). There is not too much in here: The biggest part is querying a new diag(500) STORAGE_LIMIT hypercall to obtain the proper "max_physmem_end". The last three patches are not strictly required but certainly nice-to-have. Note that -- in contrast to standby memory -- virtio-mem memory must be configured to be automatically onlined as soon as hotplugged. The easiest approach is using the "memhp_default_state=" kernel parameter or by using proper udev rules. More details can be found at [2]. I have reviving+upstreaming a systemd service to handle configuring that on my todo list, but for some reason I keep getting distracted ... I tested various things, including: * Various memory hotplug/hotunplug combinations * Device hotplug/hotunplug * /proc/iomem output * reboot * kexec * kdump: make sure we properly enter the "kdump mode" in the virtio-mem driver kdump support for virtio-mem memory on s390 will be sent out separately. v2 -> v3 * "s390/kdump: make is_kdump_kernel() consistently return "true" in kdump environments only" -> Sent out separately [3] * "s390/physmem_info: query diag500(STORAGE LIMIT) to support QEMU/KVM memory devices" -> No query function for diag500 for now. -> Update comment above setup_ident_map_size(). -> Optimize/rewrite diag500_storage_limit() [Heiko] -> Change handling in detect_physmem_online_ranges [Alexander] -> Improve documentation. * "s390/sparsemem: provide memory_add_physaddr_to_nid() with CONFIG_NUMA" -> Added after testing on systems with CONFIG_NUMA=y v1 -> v2: * Document the new diag500 subfunction * Use "s390" instead of "s390x" consistently [1] https://lkml.kernel.org/r/20241008105455.2302628-1-david@redhat.com [2] https://virtio-mem.gitlab.io/user-guide/user-guide-linux.html [3] https://lkml.kernel.org/r/20241023090651.1115507-1-david@redhat.com ==================== Link: https://lore.kernel.org/r/20241025141453.1210600-1-david@redhat.com/ Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
This commit is contained in:
@@ -35,20 +35,24 @@ DIAGNOSE function codes not specific to KVM, please refer to the
|
||||
documentation for the s390 hypervisors defining them.
|
||||
|
||||
|
||||
DIAGNOSE function code 'X'500' - KVM virtio functions
|
||||
-----------------------------------------------------
|
||||
DIAGNOSE function code 'X'500' - KVM functions
|
||||
----------------------------------------------
|
||||
|
||||
If the function code specifies 0x500, various virtio-related functions
|
||||
are performed.
|
||||
If the function code specifies 0x500, various KVM-specific functions
|
||||
are performed, including virtio functions.
|
||||
|
||||
General register 1 contains the virtio subfunction code. Supported
|
||||
virtio subfunctions depend on KVM's userspace. Generally, userspace
|
||||
provides either s390-virtio (subcodes 0-2) or virtio-ccw (subcode 3).
|
||||
General register 1 contains the subfunction code. Supported subfunctions
|
||||
depend on KVM's userspace. Regarding virtio subfunctions, generally
|
||||
userspace provides either s390-virtio (subcodes 0-2) or virtio-ccw
|
||||
(subcode 3).
|
||||
|
||||
Upon completion of the DIAGNOSE instruction, general register 2 contains
|
||||
the function's return code, which is either a return code or a subcode
|
||||
specific value.
|
||||
|
||||
If the specified subfunction is not supported, a SPECIFICATION exception
|
||||
will be triggered.
|
||||
|
||||
Subcode 0 - s390-virtio notification and early console printk
|
||||
Handled by userspace.
|
||||
|
||||
@@ -76,6 +80,23 @@ Subcode 3 - virtio-ccw notification
|
||||
|
||||
See also the virtio standard for a discussion of this hypercall.
|
||||
|
||||
Subcode 4 - storage-limit
|
||||
Handled by userspace.
|
||||
|
||||
After completion of the DIAGNOSE call, general register 2 will
|
||||
contain the storage limit: the maximum physical address that might be
|
||||
used for storage throughout the lifetime of the VM.
|
||||
|
||||
The storage limit does not indicate currently usable storage, it may
|
||||
include holes, standby storage and areas reserved for other means, such
|
||||
as memory hotplug or virtio-mem devices. Other interfaces for detecting
|
||||
actually usable storage, such as SCLP, must be used in conjunction with
|
||||
this subfunction.
|
||||
|
||||
Note that the storage limit can be larger, but never smaller than the
|
||||
maximum storage address indicated by SCLP via the "maximum storage
|
||||
increment" and the "increment size".
|
||||
|
||||
|
||||
DIAGNOSE function code 'X'501 - KVM breakpoint
|
||||
----------------------------------------------
|
||||
|
||||
@@ -109,6 +109,42 @@ static int diag260(void)
|
||||
return 0;
|
||||
}
|
||||
|
||||
#define DIAG500_SC_STOR_LIMIT 4
|
||||
|
||||
static int diag500_storage_limit(unsigned long *max_physmem_end)
|
||||
{
|
||||
unsigned long storage_limit;
|
||||
unsigned long reg1, reg2;
|
||||
psw_t old;
|
||||
|
||||
asm volatile(
|
||||
" mvc 0(16,%[psw_old]),0(%[psw_pgm])\n"
|
||||
" epsw %[reg1],%[reg2]\n"
|
||||
" st %[reg1],0(%[psw_pgm])\n"
|
||||
" st %[reg2],4(%[psw_pgm])\n"
|
||||
" larl %[reg1],1f\n"
|
||||
" stg %[reg1],8(%[psw_pgm])\n"
|
||||
" lghi 1,%[subcode]\n"
|
||||
" lghi 2,0\n"
|
||||
" diag 2,4,0x500\n"
|
||||
"1: mvc 0(16,%[psw_pgm]),0(%[psw_old])\n"
|
||||
" lgr %[slimit],2\n"
|
||||
: [reg1] "=&d" (reg1),
|
||||
[reg2] "=&a" (reg2),
|
||||
[slimit] "=d" (storage_limit),
|
||||
"=Q" (get_lowcore()->program_new_psw),
|
||||
"=Q" (old)
|
||||
: [psw_old] "a" (&old),
|
||||
[psw_pgm] "a" (&get_lowcore()->program_new_psw),
|
||||
[subcode] "i" (DIAG500_SC_STOR_LIMIT)
|
||||
: "memory", "1", "2");
|
||||
if (!storage_limit)
|
||||
return -EINVAL;
|
||||
/* Convert inclusive end to exclusive end */
|
||||
*max_physmem_end = storage_limit + 1;
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int tprot(unsigned long addr)
|
||||
{
|
||||
unsigned long reg1, reg2;
|
||||
@@ -157,7 +193,9 @@ unsigned long detect_max_physmem_end(void)
|
||||
{
|
||||
unsigned long max_physmem_end = 0;
|
||||
|
||||
if (!sclp_early_get_memsize(&max_physmem_end)) {
|
||||
if (!diag500_storage_limit(&max_physmem_end)) {
|
||||
physmem_info.info_source = MEM_DETECT_DIAG500_STOR_LIMIT;
|
||||
} else if (!sclp_early_get_memsize(&max_physmem_end)) {
|
||||
physmem_info.info_source = MEM_DETECT_SCLP_READ_INFO;
|
||||
} else {
|
||||
max_physmem_end = search_mem_end();
|
||||
@@ -170,6 +208,13 @@ void detect_physmem_online_ranges(unsigned long max_physmem_end)
|
||||
{
|
||||
if (!sclp_early_read_storage_info()) {
|
||||
physmem_info.info_source = MEM_DETECT_SCLP_STOR_INFO;
|
||||
} else if (physmem_info.info_source == MEM_DETECT_DIAG500_STOR_LIMIT) {
|
||||
unsigned long online_end;
|
||||
|
||||
if (!sclp_early_get_memsize(&online_end)) {
|
||||
physmem_info.info_source = MEM_DETECT_SCLP_READ_INFO;
|
||||
add_physmem_online_range(0, online_end);
|
||||
}
|
||||
} else if (!diag260()) {
|
||||
physmem_info.info_source = MEM_DETECT_DIAG260;
|
||||
} else if (max_physmem_end) {
|
||||
|
||||
@@ -182,12 +182,15 @@ static void kaslr_adjust_got(unsigned long offset)
|
||||
* Merge information from several sources into a single ident_map_size value.
|
||||
* "ident_map_size" represents the upper limit of physical memory we may ever
|
||||
* reach. It might not be all online memory, but also include standby (offline)
|
||||
* memory. "ident_map_size" could be lower then actual standby or even online
|
||||
* memory or memory areas reserved for other means (e.g., memory devices such as
|
||||
* virtio-mem).
|
||||
*
|
||||
* "ident_map_size" could be lower then actual standby/reserved or even online
|
||||
* memory present, due to limiting factors. We should never go above this limit.
|
||||
* It is the size of our identity mapping.
|
||||
*
|
||||
* Consider the following factors:
|
||||
* 1. max_physmem_end - end of physical memory online or standby.
|
||||
* 1. max_physmem_end - end of physical memory online, standby or reserved.
|
||||
* Always >= end of the last online memory range (get_physmem_online_end()).
|
||||
* 2. CONFIG_MAX_PHYSMEM_BITS - the maximum size of physical memory the
|
||||
* kernel is able to support.
|
||||
|
||||
@@ -9,6 +9,7 @@ enum physmem_info_source {
|
||||
MEM_DETECT_NONE = 0,
|
||||
MEM_DETECT_SCLP_STOR_INFO,
|
||||
MEM_DETECT_DIAG260,
|
||||
MEM_DETECT_DIAG500_STOR_LIMIT,
|
||||
MEM_DETECT_SCLP_READ_INFO,
|
||||
MEM_DETECT_BIN_SEARCH
|
||||
};
|
||||
@@ -107,6 +108,8 @@ static inline const char *get_physmem_info_source(void)
|
||||
return "sclp storage info";
|
||||
case MEM_DETECT_DIAG260:
|
||||
return "diag260";
|
||||
case MEM_DETECT_DIAG500_STOR_LIMIT:
|
||||
return "diag500 storage limit";
|
||||
case MEM_DETECT_SCLP_READ_INFO:
|
||||
return "sclp read info";
|
||||
case MEM_DETECT_BIN_SEARCH:
|
||||
|
||||
@@ -2,7 +2,15 @@
|
||||
#ifndef _ASM_S390_SPARSEMEM_H
|
||||
#define _ASM_S390_SPARSEMEM_H
|
||||
|
||||
#define SECTION_SIZE_BITS 28
|
||||
#define SECTION_SIZE_BITS 27
|
||||
#define MAX_PHYSMEM_BITS CONFIG_MAX_PHYSMEM_BITS
|
||||
|
||||
#ifdef CONFIG_NUMA
|
||||
static inline int memory_add_physaddr_to_nid(u64 addr)
|
||||
{
|
||||
return 0;
|
||||
}
|
||||
#define memory_add_physaddr_to_nid memory_add_physaddr_to_nid
|
||||
#endif /* CONFIG_NUMA */
|
||||
|
||||
#endif /* _ASM_S390_SPARSEMEM_H */
|
||||
|
||||
@@ -122,7 +122,7 @@ config VIRTIO_BALLOON
|
||||
|
||||
config VIRTIO_MEM
|
||||
tristate "Virtio mem driver"
|
||||
depends on X86_64 || ARM64 || RISCV
|
||||
depends on X86_64 || ARM64 || RISCV || S390
|
||||
depends on VIRTIO
|
||||
depends on MEMORY_HOTPLUG
|
||||
depends on MEMORY_HOTREMOVE
|
||||
@@ -132,11 +132,11 @@ config VIRTIO_MEM
|
||||
This driver provides access to virtio-mem paravirtualized memory
|
||||
devices, allowing to hotplug and hotunplug memory.
|
||||
|
||||
This driver currently only supports x86-64 and arm64. Although it
|
||||
should compile on other architectures that implement memory
|
||||
hot(un)plug, architecture-specific and/or common
|
||||
code changes may be required for virtio-mem, kdump and kexec to work as
|
||||
expected.
|
||||
This driver currently supports x86-64, arm64, riscv and s390.
|
||||
Although it should compile on other architectures that implement
|
||||
memory hot(un)plug, architecture-specific and/or common
|
||||
code changes may be required for virtio-mem, kdump and kexec to
|
||||
work as expected.
|
||||
|
||||
If unsure, say M.
|
||||
|
||||
|
||||
@@ -1905,7 +1905,7 @@ config STRICT_DEVMEM
|
||||
bool "Filter access to /dev/mem"
|
||||
depends on MMU && DEVMEM
|
||||
depends on ARCH_HAS_DEVMEM_IS_ALLOWED || GENERIC_LIB_DEVMEM_IS_ALLOWED
|
||||
default y if PPC || X86 || ARM64
|
||||
default y if PPC || X86 || ARM64 || S390
|
||||
help
|
||||
If this option is disabled, you allow userspace (root) access to all
|
||||
of memory, including kernel and userspace memory. Accidental
|
||||
|
||||
Reference in New Issue
Block a user