Commit Graph

766877 Commits

Author SHA1 Message Date
Paolo Bonzini
85eae57bbb Merge tag 'kvm-s390-next-4.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD
KVM: s390: Features for 4.19

- initial version for host large page support. Must be enabled with
  module parameter hpage=1 and will conflict with the nested=1
  parameter.
- enable etoken facility for guests
- Fixes
2018-08-02 13:57:29 +02:00
Paolo Bonzini
3a1174cd3e Merge tag 'kvm-ppc-next-4.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into HEAD
PPC KVM update for 4.19.

This update adds no new features; it just has some minor code cleanups
and bug fixes, including a fix to allow us to create KVM_MAX_VCPUS
vCPUs on POWER9 in all CPU threading modes.
2018-08-02 13:57:26 +02:00
Janosch Frank
2375846193 Merge tag 'hlp_stage1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into kvms390/next
KVM: s390: initial host large page support

- must be enabled via module parameter hpage=1
- cannot be used together with nested
- does support migration
- does support hugetlbfs
- no THP yet
2018-07-30 23:20:48 +02:00
Janosch Frank
a449938297 KVM: s390: Add huge page enablement control
General KVM huge page support on s390 has to be enabled via the
kvm.hpage module parameter. Either nested or hpage can be enabled, as
we currently do not support vSIE for huge backed guests. Once the vSIE
support is added we will either drop the parameter or enable it as
default.

For a guest the feature has to be enabled through the new
KVM_CAP_S390_HPAGE_1M capability and the hpage module
parameter. Enabling it means that cmm can't be enabled for the vm and
disables pfmf and storage key interpretation.

This is due to the fact that in some cases, in upcoming patches, we
have to split huge pages in the guest mapping to be able to set more
granular memory protection on 4k pages. These split pages have fake
page tables that are not visible to the Linux memory management which
subsequently will not manage its PGSTEs, while the SIE will. Disabling
these features lets us manage PGSTE data in a consistent matter and
solve that problem.

Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
2018-07-30 23:13:38 +02:00
Janosch Frank
a9e00d8349 s390/mm: Add huge page gmap linking support
Let's allow huge pmd linking when enabled through the
KVM_CAP_S390_HPAGE_1M capability. Also we can now restrict gmap
invalidation and notification to the cases where the capability has
been activated and save some cycles when that's not the case.

Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
2018-07-30 23:13:38 +02:00
Dominik Dingel
7d735b9ae8 s390/mm: hugetlb pages within a gmap can not be freed
Guests backed by huge pages could theoretically free unused pages via
the diagnose 10 instruction. We currently don't allow that, so we
don't have to refault it once it's needed again.

Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
2018-07-30 23:13:38 +02:00
Janosch Frank
57cb198cfd KVM: s390: Beautify skey enable check
Let's introduce an explicit check if skeys have already been enabled
for the vcpu, so we don't have to check the mm context if we don't have
the storage key facility.

This lets us check for enablement without having to take the mm
semaphore and thus speedup skey emulation.

Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Acked-by: Farhan Ali <alifm@linux.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2018-07-30 17:05:52 +02:00
Janosch Frank
bd096f6443 KVM: s390: Add skey emulation fault handling
When doing skey emulation for huge guests, we now need to fault in
pmds, as we don't have PGSTES anymore to store them when we do not
have valid table entries.

Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
2018-07-30 11:20:18 +01:00
Janosch Frank
637ff9efe5 s390/mm: Add huge pmd storage key handling
Storage keys for guests with huge page mappings have to be managed in
hardware. There are no PGSTEs for PMDs that we could use to retain the
guests's logical view of the key.

Signed-off-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
2018-07-30 11:20:18 +01:00
Janosch Frank
3afdfca698 s390/mm: Clear skeys for newly mapped huge guest pmds
Similarly to the pte skey handling, where we set the storage key to
the default key for each newly mapped pte, we have to also do that for
huge pmds.

With the PG_arch_1 flag we keep track if the area has already been
cleared of its skeys.

Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-07-30 11:20:18 +01:00
Dominik Dingel
964c2c05c9 s390/mm: Clear huge page storage keys on enable_skey
When a guest starts using storage keys, we trap and set a default one
for its whole valid address space. With this patch we are now able to
do that for large pages.

To speed up the storage key insertion, we use
__storage_key_init_range, which in-turn will use sske_frame to set
multiple storage keys with one instruction. As it has been previously
used for debuging we have to get rid of the default key check and make
it quiescing.

Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.vnet.ibm.com>
[replaced page_set_storage_key loop with __storage_key_init_range]
Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
2018-07-30 11:20:18 +01:00
Janosch Frank
0959e16867 s390/mm: Add huge page dirty sync support
To do dirty loging with huge pages, we protect huge pmds in the
gmap. When they are written to, we unprotect them and mark them dirty.

We introduce the function gmap_test_and_clear_dirty_pmd which handles
dirty sync for huge pages.

Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Acked-by: David Hildenbrand <david@redhat.com>
2018-07-30 11:20:18 +01:00
Janosch Frank
6a3762778d s390/mm: Add gmap pmd invalidation and clearing
If the host invalidates a pmd, we also have to invalidate the
corresponding gmap pmds, as well as flush them from the TLB. This is
necessary, as we don't share the pmd tables between host and guest as
we do with ptes.

The clearing part of these three new functions sets a guest pmd entry
to _SEGMENT_ENTRY_EMPTY, so the guest will fault on it and we will
re-link it.

Flushing the gmap is not necessary in the host's lazy local and csp
cases. Both purge the TLB completely.

Signed-off-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Acked-by: David Hildenbrand <david@redhat.com>
2018-07-30 11:20:17 +01:00
Janosch Frank
7c4b13a7c0 s390/mm: Add gmap pmd notification bit setting
Like for ptes, we also need invalidation notification for pmds, to
make sure the guest lowcore pages are always accessible and later
addition of shadowed pmds.

With PMDs we do not have PGSTEs or some other bits we could use in the
host PMD. Instead we pick one of the free bits in the gmap PMD. Every
time a host pmd will be invalidated, we will check if the respective
gmap PMD has the bit set and in that case fire up the notifier.

Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
2018-07-30 11:20:17 +01:00
Janosch Frank
58b7e200d2 s390/mm: Add gmap pmd linking
Let's allow pmds to be linked into gmap for the upcoming s390 KVM huge
page support.

Before this patch we copied the full userspace pmd entry. This is not
correct, as it contains SW defined bits that might be interpreted
differently in the GMAP context. Now we only copy over all hardware
relevant information leaving out the software bits.

Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
2018-07-30 11:20:17 +01:00
Janosch Frank
2c46e974dd s390/mm: Abstract gmap notify bit setting
Currently we use the software PGSTE bits PGSTE_IN_BIT and
PGSTE_VSIE_BIT to notify before an invalidation occurs on a prefix
page or a VSIE page respectively. Both bits are pgste specific, but
are used when protecting a memory range.

Let's introduce abstract GMAP_NOTIFY_* bits that will be realized into
the respective bits when gmap DAT table entries are protected.

Signed-off-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
2018-07-30 11:20:17 +01:00
Janosch Frank
5a045bb9c4 s390/mm: Make gmap_protect_range more modular
This patch reworks the gmap_protect_range logic and extracts the pte
handling into an own function. Also we do now walk to the pmd and make
it accessible in the function for later use. This way we can add huge
page handling logic more easily.

Signed-off-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-07-30 11:20:17 +01:00
Paul Mackerras
b5c6f7607b KVM: PPC: Book3S HV: Read kvm->arch.emul_smt_mode under kvm->lock
Commit 1e175d2 ("KVM: PPC: Book3S HV: Pack VCORE IDs to access full
VCPU ID space", 2018-07-25) added code that uses kvm->arch.emul_smt_mode
before any VCPUs are created.  However, userspace can change
kvm->arch.emul_smt_mode at any time up until the first VCPU is created.
Hence it is (theoretically) possible for the check in
kvmppc_core_vcpu_create_hv() to race with another userspace thread
changing kvm->arch.emul_smt_mode.

This fixes it by moving the test that uses kvm->arch.emul_smt_mode into
the block where kvm->lock is held.

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2018-07-26 15:38:41 +10:00
Paul Mackerras
1ebe6b81eb KVM: PPC: Book3S HV: Allow creating max number of VCPUs on POWER9
Commit 1e175d2 ("KVM: PPC: Book3S HV: Pack VCORE IDs to access full
VCPU ID space", 2018-07-25) allowed use of VCPU IDs up to
KVM_MAX_VCPU_ID on POWER9 in all guest SMT modes and guest emulated
hardware SMT modes.  However, with the current definition of
KVM_MAX_VCPU_ID, a guest SMT mode of 1 and an emulated SMT mode of 8,
it is only possible to create KVM_MAX_VCPUS / 2 VCPUS, because
threads_per_subcore is 4 on POWER9 CPUs.  (Using an emulated SMT mode
of 8 is useful when migrating VMs to or from POWER8 hosts.)

This increases KVM_MAX_VCPU_ID to 8 * KVM_MAX_VCPUS when HV KVM is
configured in, so that a full complement of KVM_MAX_VCPUS VCPUs can
be created on POWER9 in all guest SMT modes and emulated hardware
SMT modes.

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2018-07-26 14:53:54 +10:00
Sam Bobroff
1e175d2e07 KVM: PPC: Book3S HV: Pack VCORE IDs to access full VCPU ID space
It is not currently possible to create the full number of possible
VCPUs (KVM_MAX_VCPUS) on Power9 with KVM-HV when the guest uses fewer
threads per core than its core stride (or "VSMT mode"). This is
because the VCORE ID and XIVE offsets grow beyond KVM_MAX_VCPUS
even though the VCPU ID is less than KVM_MAX_VCPU_ID.

To address this, "pack" the VCORE ID and XIVE offsets by using
knowledge of the way the VCPU IDs will be used when there are fewer
guest threads per core than the core stride. The primary thread of
each core will always be used first. Then, if the guest uses more than
one thread per core, these secondary threads will sequentially follow
the primary in each core.

So, the only way an ID above KVM_MAX_VCPUS can be seen, is if the
VCPUs are being spaced apart, so at least half of each core is empty,
and IDs between KVM_MAX_VCPUS and (KVM_MAX_VCPUS * 2) can be mapped
into the second half of each core (4..7, in an 8-thread core).

Similarly, if IDs above KVM_MAX_VCPUS * 2 are seen, at least 3/4 of
each core is being left empty, and we can map down into the second and
third quarters of each core (2, 3 and 5, 6 in an 8-thread core).

Lastly, if IDs above KVM_MAX_VCPUS * 4 are seen, only the primary
threads are being used and 7/8 of the core is empty, allowing use of
the 1, 5, 3 and 7 thread slots.

(Strides less than 8 are handled similarly.)

This allows the VCORE ID or offset to be calculated quickly from the
VCPU ID or XIVE server numbers, without access to the VCPU structure.

[paulus@ozlabs.org - tidied up comment a little, changed some WARN_ONCE
 to pr_devel, wrapped line, fixed id check.]

Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2018-07-26 13:23:52 +10:00
Christian Borntraeger
a3da7b4a3b KVM: s390: add etoken support for guests
We want to provide facility 156 (etoken facility) to our
guests. This includes migration support (via sync regs) and
VSIE changes. The tokens are being reset on clear reset. This
has to be implemented by userspace (via sync regs).

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Acked-by: Cornelia Huck <cohuck@redhat.com>
2018-07-19 12:59:36 +02:00
Nicholas Mc Guire
0abb75b7a1 KVM: PPC: Book3S HV: Fix constant size warning
The constants are 64bit but not explicitly declared UL resulting
in sparse warnings. Fix this by declaring the constants UL.

Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2018-07-18 15:14:45 +10:00
Nicholas Mc Guire
51eaa08f02 KVM: PPC: Book3S HV: Add of_node_put() in success path
The call to of_find_compatible_node() is returning a pointer with
incremented refcount so it must be explicitly decremented after the
last use. As here it is only being used for checking of node presence
but the result is not actually used in the success path it can be
dropped immediately.

Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org>
Fixes: commit f725758b89 ("KVM: PPC: Book3S HV: Use OPAL XICS emulation on POWER9")
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2018-07-18 15:14:45 +10:00
Alexey Kardashevskiy
76346cd93a KVM: PPC: Book3S: Fix matching of hardware and emulated TCE tables
When attaching a hardware table to LIOBN in KVM, we match table parameters
such as page size, table offset and table size. However the tables are
created via very different paths - VFIO and KVM - and the VFIO path goes
through the platform code which has minimum TCE page size requirement
(which is 4K but since we allocate memory by pages and cannot avoid
alignment anyway, we align to 64k pages for powernv_defconfig).

So when we match the tables, one might be bigger that the other which
means the hardware table cannot get attached to LIOBN and DMA mapping
fails.

This removes the table size alignment from the guest visible table.
This does not affect the memory allocation which is still aligned -
kvmppc_tce_pages() takes care of this.

This relaxes the check we do when attaching tables to allow the hardware
table be bigger than the guest visible table.

Ideally we want the KVM table to cover the same space as the hardware
table does but since the hardware table may use multiple levels, and
all levels must use the same table size (IODA2 design), the area it can
actually cover might get very different from the window size which
the guest requested, even though the guest won't map it all.

Fixes: ca1fc489cf "KVM: PPC: Book3S: Allow backing bigger guest IOMMU pages with smaller physical pages"
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2018-07-18 15:14:45 +10:00
Simon Guo
4eeb85568e KVM: PPC: Remove mmio_vsx_tx_sx_enabled in KVM MMIO emulation
Originally PPC KVM MMIO emulation uses only 0~31#(5 bits) for VSR
reg number, and use mmio_vsx_tx_sx_enabled field together for
0~63# VSR regs.

Currently PPC KVM MMIO emulation is reimplemented with analyse_instr()
assistance.  analyse_instr() returns 0~63 for VSR register number, so
it is not necessary to use additional mmio_vsx_tx_sx_enabled field
any more.

This patch extends related reg bits (expand io_gpr to u16 from u8
and use 6 bits for VSR reg#), so that mmio_vsx_tx_sx_enabled can
be removed.

Signed-off-by: Simon Guo <wei.guo.simon@gmail.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2018-07-18 15:14:45 +10:00
Christian Borntraeger
63747bf73c KVM: s390/vsie: avoid sparse warning
This is a non-functional change that avoids
arch/s390/kvm/vsie.c:839:25: warning: context imbalance in 'do_vsie_run' - unexpected unlock

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2018-07-16 17:28:56 +02:00
Claudio Imbrenda
afdad61615 KVM: s390: Fix storage attributes migration with memory slots
This is a fix for several issues that were found in the original code
for storage attributes migration.

Now no bitmap is allocated to keep track of dirty storage attributes;
the extra bits of the per-memslot bitmap that are always present anyway
are now used for this purpose.

The code has also been refactored a little to improve readability.

Fixes: 190df4a212 ("KVM: s390: CMMA tracking, ESSA emulation, migration mode")
Fixes: 4036e3874a ("KVM: s390: ioctls to get and set guest storage attributes")
Acked-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Signed-off-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com>
Message-Id: <1525106005-13931-3-git-send-email-imbrenda@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2018-07-13 09:48:57 +02:00
Claudio Imbrenda
03133347b4 KVM: s390: a utility function for migration
Introduce a utility function that will be used later on for storage
attributes migration, and use it in kvm_main.c to replace existing code
that does the same thing.

Signed-off-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com>
Message-Id: <1525106005-13931-2-git-send-email-imbrenda@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2018-07-13 09:48:57 +02:00
Janosch Frank
0230cae75d KVM: s390: Replace clear_user with kvm_clear_guest
kvm_clear_guest also does the dirty tracking for us, which we want to
have.

Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2018-07-13 09:48:57 +02:00
Linus Torvalds
6f0d349d92 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Fix netpoll OOPS in r8169, from Ville Syrjälä.

 2) Fix bpf instruction alignment on powerpc et al., from Eric Dumazet.

 3) Don't ignore IFLA_MTU attribute when creating new ipvlan links. From
    Xin Long.

 4) Fix use after free in AF_PACKET, from Eric Dumazet.

 5) Mis-matched RTNL unlock in xen-netfront, from Ross Lagerwall.

 6) Fix VSOCK loopback on big-endian, from Claudio Imbrenda.

 7) Missing RX buffer offset correction when computing DMA addresses in
    mvneta driver, from Antoine Tenart.

 8) Fix crashes in DCCP's ccid3_hc_rx_send_feedback, from Eric Dumazet.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (34 commits)
  sfc: make function efx_rps_hash_bucket static
  strparser: Corrected typo in documentation.
  qmi_wwan: add support for the Dell Wireless 5821e module
  cxgb4: when disabling dcb set txq dcb priority to 0
  net_sched: remove a bogus warning in hfsc
  net: dccp: switch rx_tstamp_last_feedback to monotonic clock
  net: dccp: avoid crash in ccid3_hc_rx_send_feedback()
  net: Remove depends on HAS_DMA in case of platform dependency
  MAINTAINERS: Add file patterns for dsa device tree bindings
  net: mscc: make sparse happy
  net: mvneta: fix the Rx desc DMA address in the Rx path
  Documentation: e1000: Fix docs build error
  Documentation: e100: Fix docs build error
  Documentation: e1000: Use correct heading adornment
  Documentation: e100: Use correct heading adornment
  ipv6: mcast: fix unsolicited report interval after receiving querys
  vhost_net: validate sock before trying to put its fd
  VSOCK: fix loopback on big-endian systems
  net: ethernet: ti: davinci_cpdma: make function cpdma_desc_pool_create static
  xen-netfront: Update features after registering netdev
  ...
2018-06-25 15:58:17 +08:00
Colin Ian King
829eb05365 sfc: make function efx_rps_hash_bucket static
The function efx_rps_hash_bucket is local to the source and
does not need to be in global scope, so make it static.

Cleans up sparse warning:
symbol 'efx_rps_hash_bucket' was not declared. Should it be static?

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-24 23:08:25 +09:00
Linus Torvalds
7daf201d7f Linux 4.18-rc2 v4.18-rc2 2018-06-24 20:54:29 +08:00
Linus Torvalds
c81b995f00 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Thomas Gleixner:
 "A pile of perf updates:

  Kernel side:

   - Remove an incorrect warning in uprobe_init_insn() when
     insn_get_length() fails. The error return code is handled at the
     call site.

   - Move the inline keyword to the right place in the perf ringbuffer
     code to address a W=1 build warning.

  Tooling:

  perf stat:

   - Fix metric column header display alignment

   - Improve error messages for default attributes, providing better
     output for error in command line.

   - Add --interval-clear option, to provide a 'watch' like printing

  perf script:

   - Show hw-cache events too

  perf c2c:

   - Fix data dependency problem in layout of 'struct c2c_hist_entry'

  Core:

   - Do not blindly assume that 'struct perf_evsel' can be obtained via
     a straight forward container_of() as there are call sites which
     hand in a plain 'struct hist' which is not part of a container.

   - Fix error index in the PMU event parser, so that error messages can
     point to the problematic token"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/core: Move the inline keyword at the beginning of the function declaration
  uprobes/x86: Remove incorrect WARN_ON() in uprobe_init_insn()
  perf script: Show hw-cache events
  perf c2c: Keep struct hist_entry at the end of struct c2c_hist_entry
  perf stat: Add event parsing error handling to add_default_attributes
  perf stat: Allow to specify specific metric column len
  perf stat: Fix metric column header display alignment
  perf stat: Use only color_fprintf call in print_metric_only
  perf stat: Add --interval-clear option
  perf tools: Fix error index for pmu event parser
  perf hists: Reimplement hists__has_callchains()
  perf hists browser gtk: Use hist_entry__has_callchains()
  perf hists: Make hist_entry__has_callchains() work with 'perf c2c'
  perf hists: Save the callchain_size in struct hist_entry
2018-06-24 20:29:15 +08:00
Linus Torvalds
2ce413ec16 Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull rseq fixes from Thomas Gleixer:
 "A pile of rseq related fixups:

   - Prevent infinite recursion when delivering SIGSEGV

   - Remove the abort of rseq critical section on fork() as syscalls
     inside rseq critical sections are explicitely forbidden. So no
     point in doing the abort on the child.

   - Align the rseq structure on 32 bytes in the ARM selftest code.

   - Fix file permissions of the test script"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  rseq: Avoid infinite recursion when delivering SIGSEGV
  rseq/cleanup: Do not abort rseq c.s. in child on fork()
  rseq/selftests/arm: Align 'struct rseq_cs' on 32 bytes
  rseq/selftests: Make run_param_test.sh executable
2018-06-24 20:18:19 +08:00
Linus Torvalds
64dd76559d Merge branch 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull EFI fixes from Thomas Gleixner:
 "Two fixlets for the EFI maze:

   - Properly zero variables to prevent an early boot hang on EFI mixed
     mode systems

   - Fix the fallout of merging the 32bit and 64bit variants of EFI PCI
     related code which ended up chosing the 32bit variant of the actual
     EFi call invocation which leads to failures on 64bit"

* 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  efi/x86: Fix incorrect invocation of PciIo->Attributes()
  efi/libstub/tpm: Initialize efi_physical_addr_t vars to zero for mixed mode
2018-06-24 20:16:17 +08:00
Linus Torvalds
d3a6749c33 Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull core fixes from Thomas Gleixner:
 "Two tiny fixes:

   - Add the missing machine_real_restart() to objtools noreturn list so
     it stops complaining

   - Fix a trivial comment typo"

* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  kernel.h: Fix a typo in comment
  objtool: Add machine_real_restart() to the noreturn list
2018-06-24 20:06:42 +08:00
Linus Torvalds
d4e860eaf0 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
 "A set of fixes for x86:

   - Make Xen PV guest deal with speculative store bypass correctly

   - Address more fallout from the 5-Level pagetable handling. Undo an
     __initdata annotation to avoid section mismatch and malfunction
     when post init code would touch the freed variable.

   - Handle exception fixup in math_error() before calling notify_die().
     The reverse call order incorrectly triggers notify_die() listeners
     for soemthing which is handled correctly at the site which issues
     the floating point instruction.

   - Fix an off by one in the LLC topology calculation on AMD

   - Handle non standard memory block sizes gracefully un UV platforms

   - Plug a memory leak in the microcode loader

   - Sanitize the purgatory build magic

   - Add the x86 specific device tree bindings directory to the x86
     MAINTAINER file patterns"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mm: Fix 'no5lvl' handling
  Revert "x86/mm: Mark __pgtable_l5_enabled __initdata"
  x86/CPU/AMD: Fix LLC ID bit-shift calculation
  MAINTAINERS: Add file patterns for x86 device tree bindings
  x86/microcode/intel: Fix memleak in save_microcode_patch()
  x86/platform/UV: Add kernel parameter to set memory block size
  x86/platform/UV: Use new set memory block size function
  x86/platform/UV: Add adjustable set memory block size function
  x86/build: Remove unnecessary preparation for purgatory
  Revert "kexec/purgatory: Add clean-up for purgatory directory"
  x86/xen: Add call of speculative_store_bypass_ht_init() to PV paths
  x86: Call fixup_exception() before notify_die() in math_error()
2018-06-24 19:59:52 +08:00
Linus Torvalds
177d363e72 Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 pti fixes from Thomas Gleixner:
 "Two small updates for the speculative distractions:

   - Make it more clear to the compiler that array_index_mask_nospec()
     is not subject for optimizations. It's not perfect, but ...

   - Don't report XEN PV guests as vulnerable because their mitigation
     state depends on the hypervisor. Report unknown and refer to the
     hypervisor requirement"

* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/spectre_v1: Disable compiler optimizations over array_index_mask_nospec()
  x86/pti: Don't report XenPV as vulnerable
2018-06-24 19:48:30 +08:00
Linus Torvalds
2da2ca24a3 Merge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking fixes from Thomas Gleixner:
 "A set of fixes and updates for the locking code:

   - Prevent lockdep from updating irq state within its own code and
     thereby confusing itself.

   - Buid fix for older GCCs which mistreat anonymous unions

   - Add a missing lockdep annotation in down_read_non_onwer() which
     causes up_read_non_owner() to emit a lockdep splat

   - Remove the custom alpha dec_and_lock() implementation which is
     incorrect in terms of ordering and use the generic one.

  The remaining two commits are not strictly fixes. They provide irqsave
  variants of atomic_dec_and_lock() and refcount_dec_and_lock(). These
  are required to merge the relevant updates and cleanups into different
  maintainer trees for 4.19, so routing them into mainline without
  actual users is the sanest approach.

  They should have been in -rc1, but last weekend I took the liberty to
  just avoid computers in order to regain some mental sanity"

* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/qspinlock: Fix build for anonymous union in older GCC compilers
  locking/lockdep: Do not record IRQ state within lockdep code
  locking/rwsem: Fix up_read_non_owner() warning with DEBUG_RWSEMS
  locking/refcounts: Implement refcount_dec_and_lock_irqsave()
  atomic: Add irqsave variant of atomic_dec_and_lock()
  alpha: Remove custom dec_and_lock() implementation
2018-06-24 19:36:16 +08:00
Linus Torvalds
a43de48993 Merge branch 'ras-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull ras fixes from Thomas Gleixner:
 "A set of fixes for RAS/MCE:

   - Improve the error message when the kernel cannot recover from a MCE
     so the maximum amount of information gets provided.

   - Individually check MCE recovery features on SkyLake CPUs instead of
     assuming none when the CAPID0 register does not advertise the
     general ability for recovery.

   - Prevent MCE to output inconsistent messages which first show an
     error location and then claim that the source is unknown.

   - Prevent overwriting MCi_STATUS in the attempt to gather more
     information when a fatal MCE has alreay been detected. This leads
     to empty status values in the printout and failing to react
     promptly on the fatal event"

* 'ras-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mce: Fix incorrect "Machine check from unknown source" message
  x86/mce: Do not overwrite MCi_STATUS in mce_no_way_out()
  x86/mce: Check for alternate indication of machine check recovery on Skylake
  x86/mce: Improve error message when kernel cannot recover
2018-06-24 19:22:19 +08:00
Linus Torvalds
6242258b6b Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fixes from Thomas Gleixner:
 "A small set of fixes for time(r) related issues:

   - Fix a long standing conversion issue in jiffies_to_msecs() for odd
     HZ values like 1024 or 1200 which resulted in returning 0 for small
     jiffies values due to rounding down.

   - Use the proper CONFIG symbol in the new Y2038 safe compat code for
     posix-timers. Not yet a visible breakage, but this will immediately
     trigger when the architecture support for the new interfaces is
     merged.

   - Return an error code in the STM32 clocksource driver on failure
     instead of success.

   - Remove the redundant and stale irq disabled check in the posix cpu
     timer code. The check is at the wrong place anyway and lockdep
     already covers it via the sighand lock locking coverage"

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  time: Make sure jiffies_to_msecs() preserves non-zero time periods
  posix-timers: Fix nanosleep_copyout() for CONFIG_COMPAT_32BIT_TIME
  clocksource/drivers/stm32: Fix error return code
  posix-cpu-timers: Remove lockdep_assert_irqs_disabled()
2018-06-24 19:16:42 +08:00
Linus Torvalds
78fea6334f Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fixes from Thomas Gleixner:
 "A set of fixes mostly for the ARM/GIC world:

   - Fix the MSI affinity handling in the ls-scfg irq chip driver so it
     updates and uses the effective affinity mask correctly

   - Prevent binding LPIs to offline CPUs and respect the Cavium erratum
     which requires that LPIs which belong to an offline NUMA node are
     not bound to a CPU on a different NUMA node.

   - Free only the amount of allocated interrupts in the GIC-V2M driver
     instead of trying to free log2(nrirqs).

   - Prevent emitting SYNC and VSYNC targetting non existing interrupt
     collections in the GIC-V3 ITS driver

   - Ensure that the GIV-V3 interrupt redistributor is correctly
     reprogrammed on CPU hotplug

   - Remove a stale unused helper function"

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqdesc: Delete irq_desc_get_msi_desc()
  irqchip/gic-v3-its: Fix reprogramming of redistributors on CPU hotplug
  irqchip/gic-v3-its: Only emit VSYNC if targetting a valid collection
  irqchip/gic-v3-its: Only emit SYNC if targetting a valid collection
  irqchip/gic-v3-its: Don't bind LPI to unavailable NUMA node
  irqchip/gic-v2m: Fix SPI release on error path
  irqchip/ls-scfg-msi: Fix MSI affinity handling
  genirq/debugfs: Add missing IRQCHIP_SUPPORTS_LEVEL_MSI debug
2018-06-24 19:01:18 +08:00
Linus Torvalds
e0bc833d10 Merge tag 'mips_fixes_4.18_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux
Pull MIPS fixes from Paul Burton:
 "A few MIPS fixes for 4.18:

   - a GPIO device name fix for a regression in v4.15-rc1.

   - an errata workaround for the BCM5300X platform.

   - a fix to ftrace function graph tracing, broken for a long time with
     the fix applying cleanly back as far as v3.17.

   - addition of read barriers to in{b,w,l,q}() functions, matching
     behavior of other architectures & mirroring the equivalent addition
     to read{b,w,l,q} in v4.17-rc2.

  Plus changes to wire up new syscalls introduced in the 4.18 cycle:

   - Restartable sequences support is added, including MIPS support in
     the selftests.

   - io_pgetevents is wired up"

* tag 'mips_fixes_4.18_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
  MIPS: Wire up io_pgetevents syscall
  rseq/selftests: Implement MIPS support
  MIPS: Wire up the restartable sequences (rseq) syscall
  MIPS: Add syscall detection for restartable sequences
  MIPS: Add support for restartable sequences
  MIPS: io: Add barrier after register read in inX()
  mips: ftrace: fix static function graph tracing
  MIPS: BCM47XX: Enable 74K Core ExternalSync for PCIe erratum
  MIPS: pb44: Fix i2c-gpio GPIO descriptor table
2018-06-24 17:19:42 +08:00
Vakul Garg
3531456aba strparser: Corrected typo in documentation.
Replaced strp_pause() with strp_unpause() to correct a seemingly copy
paste documentation mistake.

Signed-off-by: Vakul Garg <vakul.garg@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-24 16:40:20 +09:00
Ard Biesheuvel
2e6eb40ca5 efi/x86: Fix incorrect invocation of PciIo->Attributes()
The following commit:

  2c3625cb9f ("efi/x86: Fold __setup_efi_pci32() and __setup_efi_pci64() into one function")

... merged the two versions of __setup_efi_pciXX(), without taking into
account that the 32-bit version used a rather dodgy trick to pass an
immediate 0 constant as argument for a uint64_t parameter.

The issue is caused by the fact that on x86, UEFI protocol method calls
are redirected via struct efi_config::call(), which is a variadic function,
and so the compiler has to infer the types of the parameters from the
arguments rather than from the prototype.

As the 32-bit x86 calling convention passes arguments via the stack,
passing the unqualified constant 0 twice is the same as passing 0ULL,
which is why the 32-bit code in __setup_efi_pci32() contained the
following call:

  status = efi_early->call(pci->attributes, pci,
                           EfiPciIoAttributeOperationGet, 0, 0,
                           &attributes);

to invoke this UEFI protocol method:

  typedef
  EFI_STATUS
  (EFIAPI *EFI_PCI_IO_PROTOCOL_ATTRIBUTES) (
    IN  EFI_PCI_IO_PROTOCOL                     *This,
    IN  EFI_PCI_IO_PROTOCOL_ATTRIBUTE_OPERATION Operation,
    IN  UINT64                                  Attributes,
    OUT UINT64                                  *Result OPTIONAL
    );

After the merge, we inadvertently ended up with this version for both
32-bit and 64-bit builds, breaking the latter.

So replace the two zeroes with the explicitly typed constant 0ULL,
which works as expected on both 32-bit and 64-bit builds.

Wilfried tested the 64-bit build, and I checked the generated assembly
of a 32-bit build with and without this patch, and they are identical.

Reported-by: Wilfried Klaebe <linux-kernel@lebenslange-mailadresse.de>
Tested-by: Wilfried Klaebe <linux-kernel@lebenslange-mailadresse.de>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: hdegoede@redhat.com
Cc: linux-efi@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-06-24 09:05:58 +02:00
Aleksander Morgado
e7e197edd0 qmi_wwan: add support for the Dell Wireless 5821e module
This module exposes two USB configurations: a QMI+AT capable setup on
USB config #1 and a MBIM capable setup on USB config #2.

By default the kernel will choose the MBIM capable configuration as
long as the cdc_mbim driver is available. This patch adds support for
the QMI port in the secondary configuration.

Signed-off-by: Aleksander Morgado <aleksander@aleksander.es>
Acked-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-24 11:44:28 +09:00
Ganesh Goudar
5ce36338a3 cxgb4: when disabling dcb set txq dcb priority to 0
When we are disabling DCB, store "0" in txq->dcb_prio
since that's used for future TX Work Request "OVLAN_IDX"
values. Setting non zero priority upon disabling DCB
would halt the traffic.

Reported-by: AMG Zollner Robert <robert@cloudmedia.eu>
CC: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-24 11:43:16 +09:00
Linus Torvalds
77072ca59f Merge tag 'for-linus-20180623' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:

 - Further timeout fixes. We aren't quite there yet, so expect another
   round of fixes for that to completely close some of the IRQ vs
   completion races. (Christoph/Bart)

 - Set of NVMe fixes from the usual suspects, mostly error handling

 - Two off-by-one fixes (Dan)

 - Another bdi race fix (Jan)

 - Fix nbd reconfigure with NBD_DISCONNECT_ON_CLOSE (Doron)

* tag 'for-linus-20180623' of git://git.kernel.dk/linux-block:
  blk-mq: Fix timeout handling in case the timeout handler returns BLK_EH_DONE
  bdi: Fix another oops in wb_workfn()
  lightnvm: Remove depends on HAS_DMA in case of platform dependency
  nvme-pci: limit max IO size and segments to avoid high order allocations
  nvme-pci: move nvme_kill_queues to nvme_remove_dead_ctrl
  nvme-fc: release io queues to allow fast fail
  nbd: Add the nbd NBD_DISCONNECT_ON_CLOSE config flag.
  block: sed-opal: Fix a couple off by one bugs
  blk-mq-debugfs: Off by one in blk_mq_rq_state_name()
  nvmet: reset keep alive timer in controller enable
  nvme-rdma: don't override opts->queue_size
  nvme-rdma: Fix command completion race at error recovery
  nvme-rdma: fix possible free of a non-allocated async event buffer
  nvme-rdma: fix possible double free condition when failing to create a controller
  Revert "block: Add warning for bi_next not NULL in bio_endio()"
  block: fix timeout changes for legacy request drivers
2018-06-24 06:33:54 +08:00
Linus Torvalds
2dd3f7c904 Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fixes from Herbert Xu:

 - Fix use after free in chtls

 - Fix RBP breakage in sha3

 - Fix use after free in hwrng_unregister

 - Fix overread in morus640

 - Move sleep out of kernel_neon in arm64/aes-blk

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  hwrng: core - Always drop the RNG in hwrng_unregister()
  crypto: morus640 - Fix out-of-bounds access
  crypto: don't optimize keccakf()
  crypto: arm64/aes-blk - fix and move skcipher_walk_done out of kernel_neon_begin, _end
  crypto: chtls - use after free in chtls_pt_recvmsg()
2018-06-24 06:31:54 +08:00
Linus Torvalds
b13fbe7713 Merge tag 'linux-kselftest-4.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull kselftest fixes from Shuah Khan:

 - fix new sparc64 adi driver test compile errors on non-sparc systems

 - fix config fragment for sync framework for improved test coverage

 - fix several tests to return correct Kselftest skip code

* tag 'linux-kselftest-4.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  selftests: sparc64: Add missing SPDX License Identifiers
  selftests: sparc64: delete RUN_TESTS and EMIT_TESTS overrides
  selftests: sparc64: Fix to do nothing on non-sparc64
  selftests: sync: add config fragment for testing sync framework
  selftests: vm: return Kselftest Skip code for skipped tests
  selftests: zram: return Kselftest Skip code for skipped tests
  selftests: user: return Kselftest Skip code for skipped tests
  selftests: sysctl: return Kselftest Skip code for skipped tests
  selftests: static_keys: return Kselftest Skip code for skipped tests
  selftests: pstore: return Kselftest Skip code for skipped tests
2018-06-24 06:26:19 +08:00