Commit Graph

393635 Commits

Author SHA1 Message Date
Stefan Hajnoczi
ddd3d4081f vhost: return bool from *_access_ok() functions
Currently vhost *_access_ok() functions return int.  This is error-prone
because there are two popular conventions:

1. 0 means failure, 1 means success
2. -errno means failure, 0 means success

Although vhost mostly uses #1, it does not do so consistently.
umem_access_ok() uses #2.

This patch changes the return type from int to bool so that false means
failure and true means success.  This eliminates a potential source of
errors.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-11 10:54:06 -04:00
Stefan Hajnoczi
d14d2b7809 vhost: fix vhost_vq_access_ok() log check
Commit d65026c6c6 ("vhost: validate log
when IOTLB is enabled") introduced a regression.  The logic was
originally:

  if (vq->iotlb)
      return 1;
  return A && B;

After the patch the short-circuit logic for A was inverted:

  if (A || vq->iotlb)
      return A;
  return B;

This patch fixes the regression by rewriting the checks in the obvious
way, no longer returning A when vq->iotlb is non-NULL (which is hard to
understand).

Reported-by: syzbot+65a84dde0214b0387ccd@syzkaller.appspotmail.com
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-11 10:54:06 -04:00
Eric Auger
7ced6c98c7 vhost: Fix vhost_copy_to_user()
vhost_copy_to_user is used to copy vring used elements to userspace.
We should use VHOST_ADDR_USED instead of VHOST_ADDR_DESC.

Fixes: f889491380 ("vhost: introduce O(1) vq metadata cache")
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-11 10:52:34 -04:00
Igor Russkikh
9a11aff25f net: aquantia: oops when shutdown on already stopped device
In case netdev is closed at the moment of pci shutdown, aq_nic_stop
gets called second time. napi_disable in that case hangs indefinitely.
In other case, if device was never opened at all, we get oops because
of null pointer access.

We should invoke aq_nic_stop conditionally, only if device is running
at the moment of shutdown.

Reported-by: David Arcari <darcari@redhat.com>
Fixes: 90869ddfef ("net: aquantia: Implement pci shutdown callback")
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-11 10:41:36 -04:00
Igor Russkikh
cce96d1883 net: aquantia: Regression on reset with 1.x firmware
On ASUS XG-C100C with 1.5.44 firmware a special mode called "dirty wake"
is active. With this mode when motherboard gets powered (but no poweron
happens yet), NIC automatically enables powersave link and watches
for WOL packet.
This normally allows to powerup the PC after AC power failures.

Not all motherboards or bios settings gives power to PCI slots,
so this mode is not enabled on all the hardware.

4.16 linux driver introduced full hardware reset sequence
This is required since before that we had no NIC hardware
reset implemented and there were side effects of "not clean start".

But this full reset is incompatible with "dirty wake" WOL feature
it keeps the PHY link in a special mode forever. As a consequence,
driver sees no link and no traffic.

To fix this we forcibly change FW state to idle state before doing
the full reset. This makes FW to restore link state.

Fixes: c8c82eb net: aquantia: Introduce global AQC hardware reset sequence
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-11 10:41:36 -04:00
Bassem Boubaker
53765341ee cdc_ether: flag the Cinterion AHS8 modem by gemalto as WWAN
The Cinterion AHS8 is a 3G device with one embedded WWAN interface
using cdc_ether as a driver.

The modem is controlled via AT commands through the exposed TTYs.

AT+CGDCONT write command can be used to activate or deactivate a WWAN
connection for a PDP context defined with the same command. UE
supports one WWAN adapter.

Signed-off-by: Bassem Boubaker <bassem.boubaker@actia.fr>
Acked-by: Oliver Neukum <oneukum@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-11 10:36:38 -04:00
Tejaswi Tanikella
3f01ddb962 slip: Check if rstate is initialized before uncompressing
On receiving a packet the state index points to the rstate which must be
used to fill up IP and TCP headers. But if the state index points to a
rstate which is unitialized, i.e. filled with zeros, it gets stuck in an
infinite loop inside ip_fast_csum trying to compute the ip checsum of a
header with zero length.

89.666953:   <2> [<ffffff9dd3e94d38>] slhc_uncompress+0x464/0x468
89.666965:   <2> [<ffffff9dd3e87d88>] ppp_receive_nonmp_frame+0x3b4/0x65c
89.666978:   <2> [<ffffff9dd3e89dd4>] ppp_receive_frame+0x64/0x7e0
89.666991:   <2> [<ffffff9dd3e8a708>] ppp_input+0x104/0x198
89.667005:   <2> [<ffffff9dd3e93868>] pppopns_recv_core+0x238/0x370
89.667027:   <2> [<ffffff9dd4428fc8>] __sk_receive_skb+0xdc/0x250
89.667040:   <2> [<ffffff9dd3e939e4>] pppopns_recv+0x44/0x60
89.667053:   <2> [<ffffff9dd4426848>] __sock_queue_rcv_skb+0x16c/0x24c
89.667065:   <2> [<ffffff9dd4426954>] sock_queue_rcv_skb+0x2c/0x38
89.667085:   <2> [<ffffff9dd44f7358>] raw_rcv+0x124/0x154
89.667098:   <2> [<ffffff9dd44f7568>] raw_local_deliver+0x1e0/0x22c
89.667117:   <2> [<ffffff9dd44c8ba0>] ip_local_deliver_finish+0x70/0x24c
89.667131:   <2> [<ffffff9dd44c92f4>] ip_local_deliver+0x100/0x10c

./scripts/faddr2line vmlinux slhc_uncompress+0x464/0x468 output:
 ip_fast_csum at arch/arm64/include/asm/checksum.h:40
 (inlined by) slhc_uncompress at drivers/net/slip/slhc.c:615

Adding a variable to indicate if the current rstate is initialized. If
such a packet arrives, move to toss state.

Signed-off-by: Tejaswi Tanikella <tejaswit@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-11 10:33:46 -04:00
Phil Elwell
fed56079e7 lan78xx: Avoid spurious kevent 4 "error"
lan78xx_defer_event generates an error message whenever the work item
is already scheduled. lan78xx_open defers three events -
EVENT_STAT_UPDATE, EVENT_DEV_OPEN and EVENT_LINK_RESET. Being aware
of the likelihood (or certainty) of an error message, the DEV_OPEN
event is added to the set of pending events directly, relying on
the subsequent deferral of the EVENT_LINK_RESET call to schedule the
work.  Take the same precaution with EVENT_STAT_UPDATE to avoid a
totally unnecessary error message.

Signed-off-by: Phil Elwell <phil@raspberrypi.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-11 10:32:43 -04:00
Phil Elwell
4bfc33807a lan78xx: Correctly indicate invalid OTP
lan78xx_read_otp tries to return -EINVAL in the event of invalid OTP
content, but the value gets overwritten before it is returned and the
read goes ahead anyway. Make the read conditional as it should be
and preserve the error code.

Fixes: 55d7de9de6 ("Microchip's LAN7800 family USB 2/3 to 10/100/1000 Ethernet device driver")
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-11 10:30:42 -04:00
Linus Torvalds
c18bb396d3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) The sockmap code has to free socket memory on close if there is
    corked data, from John Fastabend.

 2) Tunnel names coming from userspace need to be length validated. From
    Eric Dumazet.

 3) arp_filter() has to take VRFs properly into account, from Miguel
    Fadon Perlines.

 4) Fix oops in error path of tcf_bpf_init(), from Davide Caratti.

 5) Missing idr_remove() in u32_delete_key(), from Cong Wang.

 6) More syzbot stuff. Several use of uninitialized value fixes all
    over, from Eric Dumazet.

 7) Do not leak kernel memory to userspace in sctp, also from Eric
    Dumazet.

 8) Discard frames from unused ports in DSA, from Andrew Lunn.

 9) Fix DMA mapping and reset/failover problems in ibmvnic, from Thomas
    Falcon.

10) Do not access dp83640 PHY registers prematurely after reset, from
    Esben Haabendal.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (46 commits)
  vhost-net: set packet weight of tx polling to 2 * vq size
  net: thunderx: rework mac addresses list to u64 array
  inetpeer: fix uninit-value in inet_getpeer
  dp83640: Ensure against premature access to PHY registers after reset
  devlink: convert occ_get op to separate registration
  ARM: dts: ls1021a: Specify TBIPA register address
  net/fsl_pq_mdio: Allow explicit speficition of TBIPA address
  ibmvnic: Do not reset CRQ for Mobility driver resets
  ibmvnic: Fix failover case for non-redundant configuration
  ibmvnic: Fix reset scheduler error handling
  ibmvnic: Zero used TX descriptor counter on reset
  ibmvnic: Fix DMA mapping mistakes
  tipc: use the right skb in tipc_sk_fill_sock_diag()
  sctp: sctp_sockaddr_af must check minimal addr length for AF_INET6
  net: dsa: Discard frames from unused ports
  sctp: do not leak kernel memory to user space
  soreuseport: initialise timewait reuseport field
  ipv4: fix uninit-value in ip_route_output_key_hash_rcu()
  dccp: initialize ireq->ir_mark
  net: fix uninit-value in __hw_addr_add_ex()
  ...
2018-04-09 17:04:10 -07:00
Linus Torvalds
d8312a3f61 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm updates from Paolo Bonzini:
 "ARM:
   - VHE optimizations

   - EL2 address space randomization

   - speculative execution mitigations ("variant 3a", aka execution past
     invalid privilege register access)

   - bugfixes and cleanups

  PPC:
   - improvements for the radix page fault handler for HV KVM on POWER9

  s390:
   - more kvm stat counters

   - virtio gpu plumbing

   - documentation

   - facilities improvements

  x86:
   - support for VMware magic I/O port and pseudo-PMCs

   - AMD pause loop exiting

   - support for AMD core performance extensions

   - support for synchronous register access

   - expose nVMX capabilities to userspace

   - support for Hyper-V signaling via eventfd

   - use Enlightened VMCS when running on Hyper-V

   - allow userspace to disable MWAIT/HLT/PAUSE vmexits

   - usual roundup of optimizations and nested virtualization bugfixes

  Generic:
   - API selftest infrastructure (though the only tests are for x86 as
     of now)"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (174 commits)
  kvm: x86: fix a prototype warning
  kvm: selftests: add sync_regs_test
  kvm: selftests: add API testing infrastructure
  kvm: x86: fix a compile warning
  KVM: X86: Add Force Emulation Prefix for "emulate the next instruction"
  KVM: X86: Introduce handle_ud()
  KVM: vmx: unify adjacent #ifdefs
  x86: kvm: hide the unused 'cpu' variable
  KVM: VMX: remove bogus WARN_ON in handle_ept_misconfig
  Revert "KVM: X86: Fix SMRAM accessing even if VM is shutdown"
  kvm: Add emulation for movups/movupd
  KVM: VMX: raise internal error for exception during invalid protected mode state
  KVM: nVMX: Optimization: Dont set KVM_REQ_EVENT when VMExit with nested_run_pending
  KVM: nVMX: Require immediate-exit when event reinjected to L2 and L1 event pending
  KVM: x86: Fix misleading comments on handling pending exceptions
  KVM: x86: Rename interrupt.pending to interrupt.injected
  KVM: VMX: No need to clear pending NMI/interrupt on inject realmode interrupt
  x86/kvm: use Enlightened VMCS when running on Hyper-V
  x86/hyper-v: detect nested features
  x86/hyper-v: define struct hv_enlightened_vmcs and clean field bits
  ...
2018-04-09 11:42:31 -07:00
Linus Torvalds
7886e8aa7f Merge branch 'for-linus-sa1100' of git://git.armlinux.org.uk/~rmk/linux-arm
Pull ARM SA1100 updates from Russell King:
 "We have support for arbitary MMIO registers providing platform GPIOs,
  which allows us to abstract some of the SA11x0 CF support.

  This set of updates makes that change"

* 'for-linus-sa1100' of git://git.armlinux.org.uk/~rmk/linux-arm:
  ARM: sa1100/simpad: switch simpad CF to use gpiod APIs
  ARM: sa1100/shannon: convert to generic CF sockets
  ARM: sa1100/nanoengine: convert to generic CF sockets
  ARM: sa1100/h3xxx: switch h3xxx PCMCIA to use gpiod APIs
  ARM: sa1100/cerf: convert to generic CF sockets
  ARM: sa1100/assabet: convert to generic CF sockets
  ARM: sa1100: provide infrastructure to support generic CF sockets
  pcmcia: sa1100: provide generic CF support
2018-04-09 09:26:36 -07:00
Linus Torvalds
becdce1c66 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 updates from Martin Schwidefsky:

 - Improvements for the spectre defense:
    * The spectre related code is consolidated to a single file
      nospec-branch.c
    * Automatic enable/disable for the spectre v2 defenses (expoline vs.
      nobp)
    * Syslog messages for specve v2 are added
    * Enable CONFIG_GENERIC_CPU_VULNERABILITIES and define the attribute
      functions for spectre v1 and v2

 - Add helper macros for assembler alternatives and use them to shorten
   the code in entry.S.

 - Add support for persistent configuration data via the SCLP Store Data
   interface. The H/W interface requires a page table that uses 4K pages
   only, the code to setup such an address space is added as well.

 - Enable virtio GPU emulation in QEMU. To do this the depends
   statements for a few common Kconfig options are modified.

 - Add support for format-3 channel path descriptors and add a binary
   sysfs interface to export the associated utility strings.

 - Add a sysfs attribute to control the IFCC handling in case of
   constant channel errors.

 - The vfio-ccw changes from Cornelia.

 - Bug fixes and cleanups.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (40 commits)
  s390/kvm: improve stack frame constants in entry.S
  s390/lpp: use assembler alternatives for the LPP instruction
  s390/entry.S: use assembler alternatives
  s390: add assembler macros for CPU alternatives
  s390: add sysfs attributes for spectre
  s390: report spectre mitigation via syslog
  s390: add automatic detection of the spectre defense
  s390: move nobp parameter functions to nospec-branch.c
  s390/cio: add util_string sysfs attribute
  s390/chsc: query utility strings via fmt3 channel path descriptor
  s390/cio: rename struct channel_path_desc
  s390/cio: fix unbind of io_subchannel_driver
  s390/qdio: split up CCQ handling for EQBS / SQBS
  s390/qdio: don't retry EQBS after CCQ 96
  s390/qdio: restrict buffer merging to eligible devices
  s390/qdio: don't merge ERROR output buffers
  s390/qdio: simplify math in get_*_buffer_frontier()
  s390/decompressor: trim uncompressed image head during the build
  s390/crypto: Fix kernel crash on aes_s390 module remove.
  s390/defkeymap: fix global init to zero
  ...
2018-04-09 09:04:10 -07:00
haibinzhang(张海斌)
a2ac99905f vhost-net: set packet weight of tx polling to 2 * vq size
handle_tx will delay rx for tens or even hundreds of milliseconds when tx busy
polling udp packets with small length(e.g. 1byte udp payload), because setting
VHOST_NET_WEIGHT takes into account only sent-bytes but no single packet length.

Ping-Latencies shown below were tested between two Virtual Machines using
netperf (UDP_STREAM, len=1), and then another machine pinged the client:

vq size=256
Packet-Weight   Ping-Latencies(millisecond)
                   min      avg       max
Origin           3.319   18.489    57.303
64               1.643    2.021     2.552
128              1.825    2.600     3.224
256              1.997    2.710     4.295
512              1.860    3.171     4.631
1024             2.002    4.173     9.056
2048             2.257    5.650     9.688
4096             2.093    8.508    15.943

vq size=512
Packet-Weight   Ping-Latencies(millisecond)
                   min      avg       max
Origin           6.537   29.177    66.245
64               2.798    3.614     4.403
128              2.861    3.820     4.775
256              3.008    4.018     4.807
512              3.254    4.523     5.824
1024             3.079    5.335     7.747
2048             3.944    8.201    12.762
4096             4.158   11.057    19.985

Seems pretty consistent, a small dip at 2 VQ sizes.
Ring size is a hint from device about a burst size it can tolerate. Based on
benchmarks, set the weight to 2 * vq size.

To evaluate this change, another tests were done using netperf(RR, TX) between
two machines with Intel(R) Xeon(R) Gold 6133 CPU @ 2.50GHz, and vq size was
tweaked through qemu. Results shown below does not show obvious changes.

vq size=256 TCP_RR                vq size=512 TCP_RR
size/sessions/+thu%/+normalize%   size/sessions/+thu%/+normalize%
   1/       1/  -7%/        -2%      1/       1/   0%/        -2%
   1/       4/  +1%/         0%      1/       4/  +1%/         0%
   1/       8/  +1%/        -2%      1/       8/   0%/        +1%
  64/       1/  -6%/         0%     64/       1/  +7%/        +3%
  64/       4/   0%/        +2%     64/       4/  -1%/        +1%
  64/       8/   0%/         0%     64/       8/  -1%/        -2%
 256/       1/  -3%/        -4%    256/       1/  -4%/        -2%
 256/       4/  +3%/        +4%    256/       4/  +1%/        +2%
 256/       8/  +2%/         0%    256/       8/  +1%/        -1%

vq size=256 UDP_RR                vq size=512 UDP_RR
size/sessions/+thu%/+normalize%   size/sessions/+thu%/+normalize%
   1/       1/  -5%/        +1%      1/       1/  -3%/        -2%
   1/       4/  +4%/        +1%      1/       4/  -2%/        +2%
   1/       8/  -1%/        -1%      1/       8/  -1%/         0%
  64/       1/  -2%/        -3%     64/       1/  +1%/        +1%
  64/       4/  -5%/        -1%     64/       4/  +2%/         0%
  64/       8/   0%/        -1%     64/       8/  -2%/        +1%
 256/       1/  +7%/        +1%    256/       1/  -7%/         0%
 256/       4/  +1%/        +1%    256/       4/  -3%/        -4%
 256/       8/  +2%/        +2%    256/       8/  +1%/        +1%

vq size=256 TCP_STREAM            vq size=512 TCP_STREAM
size/sessions/+thu%/+normalize%   size/sessions/+thu%/+normalize%
  64/       1/   0%/        -3%     64/       1/   0%/         0%
  64/       4/  +3%/        -1%     64/       4/  -2%/        +4%
  64/       8/  +9%/        -4%     64/       8/  -1%/        +2%
 256/       1/  +1%/        -4%    256/       1/  +1%/        +1%
 256/       4/  -1%/        -1%    256/       4/  -3%/         0%
 256/       8/  +7%/        +5%    256/       8/  -3%/         0%
 512/       1/  +1%/         0%    512/       1/  -1%/        -1%
 512/       4/  +1%/        -1%    512/       4/   0%/         0%
 512/       8/  +7%/        -5%    512/       8/  +6%/        -1%
1024/       1/   0%/        -1%   1024/       1/   0%/        +1%
1024/       4/  +3%/         0%   1024/       4/  +1%/         0%
1024/       8/  +8%/        +5%   1024/       8/  -1%/         0%
2048/       1/  +2%/        +2%   2048/       1/  -1%/         0%
2048/       4/  +1%/         0%   2048/       4/   0%/        -1%
2048/       8/  -2%/         0%   2048/       8/   5%/        -1%
4096/       1/  -2%/         0%   4096/       1/  -2%/         0%
4096/       4/  +2%/         0%   4096/       4/   0%/         0%
4096/       8/  +9%/        -2%   4096/       8/  -5%/        -1%

Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Haibin Zhang <haibinzhang@tencent.com>
Signed-off-by: Yunfang Tai <yunfangtai@tencent.com>
Signed-off-by: Lidong Chen <lidongchen@tencent.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-09 11:01:37 -04:00
Vadim Lomovtsev
9b5c4dfb2a net: thunderx: rework mac addresses list to u64 array
It is too expensive to pass u64 values via linked list, instead
allocate array for them by overall number of mac addresses from netdev.

This eventually removes multiple kmalloc() calls, aviod memory
fragmentation and allow to put single null check on kmalloc
return value in order to prevent a potential null pointer dereference.

Addresses-Coverity-ID: 1467429 ("Dereference null return value")
Fixes: 37c3347eb2 ("net: thunderx: add ndo_set_rx_mode callback implementation for VF")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-09 10:59:38 -04:00
Esben Haabendal
76327a35ca dp83640: Ensure against premature access to PHY registers after reset
The datasheet specifies a 3uS pause after performing a software
reset. The default implementation of genphy_soft_reset() does not
provide this, so implement soft_reset with the needed pause.

Signed-off-by: Esben Haabendal <eha@deif.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-08 19:58:52 -04:00
Jiri Pirko
fc56be47da devlink: convert occ_get op to separate registration
This resolves race during initialization where the resources with
ops are registered before driver and the structures used by occ_get
op is initialized. So keep occ_get callbacks registered only when
all structs are initialized.

The example flows, as it is in mlxsw:
1) driver load/asic probe:
   mlxsw_core
      -> mlxsw_sp_resources_register
        -> mlxsw_sp_kvdl_resources_register
          -> devlink_resource_register IDX
   mlxsw_spectrum
      -> mlxsw_sp_kvdl_init
        -> mlxsw_sp_kvdl_parts_init
          -> mlxsw_sp_kvdl_part_init
            -> devlink_resource_size_get IDX (to get the current setup
                                              size from devlink)
        -> devlink_resource_occ_get_register IDX (register current
                                                  occupancy getter)
2) reload triggered by devlink command:
  -> mlxsw_devlink_core_bus_device_reload
    -> mlxsw_sp_fini
      -> mlxsw_sp_kvdl_fini
	-> devlink_resource_occ_get_unregister IDX
    (struct mlxsw_sp *mlxsw_sp is freed at this point, call to occ get
     which is using mlxsw_sp would cause use-after free)
    -> mlxsw_sp_init
      -> mlxsw_sp_kvdl_init
        -> mlxsw_sp_kvdl_parts_init
          -> mlxsw_sp_kvdl_part_init
            -> devlink_resource_size_get IDX (to get the current setup
                                              size from devlink)
        -> devlink_resource_occ_get_register IDX (register current
                                                  occupancy getter)

Fixes: d9f9b9a4d0 ("devlink: Add support for resource abstraction")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-08 12:45:57 -04:00
Esben Haabendal
21481189e8 net/fsl_pq_mdio: Allow explicit speficition of TBIPA address
This introduces a simpler and generic method for for finding (and mapping)
the TBIPA register.

Instead of relying of complicated logic for finding the TBIPA register
address based on the MDIO or MII register block base
address, which even in some cases relies on undocumented shadow registers,
a second "reg" entry for the mdio bus devicetree node specifies the TBIPA
register.

Backwards compatibility is kept, as the existing logic is applied when
only a single "reg" mapping is specified.

Signed-off-by: Esben Haabendal <eha@deif.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-08 12:44:49 -04:00
Nathan Fontenot
30f796258c ibmvnic: Do not reset CRQ for Mobility driver resets
When resetting the ibmvnic driver after a partition migration occurs
there is no requirement to do a reset of the main CRQ. The current
driver code does the required re-enable of the main CRQ, then does
a reset of the main CRQ later.

What we should be doing for a driver reset after a migration is to
re-enable the main CRQ, release all the sub-CRQs, and then allocate
new sub-CRQs after capability negotiation.

This patch updates the handling of mobility resets to do the proper
work and not reset the main CRQ. To do this the initialization/reset
of the main CRQ had to be moved out of the ibmvnic_init routine
and in to the ibmvnic_probe and do_reset routines.

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-08 12:39:47 -04:00
Thomas Falcon
5a18e1e0c1 ibmvnic: Fix failover case for non-redundant configuration
There is a failover case for a non-redundant pseries VNIC
configuration that was not being handled properly. The current
implementation assumes that the driver will always have a redandant
device to communicate with following a failover notification. There
are cases, however, when a non-redundant configuration can receive
a failover request. If that happens, the driver should wait until
it receives a signal that the device is ready for operation.

The driver is agnostic of its backing hardware configuration,
so this fix necessarily affects all device failover management.
The driver needs to wait until it receives a signal that the device
is ready for resetting. A flag is introduced to track this intermediary
state where the driver is waiting for an active device.

Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-08 12:39:47 -04:00
Thomas Falcon
af894d2398 ibmvnic: Fix reset scheduler error handling
In some cases, if the driver is waiting for a reset following
a device parameter change, failure to schedule a reset can result
in a hang since a completion signal is never sent.

If the device configuration is being altered by a tool such
as ethtool or ifconfig, it could cause the console to hang
if the reset request does not get scheduled. Add some additional
error handling code to exit the wait_for_completion if there is
one in progress.

Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-08 12:39:47 -04:00
Thomas Falcon
41f714672f ibmvnic: Zero used TX descriptor counter on reset
The counter that tracks used TX descriptors pending completion
needs to be zeroed as part of a device reset. This change fixes
a bug causing transmit queues to be stopped unnecessarily and in
some cases a transmit queue stall and timeout reset. If the counter
is not reset, the remaining descriptors will not be "removed",
effectively reducing queue capacity. If the queue is over half full,
it will cause the queue to stall if stopped.

Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-08 12:39:47 -04:00
Thomas Falcon
37e40fa8f6 ibmvnic: Fix DMA mapping mistakes
Fix some mistakes caught by the DMA debugger. The first change
fixes a unnecessary unmap that should have been removed in an
earlier update. The next hunk fixes another bad unmap by zeroing
the bit checked to determine that an unmap is needed. The final
change fixes some buffers that are unmapped with the wrong
direction specified.

Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-08 12:39:47 -04:00
Linus Torvalds
4b3f1a1515 Merge branch 'next-tpm' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull TPM updates from James Morris:
 "This release contains only bug fixes. There are no new major features
  added"

* 'next-tpm' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
  tpm: fix intermittent failure with self tests
  tpm: add retry logic
  tpm: self test failure should not cause suspend to fail
  tpm2: add longer timeouts for creation commands.
  tpm_crb: use __le64 annotated variable for response buffer address
  tpm: fix buffer type in tpm_transmit_cmd
  tpm: tpm-interface: fix tpm_transmit/_cmd kdoc
  tpm: cmd_ready command can be issued only after granting locality
2018-04-07 16:46:56 -07:00
Linus Torvalds
90fda63fa1 treewide: fix up files incorrectly marked executable
Joe Perches noted that we have a few source files that for some
inexplicable reason (read: I'm too lazy to even go look at the history)
are marked executable:

  drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
  drivers/net/ethernet/cadence/macb_ptp.c

A simple git command line to show executable C/asm/header files is this:

    git ls-files -s '*.[chsS]' | grep '^100755'

and then you can fix them up with scripting by just feeding that output
into:

    | cut -f2 | xargs chmod -x

and commit it.

Which is exactly what this commit does.

Reported-by: Joe Perches <joe@perches.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-07 13:31:23 -07:00
Linus Torvalds
0d5b1bd332 Merge branch 'i2c/for-4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c updates from Wolfram Sang:

 -I2C core now reports proper OF style module alias. I'd like to repeat
  the note from the commit msg here (Thanks, Javier!):

      NOTE: This patch may break out-of-tree drivers that were relying
            on this behavior, and only had an I2C device ID table even
            when the device was registered via OF.

            There are no remaining drivers in mainline that do this, but
            out-of-tree drivers have to be fixed and define a proper OF
            device ID table to have module auto-loading working.

 - new driver for the SynQuacer I2C controller

 - major refactoring of the QUP driver

 - the piix4 driver now uses request_muxed_region which should fix a
   long standing resource conflict with the sp5100_tco watchdog

 - a bunch of small core & driver improvements

* 'i2c/for-4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (53 commits)
  i2c: add support for Socionext SynQuacer I2C controller
  dt-bindings: i2c: add binding for Socionext SynQuacer I2C
  i2c: Update i2c_trace_msg static key to modern api
  i2c: fix parameter of trace_i2c_result
  i2c: imx: avoid taking clk_prepare mutex in PM callbacks
  i2c: imx: use clk notifier for rate changes
  i2c: make i2c_check_addr_validity() static
  i2c: rcar: fix mask value of prohibited bit
  dt-bindings: i2c: document R8A77965 bindings
  i2c: pca-platform: drop gpio from platform data
  i2c: pca-platform: use device_property_read_u32
  i2c: pca-platform: unconditionally use devm_gpiod_get_optional
  sh: sh7785lcr: add GPIO lookup table for i2c controller reset
  i2c: qup: reorganization of driver code to remove polling for qup v2
  i2c: qup: reorganization of driver code to remove polling for qup v1
  i2c: qup: send NACK for last read sub transfers
  i2c: qup: fix buffer overflow for multiple msg of maximum xfer len
  i2c: qup: change completion timeout according to transfer length
  i2c: qup: use the complete transfer length to choose DMA mode
  i2c: qup: proper error handling for i2c error in BAM mode
  ...
2018-04-07 12:36:18 -07:00
Linus Torvalds
49a695ba72 Merge tag 'powerpc-4.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc updates from Michael Ellerman:
 "Notable changes:

   - Support for 4PB user address space on 64-bit, opt-in via mmap().

   - Removal of POWER4 support, which was accidentally broken in 2016
     and no one noticed, and blocked use of some modern instructions.

   - Workarounds so that the hypervisor can enable Transactional Memory
     on Power9.

   - A series to disable the DAWR (Data Address Watchpoint Register) on
     Power9.

   - More information displayed in the meltdown/spectre_v1/v2 sysfs
     files.

   - A vpermxor (Power8 Altivec) implementation for the raid6 Q
     Syndrome.

   - A big series to make the allocation of our pacas (per cpu area),
     kernel page tables, and per-cpu stacks NUMA aware when using the
     Radix MMU on Power9.

  And as usual many fixes, reworks and cleanups.

  Thanks to: Aaro Koskinen, Alexandre Belloni, Alexey Kardashevskiy,
  Alistair Popple, Andy Shevchenko, Aneesh Kumar K.V, Anshuman Khandual,
  Balbir Singh, Benjamin Herrenschmidt, Christophe Leroy, Christophe
  Lombard, Cyril Bur, Daniel Axtens, Dave Young, Finn Thain, Frederic
  Barrat, Gustavo Romero, Horia Geantă, Jonathan Neuschäfer, Kees Cook,
  Larry Finger, Laurent Dufour, Laurent Vivier, Logan Gunthorpe,
  Madhavan Srinivasan, Mark Greer, Mark Hairgrove, Markus Elfring,
  Mathieu Malaterre, Matt Brown, Matt Evans, Mauricio Faria de Oliveira,
  Michael Neuling, Naveen N. Rao, Nicholas Piggin, Paul Mackerras,
  Philippe Bergheaud, Ram Pai, Rob Herring, Sam Bobroff, Segher
  Boessenkool, Simon Guo, Simon Horman, Stewart Smith, Sukadev
  Bhattiprolu, Suraj Jitindar Singh, Thiago Jung Bauermann, Vaibhav
  Jain, Vaidyanathan Srinivasan, Vasant Hegde, Wei Yongjun"

* tag 'powerpc-4.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (207 commits)
  powerpc/64s/idle: Fix restore of AMOR on POWER9 after deep sleep
  powerpc/64s: Fix POWER9 DD2.2 and above in cputable features
  powerpc/64s: Fix pkey support in dt_cpu_ftrs, add CPU_FTR_PKEY bit
  powerpc/64s: Fix dt_cpu_ftrs to have restore_cpu clear unwanted LPCR bits
  Revert "powerpc/64s/idle: POWER9 ESL=0 stop avoid save/restore overhead"
  powerpc: iomap.c: introduce io{read|write}64_{lo_hi|hi_lo}
  powerpc: io.h: move iomap.h include so that it can use readq/writeq defs
  cxl: Fix possible deadlock when processing page faults from cxllib
  powerpc/hw_breakpoint: Only disable hw breakpoint if cpu supports it
  powerpc/mm/radix: Update command line parsing for disable_radix
  powerpc/mm/radix: Parse disable_radix commandline correctly.
  powerpc/mm/hugetlb: initialize the pagetable cache correctly for hugetlb
  powerpc/mm/radix: Update pte fragment count from 16 to 256 on radix
  powerpc/mm/keys: Update documentation and remove unnecessary check
  powerpc/64s/idle: POWER9 ESL=0 stop avoid save/restore overhead
  powerpc/64s/idle: Consolidate power9_offline_stop()/power9_idle_stop()
  powerpc/powernv: Always stop secondaries before reboot/shutdown
  powerpc: hard disable irqs in smp_send_stop loop
  powerpc: use NMI IPI for smp_send_stop
  powerpc/powernv: Fix SMT4 forcing idle code
  ...
2018-04-07 12:08:19 -07:00
Linus Torvalds
3612605a5a Merge branch 'next-general' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull general security layer updates from James Morris:

 - Convert security hooks from list to hlist, a nice cleanup, saving
   about 50% of space, from Sargun Dhillon.

 - Only pass the cred, not the secid, to kill_pid_info_as_cred and
   security_task_kill (as the secid can be determined from the cred),
   from Stephen Smalley.

 - Close a potential race in kernel_read_file(), by making the file
   unwritable before calling the LSM check (vs after), from Kees Cook.

* 'next-general' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
  security: convert security hooks to use hlist
  exec: Set file unwritable before LSM check
  usb, signal, security: only pass the cred, not the secid, to kill_pid_info_as_cred and security_task_kill
2018-04-07 11:11:41 -07:00
Linus Torvalds
f605ba97fb Merge tag 'vfio-v4.17-rc1' of git://github.com/awilliam/linux-vfio
Pull VFIO updates from Alex Williamson:

 - Adopt iommu_unmap_fast() interface to type1 backend
   (Suravee Suthikulpanit)

 - mdev sample driver fixup (Shunyong Yang)

 - More efficient PFN mapping handling in type1 backend
   (Jason Cai)

 - VFIO device ioeventfd interface (Alex Williamson)

 - Tag new vfio-platform sub-maintainer (Alex Williamson)

* tag 'vfio-v4.17-rc1' of git://github.com/awilliam/linux-vfio:
  MAINTAINERS: vfio/platform: Update sub-maintainer
  vfio/pci: Add ioeventfd support
  vfio/pci: Use endian neutral helpers
  vfio/pci: Pull BAR mapping setup from read-write path
  vfio/type1: Improve memory pinning process for raw PFN mapping
  vfio-mdev/samples: change RDI interrupt condition
  vfio/type1: Adopt fast IOTLB flush interface when unmap IOVAs
2018-04-06 19:44:27 -07:00
Linus Torvalds
016c6f25d1 Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull fw_cfg, vhost updates from Michael Tsirkin:
 "This cleans up the qemu fw cfg device driver.

  On top of this, vmcore is dumped there on crash to help debugging
  with kASLR enabled.

  Also included are some fixes in vhost"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  vhost: add vsock compat ioctl
  vhost: fix vhost ioctl signature to build with clang
  fw_cfg: write vmcoreinfo details
  crash: export paddr_vmcoreinfo_note()
  fw_cfg: add DMA register
  fw_cfg: add a public uapi header
  fw_cfg: handle fw_cfg_read_blob() error
  fw_cfg: remove inline from fw_cfg_read_blob()
  fw_cfg: fix sparse warnings around FW_CFG_FILE_DIR read
  fw_cfg: fix sparse warning reading FW_CFG_ID
  fw_cfg: fix sparse warnings with fw_cfg_file
  fw_cfg: fix sparse warnings in fw_cfg_sel_endianness()
  ptr_ring: fix build
2018-04-06 19:21:41 -07:00
Linus Torvalds
3c0d551e02 Merge tag 'pci-v4.17-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI updates from Bjorn Helgaas:

 - move pci_uevent_ers() out of pci.h (Michael Ellerman)

 - skip ASPM common clock warning if BIOS already configured it (Sinan
   Kaya)

 - fix ASPM Coverity warning about threshold_ns (Gustavo A. R. Silva)

 - remove last user of pci_get_bus_and_slot() and the function itself
   (Sinan Kaya)

 - add decoding for 16 GT/s link speed (Jay Fang)

 - add interfaces to get max link speed and width (Tal Gilboa)

 - add pcie_bandwidth_capable() to compute max supported link bandwidth
   (Tal Gilboa)

 - add pcie_bandwidth_available() to compute bandwidth available to
   device (Tal Gilboa)

 - add pcie_print_link_status() to log link speed and whether it's
   limited (Tal Gilboa)

 - use PCI core interfaces to report when device performance may be
   limited by its slot instead of doing it in each driver (Tal Gilboa)

 - fix possible cpqphp NULL pointer dereference (Shawn Lin)

 - rescan more of the hierarchy on ACPI hotplug to fix Thunderbolt/xHCI
   hotplug (Mika Westerberg)

 - add support for PCI I/O port space that's neither directly accessible
   via CPU in/out instructions nor directly mapped into CPU physical
   memory space. This is fairly intrusive and includes minor changes to
   interfaces used for I/O space on most platforms (Zhichang Yuan, John
   Garry)

 - add support for HiSilicon Hip06/Hip07 LPC I/O space (Zhichang Yuan,
   John Garry)

 - use PCI_EXP_DEVCTL2_COMP_TIMEOUT in rapidio/tsi721 (Bjorn Helgaas)

 - remove possible NULL pointer dereference in of_pci_bus_find_domain_nr()
   (Shawn Lin)

 - report quirk timings with dev_info (Bjorn Helgaas)

 - report quirks that take longer than 10ms (Bjorn Helgaas)

 - add and use Altera Vendor ID (Johannes Thumshirn)

 - tidy Makefiles and comments (Bjorn Helgaas)

 - don't set up INTx if MSI or MSI-X is enabled to align cris, frv,
   ia64, and mn10300 with x86 (Bjorn Helgaas)

 - move pcieport_if.h to drivers/pci/pcie/ to encapsulate it (Frederick
   Lawler)

 - merge pcieport_if.h into portdrv.h (Bjorn Helgaas)

 - move workaround for BIOS PME issue from portdrv to PCI core (Bjorn
   Helgaas)

 - completely disable portdrv with "pcie_ports=compat" (Bjorn Helgaas)

 - remove portdrv link order dependency (Bjorn Helgaas)

 - remove support for unused VC portdrv service (Bjorn Helgaas)

 - simplify portdrv feature permission checking (Bjorn Helgaas)

 - remove "pcie_hp=nomsi" parameter (use "pci=nomsi" instead) (Bjorn
   Helgaas)

 - remove unnecessary "pcie_ports=auto" parameter (Bjorn Helgaas)

 - use cached AER capability offset (Frederick Lawler)

 - don't enable DPC if BIOS hasn't granted AER control (Mika Westerberg)

 - rename pcie-dpc.c to dpc.c (Bjorn Helgaas)

 - use generic pci_mmap_resource_range() instead of powerpc and xtensa
   arch-specific versions (David Woodhouse)

 - support arbitrary PCI host bridge offsets on sparc (Yinghai Lu)

 - remove System and Video ROM reservations on sparc (Bjorn Helgaas)

 - probe for device reset support during enumeration instead of runtime
   (Bjorn Helgaas)

 - add ACS quirk for Ampere (née APM) root ports (Feng Kan)

 - add function 1 DMA alias quirk for Marvell 88SE9220 (Thomas
   Vincent-Cross)

 - protect device restore with device lock (Sinan Kaya)

 - handle failure of FLR gracefully (Sinan Kaya)

 - handle CRS (config retry status) after device resets (Sinan Kaya)

 - skip various config reads for SR-IOV VFs as an optimization
   (KarimAllah Ahmed)

 - consolidate VPD code in vpd.c (Bjorn Helgaas)

 - add Tegra dependency on PCI_MSI_IRQ_DOMAIN (Arnd Bergmann)

 - add DT support for R-Car r8a7743 (Biju Das)

 - fix a PCI_EJECT vs PCI_BUS_RELATIONS race condition in Hyper-V host
   bridge driver that causes a general protection fault (Dexuan Cui)

 - fix Hyper-V host bridge hang in MSI setup on 1-vCPU VMs with SR-IOV
   (Dexuan Cui)

 - fix Hyper-V host bridge hang when ejecting a VF before setting up MSI
   (Dexuan Cui)

 - make several structures static (Fengguang Wu)

 - increase number of MSI IRQs supported by Synopsys DesignWare bridges
   from 32 to 256 (Gustavo Pimentel)

 - implemented multiplexed IRQ domain API and remove obsolete MSI IRQ
   API from DesignWare drivers (Gustavo Pimentel)

 - add Tegra power management support (Manikanta Maddireddy)

 - add Tegra loadable module support (Manikanta Maddireddy)

 - handle 64-bit BARs correctly in endpoint support (Niklas Cassel)

 - support optional regulator for HiSilicon STB (Shawn Guo)

 - use regulator bulk API for Qualcomm apq8064 (Srinivas Kandagatla)

 - support power supplies for Qualcomm msm8996 (Srinivas Kandagatla)

* tag 'pci-v4.17-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (123 commits)
  MAINTAINERS: Add John Garry as maintainer for HiSilicon LPC driver
  HISI LPC: Add ACPI support
  ACPI / scan: Do not enumerate Indirect IO host children
  ACPI / scan: Rename acpi_is_serial_bus_slave() for more general use
  HISI LPC: Support the LPC host on Hip06/Hip07 with DT bindings
  of: Add missing I/O range exception for indirect-IO devices
  PCI: Apply the new generic I/O management on PCI IO hosts
  PCI: Add fwnode handler as input param of pci_register_io_range()
  PCI: Remove __weak tag from pci_register_io_range()
  MAINTAINERS: Add missing /drivers/pci/cadence directory entry
  fm10k: Report PCIe link properties with pcie_print_link_status()
  net/mlx5e: Use pcie_bandwidth_available() to compute bandwidth
  net/mlx5: Report PCIe link properties with pcie_print_link_status()
  net/mlx4_core: Report PCIe link properties with pcie_print_link_status()
  PCI: Add pcie_print_link_status() to log link speed and whether it's limited
  PCI: Add pcie_bandwidth_available() to compute bandwidth available to device
  misc: pci_endpoint_test: Handle 64-bit BARs properly
  PCI: designware-ep: Make dw_pcie_ep_reset_bar() handle 64-bit BARs properly
  PCI: endpoint: Make sure that BAR_5 does not have 64-bit flag set when clearing
  PCI: endpoint: Make epc->ops->clear_bar()/pci_epc_clear_bar() take struct *epf_bar
  ...
2018-04-06 18:31:06 -07:00
Linus Torvalds
19fd08b85b Merge tag 'for-linus-unmerged' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull rdma updates from Jason Gunthorpe:
 "Doug and I are at a conference next week so if another PR is sent I
  expect it to only be bug fixes. Parav noted yesterday that there are
  some fringe case behavior changes in his work that he would like to
  fix, and I see that Intel has a number of rc looking patches for HFI1
  they posted yesterday.

  Parav is again the biggest contributor by patch count with his ongoing
  work to enable container support in the RDMA stack, followed by Leon
  doing syzkaller inspired cleanups, though most of the actual fixing
  went to RC.

  There is one uncomfortable series here fixing the user ABI to actually
  work as intended in 32 bit mode. There are lots of notes in the commit
  messages, but the basic summary is we don't think there is an actual
  32 bit kernel user of drivers/infiniband for several good reasons.

  However we are seeing people want to use a 32 bit user space with 64
  bit kernel, which didn't completely work today. So in fixing it we
  required a 32 bit rxe user to upgrade their userspace. rxe users are
  still already quite rare and we think a 32 bit one is non-existing.

   - Fix RDMA uapi headers to actually compile in userspace and be more
     complete

   - Three shared with netdev pull requests from Mellanox:

      * 7 patches, mostly to net with 1 IB related one at the back).
        This series addresses an IRQ performance issue (patch 1),
        cleanups related to the fix for the IRQ performance problem
        (patches 2-6), and then extends the fragmented completion queue
        support that already exists in the net side of the driver to the
        ib side of the driver (patch 7).

      * Mostly IB, with 5 patches to net that are needed to support the
        remaining 10 patches to the IB subsystem. This series extends
        the current 'representor' framework when the mlx5 driver is in
        switchdev mode from being a netdev only construct to being a
        netdev/IB dev construct. The IB dev is limited to raw Eth queue
        pairs only, but by having an IB dev of this type attached to the
        representor for a switchdev port, it enables DPDK to work on the
        switchdev device.

      * All net related, but needed as infrastructure for the rdma
        driver

   - Updates for the hns, i40iw, bnxt_re, cxgb3, cxgb4, hns drivers

   - SRP performance updates

   - IB uverbs write path cleanup patch series from Leon

   - Add RDMA_CM support to ib_srpt. This is disabled by default. Users
     need to set the port for ib_srpt to listen on in configfs in order
     for it to be enabled
     (/sys/kernel/config/target/srpt/discovery_auth/rdma_cm_port)

   - TSO and Scatter FCS support in mlx4

   - Refactor of modify_qp routine to resolve problems seen while
     working on new code that is forthcoming

   - More refactoring and updates of RDMA CM for containers support from
     Parav

   - mlx5 'fine grained packet pacing', 'ipsec offload' and 'device
     memory' user API features

   - Infrastructure updates for the new IOCTL interface, based on
     increased usage

   - ABI compatibility bug fixes to fully support 32 bit userspace on 64
     bit kernel as was originally intended. See the commit messages for
     extensive details

   - Syzkaller bugs and code cleanups motivated by them"

* tag 'for-linus-unmerged' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (199 commits)
  IB/rxe: Fix for oops in rxe_register_device on ppc64le arch
  IB/mlx5: Device memory mr registration support
  net/mlx5: Mkey creation command adjustments
  IB/mlx5: Device memory support in mlx5_ib
  net/mlx5: Query device memory capabilities
  IB/uverbs: Add device memory registration ioctl support
  IB/uverbs: Add alloc/free dm uverbs ioctl support
  IB/uverbs: Add device memory capabilities reporting
  IB/uverbs: Expose device memory capabilities to user
  RDMA/qedr: Fix wmb usage in qedr
  IB/rxe: Removed GID add/del dummy routines
  RDMA/qedr: Zero stack memory before copying to user space
  IB/mlx5: Add ability to hash by IPSEC_SPI when creating a TIR
  IB/mlx5: Add information for querying IPsec capabilities
  IB/mlx5: Add IPsec support for egress and ingress
  {net,IB}/mlx5: Add ipsec helper
  IB/mlx5: Add modify_flow_action_esp verb
  IB/mlx5: Add implementation for create and destroy action_xfrm
  IB/uverbs: Introduce ESP steering match filter
  IB/uverbs: Add modify ESP flow_action
  ...
2018-04-06 17:35:43 -07:00
Linus Torvalds
28da7be5eb Merge tag 'mailbox-v4.17' of git://git.linaro.org/landing-teams/working/fujitsu/integration
Pull mailbox updates from Jassi Brar:

 - New Hi3660 mailbox driver

 - Fix TEGRA Kconfig warning

 - Broadcom: use dma_pool_zalloc instead of dma_pool_alloc+memset

* tag 'mailbox-v4.17' of git://git.linaro.org/landing-teams/working/fujitsu/integration:
  mailbox: Add support for Hi3660 mailbox
  dt-bindings: mailbox: Introduce Hi3660 controller binding
  mailbox: tegra: relax TEGRA_HSP_MBOX Kconfig dependencies
  maillbox: bcm-flexrm-mailbox: Use dma_pool_zalloc()
2018-04-06 17:20:14 -07:00
Linus Torvalds
3b54765cca Merge branch 'akpm' (patches from Andrew)
Merge updates from Andrew Morton:

 - a few misc things

 - ocfs2 updates

 - the v9fs maintainers have been missing for a long time. I've taken
   over v9fs patch slinging.

 - most of MM

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (116 commits)
  mm,oom_reaper: check for MMF_OOM_SKIP before complaining
  mm/ksm: fix interaction with THP
  mm/memblock.c: cast constant ULLONG_MAX to phys_addr_t
  headers: untangle kmemleak.h from mm.h
  include/linux/mmdebug.h: make VM_WARN* non-rvals
  mm/page_isolation.c: make start_isolate_page_range() fail if already isolated
  mm: change return type to vm_fault_t
  mm, oom: remove 3% bonus for CAP_SYS_ADMIN processes
  mm, page_alloc: wakeup kcompactd even if kswapd cannot free more memory
  kernel/fork.c: detect early free of a live mm
  mm: make counting of list_lru_one::nr_items lockless
  mm/swap_state.c: make bool enable_vma_readahead and swap_vma_readahead() static
  block_invalidatepage(): only release page if the full page was invalidated
  mm: kernel-doc: add missing parameter descriptions
  mm/swap.c: remove @cold parameter description for release_pages()
  mm/nommu: remove description of alloc_vm_area
  zram: drop max_zpage_size and use zs_huge_class_size()
  zsmalloc: introduce zs_huge_class_size()
  mm: fix races between swapoff and flush dcache
  fs/direct-io.c: minor cleanups in do_blockdev_direct_IO
  ...
2018-04-06 14:19:26 -07:00
Linus Torvalds
3fd14cdcc0 Merge tag 'mtd/for-4.17' of git://git.infradead.org/linux-mtd
Pull MTD updates from Boris Brezillon:
 "MTD Core:
   - Remove support for asynchronous erase (not implemented by any of
     the existing drivers anyway)
   - Remove Cyrille from the list of SPI NOR and MTD maintainers
   - Fix kernel doc headers
   - Allow users to define the partitions parsers they want to test
     through a DT property (compatible of the partitions subnode)
   - Remove the bfin-async-flash driver (the only architecture using it
     has been removed)
   - Fix pagetest test
   - Add extra checks in mtd_erase()
   - Simplify the MTD partition creation logic and get rid of
     mtd_add_device_partitions()

  MTD Drivers:
   - Add endianness information to the physmap DT binding
   - Add Eon EN29LV400A IDs to JEDEC probe logic
   - Use %*ph where appropriate

  SPI NOR Drivers:
   - Make fsl-quaspi assign different names to MTD devices connected to
     the same QSPI controller
   - Remove an unneeded driver.bus assigned in the fsl-qspi driver

  NAND Core:
   - Prepare arrival of the SPI NAND subsystem by implementing a generic
     (interface-agnostic) layer to ease manipulation of NAND devices
   - Move onenand code base to the drivers/mtd/nand/ dir
   - Rework timing mode selection
   - Provide a generic way for NAND chip drivers to flag a specific
     GET/SET FEATURE operation as supported/unsupported
   - Stop embedding ONFI/JEDEC param page in nand_chip

  NAND Drivers:
   - Rework/cleanup of the mxc driver
   - Various cleanups in the vf610 driver
   - Migrate the fsmc and vf610 to ->exec_op()
   - Get rid of the pxa driver (replaced by marvell_nand)
   - Support ->setup_data_interface() in the GPMI driver
   - Fix probe error path in several drivers
   - Remove support for unused hw_syndrome mode in sunxi_nand
   - Various minor improvements"

* tag 'mtd/for-4.17' of git://git.infradead.org/linux-mtd: (89 commits)
  dt-bindings: fsl-quadspi: Add the example of two SPI NOR
  mtd: fsl-quadspi: Distinguish the mtd device names
  mtd: nand: Fix some function description mismatches in core.c
  mtd: fsl-quadspi: Remove unneeded driver.bus assignment
  mtd: rawnand: marvell: Rename ->ecc_clk into ->core_clk
  mtd: rawnand: s3c2410: enhance the probe function error path
  mtd: rawnand: tango: fix probe function error path
  mtd: rawnand: sh_flctl: fix the probe function error path
  mtd: rawnand: omap2: fix the probe function error path
  mtd: rawnand: mxc: fix probe function error path
  mtd: rawnand: denali: fix probe function error path
  mtd: rawnand: davinci: fix probe function error path
  mtd: rawnand: cafe: fix probe function error path
  mtd: rawnand: brcmnand: fix probe function error path
  mtd: rawnand: sunxi: Stop supporting ECC_HW_SYNDROME mode
  mtd: rawnand: marvell: Fix clock resource by adding a register clock
  mtd: ftl: Use DIV_ROUND_UP()
  mtd: Fix some function description mismatches in mtdcore.c
  mtd: physmap_of: update struct map_info's swap as per map requirement
  dt-bindings: mtd-physmap: Add endianness supports
  ...
2018-04-06 12:15:41 -07:00
Linus Torvalds
83c7c18b16 Merge tag 'for-4.17/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device mapper updates from Mike Snitzer:

 - DM core passthrough ioctl fix to retain reference to DM table, and
   that table's block devices, while issuing the ioctl to one of those
   block devices.

 - DM core passthrough ioctl fix to _not_ override the fmode_t used to
   issue the ioctl. Overriding by using the fmode_t that the block
   device was originally open with during DM table load is a liability.

 - Add DM core support for secure erase forwarding and update the DM
   linear and DM striped targets to support them.

 - A DM core 4.16 stable fix to allow abnormal IO (e.g. discard, write
   same, write zeroes) for targets that make use of the non-splitting IO
   variant (as is done for multipath or thinp when layered directly on
   NVMe).

 - Allow DM targets to return a payload in response to a DM message that
   they are sent. This is useful for DM targets that would like to
   provide statistics data in response to DM messages.

 - Update DM bufio to support non-power-of-2 block sizes. Numerous other
   related changes prepare the DM bufio code for this support.

 - Fix DM crypt to use a bounded amount of memory across the entire
   system. This is to avoid OOM that can otherwise occur in response to
   certain pathological IO workloads (e.g. discarding a large DM crypt
   device).

 - Add a 'check_at_most_once' feature to the DM verity target to allow
   verity to be used on mobile devices that have very limited resources.

 - Fix the DM integrity target to fail early if a keyed algorithm (e.g.
   HMAC) is to be used but the key isn't set.

 - Add non-power-of-2 support to the DM unstripe target.

 - Eliminate the use of a Variable Length Array in the DM stripe target.

 - Update the DM log-writes target to record metadata (REQ_META flag).

 - DM raid fixes for its nosync status and some variable range issues.

* tag 'for-4.17/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: (28 commits)
  dm: remove fmode_t argument from .prepare_ioctl hook
  dm: hold DM table for duration of ioctl rather than use blkdev_get
  dm raid: fix parse_raid_params() variable range issue
  dm verity: make verity_for_io_block static
  dm verity: add 'check_at_most_once' option to only validate hashes once
  dm bufio: don't embed a bio in the dm_buffer structure
  dm bufio: support non-power-of-two block sizes
  dm bufio: use slab cache for dm_buffer structure allocations
  dm bufio: reorder fields in dm_buffer structure
  dm bufio: relax alignment constraint on slab cache
  dm bufio: remove code that merges slab caches
  dm bufio: get rid of slab cache name allocations
  dm bufio: move dm-bufio.h to include/linux/
  dm bufio: delete outdated comment
  dm: add support for secure erase forwarding
  dm: backfill abnormal IO support to non-splitting IO submission
  dm raid: fix nosync status
  dm mpath: use DM_MAPIO_SUBMITTED instead of magic number 0 in process_queued_bios()
  dm stripe: get rid of a Variable Length Array (VLA)
  dm log writes: record metadata flag for better flags record
  ...
2018-04-06 11:50:19 -07:00
Linus Torvalds
9022ca6b11 Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull misc vfs updates from Al Viro:
 "Assorted stuff, including Christoph's I_DIRTY patches"

* 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  fs: move I_DIRTY_INODE to fs.h
  ubifs: fix bogus __mark_inode_dirty(I_DIRTY_SYNC | I_DIRTY_DATASYNC) call
  ntfs: fix bogus __mark_inode_dirty(I_DIRTY_SYNC | I_DIRTY_DATASYNC) call
  gfs2: fix bogus __mark_inode_dirty(I_DIRTY_SYNC | I_DIRTY_DATASYNC) calls
  fs: fold open_check_o_direct into do_dentry_open
  vfs: Replace stray non-ASCII homoglyph characters with their ASCII equivalents
  vfs: make sure struct filename->iname is word-aligned
  get rid of pointless includes of fs_struct.h
  [poll] annotate SAA6588_CMD_POLL users
2018-04-06 11:07:08 -07:00
Esben Haabendal
dd9a122ae9 net: phy: marvell: Enable interrupt function on LED2 pin
The LED2[2]/INTn pin on Marvell 88E1318S as well as 88E1510/12/14/18 needs
to be configured to be usable as interrupt not only when WOL is enabled,
but whenever we rely on interrupts from the PHY.

Signed-off-by: Esben Haabendal <eha@deif.com>
Cc: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-06 13:36:57 -04:00
Russell King
64b2f129c3 ARM: sa1100/simpad: switch simpad CF to use gpiod APIs
Switch simpad's CF implementation to use the gpiod APIs.  The inverted
detection is handled using gpiolib's native inversion abilities.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2018-04-06 15:53:22 +01:00
Russell King
b51af86559 ARM: sa1100/shannon: convert to generic CF sockets
Convert shannon to use the generic CF socket support.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2018-04-06 15:53:22 +01:00
Russell King
80c799dbf8 ARM: sa1100/nanoengine: convert to generic CF sockets
Convert nanoengine to use the generic CF socket support.
Makefile fix from Arnd Bergmann <arnd@arndb.de>.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2018-04-06 15:53:00 +01:00
Anirudh Venkataramanan
cba5957d7e ice: Bug fixes in ethtool code
1) Return correct size from ice_get_regs_len.
2) Fix incorrect use of ARRAY_SIZE in ice_get_regs.

Fixes: fcea6f3da5 (ice: Add stats and ethtool support)
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-04-06 07:00:09 -07:00
Wei Yongjun
63bb4e1ebd ice: Fix error return code in ice_init_hw()
Fix to return error code ICE_ERR_NO_MEMORY from the alloc error
handling case instead of 0, as done elsewhere in this function.

Fixes: dc49c77236 ("ice: Get MAC/PHY/link info and scheduler topology")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-04-06 07:00:09 -07:00
Randy Dunlap
514c603249 headers: untangle kmemleak.h from mm.h
Currently <linux/slab.h> #includes <linux/kmemleak.h> for no obvious
reason.  It looks like it's only a convenience, so remove kmemleak.h
from slab.h and add <linux/kmemleak.h> to any users of kmemleak_* that
don't already #include it.  Also remove <linux/kmemleak.h> from source
files that do not use it.

This is tested on i386 allmodconfig and x86_64 allmodconfig.  It would
be good to run it through the 0day bot for other $ARCHes.  I have
neither the horsepower nor the storage space for the other $ARCHes.

Update: This patch has been extensively build-tested by both the 0day
bot & kisskb/ozlabs build farms.  Both of them reported 2 build failures
for which patches are included here (in v2).

[ slab.h is the second most used header file after module.h; kernel.h is
  right there with slab.h. There could be some minor error in the
  counting due to some #includes having comments after them and I didn't
  combine all of those. ]

[akpm@linux-foundation.org: security/keys/big_key.c needs vmalloc.h, per sfr]
Link: http://lkml.kernel.org/r/e4309f98-3749-93e1-4bb7-d9501a39d015@infradead.org
Link: http://kisskb.ellerman.id.au/kisskb/head/13396/
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Reported-by: Michael Ellerman <mpe@ellerman.id.au>	[2 build failures]
Reported-by: Fengguang Wu <fengguang.wu@intel.com>	[2 build failures]
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Wei Yongjun <weiyongjun1@huawei.com>
Cc: Luis R. Rodriguez <mcgrof@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: John Johansen <john.johansen@canonical.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-05 21:36:27 -07:00
Sergey Senozhatsky
60f5921a9a zram: drop max_zpage_size and use zs_huge_class_size()
Remove ZRAM's enforced "huge object" value and use zsmalloc huge-class
watermark instead, which makes more sense.

TEST
- I used a 1G zram device, LZO compression back-end, original
  data set size was 444MB. Looking at zsmalloc classes stats the
  test ended up to be pretty fair.

BASE ZRAM/ZSMALLOC
=====================
zram mm_stat

498978816 191482495 199831552        0 199831552    15634        0

zsmalloc classes

 class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
...
   151  2448           0            0          1240       1240        744                3        0
   168  2720           0            0          4200       4200       2800                2        0
   190  3072           0            0         10100      10100       7575                3        0
   202  3264           0            0           380        380        304                4        0
   254  4096           0            0         10620      10620      10620                1        0

 Total                 7           46        106982     106187      48787                         0

PATCHED ZRAM/ZSMALLOC
=====================

zram mm_stat

498978816 182579184 194248704        0 194248704    15628        0

zsmalloc classes

 class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
...
   151  2448           0            0          1240       1240        744                3        0
   168  2720           0            0          4200       4200       2800                2        0
   190  3072           0            0         10100      10100       7575                3        0
   202  3264           0            0          7180       7180       5744                4        0
   254  4096           0            0          3820       3820       3820                1        0

 Total                 8           45        106959     106193      47424                         0

As we can see, we reduced the number of objects stored in class-4096,
because a huge number of objects which we previously forcibly stored in
class-4096 now stored in non-huge class-3264.  This results in lower
memory consumption:

- zsmalloc now uses 47424 physical pages, which is less than 48787 pages
  zsmalloc used before.

- objects that we store in class-3264 share zspages.  That's why overall
  the number of pages that both class-4096 and class-3264 consumed went
  down from 10924 to 9564.

[sergey.senozhatsky.work@gmail.com: add pool param to zs_huge_class_size()]
  Link: http://lkml.kernel.org/r/20180314081833.1096-3-sergey.senozhatsky@gmail.com
Link: http://lkml.kernel.org/r/20180306070639.7389-3-sergey.senozhatsky@gmail.com
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-05 21:36:26 -07:00
Dan Williams
c1d53b92b9 device-dax: implement ->pagesize() for smaps to report MMUPageSize
Given that device-dax is making similar page mapping size guarantees as
hugetlbfs, emit the size in smaps and any other kernel path that
requests the mapping size of a vma.

Link: http://lkml.kernel.org/r/151996255287.27922.18397777516059080245.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reported-by: Jane Chu <jane.chu@oracle.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-05 21:36:26 -07:00
Pavel Tatashin
d0dc12e86b mm/memory_hotplug: optimize memory hotplug
During memory hotplugging we traverse struct pages three times:

1. memset(0) in sparse_add_one_section()
2. loop in __add_section() to set do: set_page_node(page, nid); and
   SetPageReserved(page);
3. loop in memmap_init_zone() to call __init_single_pfn()

This patch removes the first two loops, and leaves only loop 3.  All
struct pages are initialized in one place, the same as it is done during
boot.

The benefits:

 - We improve memory hotplug performance because we are not evicting the
   cache several times and also reduce loop branching overhead.

 - Remove condition from hotpath in __init_single_pfn(), that was added
   in order to fix the problem that was reported by Bharata in the above
   email thread, thus also improve performance during normal boot.

 - Make memory hotplug more similar to the boot memory initialization
   path because we zero and initialize struct pages only in one
   function.

 - Simplifies memory hotplug struct page initialization code, and thus
   enables future improvements, such as multi-threading the
   initialization of struct pages in order to improve hotplug
   performance even further on larger machines.

[pasha.tatashin@oracle.com: v5]
  Link: http://lkml.kernel.org/r/20180228030308.1116-7-pasha.tatashin@oracle.com
Link: http://lkml.kernel.org/r/20180215165920.8570-7-pasha.tatashin@oracle.com
Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Bharata B Rao <bharata@linux.vnet.ibm.com>
Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Steven Sistare <steven.sistare@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-05 21:36:25 -07:00
Pavel Tatashin
fc44f7f923 mm/memory_hotplug: don't read nid from struct page during hotplug
During memory hotplugging the probe routine will leave struct pages
uninitialized, the same as it is currently done during boot.  Therefore,
we do not want to access the inside of struct pages before
__init_single_page() is called during onlining.

Because during hotplug we know that pages in one memory block belong to
the same numa node, we can skip the checking.  We should keep checking
for the boot case.

[pasha.tatashin@oracle.com: s/register_new_memory()/hotplug_memory_register()]
  Link: http://lkml.kernel.org/r/20180228030308.1116-6-pasha.tatashin@oracle.com
Link: http://lkml.kernel.org/r/20180215165920.8570-6-pasha.tatashin@oracle.com
Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Bharata B Rao <bharata@linux.vnet.ibm.com>
Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Steven Sistare <steven.sistare@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-05 21:36:25 -07:00
Pavel Tatashin
b77eab7079 mm/memory_hotplug: optimize probe routine
When memory is hotplugged pages_correctly_reserved() is called to verify
that the added memory is present, this routine traverses through every
struct page and verifies that PageReserved() is set.  This is a slow
operation especially if a large amount of memory is added.

Instead of checking every page, it is enough to simply check that the
section is present, has mapping (struct page array is allocated), and
the mapping is online.

In addition, we should not excpect that probe routine sets flags in
struct page, as the struct pages have not yet been initialized.  The
initialization should be done in __init_single_page(), the same as
during boot.

Link: http://lkml.kernel.org/r/20180215165920.8570-5-pasha.tatashin@oracle.com
Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Bharata B Rao <bharata@linux.vnet.ibm.com>
Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Steven Sistare <steven.sistare@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-05 21:36:25 -07:00
Linus Torvalds
38c23685b2 Merge tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC driver updates from Arnd Bergmann:
 "The main addition this time around is the new ARM "SCMI" framework,
  which is the latest in a series of standards coming from ARM to do
  power management in a platform independent way.

  This has been through many review cycles, and it relies on a rather
  interesting way of using the mailbox subsystem, but in the end I
  agreed that Sudeep's version was the best we could do after all.

  Other changes include:

   - the ARM CCN driver is moved out of drivers/bus into drivers/perf,
     which makes more sense. Similarly, the performance monitoring
     portion of the CCI driver are moved the same way and cleaned up a
     little more.

   - a series of updates to the SCPI framework

   - support for the Mediatek mt7623a SoC in drivers/soc

   - support for additional NVIDIA Tegra hardware in drivers/soc

   - a new reset driver for Socionext Uniphier

   - lesser bug fixes in drivers/soc, drivers/tee, drivers/memory, and
     drivers/firmware and drivers/reset across platforms"

* tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (87 commits)
  reset: uniphier: add ethernet reset control support for PXs3
  reset: stm32mp1: Enable stm32mp1 reset driver
  dt-bindings: reset: add STM32MP1 resets
  reset: uniphier: add Pro4/Pro5/PXs2 audio systems reset control
  reset: imx7: add 'depends on HAS_IOMEM' to fix unmet dependency
  reset: modify the way reset lookup works for board files
  reset: add support for non-DT systems
  clk: scmi: use devm_of_clk_add_hw_provider() API and drop scmi_clocks_remove
  firmware: arm_scmi: prevent accessing rate_discrete uninitialized
  hwmon: (scmi) return -EINVAL when sensor information is unavailable
  amlogic: meson-gx-socinfo: Update soc ids
  soc/tegra: pmc: Use the new reset APIs to manage reset controllers
  soc: mediatek: update power domain data of MT2712
  dt-bindings: soc: update MT2712 power dt-bindings
  cpufreq: scmi: add thermal dependency
  soc: mediatek: fix the mistaken pointer accessed when subdomains are added
  soc: mediatek: add SCPSYS power domain driver for MediaTek MT7623A SoC
  soc: mediatek: avoid hardcoded value with bus_prot_mask
  dt-bindings: soc: add header files required for MT7623A SCPSYS dt-binding
  dt-bindings: soc: add SCPSYS binding for MT7623 and MT7623A SoC
  ...
2018-04-05 21:29:35 -07:00