Commit Graph

115849 Commits

Author SHA1 Message Date
Mark Brown
a21da94f61 Merge branch 'regulator-5.5' into regulator-next 2019-11-22 19:56:20 +00:00
Mark Brown
8c4d2a0bfb Merge branch 'asoc-5.5' into asoc-next 2019-11-22 19:56:02 +00:00
Chuck Lever
a264abad51 SUNRPC: Capture completion of all RPC tasks
RPC tasks on the backchannel never invoke xprt_complete_rqst(), so
there is no way to report their tk_status at completion. Also, any
RPC task that exits via rpc_exit_task() before it is replied to will
also disappear without a trace.

Introduce a trace point that is symmetrical with rpc_task_begin that
captures the termination status of each RPC task.

Sample trace output for callback requests initiated on the server:
   kworker/u8:12-448   [003]   127.025240: rpc_task_end:         task:50@3 flags=ASYNC|DYNAMIC|SOFT|SOFTCONN|SENT runstate=RUNNING|ACTIVE status=0 action=rpc_exit_task
   kworker/u8:12-448   [002]   127.567310: rpc_task_end:         task:51@3 flags=ASYNC|DYNAMIC|SOFT|SOFTCONN|SENT runstate=RUNNING|ACTIVE status=0 action=rpc_exit_task
   kworker/u8:12-448   [001]   130.506817: rpc_task_end:         task:52@3 flags=ASYNC|DYNAMIC|SOFT|SOFTCONN|SENT runstate=RUNNING|ACTIVE status=0 action=rpc_exit_task

Odd, though, that I never see trace_rpc_task_complete, either in the
forward or backchannel. Should it be removed?

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-11-22 19:09:38 +01:00
Petr Machata
d1746d1e80 net: flow_dissector: Wrap unionized VLAN fields in a struct
In commit a82055af59 ("netfilter: nft_payload: add VLAN offload
support"), VLAN fields in struct flow_dissector_key_vlan were unionized
with the intention of introducing another field that covered the whole TCI
header. However without a wrapping struct the subfields end up sharing the
same bits. As a result, "tc filter add ... flower vlan_id 14" specifies not
only vlan_id, but also vlan_priority.

Fix by wrapping the individual VLAN fields in a struct.

Fixes: a82055af59 ("netfilter: nft_payload: add VLAN offload support")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-22 09:44:14 -08:00
Florian Westphal
677bf08cfd udp: drop skb extensions before marking skb stateless
Once udp stack has set the UDP_SKB_IS_STATELESS flag, later skb free
assumes all skb head state has been dropped already.

This will leak the extension memory in case the skb has extensions other
than the ipsec secpath, e.g. bridge nf data.

To fix this, set the UDP_SKB_IS_STATELESS flag only if we don't have
extensions or if the extension space can be free'd.

Fixes: 895b5c9f20 ("netfilter: drop bridge nf reset from nf_reset")
Cc: Paolo Abeni <pabeni@redhat.com>
Reported-by: Byron Stanoszek <gandalf@winds.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-22 09:28:46 -08:00
Danit Goldberg
bfcb3c5d14 IB/core: Add interfaces to get VF node and port GUIDs
Provide ability to get node and port GUIDs of VFs to be symmetrical
to already existing set option.

Signed-off-by: Danit Goldberg <danitg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2019-11-22 18:17:24 +02:00
Danit Goldberg
30aad41721 net/core: Add support for getting VF GUIDs
Introduce a new ndo: ndo_get_vf_guid, to get from the net
device the port and node GUID.

New applications can choose to use this interface to show
GUIDs with iproute2 with commands such as:

- ip link show ib4
ib4: <BROADCAST,MULTICAST> mtu 4092 qdisc noop state DOWN mode DEFAULT group default qlen 256
link/infiniband 00:00:0a:2d:fe:80:00:00:00:00:00:00:ec:0d:9a:03:00:44:36:8d brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff
vf 0     link/infiniband 00:00:0a:2d:fe:80:00:00:00:00:00:00:ec:0d:9a:03:00:44:36:8d brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff,
spoof checking off, NODE_GUID 22:44:33:00:33:11:00:33, PORT_GUID 10:21:33:12:00:11:22:10, link-state disable, trust off, query_rss off

Signed-off-by: Danit Goldberg <danitg@mellanox.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2019-11-22 18:16:27 +02:00
Takashi Iwai
1e5ddb6ba7 ASoC: component: Add sync_stop PCM ops
Add the support of the new PCM sync_stop ops in ASoC component.
It's optional and can be NULL unless you need the sync operation.

Acked-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20191121190709.29121-3-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2019-11-22 14:38:01 +01:00
Toke Høiland-Jørgensen
7a89233ac5 mac80211: Use Airtime-based Queue Limits (AQL) on packet dequeue
The previous commit added the ability to throttle stations when they queue
too much airtime in the hardware. This commit enables the functionality by
calculating the expected airtime usage of each packet that is dequeued from
the TXQs in mac80211, and accounting that as pending airtime.

The estimated airtime for each skb is stored in the tx_info, so we can
subtract the same amount from the running total when the skb is freed or
recycled. The throttling mechanism relies on this accounting to be
accurate (i.e., that we are not freeing skbs without subtracting any
airtime they were accounted for), so we put the subtraction into
ieee80211_report_used_skb(). As an optimisation, we also subtract the
airtime on regular TX completion, zeroing out the value stored in the
packet afterwards, to avoid having to do an expensive lookup of the station
from the packet data on every packet.

This patch does *not* include any mechanism to wake a throttled TXQ again,
on the assumption that this will happen anyway as a side effect of whatever
freed the skb (most commonly a TX completion).

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/r/20191119060610.76681-5-kyan@google.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-11-22 13:36:25 +01:00
Kan Yan
3ace10f5b5 mac80211: Implement Airtime-based Queue Limit (AQL)
In order for the Fq_CoDel algorithm integrated in mac80211 layer to operate
effectively to control excessive queueing latency, the CoDel algorithm
requires an accurate measure of how long packets stays in the queue, AKA
sojourn time. The sojourn time measured at the mac80211 layer doesn't
include queueing latency in the lower layer (firmware/hardware) and CoDel
expects lower layer to have a short queue. However, most 802.11ac chipsets
offload tasks such TX aggregation to firmware or hardware, thus have a deep
lower layer queue.

Without a mechanism to control the lower layer queue size, packets only
stay in mac80211 layer transiently before being sent to firmware queue.
As a result, the sojourn time measured by CoDel in the mac80211 layer is
almost always lower than the CoDel latency target, hence CoDel does little
to control the latency, even when the lower layer queue causes excessive
latency.

The Byte Queue Limits (BQL) mechanism is commonly used to address the
similar issue with wired network interface. However, this method cannot be
applied directly to the wireless network interface. "Bytes" is not a
suitable measure of queue depth in the wireless network, as the data rate
can vary dramatically from station to station in the same network, from a
few Mbps to over Gbps.

This patch implements an Airtime-based Queue Limit (AQL) to make CoDel work
effectively with wireless drivers that utilized firmware/hardware
offloading. AQL allows each txq to release just enough packets to the lower
layer to form 1-2 large aggregations to keep hardware fully utilized and
retains the rest of the frames in mac80211 layer to be controlled by the
CoDel algorithm.

Signed-off-by: Kan Yan <kyan@google.com>
[ Toke: Keep API to set pending airtime internal, fix nits in commit msg ]
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/r/20191119060610.76681-4-kyan@google.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-11-22 13:36:25 +01:00
Toke Høiland-Jørgensen
db3e1c40cf mac80211: Import airtime calculation code from mt76
Felix recently added code to calculate airtime of packets to the mt76
driver. Import this into mac80211 so we can use it for airtime queue limit
calculations.

The airtime.c file is copied verbatim from the mt76 driver, and adjusted to
be usable in mac80211. This involves:

- Switching to mac80211 data structures.
- Adding support for 160 MHz channels and HE mode.
- Moving the symbol and duration calculations around a bit to avoid
  rounding with the higher rates and longer symbol times used for HE rates.

The per-rate TX rate calculation is also split out to its own function so
it can be used directly for the AQL calculations later.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/r/20191119060610.76681-3-kyan@google.com
[fix HE_GROUP_IDX() to use 3 * bw, since there are 3 _gi values]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-11-22 13:36:25 +01:00
Dexuan Cui
b96f86534f x86/hyperv: Implement hv_is_hibernation_supported()
The API will be used by the hv_balloon and hv_vmbus drivers.

Balloon up/down and hot-add of memory must not be active if the user
wants the Linux VM to support hibernation, because they are incompatible
with hibernation according to Hyper-V team, e.g. upon suspend the
balloon VSP doesn't save any info about the ballooned-out pages (if any);
so, after Linux resumes, Linux balloon VSC expects that the VSP will
return the pages if Linux is under memory pressure, but the VSP will
never do that, since the VSP thinks it never stole the pages from the VM.

So, if the user wants Linux VM to support hibernation, Linux must forbid
balloon up/down and hot-add, and the only functionality of the balloon VSC
driver is reporting the VM's memory pressure to the host.

Ideally, when Linux detects that the user wants it to support hibernation,
the balloon VSC should tell the VSP that it does not support ballooning
and hot-add. However, the current version of the VSP requires the VSC
should support these capabilities, otherwise the capability negotiation
fails and the VSC can not load at all, so with the later changes to the
VSC driver, Linux VM still reports to the VSP that the VSC supports these
capabilities, but the VSC ignores the VSP's requests of balloon up/down
and hot add, and reports an error to the VSP, when applicable. BTW, in
the future the balloon VSP driver will allow the VSC to not support the
capabilities of balloon up/down and hot add.

The ACPI S4 state is not a must for hibernation to work, because Linux is
able to hibernate as long as the system can shut down. However in practice
we decide to artificially use the presence of the virtual ACPI S4 state as
an indicator of the user's intent of using hibernation, because Linux VM
must find a way to know if the user wants to use the hibernation feature
or not.

By default, Hyper-V does not enable the virtual ACPI S4 state; on recent
Hyper-V hosts (e.g. RS5, 19H1), the administrator is able to enable the
state for a VM by WMI commands.

Once all the vmbus and VSC patches for the hibernation feature are
accepted, an extra patch will be submitted to forbid hibernation if the
virtual ACPI S4 state is absent, i.e. hv_is_hibernation_supported() is
false.

Signed-off-by: Dexuan Cui <decui@microsoft.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-21 20:10:45 -05:00
Branden Bonaby
af9ca6f9bb drivers: hv: vmbus: Introduce latency testing
Introduce user specified latency in the packet reception path
By exposing the test parameters as part of the debugfs channel
attributes. We will control the testing state via these attributes.

Signed-off-by: Branden Bonaby <brandonbonaby94@gmail.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-21 20:10:44 -05:00
Andrea Parri
2d4f49b3e1 Drivers: hv: vmbus: Enable VMBus protocol versions 4.1, 5.1 and 5.2
Hyper-V has added VMBus protocol versions 5.1 and 5.2 in recent release
versions.  Allow Linux guests to negotiate these new protocol versions
on versions of Hyper-V that support them.  While on this, also allow
guests to negotiate the VMBus protocol version 4.1 (which was missing).

Signed-off-by: Andrea Parri <parri.andrea@gmail.com>
Reviewed-by: Wei Liu <wei.liu@kernel.org>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-21 20:10:43 -05:00
Andrea Parri
bedc61a922 Drivers: hv: vmbus: Introduce table of VMBus protocol versions
The technique used to get the next VMBus version seems increasisly
clumsy as the number of VMBus versions increases.  Performance is
not a concern since this is only done once during system boot; it's
just that we'll end up with more lines of code than is really needed.

As an alternative, introduce a table with the version numbers listed
in order (from the most recent to the oldest).  vmbus_connect() loops
through the versions listed in the table until it gets an accepted
connection or gets to the end of the table (invalid version).

Suggested-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Andrea Parri <parri.andrea@gmail.com>
Reviewed-by: Wei Liu <wei.liu@kernel.org>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-21 20:10:43 -05:00
Paolo Abeni
02b2494161 ipv4: use dst hint for ipv4 list receive
This is alike the previous change, with some additional ipv4 specific
quirk. Even when using the route hint we still have to do perform
additional per packet checks about source address validity: a new
helper is added to wrap them.

Hints are explicitly disabled if the destination is a local broadcast,
that keeps the code simple and local broadcast are a slower path anyway.

UDP flood performances vs recvmmsg() receiver:

vanilla		patched		delta
Kpps		Kpps		%
1683		1871		+11

In the worst case scenario - each packet has a different
destination address - the performance delta is within noise
range.

v3 -> v4:
 - re-enable hints for forward

v2 -> v3:
 - really fix build (sic) and hint usage check
 - use fib4_has_custom_rules() helpers (David A.)
 - add ip_extract_route_hint() helper (Edward C.)
 - use prev skb as hint instead of copying data (Willem)

v1 -> v2:
 - fix build issue with !CONFIG_IP_MULTIPLE_TABLES

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-21 14:45:55 -08:00
Paolo Abeni
c43c3d76c0 ipv4: move fib4_has_custom_rules() helper to public header
So that we can use it in the next patch.
Additionally constify the helper argument.

Suggested-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-21 14:45:55 -08:00
Paolo Abeni
b9b33e7c24 ipv6: keep track of routes using src
Use a per namespace counter, increment it on successful creation
of any route using the source address, decrement it on deletion
of such routes.

This allows us to check easily if the routing decision in the
current namespace depends on the packet source. Will be used
by the next patch.

Suggested-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-21 14:45:55 -08:00
Paolo Abeni
1f8ac57030 ipv6: add fib6_has_custom_rules() helper
It wraps the namespace field with the same name, to easily
access it regardless of build options.

Suggested-by: David Ahern <dsahern@gmail.com>
Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-21 14:45:55 -08:00
Yangbo Lu
400928bf92 net: mscc: ocelot: convert to use ocelot_port_add_txtstamp_skb()
Convert to use ocelot_port_add_txtstamp_skb() for adding skbs which
require TX timestamp into list. Export it so that DSA Felix driver
could reuse it too.

Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-21 14:39:02 -08:00
Yangbo Lu
e23a7b3e8d net: mscc: ocelot: convert to use ocelot_get_txtstamp()
The method getting TX timestamp by reading timestamp FIFO and
matching skbs list is common for DSA Felix driver too.
So move code out of ocelot_board.c, convert to use
ocelot_get_txtstamp() function and export it.

Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-21 14:39:02 -08:00
Yangbo Lu
f145922ddc net: mscc: ocelot: export ocelot_hwstamp_get/set functions
Export ocelot_hwstamp_get/set functions so that DSA driver
is able to reuse them.

Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-21 14:39:02 -08:00
Jiri Olsa
7599a896f2 audit: Move audit_log_task declaration under CONFIG_AUDITSYSCALL
The 0-DAY found that audit_log_task is not declared under
CONFIG_AUDITSYSCALL which causes compilation error when
it is not defined:

    kernel/bpf/syscall.o: In function `bpf_audit_prog.isra.30':
 >> syscall.c:(.text+0x860): undefined reference to `audit_log_task'

Adding the audit_log_task declaration and stub within
CONFIG_AUDITSYSCALL ifdef.

Fixes: 91e6015b08 ("bpf: Emit audit messages upon successful prog load and unload")
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-21 12:14:49 -08:00
Xin Long
79b1011cb3 net: sched: allow flower to match erspan options
This patch is to allow matching options in erspan.

The options can be described in the form:
VER:INDEX:DIR:HWID/VER:INDEX_MASK:DIR_MASK:HWID_MASK.
When ver is set to 1, index will be applied while dir
and hwid will be ignored, and when ver is set to 2,
dir and hwid will be used while index will be ignored.

Different from geneve, only one option can be set. And
also, geneve options, vxlan options or erspan options
can't be set at the same time.

  # ip link add name erspan1 type erspan external
  # tc qdisc add dev erspan1 ingress
  # tc filter add dev erspan1 protocol ip parent ffff: \
      flower \
        enc_src_ip 10.0.99.192 \
        enc_dst_ip 10.0.99.193 \
        enc_key_id 11 \
        erspan_opts 1:12:0:0/1:ffff:0:0 \
        ip_proto udp \
        action mirred egress redirect dev eth0

v1->v2:
  - improve some err msgs of extack.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-21 11:44:06 -08:00
Xin Long
d8f9dfae49 net: sched: allow flower to match vxlan options
This patch is to allow matching gbp option in vxlan.

The options can be described in the form GBP/GBP_MASK,
where GBP is represented as a 32bit hexadecimal value.
Different from geneve, only one option can be set. And
also, geneve options and vxlan options can't be set at
the same time.

  # ip link add name vxlan0 type vxlan dstport 0 external
  # tc qdisc add dev vxlan0 ingress
  # tc filter add dev vxlan0 protocol ip parent ffff: \
      flower \
        enc_src_ip 10.0.99.192 \
        enc_dst_ip 10.0.99.193 \
        enc_key_id 11 \
        vxlan_opts 01020304/ffffffff \
        ip_proto udp \
        action mirred egress redirect dev eth0

v1->v2:
  - add .strict_start_type for enc_opts_policy as Jakub noticed.
  - use Duplicate instead of Wrong in err msg for extack as Jakub
    suggested.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-21 11:44:06 -08:00
Xin Long
e20d4ff2ac net: sched: add erspan option support to act_tunnel_key
This patch is to allow setting erspan options using the
act_tunnel_key action. Different from geneve options,
only one option can be set. And also, geneve options,
vxlan options or erspan options can't be set at the
same time.

Options are expressed as ver:index:dir:hwid, when ver
is set to 1, index will be applied while dir and hwid
will be ignored, and when ver is set to 2, dir and
hwid will be used while index will be ignored.

  # ip link add name erspan1 type erspan external
  # tc qdisc add dev eth0 ingress
  # tc filter add dev eth0 protocol ip parent ffff: \
           flower indev eth0 \
              ip_proto udp \
              action tunnel_key \
                  set src_ip 10.0.99.192 \
                  dst_ip 10.0.99.193 \
                  dst_port 6081 \
                  id 11 \
  		erspan_opts 1:2:0:0 \
          action mirred egress redirect dev erspan1

v1->v2:
  - do the validation when dst is not yet allocated as Jakub suggested.
  - use Duplicate instead of Wrong in err msg for extack.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-21 11:44:06 -08:00
Xin Long
fca3f91cc3 net: sched: add vxlan option support to act_tunnel_key
This patch is to allow setting vxlan options using the
act_tunnel_key action. Different from geneve options,
only one option can be set. And also, geneve options
and vxlan options can't be set at the same time.

gbp is the only param for vxlan options:

  # ip link add name vxlan0 type vxlan dstport 0 external
  # tc qdisc add dev eth0 ingress
  # tc filter add dev eth0 protocol ip parent ffff: \
           flower indev eth0 \
              ip_proto udp \
              action tunnel_key \
                  set src_ip 10.0.99.192 \
                  dst_ip 10.0.99.193 \
                  dst_port 6081 \
                  id 11 \
  		  vxlan_opts 01020304 \
          action mirred egress redirect dev vxlan0

v1->v2:
  - add .strict_start_type for enc_opts_policy as Jakub noticed.
  - use Duplicate instead of Wrong in err msg for extack as Jakub
    suggested.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-21 11:44:06 -08:00
Jens Axboe
866ca95da5 Merge branch 'nvme-5.5' of git://git.infradead.org/nvme into for-5.5/drivers-post
Pull NVMe changes from Keith:

"- The only new feature is the optional hwmon support for nvme (Guenter
   and Akinobu)

 - A universal work-around for controllers reading discard payloads
   beyond the range boundary (Eduard)

 - Chaitanya graciously agreed to share the target driver maintenance"

* 'nvme-5.5' of git://git.infradead.org/nvme:
  nvme: hwmon: add quirk to avoid changing temperature threshold
  nvme: hwmon: provide temperature min and max values for each sensor
  nvmet: add another maintainer
  nvme: Discard workaround for non-conformant devices
  nvme: Add hardware monitoring support
2019-11-21 10:53:47 -07:00
Akinobu Mita
52deba0f02 nvme: hwmon: provide temperature min and max values for each sensor
According to the NVMe specification, the over temperature threshold and
under temperature threshold features shall be implemented for Composite
Temperature if a non-zero WCTEMP field value is reported in the Identify
Controller data structure.  The features are also implemented for all
implemented temperature sensors (i.e., all Temperature Sensor fields that
report a non-zero value).

This provides the over temperature threshold and under temperature
threshold for each sensor as temperature min and max values of hwmon
sysfs attributes.

The WCTEMP is already provided as a temperature max value for Composite
Temperature, but this change isn't incompatible.  Because the default
value of the over temperature threshold for Composite Temperature is
the WCTEMP.

Now the alarm attribute for Composite Temperature indicates one of the
temperature is outside of a temperature threshold.  Because there is only
a single bit in Critical Warning field that indicates a temperature is
outside of a threshold.

Example output from the "sensors" command:

nvme-pci-0100
Adapter: PCI adapter
Composite:    +33.9°C  (low  = -273.1°C, high = +69.8°C)
                       (crit = +79.8°C)
Sensor 1:     +34.9°C  (low  = -273.1°C, high = +65261.8°C)
Sensor 2:     +31.9°C  (low  = -273.1°C, high = +65261.8°C)
Sensor 5:     +47.9°C  (low  = -273.1°C, high = +65261.8°C)

This also adds helper macros for kelvin from/to milli Celsius conversion,
and replaces the repeated code in hwmon.c.

Cc: Keith Busch <kbusch@kernel.org>
Cc: Jens Axboe <axboe@fb.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Jean Delvare <jdelvare@suse.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
2019-11-22 02:21:08 +09:00
Nicolas Saenz Julienne
a7ba70f178 dma-mapping: treat dev->bus_dma_mask as a DMA limit
Using a mask to represent bus DMA constraints has a set of limitations.
The biggest one being it can only hold a power of two (minus one). The
DMA mapping code is already aware of this and treats dev->bus_dma_mask
as a limit. This quirk is already used by some architectures although
still rare.

With the introduction of the Raspberry Pi 4 we've found a new contender
for the use of bus DMA limits, as its PCIe bus can only address the
lower 3GB of memory (of a total of 4GB). This is impossible to represent
with a mask. To make things worse the device-tree code rounds non power
of two bus DMA limits to the next power of two, which is unacceptable in
this case.

In the light of this, rename dev->bus_dma_mask to dev->bus_dma_limit all
over the tree and treat it as such. Note that dev->bus_dma_limit should
contain the higher accessible DMA address.

Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2019-11-21 18:14:35 +01:00
Christoph Hellwig
d7293f79ca Merge branch 'for-next/zone-dma' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux into dma-mapping-for-next
Pull in a stable branch from the arm64 tree that adds the zone_dma_bits
variable to avoid creating hard to resolve conflicts with that addition.
2019-11-21 18:13:03 +01:00
Konstantin Khlebnikov
b686631865 block: add iostat counters for flush requests
Requests that triggers flushing volatile writeback cache to disk (barriers)
have significant effect to overall performance.

Block layer has sophisticated engine for combining several flush requests
into one. But there is no statistics for actual flushes executed by disk.
Requests which trigger flushes usually are barriers - zero-size writes.

This patch adds two iostat counters into /sys/class/block/$dev/stat and
/proc/diskstats - count of completed flush requests and their total time.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-11-21 09:06:47 -07:00
Bjorn Helgaas
bbdb2f5ecd PCI: Add #defines for Enter Compliance, Transmit Margin
Add definitions for the Enter Compliance and Transmit Margin fields of the
PCIe Link Control 2 register.

Link: https://lore.kernel.org/r/20191112173503.176611-2-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-21 07:52:34 -06:00
Krzysztof Wilczynski
bbd8810d39 PCI: Remove unused includes and superfluous struct declaration
Remove <linux/pci.h> and <linux/msi.h> from being included directly as part
of the include/linux/of_pci.h, and remove superfluous declaration of struct
of_phandle_args.

Move users of include <linux/of_pci.h> to include <linux/pci.h> and
<linux/msi.h> directly rather than rely on both being included transitively
through <linux/of_pci.h>.

Link: https://lore.kernel.org/r/20190903113059.2901-1-kw@linux.com
Signed-off-by: Krzysztof Wilczynski <kw@linux.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rob Herring <robh@kernel.org>
2019-11-21 07:49:29 -06:00
Kusanagi Kouichi
4250b04703 debugfs: Fix !DEBUG_FS debugfs_create_automount
If DEBUG_FS=n, compile fails with the following error:

kernel/trace/trace.c: In function 'tracing_init_dentry':
kernel/trace/trace.c:8658:9: error: passing argument 3 of 'debugfs_create_automount' from incompatible pointer type [-Werror=incompatible-pointer-types]
 8658 |         trace_automount, NULL);
      |         ^~~~~~~~~~~~~~~
      |         |
      |         struct vfsmount * (*)(struct dentry *, void *)
In file included from kernel/trace/trace.c:24:
./include/linux/debugfs.h:206:25: note: expected 'struct vfsmount * (*)(void *)' but argument is of type 'struct vfsmount * (*)(struct dentry *, void *)'
  206 |      struct vfsmount *(*f)(void *),
      |      ~~~~~~~~~~~~~~~~~~~^~~~~~~~~~

Signed-off-by: Kusanagi Kouichi <slash@ac.auone-net.jp>
Link: https://lore.kernel.org/r/20191121102021787.MLMY.25002.ppp.dion.ne.jp@dmta0003.auone-net.jp
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-11-21 14:28:32 +01:00
Paolo Bonzini
46f4f0aabc Merge branch 'kvm-tsx-ctrl' into HEAD
Conflicts:
	arch/x86/kvm/vmx/vmx.c
2019-11-21 12:03:40 +01:00
Enric Balletbo i Serra
5ec966dbcc Merge tag 'tag-ib-chrome-mfd-iio-input-5.5' into chrome-platform-5.5
IB between mfd, iio, input and chrome platform for cros-ec-sensorhub

Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
2019-11-21 11:52:51 +01:00
Enrico Granata
3300fdd630 platform/chrome: cros_ec: handle MKBP more events flag
The ChromeOS EC has support for signaling to the host that a single IRQ
can serve multiple MKBP (Matrix KeyBoard Protocol) events.

Doing this serves an optimization purpose, as it minimizes the number of
round-trips into the interrupt handling machinery, and it proves
beneficial to sensor timestamping as it keeps the desired synchronization
of event times between the two processors.

This patch adds kernel support for this EC feature, allowing the ec_irq
to loop until all events have been served.

Signed-off-by: Enrico Granata <egranata@chromium.org>
Signed-off-by: Gwendal Grignou <gwendal@chromium.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Acked-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
2019-11-21 11:23:15 +01:00
Gwendal Grignou
05a3c420ea platform/chrome: cros-ec: Record event timestamp in the hard irq
To improve sensor timestamp precision, given EC and AP are in different
time domains, the AP needs to try to record the exact moment an event
was signalled to the AP by the EC as soon as possible after it happens.

First thing in the hard irq is the best place for this.

Signed-off-by: Gwendal Grignou <gwendal@chromium.org>
Acked-by: Jonathan Cameron <Jonathan.Cameron@kernel.org>
Acked-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
2019-11-21 11:23:14 +01:00
Gwendal Grignou
d60ac88a62 mfd / platform / iio: cros_ec: Register sensor through sensorhub
Remove the duplicated code in MFD, since MFD just registers cros-ec-sensorhub
if at least one sensor is present.

Change IIO cros-ec driver to get the pointer to the cros-ec-dev through
cros-ec-sensorhub.

Signed-off-by: Gwendal Grignou <gwendal@chromium.org>
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Acked-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
2019-11-21 11:23:14 +01:00
Gwendal Grignou
5306747118 iio / platform: cros_ec: Add cros-ec-sensorhub driver
Similar to HID sensor stack, the new driver sits between cros-ec-dev
and the IIO device drivers:

The EC based IIO device topology would be:

iio:device1 ->
   ...0/0000:00:1f.0/PNP0C09:00/GOOG0004:00/cros-ec-dev.6.auto/
                                            cros-ec-sensorhub.7.auto/
                                            cros-ec-accel.15.auto/
                                            iio:device1

It will be expanded to control EC sensor FIFO.

Signed-off-by: Gwendal Grignou <gwendal@chromium.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
[Fix "unknown type name 'uint32_t'" type errors]
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
2019-11-21 11:23:14 +01:00
Gwendal Grignou
a16b2e2819 mfd / platform: cros_ec: Add sensor_count and make check_features public
Add a new function to return the number of MEMS sensors available in a
ChromeOS Embedded Controller. It uses MOTIONSENSE_CMD_DUMP if available
or a specific memory map ACPI registers to find out.

Also, make check_features public as it can be useful for other drivers
to know what the Embedded Controller supports.

Signed-off-by: Gwendal Grignou <gwendal@chromium.org>
Acked-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
2019-11-21 11:23:13 +01:00
Gwendal Grignou
c9b465683a platform/chrome: cros_ec: Put docs with the code
To avoid doc rot, put function documentations with code, not header.
Use kernel-doc style comments for exported functions.

Signed-off-by: Gwendal Grignou <gwendal@chromium.org>
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
2019-11-21 11:23:13 +01:00
Moti Haimovski
52c01b0137 habanalabs: expose reset counters via existing INFO IOCTL
Expose both soft and hard reset counts via INFO IOCTL.
This will allow system management applications to easily check
if the device has undergone reset.

Signed-off-by: Moti Haimovski <mhaimovski@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-11-21 11:35:47 +02:00
Oded Gabbay
5d1012576d habanalabs: export uapi defines to user-space
The two defines that control the maximum size of a command buffer and the
maximum number of JOBS per CS need to be exported to the user as they are
part of the API towards user-space.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>
2019-11-21 11:35:46 +02:00
Oded Gabbay
91edbf2cf8 habanalabs: expose card name in INFO IOCTL
To enable userspace processes, e.g. management utilities, to display the
card name to the user, add the card name property to the HW_IP
structure that is copied to the user in the INFO IOCTL.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-11-21 11:35:45 +02:00
Oded Gabbay
62c1e124a9 habanalabs: add opcode to INFO IOCTL to return clock rate
Add a new opcode to the INFO IOCTL to allow the user application to
retrieve the ASIC's current and maximum clock rate. The rate is
returned in MHz.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Tomer Tayar <ttayar@habana.ai>
2019-11-21 11:35:45 +02:00
Tomer Tayar
f435614ff5 habanalabs: Fix typos
s/paerser/parser/
s/requeusted/requested/
s/an JOB/a JOB/

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-11-21 11:35:45 +02:00
Paolo Bonzini
14edff8831 Merge tag 'kvmarm-5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm updates for Linux 5.5:

- Allow non-ISV data aborts to be reported to userspace
- Allow injection of data aborts from userspace
- Expose stolen time to guests
- GICv4 performance improvements
- vgic ITS emulation fixes
- Simplify FWB handling
- Enable halt pool counters
- Make the emulated timer PREEMPT_RT compliant

Conflicts:
	include/uapi/linux/kvm.h
2019-11-21 09:58:35 +01:00
Frederic Weisbecker
74722bb223 sched/vtime: Bring up complete kcpustat accessor
Many callsites want to fetch the values of system, user, user_nice, guest
or guest_nice kcpustat fields altogether or at least a pair of these.

In that case calling kcpustat_field() for each requested field brings
unecessary overhead when we could fetch all of them in a row.

So provide kcpustat_cpu_fetch() that fetches the whole kcpustat array
in a vtime safe way under the same RCU and seqcount block.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wanpeng Li <wanpengli@tencent.com>
Cc: Yauheni Kaliuta <yauheni.kaliuta@redhat.com>
Link: https://lkml.kernel.org/r/20191121024430.19938-3-frederic@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-11-21 07:33:24 +01:00