Change from mlx5e_tx_reporter_* to mlx5e_reporter_tx_*. In the following
patches in the set rx reporter is added, the new naming convention is
more uniformed.
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
in ip_mc_inc_group, memory allocation flag, not mcast mode, is expected
by __ip_mc_inc_group
similar issue in __ip_mc_join_group, both mcase mode and gfp_t are needed
here, so use ____ip_mc_inc_group(...)
Fixes: 9fb20801da ("net: Fix ip_mc_{dec,inc}_group allocation context")
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Zhang Yu <zhangyu31@baidu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The NCSI spec indicates that if the data does not end on a 32 bit
boundary, one to three padding bytes equal to 0x00 shall be present to
align the checksum field to a 32-bit boundary.
Signed-off-by: Terry S. Duncan <terry.s.duncan@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot says:
====================
net: dsa: enable and disable all ports
The DSA stack currently calls the .port_enable and .port_disable switch
callbacks for slave ports only. However, it is useful to call them for all
port types. For example this allows some drivers to delay the optimization
of power consumption after the switch is setup. This can also help reducing
the setup code of drivers a bit.
The first DSA core patches enable and disable all ports of a switch, regardless
their type. The last mv88e6xxx patches remove redundant code from the driver
setup and the said callbacks, now that they handle SERDES power for all ports.
Changes in v2: do not guard .port_disable for broadcom switches.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that mv88e6xxx_serdes_power is only called after driver setup,
we can wrap the SERDES IRQ code directly within it for clarity.
Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
SERDES is powered on for CPU and DSA ports and powered down for unused
ports at setup time. But now that DSA calls mv88e6xxx_port_enable
and mv88e6xxx_port_disable for all ports, the SERDES power can now
be handled after setup inconditionally for all ports.
Using the port enable and disable callbacks also have the benefit to
handle the SERDES IRQ for non user ports as well.
Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When disabling a port, that is not for the driver to decide what to
do with the STP state. This is already handled by the DSA layer.
Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Call the .port_enable and .port_disable functions for all ports,
not only the user ports, so that drivers may optimize the power
consumption of all ports after a successful setup.
Unused ports are now disabled on setup. CPU and DSA ports are now
enabled on setup and disabled on teardown. User ports were already
enabled at slave creation and disabled at slave destruction.
Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The .port_enable and .port_disable operations are currently only
called for user ports, hence assuming they have a slave device. In
preparation for using these operations for other port types as well,
simply guard all implementations against non user ports and return
directly in such case.
Note that bcm_sf2_sw_suspend() currently calls bcm_sf2_port_disable()
(and thus b53_disable_port()) against the user and CPU ports, so do
not guards those functions. They will be called for unused ports in
the future, but that was expected by those drivers anyway.
Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It is currently difficult to read the different steps involved in the
setup and teardown of ports in the DSA code. Keep it simple with a
single switch statement for each port type: UNUSED, CPU, DSA, or USER.
Also no need to call devlink_port_unregister from within dsa_port_setup
as this step is inconditionally handled by dsa_port_teardown on error.
Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch restructures how VFs are configured, and resources allocated.
Instead of freeing resources that were never allocated, and resetting
empty VFs that have never been created - the new flow will just allocate
resources for number of requested VFs based on the availability.
During VFs initialization process, global interrupt is disabled, and
rearmed after getting MSIX vectors for VFs. This allows immediate mailbox
communications, instead of delaying it till later and VFs.
PF communications resulted to using polling instead of actual interrupt.
The issue manifested when creating higher number of VFs (128 VFs) per PF.
Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Currently we divide budget by the number of Rx queues per Rx ring
container in ice_napi_poll even if there is only 1. This is an
unnecessary divide for the normal case of 1 Rx ring per Rx ring
container. Fix this by using an unlikely() call in the case where we
actually need to divide.
Also, we will always set budget_per_ring even if there are no Rx rings
in the Rx ring container so we don't need to initialize it to 0.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Currently in ice_get_tx_pending we try to read a Tx ring's tail. This is
then compared with the software based head (next_to_clean) to determine
if we have pending work. This will never work because reading of the Tx
ring's tail is no longer supported. Fix this by using the software based
tail (next_to_use) to determine if there is pending work.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Currently, we are only explicitly setting SOCK_NOSPACE on a write timeout
for non-blocking sockets. Epoll() edge-trigger mode relies on SOCK_NOSPACE
being set when -EAGAIN is returned to ensure that EPOLLOUT is raised.
Expand the setting of SOCK_NOSPACE to non-blocking sockets as well that can
use SO_SNDTIMEO to adjust their write timeout. This mirrors the behavior
that Eric Dumazet introduced for tcp sockets.
Signed-off-by: Jason Baron <jbaron@akamai.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Ursula Braun <ubraun@linux.ibm.com>
Cc: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move the tx bottom function from NAPI to a new tasklet. Then, for
multi-cores, the bottom functions of tx and rx may be run at same
time with different cores. This is used to improve performance.
On x86, Tx/Rx 943/943 Mbits/sec -> 945/944.
For arm platform, Tx/Rx: 917/917 Mbits/sec -> 933/933.
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull HID fixes from Jiri Kosina:
- a few regression fixes for wacom driver (including fix for my earlier
mismerge) from Aaron Armstrong Skomra and Jason Gerecke
- revert of a few Logitech device ID additions which turn out to not
work perfectly with the hidpp driver at the moment; proper support is
now scheduled for 5.4. Fixes from Benjamin Tissoires
- scheduling-in-atomic fix for cp2112 driver, from Benjamin Tissoires
- new device ID to intel-ish, from Even Xu
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
HID: wacom: correct misreported EKR ring values
HID: cp2112: prevent sleeping function called from invalid context
HID: intel-ish-hid: ipc: add EHL device id
HID: wacom: Correct distance scale for 2nd-gen Intuos devices
HID: logitech-hidpp: remove support for the G700 over USB
Revert "HID: logitech-hidpp: add USB PID for a few more supported mice"
HID: wacom: add back changes dropped in merge commit
ODP depends on the several device capabilities, among them is the ability
to send UMR WQEs with that modify atomic and entity size of the MR.
Therefore, only if all conditions to send such a UMR WQE are met then
driver can report that ODP is supported. Use this check of conditions
in all places where driver needs to know about ODP support.
Also, implicit ODP support depends on ability of driver to send UMR WQEs
for an indirect mkey. Therefore, verify that all conditions to do so are
met when reporting support.
Fixes: c8d75a980f ("IB/mlx5: Respect new UMR capabilities")
Signed-off-by: Moni Shoua <monis@mellanox.com>
Reviewed-by: Guy Levi <guyle@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Link: https://lore.kernel.org/r/20190815083834.9245-7-leon@kernel.org
Signed-off-by: Doug Ledford <dledford@redhat.com>
task_active_pid_ns() is wrong API to check PID namespace because it
posses some restrictions and return PID namespace where the process
was allocated. It created mismatches with current namespace, which
can be different.
Rewrite whole rdma_is_visible_in_pid_ns() logic to provide reliable
results without any relation to allocated PID namespace.
Fixes: 8be565e65f ("RDMA/nldev: Factor out the PID namespace check")
Fixes: 6a6c306a09 ("RDMA/restrack: Make is_visible_in_pid_ns() as an API")
Reviewed-by: Mark Zhang <markz@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Link: https://lore.kernel.org/r/20190815083834.9245-4-leon@kernel.org
Signed-off-by: Doug Ledford <dledford@redhat.com>
In a congested fabric with adaptive routing enabled, traces show that
packets could be delivered out of order. A stale TID RDMA data packet
could lead to TidErr if the TID entries have been released by duplicate
data packets generated from retries, and subsequently erroneously force
the qp into error state in the current implementation.
Since the payload has already been dropped by hardware, the packet can
be simply dropped and it is no longer necessary to put the qp into
error state.
Fixes: 9905bf06e8 ("IB/hfi1: Add functions to receive TID RDMA READ response")
Cc: <stable@vger.kernel.org>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Link: https://lore.kernel.org/r/20190815192058.105923.72324.stgit@awfm-01.aw.intel.com
Signed-off-by: Doug Ledford <dledford@redhat.com>
In a congested fabric with adaptive routing enabled, traces show that
the sender could receive stale TID RDMA NAK packets that contain newer
KDETH PSNs and older Verbs PSNs. If not dropped, these packets could
cause the incorrect rewinding of the software flows and the incorrect
completion of TID RDMA WRITE requests, and eventually leading to memory
corruption and kernel crash.
The current code drops stale TID RDMA ACK/NAK packets solely based
on KDETH PSNs, which may lead to erroneous processing. This patch
fixes the issue by also checking the Verbs PSN. Addition checks are
added before rewinding the TID RDMA WRITE DATA packets.
Fixes: 9e93e967f7 ("IB/hfi1: Add a function to receive TID RDMA ACK packet")
Cc: <stable@vger.kernel.org>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Link: https://lore.kernel.org/r/20190815192033.105923.44192.stgit@awfm-01.aw.intel.com
Signed-off-by: Doug Ledford <dledford@redhat.com>
We need to check if we have CQEs pending before starting a poll loop,
as those could be the events we will be spinning for (and hence we'll
find none). This can happen if a CQE triggers an error, or if it is
found by eg an IRQ before we get a chance to find it through polling.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
One of the components in LiteON CL1 device has limitations that
can be encountered based upon boundary race conditions using the
nvme bus specific suspend to idle flow.
When this situation occurs the drive doesn't resume properly from
suspend-to-idle.
LiteON has confirmed this problem and fixed in the next firmware
version. As this firmware is already in the field, avoid running
nvme specific suspend to idle flow.
Fixes: d916b1be94 ("nvme-pci: use host managed power state for suspend")
Link: http://lists.infradead.org/pipermail/linux-nvme/2019-July/thread.html
Signed-off-by: Mario Limonciello <mario.limonciello@dell.com>
Signed-off-by: Charles Hyde <charles.hyde@dellteam.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Commit 1b1031ca63 ("nvme: validate cntlid during controller initialisation")
introduced a validation for controllers with duplicate cntlid that runs
on nvme_init_subsystem(). The problem is that the validation relies on
ctrl->cntlid, and this value is assigned (from id_ctrl value) after the
call for nvme_init_subsystem() in nvme_init_identify() for non-fabrics
scenario. That leads to ctrl->cntlid always being 0 in case we have a
physical set of controllers in the same subsystem.
This patch fixes that by loading the discovered cntlid id_ctrl value into
ctrl->cntlid before the subsystem initialization, only for the non-fabrics
case. The patch was tested with emulated nvme devices (qemu) having two
controllers in a single subsystem. Without the patch, we couldn't make
it work failing in the duplicate check; when running with the patch, we
could see the subsystem holding both controllers.
For the fabrics case we see ctrl->cntlid has a more intricate relation
with the admin connect, so we didn't change that.
Fixes: 1b1031ca63 ("nvme: validate cntlid during controller initialisation")
Signed-off-by: Guilherme G. Piccoli <gpiccoli@canonical.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
nvme_state_set_live() making a path available triggers requeue_work
in order to resubmit requests that ended up on requeue_list when no
paths were available.
This requeue_work may race with concurrent nvme_ns_head_make_request()
that do not observe the live path yet.
Such concurrent requests may by made by either:
- New IO submission.
- Requeue_work triggered by nvme_failover_req() or another ana_work.
A race may cause requeue_work capture the state of requeue_list before
more requests get onto the list. These requests will stay on the list
forever unless requeue_work is triggered again.
In order to prevent such race, nvme_state_set_live() should
synchronize_srcu(&head->srcu) before triggering the requeue_work and
prevent nvme_ns_head_make_request referencing an old snapshot of the
path list.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Anton Eidelman <anton@lightbitslabs.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
If a request issue ends up being punted to async context to avoid
blocking, we can get into a situation where the original application
enters the poll loop for that very request before it has been issued.
This should not be an issue, except that the polling will hold the
io_uring uring_ctx mutex for the duration of the poll. When the async
worker has actually issued the request, it needs to acquire this mutex
to add the request to the poll issued list. Since the application
polling is already holding this mutex, the workqueue sleeps on the
mutex forever, and the application thus never gets a chance to poll for
the very request it was interested in.
Fix this by ensuring that the polling drops the uring_ctx occasionally
if it's not making any progress.
Reported-by: Jeffrey M. Birnbaum <jmbnyc@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
In the case of X86_PAE, unsigned long is u32, but the physical address type
should be u64. Due to the bug here, the netvsc driver can not load
successfully, and sometimes the VM can panic due to memory corruption (the
hypervisor writes data to the wrong location).
Fixes: 6ba34171bc ("Drivers: hv: vmbus: Remove use of slow_virt_to_phys()")
Cc: stable@vger.kernel.org
Cc: Michael Kelley <mikelley@microsoft.com>
Reported-and-tested-by: Juliana Rodrigueiro <juliana.rodrigueiro@intra2net.com>
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
When building hv_kvp_daemon GCC-8.3 complains:
hv_kvp_daemon.c: In function ‘kvp_get_ip_info.constprop’:
hv_kvp_daemon.c:812:30: warning: ‘ip_buffer’ may be used uninitialized in this function [-Wmaybe-uninitialized]
struct hv_kvp_ipaddr_value *ip_buffer;
this seems to be a false positive: we only use ip_buffer when
op == KVP_OP_GET_IP_INFO and it is only unset when op == KVP_OP_ENUMERATE.
Silence the warning by initializing ip_buffer to NULL.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Simplify the ring buffer handling with the in-place API.
Also avoid the dynamic allocation and the memory leak in the channel
callback function.
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
This field is no longer used after the commit
63ed4e0c67 ("Drivers: hv: vmbus: Consolidate all Hyper-V specific clocksource code")
, because it's replaced by the global variable
"struct ms_hyperv_tsc_page *tsc_pg;" (now, the variable is in
drivers/clocksource/hyperv_timer.c).
Fixes: 63ed4e0c67 ("Drivers: hv: vmbus: Consolidate all Hyper-V specific clocksource code")
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
If the HW revision of Qu devices we found is QuZ, then we need to
switch the configuration accordingly in order to use the correct FW.
Add a block of ifs in order do that.
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
We have a too generic condition that switches from Qu configurations
to QnJ configurations. We need to exclude some configurations so that
they are not erroneously switched. Add the ax201 configuration to the
list of exclusions.
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
The MAC context configuration always allowed multicast data frames
to pass to the driver for all MAC context types, and in the
case of station MAC context both when associated and when not
associated.
One of the outcomes of this configuration is having the FW forward
encrypted multicast frames to the driver with Rx status indicating
that the frame was not decrypted (as expected, since no keys were
configured yet) which in turn results with unnecessary error
messages.
Change this behavior to allow multicast data frames only when they
are actually expected, e.g., station MAC context is associated etc.
Signed-off-by: Ilan Peer <ilan.peer@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
To do not brake HW restart we should keep initialization vectors data.
I assumed that on start the data is already initialized to zeros, but
that not true on some scenarios and we should clear it. So add
additional flag to check if we are under HW restart and clear IV's
data if we are not.
Patch fixes AP mode regression.
Reported-and-tested-by: Emil Karlson <jekarl@iki.fi>
Fixes: 710e6cc159 ("rt2800: do not nullify initialization vector data")
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>