The current irq_comm.c file contains pieces of code that are generic
across different irqchip implementations, as well as code that is
fully IOAPIC specific.
Split the generic bits out into irqchip.c.
Signed-off-by: Alexander Graf <agraf@suse.de>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
The prototype has been stale for a while, I can't spot any real function
define behind it. Let's just remove it.
Signed-off-by: Alexander Graf <agraf@suse.de>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
We have a capability enquire system that allows user space to ask kvm
whether a feature is available.
The point behind this system is that we can have different kernel
configurations with different capabilities and user space can adjust
accordingly.
Because features can always be non existent, we can drop any #ifdefs
on CAP defines that could be used generically, like the irq routing
bits. These can be easily reused for non-IOAPIC systems as well.
Signed-off-by: Alexander Graf <agraf@suse.de>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Quite a bit of code in KVM has been conditionalized on availability of
IOAPIC emulation. However, most of it is generically applicable to
platforms that don't have an IOPIC, but a different type of irq chip.
Make code that only relies on IRQ routing, not an APIC itself, on
CONFIG_HAVE_KVM_IRQ_ROUTING, so that we can reuse it later.
Signed-off-by: Alexander Graf <agraf@suse.de>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
The concept of routing interrupt lines to an irqchip is nothing
that is IOAPIC specific. Every irqchip has a maximum number of pins
that can be linked to irq lines.
So let's add a new define that allows us to reuse generic code for
non-IOAPIC platforms.
Signed-off-by: Alexander Graf <agraf@suse.de>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
We expose this parameter for a future caller.
It will be used to extract the endtime from the gss-proxy upcall mechanism,
in order to set the rsc cache expiration time.
Signed-off-by: Simo Sorce <simo@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
In the gss-proxy case we don't want to have to reconnect at random--we
want to connect only on gss-proxy startup when we can steal gss-proxy's
context to do the connect in the right namespace.
So, provide a flag that allows the rpc_create caller to turn off the
idle timeout.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Merging Trond's nfs-for-next branch, mainly to get
b7993cebb8 "SUNRPC: Allow rpc_create() to
request that TCP slots be unlimited", which a small piece of the
gss-proxy work depends on.
On my SMP platform which is made of 5 cores in 2 clusters, I
have the nr_busy_cpu field of sched_group_power struct that is
not null when the platform is fully idle - which makes the
scheduler unhappy.
The root cause is:
During the boot sequence, some CPUs reach the idle loop and set
their NOHZ_IDLE flag while waiting for others CPUs to boot. But
the nr_busy_cpus field is initialized later with the assumption
that all CPUs are in the busy state whereas some CPUs have
already set their NOHZ_IDLE flag.
More generally, the NOHZ_IDLE flag must be initialized when new
sched_domains are created in order to ensure that NOHZ_IDLE and
nr_busy_cpus are aligned.
This condition can be ensured by adding a synchronize_rcu()
between the destruction of old sched_domains and the creation of
new ones so the NOHZ_IDLE flag will not be updated with old
sched_domain once it has been initialized. But this solution
introduces a additionnal latency in the rebuild sequence that is
called during cpu hotplug.
As suggested by Frederic Weisbecker, another solution is to have
the same rcu lifecycle for both NOHZ_IDLE and sched_domain
struct. A new nohz_idle field is added to sched_domain so both
status and sched_domain will share the same RCU lifecycle and
will be always synchronized. In addition, there is no more need
to protect nohz_idle against concurrent access as it is only
modified by 2 exclusive functions called by local cpu.
This solution has been prefered to the creation of a new struct
with an extra pointer indirection for sched_domain.
The synchronization is done at the cost of :
- An additional indirection and a rcu_dereference for accessing nohz_idle.
- We use only the nohz_idle field of the top sched_domain.
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: linaro-kernel@lists.linaro.org
Cc: peterz@infradead.org
Cc: fweisbec@gmail.com
Cc: pjt@google.com
Cc: rostedt@goodmis.org
Cc: efault@gmx.de
Link: http://lkml.kernel.org/r/1366729142-14662-1-git-send-email-vincent.guittot@linaro.org
[ Fixed !NO_HZ build bug. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
We are currently out of free bits in AT_HWCAP. With POWER8, we have
several hardware features that we need to advertise.
Tested on POWER and x86.
Signed-off-by: Michael Neuling <michael@neuling.org>
Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
drm_mode_equal_no_clocks() is like drm_mode_equal() except it doesn't
compare the clock or vrefresh values. drm_mode_equal() is now
implemented by first doing the clock checks, and then calling
drm_mode_equal_no_clocks().
v2: Add missing EXPORT_SYMBOL()
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
No need to zero initialize .vrefresh in DRM_MODE() since it's using
desgignated initializers.
This will also avoid some duplicate initialization warnings later.
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Now jbd_alloc_handle is only called by new_handle. So this commit
uses kmem_cache_zalloc instead of kmem_cache_alloc/memset.
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: Jan Kara <jack@suse.cz>
timeriomem_rng only supports a single device instance. This patch
enables multiple timeriomem_rng devices to coexist as well as adds
some additional error checking.
Signed-off-by: Alexander Clouter <alex@digriz.org.uk>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
videobuf_queue_dma_contig_init_cached() is not used anywhere.
Drop support for it, cleaning up the code a little bit.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Acked-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Add basic struct iscsit_transport API template to allow iscsi-target for
running with external transport modules using existing iscsi_target_core.h
code.
For all external modules, this calls try_module_get() and module_put()
to obtain + release an external iscsit_transport module reference count.
Also include the iscsi-target symbols necessary in iscsi_transport.h to
allow external transport modules to function.
v3 changes:
- Add iscsit_build_reject export for ISTATE_SEND_REJECT usage
v2 changes:
- Drop unnecessary export of iscsit_get_transport + iscsit_put_transport (roland)
- Add ->iscsit_queue_data_in() to remove extra context switch on RDMA_WRITE
- Add ->iscsit_queue_status() to remove extra context switch on IB_SEND status
- Add ->iscsit_get_dataout() to remove extra context switch on RDMA_READ
- Drop ->iscsit_free_cmd()
- Drop ->iscsit_unmap_cmd()
- Rename iscsit_create_transport() -> iscsit_register_transport() (andy)
- Rename iscsit_destroy_transport() -> iscsit_unregister_transport() (andy)
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
iblock_execute_unmap() and fd_execute_unmap share a lot of code.
Add sbc_execute_unmap() helper to remove duplicated code for
iblock_execute_unmap() and fd_execute_unmap().
Cc: Christoph Hellwig <hch@lst.de>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Asias He <asias@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
All genl callbacks are serialized by genl-mutex. This can become
bottleneck in multi threaded case.
Following patch adds an parameter to genl_family so that a
particular family can get concurrent netlink callback without
genl_lock held.
New rw-sem is used to protect genl callback from genl family unregister.
in case of parallel_ops genl-family read-lock is taken for callbacks and
write lock is taken for register or unregistration for any family.
In case of locked genl family semaphore and gel-mutex is locked for
any openration.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, there is no way to find out which timestamp is reported in
tpacket{,2,3}_hdr's tp_sec, tp_{n,u}sec members. It can be one of
SOF_TIMESTAMPING_SYS_HARDWARE, SOF_TIMESTAMPING_RAW_HARDWARE,
SOF_TIMESTAMPING_SOFTWARE, or a fallback variant late call from the
PF_PACKET code in software.
Therefore, report in the tp_status member of the ring buffer which
timestamp has been reported for RX and TX path. This should not break
anything for the following reasons: i) in RX ring path, the user needs
to test for tp_status & TP_STATUS_USER, and later for other flags as
well such as TP_STATUS_VLAN_VALID et al, so adding other flags will
do no harm; ii) in TX ring path, time stamps with PACKET_TIMESTAMP
socketoption are not available resp. had no effect except that the
application setting this is buggy. Next to TP_STATUS_AVAILABLE, the
user also should check for other flags such as TP_STATUS_WRONG_FORMAT
to reclaim frames to the application. Thus, in case TX ts are turned
off (default case), nothing happens to the application logic, and in
case we want to use this new feature, we now can also check which of
the ts source is reported in the status field as provided in the docs.
Reported-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This makes it more readable and clearer what bits are still free to
use. The compiler reduces this to a constant for us anyway.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeff Kirsher says:
====================
This series contains updates to ixgbe, igb and pci.
The ixgbe changes contains a fix to a possible divide by zero by bailing
out of the ixgbe_update_itr() function if the last interrupt timeslice is
zero. In addition, support is added for the new OCP x520 adapter as well
as LX support for 82599 devices. Jacob provides a patch to change
variable wol_supported to wol_enabled to better reflect what the code
is actually doing (i.e. checking if WoL is enabled).
Alex adds SRIOV helper function to pci that will determine if a PF
has any VFs that are currently assigned to a guest.
The remaining 8 patches are against igb and contain the following changes:
* implement SERDES loopback configuration for i210 devices by unsetting
sigdetect bit, so as to fix Ethtool loopback test failure
* add support for the SMBI semaphore for I210/I211 devices
* implement the new generic pci_vfs_assigned helper function (Alex's PCI
helper function)
* display warning when link speed is downgraded due to Smartspeed
* ensure that VLAN hardware filtering remains enabled when the device is
in promiscuous mode and VT mode simultaneously
* cleanup dead code in igb
* bump the driver version
v2: updated the PCI patch to add SRIOV helper function to remove extern
from the declaration of pci_vfs_assigned in pci.h and return 0 if
SR-IOV is disabled which is inline with other PCI SR-IOV functions
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Pablo Neira Ayuso says:
====================
The following patchset contains fixes for recently applied
Netfilter/IPVS updates to the net-next tree, most relevantly
they are:
* Fix sparse warnings introduced in the RCU conversion, from
Julian Anastasov.
* Fix wrong endianness in the size field of IPVS sync messages,
from Simon Horman.
* Fix missing if checking in nf_xfrm_me_harder, from Dan Carpenter.
* Fix off by one access in the IPVS SCTP tracking code, again from
Dan Carpenter.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This function is meant to add a helper function that will determine if a PF
has any VFs that are currently assigned to a guest. We currently have been
implementing this function per driver, and going forward I would like to avoid
that by making this function generic and using this helper.
v2: Removed extern from declaration of pci_vfs_assigned in pci.h and return
0 if SR-IOV is disabled with is inline with other PCI SRIOV functions.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Re-arrange some of code which fills DMFS HW structures so we can use
it from within the core driver and from the IB driver too, e.g when
verbs DMFS structures are transformed into mlx4 hardware structs.
Also, add struct mlx4_flow_handle struct which will be of use by the
DMFS verbs flow in the IB driver.
Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Some of struct mlx4_net_trans_rule_hw_ctrl fields were packed into u32
and accessed through bit field operations. Expose and access them
directly as u8 or u16.
Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Change struct mlx4_net_trans_rule_hw_eth :: vlan_id name to vlan_tag
Change struct mlx4_net_trans_rule_hw_ib :: r_u_qpn name to l3_qpn
The patch doesn't introduce any functional change or API change
towards the firmware.
Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Align the names used by enum mlx4_net_trans_promisc_mode with the
actual firmware specification. The patch doesn't introduce any
functional change or API change towards the firmware.
Remove MLX4_FS_PROMISC_FUNCTION_PORT which isn't of use. Add new
enums MLX4_FS_{UC/MC}_SNIFFER as a preparation step for sniffer
support.
Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Move flow steering HW structures to be on the public mlx4 include
directory, as a pre-step for the mlx4 IB driver to use them too.
Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
The patch allows to enable/disable HW timestamping for incoming and/or
outgoing packets. It adds and initializes all structs and callbacks
needed by kernel TS API.
To enable/disable HW timestamping appropriate ioctl should be used.
Currently HWTSTAMP_FILTER_ALL/NONE and HWTSAMP_TX_ON/OFF only are
supported.
When enabling TS on receive flow - VLAN stripping will be disabled.
Also were made all relevant changes in RX/TX flows to consider TS request
and plant HW timestamps into relevant structures.
mlx4_ib was fixed to compile with new mlx4_cq_alloc() signature.
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Read HCA frequency, read PCI clock bar and offset, map internal clock to
PCI bar.
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As IOMMU groups are exposed to the user space by their numbers,
the user space can use them in various kernel APIs so the kernel
might need an API to find a group by its ID.
As an example, QEMU VFIO on PPC64 platform needs it to associate
a logical bus number (LIOBN) with a specific IOMMU group in order
to support in-kernel handling of DMA map/unmap requests.
The patch adds the iommu_group_get_by_id(id) function which performs
such search.
v2: fixed reference counting.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
* pci/gavin-msi-cleanup:
vfio-pci: Use cached MSI/MSI-X capabilities
vfio-pci: Use PCI_MSIX_TABLE_BIR, not PCI_MSIX_FLAGS_BIRMASK
PCI: Remove "extern" from function declarations
PCI: Use PCI_MSIX_TABLE_BIR, not PCI_MSIX_FLAGS_BIRMASK
PCI: Drop msi_mask_reg() and remove drivers/pci/msi.h
PCI: Use msix_table_size() directly, drop multi_msix_capable()
PCI: Drop msix_table_offset_reg() and msix_pba_offset_reg() macros
PCI: Drop is_64bit_address() and is_mask_bit_support() macros
PCI: Drop msi_data_reg() macro
PCI: Drop msi_lower_address_reg() and msi_upper_address_reg() macros
PCI: Drop msi_control_reg() macro and use PCI_MSI_FLAGS directly
PCI: Use cached MSI/MSI-X offsets from dev, not from msi_desc
PCI: Clean up MSI/MSI-X capability #defines
PCI: Use cached MSI-X cap while enabling MSI-X
PCI: Use cached MSI cap while enabling MSI interrupts
PCI: Remove MSI/MSI-X cap check in pci_msi_check_device()
PCI: Cache MSI/MSI-X capability offsets in struct pci_dev
PCI: Use u8, not int, for PM capability offset
[SCSI] megaraid_sas: Use correct #define for MSI-X capability
To follow the prefix names used by the thermal functions,
this patch renames notify_thermal_framework to thermal_notify_framework.
Signed-off-by: Eduardo Valentin <eduardo.valentin@ti.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
linux-v3.8-rc1 and later support for plug for blkdev_issue_discard with
commit 0cfbcafcae
(block: add plug for blkdev_issue_discard )
For example,
1) DISCARD rq-1 with size size 4GB
2) DISCARD rq-2 with size size 1GB
If these 2 discard requests get merged, final request size will be 5GB.
In this case, request's __data_len field may overflow as it can store
max 4GB(unsigned int).
This issue was observed while doing mkfs.f2fs on 5GB SD card:
https://lkml.org/lkml/2013/4/1/292
Info: sector size = 512
Info: total sectors = 11370496 (in 512bytes)
Info: zone aligned segment0 blkaddr: 512
[ 257.789764] blk_update_request: bio idx 0 >= vcnt 0
mkfs process gets stuck in D state and I see the following in the dmesg:
[ 257.789733] __end_that: dev mmcblk0: type=1, flags=122c8081
[ 257.789764] sector 4194304, nr/cnr 2981888/4294959104
[ 257.789764] bio df3840c0, biotail df3848c0, buffer (null), len
1526726656
[ 257.789764] blk_update_request: bio idx 0 >= vcnt 0
[ 257.794921] request botched: dev mmcblk0: type=1, flags=122c8081
[ 257.794921] sector 4194304, nr/cnr 2981888/4294959104
[ 257.794921] bio df3840c0, biotail df3848c0, buffer (null), len
1526726656
This patch fixes this issue.
Reported-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
Tested-by: Max Filippov <jcmvbkbc@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Fix typo in printk and comments within various drivers.
Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>