Even when GPU recovery is disabled we could run into a manually
triggered recovery.
v2: keep accidental removed comments
Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
We are going to need this for recoverable page fault handling and it
makes shadow handling during GPU reset much more easier.
Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Junwei Zhang <Jerry.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
DC doesn't seem to have a fallback path either.
So when interrupts doesn't work any more we are pretty much busted no
matter what.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
With OD settings applied, the clock table will be updated accordingly.
We need to retrieve the new clock tables then.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
With the latest SMC fw, we are able to get the voltage value for
specific frequency point. So, we update the OD relates to take
absolute voltage instead of offset.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
vega20 should use umc_info v3_3 instead of v3_1. There are
serveral versions of umc_info for vega series. Compared to
various versions of these structures, vram_info strucure is
unified for vega series. The patch switch to query mem_type
from vram_info structure for all the vega series dGPU.
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
In function ‘radeon_process_i2c_ch’ a comparison of a u8 value against
255 is done. Since it is always false, change the signature of this
function to use an `int` instead, which match the type used in caller:
`radeon_atom_hw_i2c_xfer`.
Fix the following warning triggered with W=1:
CC [M] drivers/gpu/drm/radeon/atombios_i2c.o
drivers/gpu/drm/radeon/atombios_i2c.c: In function ‘radeon_process_i2c_ch’:
drivers/gpu/drm/radeon/atombios_i2c.c:71:11: warning: comparison is always false due to limited range of data type [-Wtype-limits]
if (num > ATOM_MAX_HW_I2C_READ) {
^
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Mathieu Malaterre <malat@debian.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
adev->gfx.rlc has the values from rlc_hdr already processed by
le32_to_cpu. Using the rlc_hdr values on big-endian machines causes
a kernel Oops due to writing well outside of the array (0x24000000
instead of 0x24).
Signed-off-by: A. Wilcox <AWilcox@Wilcox-Tech.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
We can't get the mask for the root directory from the number of entries.
So add a new function to avoid that problem.
v2: fix typo in mask
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
When a driver domain (e.g. dom0) is running out of maptrack entries it
can't map any more foreign domain pages. Instead of silently stalling
the affected domUs issue a rate limited warning in this case in order
to make it easier to detect that situation.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Borislav is effectivly maintaining parts of X86 already, make it official.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Borislav Petkov <bp@alien8.de>
Sync syscall to DAX file needs to flush processor cache, but it
currently does not flush to existing DAX files. This is because
'ext2_da_aops' is set to address_space_operations of existing DAX
files, instead of 'ext2_dax_aops', since S_DAX flag is set after
ext2_set_aops() in the open path.
Similar to ext4, change ext2_iget() to initialize i_flags before
ext2_set_aops().
Fixes: fb094c9074 ("ext2, dax: introduce ext2_dax_aops")
Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Suggested-by: Jan Kara <jack@suse.cz>
Cc: Jan Kara <jack@suse.cz>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Cc: <stable@vger.kernel.org>
Signed-off-by: Jan Kara <jack@suse.cz>
Pull perf/urgent fixes from Arnaldo Carvalho de Melo:
- Fix the build on !_GNU_SOURCE libc systems such as Alpine Linux/musl
libc due to usage of strerror_r glibc variant on libbpf (Arnaldo Carvalho de Melo)
- Fix out-of-tree asciidoctor man page generation (Ben Hutchings)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Crypto stuff from Herbert:
"This push fixes a potential boot hang in ccp and an incorrect
CPU capability check in aegis/morus on x86."
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: x86/aegis,morus - Do not require OSXSAVE for SSE2
crypto: ccp - add timeout support in the SEV command
Steven writes:
"Vaibhav Nagarnaik found that modifying the ring buffer size could cause
a huge latency in the system because it does a while loop to free pages
without releasing the CPU (on non preempt kernels). In a case where there
are hundreds of thousands of pages to free it could actually cause a system
stall. A properly place cond_resched() solves this issue."
Darren writes:
"platform-drivers-x86 for v4.19-2
Free allocated ACPI buffers in two drivers.
The following is an automated git shortlog grouped by driver:
alienware-wmi:
- Correct a memory leak
dell-smbios-wmi:
- Correct a memory leak"
* tag 'platform-drivers-x86-v4.19-2' of git://git.infradead.org/linux-platform-drivers-x86:
platform/x86: alienware-wmi: Correct a memory leak
platform/x86: dell-smbios-wmi: Correct a memory leak
Wei Wang says:
====================
ipv6: fix issues on accessing fib6_metrics
The latest fix on the memory leak of fib6_metrics still causes
use-after-free.
This patch series first revert the previous fix and propose a new fix
that is more inline with ipv4 logic and is tested to fix the
use-after-free issue reported.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
When dst->_metrics and f6i->fib6_metrics share the same memory, both
take reference count on the dst_metrics structure. However, when dst is
destroyed, ip6_dst_destroy() only invokes dst_destroy_metrics_generic()
which does not take care of READONLY metrics and does not release refcnt.
This causes memory leak.
Similar to ipv4 logic, the fix is to properly release refcnt and free
the memory space pointed by dst->_metrics if refcnt becomes 0.
Fixes: 93531c6743 ("net/ipv6: separate handling of FIB entries from dst based routes")
Reported-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Wei Wang <weiwan@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If a network interface is created prior to the SFP socket being
available, ethtool can request module information. This unfortunately
leads to an oops:
Unable to handle kernel NULL pointer dereference at virtual address 00000008
pgd = (ptrval)
[00000008] *pgd=7c400831, *pte=00000000, *ppte=00000000
Internal error: Oops: 17 [#1] SMP ARM
Modules linked in:
CPU: 0 PID: 1480 Comm: ethtool Not tainted 4.19.0-rc3 #138
Hardware name: Broadcom Northstar Plus SoC
PC is at sfp_get_module_info+0x8/0x10
LR is at dev_ethtool+0x218c/0x2afc
Fix this by not filling in the network device's SFP bus pointer until
SFP is fully bound, thereby avoiding the core calling into the SFP bus
code.
Fixes: ce0aa27ff3 ("sfp: add sfp-bus to bridge between network devices and sfp cages")
Reported-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
When no Tx IRQ is available, the txq_done() routine (called from
tx_done()) shouldn't be called from the polling function, as in such
case it is already called in the Tx path thanks to an hrtimer. This
mostly occurred when using PPv2.1, as the engine then do not have Tx
IRQs.
Fixes: edc660fa09 ("net: mvpp2: replace TX coalescing interrupts with hrtimer")
Reported-by: Stefan Chulski <stefanc@marvell.com>
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ursula Braun says:
====================
net/smc: fixes 2018-09-18
here are some fixes in different areas of the smc code for the net
tree.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Comparing an int to a size, which is unsigned, causes the int to become
unsigned, giving the wrong result. kernel_sendmsg can return a negative
error code.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If a linkgroup is terminated abnormally already due to failing
LLC CONFIRM LINK or LLC ADD LINK, fallback to TCP is still possible.
In this case do not switch to state SMC_PEERABORTWAIT and do not set
sk_err.
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For a failing smc_listen_rdma_finish() smc_listen_decline() is
called. If fallback is possible, the new socket is already enqueued
to be accepted in smc_listen_decline(). Avoid enqueuing a second time
afterwards in this case, otherwise the smc_create_lgr_pending lock
is released twice:
[ 373.463976] WARNING: bad unlock balance detected!
[ 373.463978] 4.18.0-rc7+ #123 Tainted: G O
[ 373.463979] -------------------------------------
[ 373.463980] kworker/1:1/30 is trying to release lock (smc_create_lgr_pending) at:
[ 373.463990] [<000003ff801205fc>] smc_listen_work+0x22c/0x5d0 [smc]
[ 373.463991] but there are no more locks to release!
[ 373.463991]
other info that might help us debug this:
[ 373.463993] 2 locks held by kworker/1:1/30:
[ 373.463994] #0: 00000000772cbaed ((wq_completion)"events"){+.+.}, at: process_one_work+0x1ec/0x6b0
[ 373.464000] #1: 000000003ad0894a ((work_completion)(&new_smc->smc_listen_work)){+.+.}, at: process_one_work+0x1ec/0x6b0
[ 373.464003]
stack backtrace:
[ 373.464005] CPU: 1 PID: 30 Comm: kworker/1:1 Kdump: loaded Tainted: G O 4.18.0-rc7uschi+ #123
[ 373.464007] Hardware name: IBM 2827 H43 738 (LPAR)
[ 373.464010] Workqueue: events smc_listen_work [smc]
[ 373.464011] Call Trace:
[ 373.464015] ([<0000000000114100>] show_stack+0x60/0xd8)
[ 373.464019] [<0000000000a8c9bc>] dump_stack+0x9c/0xd8
[ 373.464021] [<00000000001dcaf8>] print_unlock_imbalance_bug+0xf8/0x108
[ 373.464022] [<00000000001e045c>] lock_release+0x114/0x4f8
[ 373.464025] [<0000000000aa87fa>] __mutex_unlock_slowpath+0x4a/0x300
[ 373.464027] [<000003ff801205fc>] smc_listen_work+0x22c/0x5d0 [smc]
[ 373.464029] [<0000000000197a68>] process_one_work+0x2a8/0x6b0
[ 373.464030] [<0000000000197ec2>] worker_thread+0x52/0x410
[ 373.464033] [<000000000019fd0e>] kthread+0x15e/0x178
[ 373.464035] [<0000000000aaf58a>] kernel_thread_starter+0x6/0xc
[ 373.464052] [<0000000000aaf584>] kernel_thread_starter+0x0/0xc
[ 373.464054] INFO: lockdep is turned off.
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In state SMC_INIT smc_poll() delegates polling to the internal
CLC socket. This means, once the connect worker has finished
its kernel_connect() step, the poll wake-up may occur. This is not
intended. The wake-up should occur from the wake up call in
smc_connect_work() after __smc_connect() has finished.
Thus in state SMC_INIT this patch now calls sock_poll_wait() on the
main SMC socket.
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
EtherAVB hardware requires 0 to be written to status register bits in
order to clear them, however, care must be taken not to:
1. Clear other bits, by writing zero to them
2. Write one to reserved bits
This patch corrects the ravb driver with respect to the second point above.
This is done by defining reserved bit masks for the affected registers and,
after auditing the code, ensure all sites that may write a one to a
reserved bit use are suitably masked.
Signed-off-by: Kazuya Mizuguchi <kazuya.mizuguchi.ks@renesas.com>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Reviewed-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>