Commit Graph

1233180 Commits

Author SHA1 Message Date
Michal Schmidt
e1db8c2a01 ice: lag: in RCU, use atomic allocation
Sleeping is not allowed in RCU read-side critical sections.
Use atomic allocations under rcu_read_lock.

Fixes: 1e0f9881ef ("ice: Flesh out implementation of support for SRIOV on bonded interface")
Fixes: 41ccedf5ca ("ice: implement lag netdev event handler")
Fixes: 3579aa86fb ("ice: update reset path for SRIOV LAG support")
Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-11-06 16:42:41 -08:00
Dave Ertman
3e39da4fa1 ice: Fix SRIOV LAG disable on non-compliant aggregate
If an attribute of an aggregate interface disqualifies it from supporting
SRIOV, the driver will unwind the SRIOV support.  Currently the driver is
clearing the feature bit for all interfaces in the aggregate, but this is
not allowing the other interfaces to unwind successfully on driver unload.

Only clear the feature bit for the interface that is currently unwinding.

Fixes: bf65da2eb2 ("ice: enforce interface eligibility and add messaging for SRIOV LAG")
Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-11-06 16:42:18 -08:00
Andreas Schwab
e0c0a7c35f riscv: select ARCH_PROC_KCORE_TEXT
This adds a separate segment for kernel text in /proc/kcore, which has a
different address than the direct linear map.

Signed-off-by: Andreas Schwab <schwab@suse.de>
Link: https://lore.kernel.org/r/mvmh6m758ao.fsf@suse.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-11-06 16:25:45 -08:00
Ivan Vecera
aa54d846f3 i40e: Fix devlink port unregistering
Ensure that devlink port is unregistered after unregistering
of net device.

Reproducer:
[root@host ~]# rmmod i40e
[ 4742.939386] i40e 0000:02:00.1: i40e_ptp_stop: removed PHC on enp2s0f1np1
[ 4743.059269] ------------[ cut here ]------------
[ 4743.063900] WARNING: CPU: 21 PID: 10766 at net/devlink/port.c:1078 devl_port_unregister+0x69/0x80
...

Fixes: 9e479d64dc ("i40e: Add initial devlink support")
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-11-06 16:16:54 -08:00
Ivan Vecera
e96fe283c6 i40e: Do not call devlink_port_type_clear()
Do not call devlink_port_type_clear() prior devlink port unregister
and let devlink core to take care about it.

Reproducer:
[root@host ~]# rmmod i40e
[ 4539.964699] i40e 0000:02:00.0: devlink port type for port 0 cleared without a software interface reference, device type not supported by the kernel?
[ 4540.319811] i40e 0000:02:00.1: devlink port type for port 1 cleared without a software interface reference, device type not supported by the kernel?

Fixes: 9e479d64dc ("i40e: Add initial devlink support")
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-11-06 16:16:13 -08:00
Dmitry Torokhov
cdd5b5a976 Merge branch 'next' into for-linus
Prepare input updates for 6.7 merge window.
2023-11-06 15:42:08 -08:00
Linus Torvalds
be3ca57cfb Merge tag 'media/v6.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
Pull media updates from Mauro Carvalho Chehab:

 - the old V4L2 core videobuf kAPI was finally removed. All media
   drivers should now be using VB2 kAPI

 - new automotive driver: mgb4

 - new platform video driver: npcm-video

 - new sensor driver: mt9m114

 - new TI driver used in conjunction with Cadence CSI2RX IP to bridge
   TI-specific parts

 - ir-rx51 was removed and the N900 DT binding was moved to the
   pwm-ir-tx generic driver

 - drop atomisp-specific ov5693, using the upstream driver instead

 - the camss driver has gained RDI3 support for VFE 17x

 - the atomisp driver now detects ISP2400 or ISP2401 at run time. No
   need to set it up at build time anymore

 - lots of driver fixes, cleanups and improvements

* tag 'media/v6.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (377 commits)
  media: nuvoton: VIDEO_NPCM_VCD_ECE should depend on ARCH_NPCM
  media: venus: Fix firmware path for resources
  media: venus: hfi_cmds: Replace one-element array with flex-array member and use __counted_by
  media: venus: hfi_parser: Add check to keep the number of codecs within range
  media: venus: hfi: add checks to handle capabilities from firmware
  media: venus: hfi: fix the check to handle session buffer requirement
  media: venus: hfi: add checks to perform sanity on queue pointers
  media: platform: cadence: select MIPI_DPHY dependency
  media: MAINTAINERS: Fix path for J721E CSI2RX bindings
  media: cec: meson: always include meson sub-directory in Makefile
  media: videobuf2: Fix IS_ERR checking in vb2_dc_put_userptr()
  media: platform: mtk-mdp3: fix uninitialized variable in mdp_path_config()
  media: mediatek: vcodec: using encoder device to alloc/free encoder memory
  media: imx-jpeg: notify source chagne event when the first picture parsed
  media: cx231xx: Use EP5_BUF_SIZE macro
  media: siano: Drop unnecessary error check for debugfs_create_dir/file()
  media: mediatek: vcodec: Handle invalid encoder vsi
  media: aspeed: Drop unnecessary error check for debugfs_create_file()
  Documentation: media: buffer.rst: fix V4L2_BUF_FLAG_PREPARED
  Documentation: media: gen-errors.rst: fix confusing ENOTTY description
  ...
2023-11-06 15:06:06 -08:00
Clément Léger
4cc0d8a3f1 riscv: kernel: Use correct SYM_DATA_*() macro for data
Some data were incorrectly annotated with SYM_FUNC_*() instead of
SYM_DATA_*() ones. Use the correct ones.

Signed-off-by: Clément Léger <cleger@rivosinc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20231024132655.730417-4-cleger@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-11-06 09:42:48 -08:00
Clément Léger
76329c6939 riscv: Use SYM_*() assembly macros instead of deprecated ones
ENTRY()/END()/WEAK() macros are deprecated and we should make use of the
new SYM_*() macros [1] for better annotation of symbols. Replace the
deprecated ones with the new ones and fix wrong usage of END()/ENDPROC()
to correctly describe the symbols.

[1] https://docs.kernel.org/core-api/asm-annotations.html

Signed-off-by: Clément Léger <cleger@rivosinc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20231024132655.730417-3-cleger@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-11-06 09:42:47 -08:00
Clément Léger
b18f7296fb riscv: use ".L" local labels in assembly when applicable
For the sake of coherency, use local labels in assembly when
applicable. This also avoid kprobes being confused when applying a
kprobe since the size of function is computed by checking where the
next visible symbol is located. This might end up in computing some
function size to be way shorter than expected and thus failing to apply
kprobes to the specified offset.

Signed-off-by: Clément Léger <cleger@rivosinc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20231024132655.730417-2-cleger@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-11-06 09:42:05 -08:00
Geert Uytterhoeven
57a4542cb7 riscv: boot: Fix creation of loader.bin
When flashing loader.bin for K210 using kflash:

    [ERROR] This is an ELF file and cannot be programmed to flash directly: arch/riscv/boot/loader.bin

Before, loader.bin relied on "OBJCOPYFLAGS := -O binary" in the main
RISC-V Makefile to create a boot image with the right format.  With this
removed, the image is now created in the wrong (ELF) format.

Fix this by adding an explicit rule.

Fixes: 505b02957e ("riscv: Remove duplicate objcopy flag")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/1086025809583809538dfecaa899892218f44e7e.1698159066.git.geert+renesas@glider.be
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-11-06 09:39:26 -08:00
Yuran Pereira
23816724fd kdb: Corrects comment for kdballocenv
This patch corrects the comment for the kdballocenv function.
The previous comment incorrectly described the function's
parameters and return values.

Signed-off-by: Yuran Pereira <yuran.pereira@hotmail.com>
Link: https://lore.kernel.org/r/DB3PR10MB6835B383B596133EDECEA98AE8ABA@DB3PR10MB6835.EURPRD10.PROD.OUTLOOK.COM
[daniel.thompson@linaro.org: fixed whitespace alignment in new lines]
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
2023-11-06 17:13:55 +00:00
Palmer Dabbelt
9ba91d1356 Merge patch series "riscv: tlb flush improvements"
Alexandre Ghiti <alexghiti@rivosinc.com> says:

This series optimizes the tlb flushes on riscv which used to simply
flush the whole tlb whatever the size of the range to flush or the size
of the stride.

Patch 3 introduces a threshold that is microarchitecture specific and
will very likely be modified by vendors, not sure though which mechanism
we'll use to do that (dt? alternatives? vendor initialization code?).

* b4-shazam-merge:
  riscv: Improve flush_tlb_kernel_range()
  riscv: Make __flush_tlb_range() loop over pte instead of flushing the whole tlb
  riscv: Improve flush_tlb_range() for hugetlb pages
  riscv: Improve tlb_flush()

Link: https://lore.kernel.org/r/20231030133027.19542-1-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-11-06 07:20:54 -08:00
Alexandre Ghiti
5e22bfd520 riscv: Improve flush_tlb_kernel_range()
This function used to simply flush the whole tlb of all harts, be more
subtile and try to only flush the range.

The problem is that we can only use PAGE_SIZE as stride since we don't know
the size of the underlying mapping and then this function will be improved
only if the size of the region to flush is < threshold * PAGE_SIZE.

Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> # On RZ/Five SMARC
Reviewed-by: Samuel Holland <samuel.holland@sifive.com>
Tested-by: Samuel Holland <samuel.holland@sifive.com>
Link: https://lore.kernel.org/r/20231030133027.19542-5-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-11-06 07:20:52 -08:00
Alexandre Ghiti
9d4e8d5fa7 riscv: Make __flush_tlb_range() loop over pte instead of flushing the whole tlb
Currently, when the range to flush covers more than one page (a 4K page or
a hugepage), __flush_tlb_range() flushes the whole tlb. Flushing the whole
tlb comes with a greater cost than flushing a single entry so we should
flush single entries up to a certain threshold so that:
threshold * cost of flushing a single entry < cost of flushing the whole
tlb.

Co-developed-by: Mayuresh Chitale <mchitale@ventanamicro.com>
Signed-off-by: Mayuresh Chitale <mchitale@ventanamicro.com>
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> # On RZ/Five SMARC
Reviewed-by: Samuel Holland <samuel.holland@sifive.com>
Tested-by: Samuel Holland <samuel.holland@sifive.com>
Link: https://lore.kernel.org/r/20231030133027.19542-4-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-11-06 07:20:51 -08:00
Alexandre Ghiti
c962a6e746 riscv: Improve flush_tlb_range() for hugetlb pages
flush_tlb_range() uses a fixed stride of PAGE_SIZE and in its current form,
when a hugetlb mapping needs to be flushed, flush_tlb_range() flushes the
whole tlb: so set a stride of the size of the hugetlb mapping in order to
only flush the hugetlb mapping. However, if the hugepage is a NAPOT region,
all PTEs that constitute this mapping must be invalidated, so the stride
size must actually be the size of the PTE.

Note that THPs are directly handled by flush_pmd_tlb_range().

Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Samuel Holland <samuel.holland@sifive.com>
Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> # On RZ/Five SMARC
Link: https://lore.kernel.org/r/20231030133027.19542-3-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-11-06 07:20:50 -08:00
Alexandre Ghiti
c5e9b2c2ae riscv: Improve tlb_flush()
For now, tlb_flush() simply calls flush_tlb_mm() which results in a
flush of the whole TLB. So let's use mmu_gather fields to provide a more
fine-grained flush of the TLB.

Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Samuel Holland <samuel.holland@sifive.com>
Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> # On RZ/Five SMARC
Link: https://lore.kernel.org/r/20231030133027.19542-2-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-11-06 07:20:49 -08:00
David S. Miller
c1ed833e0b Merge branch 'smc-fixes'
D. Wythe says

====================
bugfixs for smc

This patches includes bugfix following:

1. hung state
2. sock leak
3. potential panic

We have been testing these patches for some time, but
if you have any questions, please let us know.

--
v1:
Fix spelling errors and incorrect function names in descriptions

v2->v1:
Add fix tags for bugfix patch
====================

Reviewed-by: Wenjia Zhang <wenjia@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-06 10:01:08 +00:00
D. Wythe
aa96fbd6d7 net/smc: put sk reference if close work was canceled
Note that we always hold a reference to sock when attempting
to submit close_work. Therefore, if we have successfully
canceled close_work from pending, we MUST release that reference
to avoid potential leaks.

Fixes: 42bfba9eaa ("net/smc: immediate termination for SMCD link groups")
Signed-off-by: D. Wythe <alibuda@linux.alibaba.com>
Reviewed-by: Dust Li <dust.li@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-06 10:01:07 +00:00
D. Wythe
c5bf605ba4 net/smc: allow cdc msg send rather than drop it with NULL sndbuf_desc
This patch re-fix the issues mentioned by commit 22a825c541
("net/smc: fix NULL sndbuf_desc in smc_cdc_tx_handler()").

Blocking sending message do solve the issues though, but it also
prevents the peer to receive the final message. Besides, in logic,
whether the sndbuf_desc is NULL or not have no impact on the processing
of cdc message sending.

Hence that, this patch allows the cdc message sending but to check the
sndbuf_desc with care in smc_cdc_tx_handler().

Fixes: 22a825c541 ("net/smc: fix NULL sndbuf_desc in smc_cdc_tx_handler()")
Signed-off-by: D. Wythe <alibuda@linux.alibaba.com>
Reviewed-by: Dust Li <dust.li@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-06 10:01:07 +00:00
D. Wythe
5211c97294 net/smc: fix dangling sock under state SMC_APPFINCLOSEWAIT
Considering scenario:

				smc_cdc_rx_handler
__smc_release
				sock_set_flag
smc_close_active()
sock_set_flag

__set_bit(DEAD)			__set_bit(DONE)

Dues to __set_bit is not atomic, the DEAD or DONE might be lost.
if the DEAD flag lost, the state SMC_CLOSED  will be never be reached
in smc_close_passive_work:

if (sock_flag(sk, SOCK_DEAD) &&
	smc_close_sent_any_close(conn)) {
	sk->sk_state = SMC_CLOSED;
} else {
	/* just shutdown, but not yet closed locally */
	sk->sk_state = SMC_APPFINCLOSEWAIT;
}

Replace sock_set_flags or __set_bit to set_bit will fix this problem.
Since set_bit is atomic.

Fixes: b38d732477 ("smc: socket closing and linkgroup cleanup")
Signed-off-by: D. Wythe <alibuda@linux.alibaba.com>
Reviewed-by: Dust Li <dust.li@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-06 10:01:07 +00:00
Jakub Kicinski
d93f952857 nfsd: regenerate user space parsers after ynl-gen changes
Commit 8cea95b0bd ("tools: ynl-gen: handle do ops with no input attrs")
added support for some of the previously-skipped ops in nfsd.
Regenerate the user space parsers to fill them in.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-06 09:03:46 +00:00
Kuniyuki Iwashima
0a8e987dcc tcp: Fix SYN option room calculation for TCP-AO.
When building SYN packet in tcp_syn_options(), MSS, TS, WS, and
SACKPERM are used without checking the remaining bytes in the
options area.

To keep that logic as is, we limit the TCP-AO MAC length in
tcp_ao_parse_crypto().  Currently, the limit is calculated as below.

  MAX_TCP_OPTION_SPACE - TCPOLEN_TSTAMP_ALIGNED
                       - TCPOLEN_WSCALE_ALIGNED
                       - TCPOLEN_SACKPERM_ALIGNED

This looks confusing as (1) we pack SACKPERM into the leading
2-bytes of the aligned 12-bytes of TS and (2) TCPOLEN_MSS_ALIGNED
is not used.  Fortunately, the calculated limit is not wrong as
TCPOLEN_SACKPERM_ALIGNED and TCPOLEN_MSS_ALIGNED are the same value.

However, we should use the proper constant in the formula.

  MAX_TCP_OPTION_SPACE - TCPOLEN_MSS_ALIGNED
                       - TCPOLEN_TSTAMP_ALIGNED
                       - TCPOLEN_WSCALE_ALIGNED

Fixes: 4954f17dde ("net/tcp: Introduce TCP_AO setsockopt()s")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-06 08:59:54 +00:00
Geetha sowjanya
3423ca23e0 octeontx2-pf: Free pending and dropped SQEs
On interface down, the pending SQEs in the NIX get dropped
or drained out during SMQ flush. But skb's pointed by these
SQEs never get free or updated to the stack as respective CQE
never get added.
This patch fixes the issue by freeing all valid skb's in SQ SG list.

Fixes: b1bc8457e9 ("octeontx2-pf: Cleanup all receive buffers in SG descriptor")
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-06 08:57:15 +00:00
Jamal Hadi Salim
40cb2fdfed net, sched: Fix SKB_NOT_DROPPED_YET splat under debug config
Getting the following splat [1] with CONFIG_DEBUG_NET=y and this
reproducer [2]. Problem seems to be that classifiers clear 'struct
tcf_result::drop_reason', thereby triggering the warning in
__kfree_skb_reason() due to reason being 'SKB_NOT_DROPPED_YET' (0).

Fixed by disambiguating a legit error from a verdict with a bogus drop_reason

[1]
WARNING: CPU: 0 PID: 181 at net/core/skbuff.c:1082 kfree_skb_reason+0x38/0x130
Modules linked in:
CPU: 0 PID: 181 Comm: mausezahn Not tainted 6.6.0-rc6-custom-ge43e6d9582e0 #682
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc37 04/01/2014
RIP: 0010:kfree_skb_reason+0x38/0x130
[...]
Call Trace:
 <IRQ>
 __netif_receive_skb_core.constprop.0+0x837/0xdb0
 __netif_receive_skb_one_core+0x3c/0x70
 process_backlog+0x95/0x130
 __napi_poll+0x25/0x1b0
 net_rx_action+0x29b/0x310
 __do_softirq+0xc0/0x29b
 do_softirq+0x43/0x60
 </IRQ>

[2]

ip link add name veth0 type veth peer name veth1
ip link set dev veth0 up
ip link set dev veth1 up
tc qdisc add dev veth1 clsact
tc filter add dev veth1 ingress pref 1 proto all flower dst_mac 00:11:22:33:44:55 action drop
mausezahn veth0 -a own -b 00:11:22:33:44:55 -q -c 1

Ido reported:

  [...] getting the following splat [1] with CONFIG_DEBUG_NET=y and this
  reproducer [2]. Problem seems to be that classifiers clear 'struct
  tcf_result::drop_reason', thereby triggering the warning in
  __kfree_skb_reason() due to reason being 'SKB_NOT_DROPPED_YET' (0). [...]

  [1]
  WARNING: CPU: 0 PID: 181 at net/core/skbuff.c:1082 kfree_skb_reason+0x38/0x130
  Modules linked in:
  CPU: 0 PID: 181 Comm: mausezahn Not tainted 6.6.0-rc6-custom-ge43e6d9582e0 #682
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc37 04/01/2014
  RIP: 0010:kfree_skb_reason+0x38/0x130
  [...]
  Call Trace:
   <IRQ>
   __netif_receive_skb_core.constprop.0+0x837/0xdb0
   __netif_receive_skb_one_core+0x3c/0x70
   process_backlog+0x95/0x130
   __napi_poll+0x25/0x1b0
   net_rx_action+0x29b/0x310
   __do_softirq+0xc0/0x29b
   do_softirq+0x43/0x60
   </IRQ>

  [2]
  #!/bin/bash

  ip link add name veth0 type veth peer name veth1
  ip link set dev veth0 up
  ip link set dev veth1 up
  tc qdisc add dev veth1 clsact
  tc filter add dev veth1 ingress pref 1 proto all flower dst_mac 00:11:22:33:44:55 action drop
  mausezahn veth0 -a own -b 00:11:22:33:44:55 -q -c 1

What happens is that inside most classifiers the tcf_result is copied over
from a filter template e.g. *res = f->res which then implicitly overrides
the prior SKB_DROP_REASON_TC_{INGRESS,EGRESS} default drop code which was
set via sch_handle_{ingress,egress}() for kfree_skb_reason().

Commit text above copied verbatim from Daniel. The general idea of the patch
is not very different from what Ido originally posted but instead done at the
cls_api codepath.

Fixes: 54a59aed39 ("net, sched: Make tc-related drop reason more flexible")
Reported-by: Ido Schimmel <idosch@idosch.org>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://lore.kernel.org/netdev/ZTjY959R+AFXf3Xy@shredder
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-06 08:56:25 +00:00
Linus Torvalds
d2f51b3516 Merge tag 'rtc-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux
Pull RTC updates from Alexandre Belloni:
 "There is a new driver for the RTC of the Mstar SSD202D SoC. The
  rtc7301 driver gains support for byte addresses to support the
  USRobotics USR8200. Then we have many non user visible changes and
  typo fixes.

  Summary:

  Subsytem:
   - convert platform drivers to remove_new
   - prevent modpost warnings for unremovable platform drivers

  New driver:
   - Mstar SSD202D

  Drivers:
   - brcmstb-waketimer: support level alarm_irq
   - ep93xx: add DT support
   - rtc7301: support byte-addressed IO"

* tag 'rtc-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux: (28 commits)
  dt-bindings: rtc: Add Mstar SSD202D RTC
  rtc: Add support for the SSD202D RTC
  rtc: at91rm9200: annotate at91_rtc_remove with __exit again
  dt-bindings: rtc: microcrystal,rv3032: Document wakeup-source property
  dt-bindings: rtc: pcf8523: Convert to YAML
  dt-bindings: rtc: mcp795: move to trivial-rtc
  rtc: ep93xx: add DT support for Cirrus EP93xx
  dt-bindings: rtc: Add Cirrus EP93xx
  dt-bindings: rtc: pcf2123: convert to YAML
  rtc: efi: fixed typo in efi_procfs()
  rtc: omap: Use device_get_match_data()
  rtc: pcf85363: fix wrong mask/val parameters in regmap_update_bits call
  rtc: rtc7301: Support byte-addressed IO
  rtc: rtc7301: Rewrite bindings in schema
  rtc: sh: Convert to platform remove callback returning void
  rtc: pxa: Convert to platform remove callback returning void
  rtc: mv: Convert to platform remove callback returning void
  rtc: imxdi: Convert to platform remove callback returning void
  rtc: at91rm9200: Convert to platform remove callback returning void
  rtc: pcap: Drop no-op remove function
  ...
2023-11-05 18:49:40 -08:00
Linus Torvalds
7b2c9e41e7 Merge tag 'mailbox-v6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/jassibrar/mailbox
Pull mailbox updates from Jassi Brar:

 - imx: add support for TX Doorbell v2

 - mtk: implement runtime PM

 - zynqmp: add destination mailbox compatible

 - qcom:
    - add another clock provider for IPQ
    - add SM8650 compatible

 - misc: use preferred device_get_match_data()

* tag 'mailbox-v6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/jassibrar/mailbox:
  dt-bindings: mailbox: qcom-ipcc: document the SM8650 Inter-Processor Communication Controller
  mailbox: mtk-cmdq-mailbox: Implement Runtime PM with autosuspend
  mailbox: Use device_get_match_data()
  dt-bindings: zynqmp: add destination mailbox compatible
  dt-bindings: mailbox: qcom: add one more clock provider for IPQ mailbox
  mailbox: imx: support channel type tx doorbell v2
  dt-bindings: mailbox: fsl,mu: add new tx doorbell channel
2023-11-05 18:45:32 -08:00
Dave Airlie
9ccde17d46 Merge tag 'amd-drm-next-6.7-2023-11-03' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-6.7-2023-11-03:

amdgpu:
- Fix RAS support check
- RAS fixes
- MES fixes
- SMU13 fixes
- Contiguous memory allocation fix
- BACO fixes
- GPU reset fixes
- Min power limit fixes
- GFX11 fixes
- USB4/TB hotplug fixes
- ARM regression fix
- GFX9.4.3 fixes
- KASAN/KCSAN stack size check fixes
- SR-IOV fixes
- SMU14 fixes
- PSP13 fixes
- Display blend fixes
- Flexible array size fixes

amdkfd:
- GPUVM fix

radeon:
- Flexible array size fixes

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231103173203.4912-1-alexander.deucher@amd.com
2023-11-06 11:25:14 +10:00
Dave Airlie
f056cb9681 Merge tag 'drm-misc-next-fixes-2023-11-02' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
drm-misc-next-fixes for v6.7-rc1:

- dt binding fix for ssd132x
- Initialize ssd130x crtc_state to NULL.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/58f40043-bb8a-4716-bf07-89f6a9f56c4c@linux.intel.com
2023-11-06 11:24:30 +10:00
Andreas Gruenbacher
0cdc6f44e9 gfs2: don't withdraw if init_threads() got interrupted
In gfs2_fill_super(), when mounting a gfs2 filesystem is interrupted,
kthread_create() can return -EINTR.  When that happens, we roll back
what has already been done and abort the mount.

Since commit 62dd0f98a0 ("gfs2: Flag a withdraw if init_threads()
fails), we are calling gfs2_withdraw_delayed() in gfs2_fill_super();
first via gfs2_make_fs_rw(), then directly.  But gfs2_withdraw_delayed()
only marks the filesystem as withdrawing and relies on a caller further
up the stack to do the actual withdraw, which doesn't exist in the
gfs2_fill_super() case.  Because the filesystem is marked as withdrawing
/ withdrawn, function gfs2_lm_unmount() doesn't release the dlm
lockspace, so when we try to mount that filesystem again, we get:

    gfs2: fsid=gohan:gohan0: Trying to join cluster "lock_dlm", "gohan:gohan0"
    gfs2: fsid=gohan:gohan0: dlm_new_lockspace error -17

Since commit b77b4a4815 ("gfs2: Rework freeze / thaw logic"), the
deadlock this gfs2_withdraw_delayed() call was supposed to work around
cannot occur anymore because freeze_go_callback() won't take the
sb->s_umount semaphore unconditionally anymore, so we can get rid of the
gfs2_withdraw_delayed() in gfs2_fill_super() entirely.

Reported-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Cc: stable@vger.kernel.org # v6.5+
2023-11-06 01:51:26 +01:00
Su Hui
bb25b97562 gfs2: remove dead code in add_to_queue
clang static analyzer complains that value stored to 'gh' is never read.
The code of this line is useless after commit 0b93bac227
("gfs2: Remove LM_FLAG_PRIORITY flag"). Remove this code to save space.

Signed-off-by: Su Hui <suhui@nfschina.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-11-06 01:51:26 +01:00
Juntong Deng
bdcb8aa434 gfs2: Fix slab-use-after-free in gfs2_qd_dealloc
In gfs2_put_super(), whether withdrawn or not, the quota should
be cleaned up by gfs2_quota_cleanup().

Otherwise, struct gfs2_sbd will be freed before gfs2_qd_dealloc (rcu
callback) has run for all gfs2_quota_data objects, resulting in
use-after-free.

Also, gfs2_destroy_threads() and gfs2_quota_cleanup() is already called
by gfs2_make_fs_ro(), so in gfs2_put_super(), after calling
gfs2_make_fs_ro(), there is no need to call them again.

Reported-by: syzbot+29c47e9e51895928698c@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=29c47e9e51895928698c
Signed-off-by: Juntong Deng <juntong.deng@outlook.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-11-06 01:51:26 +01:00
Andreas Gruenbacher
074d7306a4 gfs2: Silence "suspicious RCU usage in gfs2_permission" warning
Commit 0abd1557e2 added rcu_dereference() for dereferencing ip->i_gl
in gfs2_permission.  This now causes lockdep to complain when
gfs2_permission is called in non-RCU context:

    WARNING: suspicious RCU usage in gfs2_permission

Switch to rcu_dereference_check() and check for the MAY_NOT_BLOCK flag
to shut up lockdep when we know that dereferencing ip->i_gl is safe.

Fixes: 0abd1557e2 ("gfs2: fix an oops in gfs2_permission")
Reported-by: syzbot+3e5130844b0c0e2b4948@syzkaller.appspotmail.com
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-11-06 01:51:26 +01:00
Amir Goldstein
d6fc6c9363 gfs2: fs: derive f_fsid from s_uuid
gfs2 already has optional persistent uuid.

Use that uuid to report f_fsid in statfs(2), same as ext2/ext4/zonefs.

This allows gfs2 to be monitored by fanotify filesystem watch.
for example, with inotify-tools 4.23.8.0, the following command can be
used to watch changes over entire filesystem:

  fsnotifywatch --filesystem /mnt/gfs2

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-11-06 01:51:26 +01:00
Andreas Gruenbacher
0b2355fe91 gfs2: No longer use 'extern' in function declarations
For non-static function declarations, external linkage is implied and
the 'extern' keyword isn't needed.  Some static checkers complain about
the overuse of 'extern', so clean up all the function declarations.

In addition, remove 'extern' from the definition of
free_local_statfs_inodes(); it isn't needed there, either.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-11-06 01:51:26 +01:00
Andreas Gruenbacher
062fb90389 gfs2: Rename gfs2_lookup_{ simple => meta }
Function gfs2_lookup_simple() is used for looking up inodes in the
metadata directory tree, so rename it to gfs2_lookup_meta() to closer
match its purpose.  Clean the function up a little on the way.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-11-06 01:51:26 +01:00
Andreas Gruenbacher
be7f6a6b0b gfs2: Convert gfs2_internal_read to folios
Change gfs2_internal_read() to use folios.  Convert sizes to size_t.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-11-06 01:51:26 +01:00
Andreas Gruenbacher
7fa4964b35 gfs2: Convert stuffed_readpage to folios
Change stuffed_readpage() to take a folio instead of a page.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-11-06 01:51:25 +01:00
Andreas Gruenbacher
d6d64dac1d gfs2: Minor gfs2_write_jdata_batch PAGE_SIZE cleanup
In gfs2_write_jdata_batch(), to compute the number of blocks, compute
the total size of the folio batch instead of the number of pages it
contains.  Not a functional change.

Note that we don't currently allow mounting filesystems with a block
size bigger than the page size.  We could change that after converting
the page cache to folios.  The page cache would then only contain
block-size or bigger folios, so rounding wouldn't become an issue here.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-11-06 01:51:25 +01:00
Andreas Gruenbacher
4c7b3f7fb7 gfs2: Get rid of gfs2_alloc_blocks generation parameter
Get rid of the generation parameter of gfs2_alloc_blocks(): we only ever
set the generation of the current inode while creating it, so do so
directly.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-11-06 01:51:25 +01:00
Jisheng Zhang
dbfbda3bd6 riscv: mm: update T-Head memory type definitions
Update T-Head memory type definitions according to C910 doc [1]
For NC and IO, SH property isn't configurable, hardcoded as SH,
so set SH for NOCACHE and IO.

And also set bit[61](Bufferable) for NOCACHE according to the
table 6.1 in the doc [1].

Link: https://github.com/T-head-Semi/openc910 [1]
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-by: Guo Ren <guoren@kernel.org>
Tested-by: Drew Fustini <dfustini@baylibre.com>
Link: https://lore.kernel.org/r/20230912072510.2510-1-jszhang@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-11-05 14:17:32 -08:00
Palmer Dabbelt
7f00a97500 Merge patch series "riscv: vdso.lds.S: some improvement"
Jisheng Zhang <jszhang@kernel.org> says:

This series renews one of my last year RFC patch[1], tries to improve
the vdso layout a bit.

patch1 removes useless symbols
patch2 merges .data section of vdso into .rodata because they are
readonly
patch3 is the real renew patch, it removes hardcoded 0x800 .text start
addr. But I rewrite the commit msg per Andrew's suggestions and move
move .note, .eh_frame_hdr, and .eh_frame between .rodata and .text to
keep the actual code well away from the non-instruction data.

* b4-shazam-merge:
  riscv: vdso.lds.S: remove hardcoded 0x800 .text start addr
  riscv: vdso.lds.S: merge .data section into .rodata section
  riscv: vdso.lds.S: drop __alt_start and __alt_end symbols

Link: https://lore.kernel.org/linux-riscv/20221123161805.1579-1-jszhang@kernel.org/ [1]
Link: https://lore.kernel.org/r/20230912072015.2424-1-jszhang@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-11-05 14:15:17 -08:00
Jisheng Zhang
8f8c1ff879 riscv: vdso.lds.S: remove hardcoded 0x800 .text start addr
I believe the hardcoded 0x800 and related comments come from the long
history VDSO_TEXT_OFFSET in x86 vdso code, but commit 5b93049337
("x86 vDSO: generate vdso-syms.lds") and commit f6b46ebf90 ("x86
vDSO: new layout") removes the comment and hard coding for x86.

Similar as x86 and other arch, riscv doesn't need the rigid layout
using VDSO_TEXT_OFFSET since it "no longer matters to the kernel".
so we could remove the hard coding now, and removing it brings a
small vdso.so and aligns with other architectures.

Also, having enough separation between data and text is important for
I-cache, so similar as x86, move .note, .eh_frame_hdr, and .eh_frame
between .rodata and .text.

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Tested-by: Emil Renner Berthing <emil.renner.berthing@canonical.com>
Link: https://lore.kernel.org/r/20230912072015.2424-4-jszhang@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-11-05 14:15:14 -08:00
Jisheng Zhang
49cfbdc21f riscv: vdso.lds.S: merge .data section into .rodata section
The .data section doesn't need to be separate from .rodata section,
they are both readonly.

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Tested-by: Emil Renner Berthing <emil.renner.berthing@canonical.com>
Link: https://lore.kernel.org/r/20230912072015.2424-3-jszhang@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-11-05 14:15:13 -08:00
Jisheng Zhang
ddcc7d9bf5 riscv: vdso.lds.S: drop __alt_start and __alt_end symbols
These two symbols are not used, remove them.

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Tested-by: Emil Renner Berthing <emil.renner.berthing@canonical.com>
Link: https://lore.kernel.org/r/20230912072015.2424-2-jszhang@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-11-05 14:15:13 -08:00
Yunhui Cui
b8a03a6341 riscv: add userland instruction dump to RISC-V splats
Add userland instruction dump and rename dump_kernel_instr()
to dump_instr().

An example:
[    0.822439] Freeing unused kernel image (initmem) memory: 6916K
[    0.823817] Run /init as init process
[    0.839411] init[1]: unhandled signal 4 code 0x1 at 0x000000000005be18 in bb[10000+5fb000]
[    0.840751] CPU: 0 PID: 1 Comm: init Not tainted 5.14.0-rc4-00049-gbd644290aa72-dirty #187
[    0.841373] Hardware name:  , BIOS
[    0.841743] epc : 000000000005be18 ra : 0000000000079e74 sp : 0000003fffcafda0
[    0.842271]  gp : ffffffff816e9dc8 tp : 0000000000000000 t0 : 0000000000000000
[    0.842947]  t1 : 0000003fffc9fdf0 t2 : 0000000000000000 s0 : 0000000000000000
[    0.843434]  s1 : 0000000000000000 a0 : 0000003fffca0190 a1 : 0000003fffcafe18
[    0.843891]  a2 : 0000000000000000 a3 : 0000000000000000 a4 : 0000000000000000
[    0.844357]  a5 : 0000000000000000 a6 : 0000000000000000 a7 : 0000000000000000
[    0.844803]  s2 : 0000000000000000 s3 : 0000000000000000 s4 : 0000000000000000
[    0.845253]  s5 : 0000000000000000 s6 : 0000000000000000 s7 : 0000000000000000
[    0.845722]  s8 : 0000000000000000 s9 : 0000000000000000 s10: 0000000000000000
[    0.846180]  s11: 0000000000d144e0 t3 : 0000000000000000 t4 : 0000000000000000
[    0.846616]  t5 : 0000000000000000 t6 : 0000000000000000
[    0.847204] status: 0000000200000020 badaddr: 00000000f0028053 cause: 0000000000000002
[    0.848219] Code: f06f ff5f 3823 fa11 0113 fb01 2e23 0201 0293 0000 (8053) f002
[    0.851016] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000004

Signed-off-by: Yunhui Cui <cuiyunhui@bytedance.com>
Reviewed-by: Björn Töpel <bjorn@rivosinc.com>
Link: https://lore.kernel.org/r/20230912021349.28302-1-cuiyunhui@bytedance.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-11-05 14:14:13 -08:00
Nam Cao
8cb22bec14 riscv: kprobes: allow writing to x0
Instructions can write to x0, so we should simulate these instructions
normally.

Currently, the kernel hangs if an instruction who writes to x0 is
simulated.

Fixes: c22b0bcb1d ("riscv: Add kprobes supported")
Cc: stable@vger.kernel.org
Signed-off-by: Nam Cao <namcaov@gmail.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Acked-by: Guo Ren <guoren@kernel.org>
Link: https://lore.kernel.org/r/20230829182500.61875-1-namcaov@gmail.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-11-05 14:12:47 -08:00
Nam Cao
b701f9e726 riscv: provide riscv-specific is_trap_insn()
uprobes expects is_trap_insn() to return true for any trap instructions,
not just the one used for installing uprobe. The current default
implementation only returns true for 16-bit c.ebreak if C extension is
enabled. This can confuse uprobes if a 32-bit ebreak generates a trap
exception from userspace: uprobes asks is_trap_insn() who says there is no
trap, so uprobes assume a probe was there before but has been removed, and
return to the trap instruction. This causes an infinite loop of entering
and exiting trap handler.

Instead of using the default implementation, implement this function
speficially for riscv with checks for both ebreak and c.ebreak.

Fixes: 74784081aa ("riscv: Add uprobes supported")
Signed-off-by: Nam Cao <namcaov@gmail.com>
Tested-by: Björn Töpel <bjorn@rivosinc.com>
Reviewed-by: Guo Ren <guoren@kernel.org>
Link: https://lore.kernel.org/r/20230829083614.117748-1-namcaov@gmail.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-11-05 14:12:28 -08:00
Alexander Gordeev
02e790ee30 s390/mm: make pte_free_tlb() similar to pXd_free_tlb()
Make pte_free_tlb() look similar to pXd_free_tlb() family
functions.

Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2023-11-05 22:34:58 +01:00
Alexander Gordeev
0031f1c7cf s390/mm: use compound page order to distinguish page tables
CRSTs always have size of four pages, while 2KB-size page tables
always occupy a single page. Use that information to distinguish
page tables from CRSTs.

Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2023-11-05 22:34:58 +01:00