linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-05-01 06:04:48 -04:00

Author	SHA1	Message	Date
Thomas Weißschuh	c55eb03765	net/ipv6/addrconf: constify ctl_table arguments of utility functions The sysctl core is preparing to only expose instances of struct ctl_table as "const". This will also affect the ctl_table argument of sysctl handlers. As the function prototype of all sysctl handlers throughout the tree needs to stay consistent that change will be done in one commit. To reduce the size of that final commit, switch utility functions which are not bound by "typedef proc_handler" to "const struct ctl_table". No functional change. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://lore.kernel.org/r/20240527-sysctl-const-handler-net-v1-3-16523767d0b2@weissschuh.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-05-28 19:49:47 -07:00
Thomas Weißschuh	551814313f	net/ipv4/sysctl: constify ctl_table arguments of utility functions The sysctl core is preparing to only expose instances of struct ctl_table as "const". This will also affect the ctl_table argument of sysctl handlers. As the function prototype of all sysctl handlers throughout the tree needs to stay consistent that change will be done in one commit. To reduce the size of that final commit, switch utility functions which are not bound by "typedef proc_handler" to "const struct ctl_table". No functional change. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://lore.kernel.org/r/20240527-sysctl-const-handler-net-v1-2-16523767d0b2@weissschuh.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-05-28 19:49:47 -07:00
Thomas Weißschuh	874aa96d78	net/neighbour: constify ctl_table arguments of utility function The sysctl core is preparing to only expose instances of struct ctl_table as "const". This will also affect the ctl_table argument of sysctl handlers. As the function prototype of all sysctl handlers throughout the tree needs to stay consistent that change will be done in one commit. To reduce the size of that final commit, switch utility functions which are not bound by "typedef proc_handler" to "const struct ctl_table". No functional change. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://lore.kernel.org/r/20240527-sysctl-const-handler-net-v1-1-16523767d0b2@weissschuh.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-05-28 19:49:47 -07:00
Heiner Kallweit	982300c115	r8169: remove detection of chip version 11 (early RTL8168b) This early RTL8168b version was the first PCIe chip version, and it's quite quirky. Last sign of life is from more than 15 yrs ago. Let's remove detection of this chip version, we'll see whether anybody complains. If not, support for this chip version can be removed a few kernel versions later. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://lore.kernel.org/r/875cdcf4-843c-420a-ad5d-417447b68572@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-05-28 19:47:42 -07:00
Heiner Kallweit	6994520a33	r8169: disable interrupt source RxOverflow Vendor driver calls this bit RxDescUnavail. All we do in the interrupt handler in this case is scheduling NAPI. If we should be out of RX descriptors, then NAPI is scheduled anyway. Therefore remove this interrupt source. Tested on RTL8168h. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Sunil Goutham <sgoutham@marvell.com> Link: https://lore.kernel.org/r/9b2054b2-0548-4f48-bf91-b646572093b4@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-05-28 19:47:24 -07:00
Jakub Kicinski	4b3529edbb	Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2024-05-28 We've added 23 non-merge commits during the last 11 day(s) which contain a total of 45 files changed, 696 insertions(+), 277 deletions(-). The main changes are: 1) Rename skb's mono_delivery_time to tstamp_type for extensibility and add SKB_CLOCK_TAI type support to bpf_skb_set_tstamp(), from Abhishek Chauhan. 2) Add netfilter CT zone ID and direction to bpf_ct_opts so that arbitrary CT zones can be used from XDP/tc BPF netfilter CT helper functions, from Brad Cowie. 3) Several tweaks to the instruction-set.rst IETF doc to address the Last Call review comments, from Dave Thaler. 4) Small batch of riscv64 BPF JIT optimizations in order to emit more compressed instructions to the JITed image for better icache efficiency, from Xiao Wang. 5) Sort bpftool C dump output from BTF, aiming to simplify vmlinux.h diffing and forcing more natural type definitions ordering, from Mykyta Yatsenko. 6) Use DEV_STATS_INC() macro in BPF redirect helpers to silence a syzbot/KCSAN race report for the tx_errors counter, from Jiang Yunshui. 7) Un-constify bpf_func_info in bpftool to fix compilation with LLVM 17+ which started treating const structs as constants and thus breaking full BTF program name resolution, from Ivan Babrou. 8) Fix up BPF program numbers in test_sockmap selftest in order to reduce some of the test-internal array sizes, from Geliang Tang. 9) Small cleanup in Makefile.btf script to use test-ge check for v1.25-only pahole, from Alan Maguire. 10) Fix bpftool's make dependencies for vmlinux.h in order to avoid needless rebuilds in some corner cases, from Artem Savkov. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (23 commits) bpf, net: Use DEV_STAT_INC() bpf, docs: Fix instruction.rst indentation bpf, docs: Clarify call local offset bpf, docs: Add table captions bpf, docs: clarify sign extension of 64-bit use of 32-bit imm bpf, docs: Use RFC 2119 language for ISA requirements bpf, docs: Move sentence about returning R0 to abi.rst bpf: constify member bpf_sysctl_kern:: Table riscv, bpf: Try RVC for reg move within BPF_CMPXCHG JIT riscv, bpf: Use STACK_ALIGN macro for size rounding up riscv, bpf: Optimize zextw insn with Zba extension selftests/bpf: Handle forwarding of UDP CLOCK_TAI packets net: Add additional bit to support clockid_t timestamp type net: Rename mono_delivery_time to tstamp_type for scalabilty selftests/bpf: Update tests for new ct zone opts for nf_conntrack kfuncs net: netfilter: Make ct zone opts configurable for bpf ct helpers selftests/bpf: Fix prog numbers in test_sockmap bpf: Remove unused variable "prev_state" bpftool: Un-const bpf_func_info to fix it for llvm 17 and newer bpf: Fix order of args in call to bpf_map_kvcalloc ... ==================== Link: https://lore.kernel.org/r/20240528105924.30905-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-05-28 07:27:29 -07:00
Dr. David Alan Gilbert	c30ff5f3ae	net: usb: remove unused structs 'usb_context' Both lan78xx and smsc75xx have a 'usb_context' struct which is unused, since their original commits. Remove them. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Link: https://lore.kernel.org/r/20240526205922.176578-1-linux@treblig.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-05-28 15:24:34 +02:00
Dr. David Alan Gilbert	18ae4c093c	net: ethernet: 8390: ne2k-pci: remove unused struct 'ne2k_pci_card' 'ne2k_pci_card' is unused since 2.3.99-pre3 in March 2000. Remove it. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-05-28 15:21:12 +02:00
Dr. David Alan Gilbert	ef7f9febb3	net: ethernet: mlx4: remove unused struct 'mlx4_port_config' 'mlx4_port_config was added by commit `ab9c17a009` ("mlx4_core: Modify driver initialization flow to accommodate SRIOV for Ethernet") but remained unused. Remove it. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-05-28 15:21:12 +02:00
Dr. David Alan Gilbert	a09892f6e2	net: ethernet: liquidio: remove unused structs 'niclist' and 'oct_link_status_resp' are unused since the original commit `f21fb3ed36` ("Add support of Cavium Liquidio ethernet adapters"). Remove them. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-05-28 15:21:11 +02:00
Dr. David Alan Gilbert	b2ff269850	net: ethernet: starfire: remove unused structs 'short_rx_done_desc' and 'basic_rx_done_desc' are unused since commit `fdecea6668` (" [netdrvr starfire] Add GPL'd firmware, remove compat code"). Remove them. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-05-28 15:21:04 +02:00
Gou Hao	de31e96cf4	net/core: move the lockdep-init of sk_callback_lock to sk_init_common() In commit `cdfbabfb2f` ("net: Work around lockdep limitation in sockets that use sockets"), it introduces 'af_kern_callback_keys' to lockdep-init of sk_callback_lock according to 'sk_kern_sock', it modifies sock_init_data() only, and sk_clone_lock() calls sk_init_common() to initialize sk_callback_lock too, so the lockdep-init of sk_callback_lock should be moved to sk_init_common(). Signed-off-by: Gou Hao <gouhao@uniontech.com> Link: https://lore.kernel.org/r/20240526145718.9542-2-gouhao@uniontech.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-05-28 13:29:36 +02:00
Gou Hao	c65b652111	net/core: remove redundant sk_callback_lock initialization sk_callback_lock has already been initialized in sk_init_common(). Signed-off-by: Gou Hao <gouhao@uniontech.com> Reviewed-by: Breno Leitao <leitao@debian.org> Link: https://lore.kernel.org/r/20240526145718.9542-1-gouhao@uniontech.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-05-28 13:29:36 +02:00
yunshui	d9cbd8343b	bpf, net: Use DEV_STAT_INC() syzbot/KCSAN reported that races happen when multiple CPUs updating dev->stats.tx_error concurrently. Adopt SMP safe DEV_STATS_INC() to update the dev->stats fields. Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: yunshui <jiangyunshui@kylinos.cn> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20240523033520.4029314-1-jiangyunshui@kylinos.cn	2024-05-28 12:04:11 +02:00
Dr. David Alan Gilbert	5233a55a52	mISDN: remove unused struct 'bf_ctx' 'bf_ctx' appears unused since the original commit `960366cf8d` ("Add mISDN DSP"). Remove it. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Link: https://lore.kernel.org/r/20240523155922.67329-1-linux@treblig.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-05-27 16:48:00 -07:00
Dave Thaler	e245ef8a0b	bpf, docs: Fix instruction.rst indentation The table captions patch corrected indented most tables to work with the table directive for adding a caption but missed two of them. Signed-off-by: Dave Thaler <dthaler1968@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20240526061815.22497-1-dthaler1968@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-05-26 09:58:42 -07:00
Dave Thaler	f980f13e4e	bpf, docs: Clarify call local offset In the Jump instructions section it explains that the offset is "relative to the instruction following the jump instruction". But the program-local section confusingly said "referenced by offset from the call instruction, similar to JA". This patch updates that sentence with consistent wording, saying it's relative to the instruction following the call instruction. Signed-off-by: Dave Thaler <dthaler1968@gmail.com> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20240525153332.21355-1-dthaler1968@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-05-25 10:41:57 -07:00
Dave Thaler	6a6d8b6f00	bpf, docs: Add table captions As suggested by Ines Robles in his IETF GENART review at https://datatracker.ietf.org/doc/review-ietf-bpf-isa-02-genart-lc-robles-2024-05-16/ Signed-off-by: Dave Thaler <dthaler1968@gmail.com> Link: https://lore.kernel.org/r/20240524164618.18894-1-dthaler1968@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-05-25 10:39:31 -07:00
Dave Thaler	4e1215d9a1	bpf, docs: clarify sign extension of 64-bit use of 32-bit imm imm is defined as a 32-bit signed integer. {MOV, K, ALU64} says it does "dst = src" (where src is 'imm') and it does do dst = (s64)imm, which in that sense does sign extend imm. The MOVSX instruction is explained as sign extending, so added the example of {MOV, K, ALU64} to make this more clear. {JLE, K, JMP} says it does "PC += offset if dst <= src" (where src is 'imm', and the comparison is unsigned). This was apparently ambiguous to some readers as to whether the comparison was "dst <= (u64)(u32)imm" or "dst <= (u64)(s64)imm" so added an example to make this more clear. v1 -> v2: Address comments from Yonghong Signed-off-by: Dave Thaler <dthaler1968@googlemail.com> Acked-by: Yonghong Song <yonghong.song@linux.dev> Acked-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/r/20240520215255.10595-1-dthaler1968@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-05-25 10:39:03 -07:00
Dave Thaler	a985fdca5e	bpf, docs: Use RFC 2119 language for ISA requirements Per IETF convention and discussion at LSF/MM/BPF, use MUST etc. keywords as requested by IETF Area Director review. Also as requested, indicate that documenting BTF is out of scope of this document and will be covered by a separate IETF specification. Added paragraph about the terminology that is required IETF boilerplate and must be worded exactly as such. Signed-off-by: Dave Thaler <dthaler1968@googlemail.com> Acked-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/r/20240517165855.4688-1-dthaler1968@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-05-25 10:38:35 -07:00
Dave Thaler	4652072e7b	bpf, docs: Move sentence about returning R0 to abi.rst As discussed at LSF/MM/BPF, the sentence about using R0 for returning values from calls is part of the calling convention and belongs in abi.rst. Any further additions or clarifications to this text are left for future patches on abi.rst. The current patch is simply to unblock progression of instruction-set.rst to a standard. In contrast, the restriction of register numbers to the range 0-10 is untouched, left in the instruction-set.rst definition of the src_reg and dst_reg fields. Signed-off-by: Dave Thaler <dthaler1968@googlemail.com> Link: https://lore.kernel.org/r/20240517153445.3914-1-dthaler1968@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-05-25 10:37:49 -07:00
Thomas Weißschuh	2c1713a8f1	bpf: constify member bpf_sysctl_kern:: Table The sysctl core is preparing to only expose instances of struct ctl_table as "const". This will also affect the ctl_table argument of sysctl handlers, for which bpf_sysctl_kern::table is also used. As the function prototype of all sysctl handlers throughout the tree needs to stay consistent that change will be done in one commit. To reduce the size of that final commit, switch this utility type which is not bound by "typedef proc_handler" to "const struct ctl_table". No functional change. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Joel Granados <j.granados@samsung.com> Link: https://lore.kernel.org/bpf/20240518-sysctl-const-handler-bpf-v1-1-f0d7186743c1@weissschuh.net	2024-05-24 17:44:32 +02:00
Xiao Wang	99fa63d9ca	riscv, bpf: Try RVC for reg move within BPF_CMPXCHG JIT We could try to emit compressed insn for reg move operation during CMPXCHG JIT, the instruction compression has no impact on the jump offsets of following forward and backward jump instructions. Signed-off-by: Xiao Wang <xiao.w.wang@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Björn Töpel <bjorn@kernel.org> Link: https://lore.kernel.org/bpf/20240519050507.2217791-1-xiao.w.wang@intel.com	2024-05-24 17:40:33 +02:00
Xiao Wang	e944fc8152	riscv, bpf: Use STACK_ALIGN macro for size rounding up Use the macro STACK_ALIGN that is defined in asm/processor.h for stack size rounding up, just like bpf_jit_comp32.c does. Signed-off-by: Xiao Wang <xiao.w.wang@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Pu Lehui <pulehui@huawei.com> Link: https://lore.kernel.org/bpf/20240523031835.3977713-1-xiao.w.wang@intel.com	2024-05-24 17:16:56 +02:00
Xiao Wang	c12603e76e	riscv, bpf: Optimize zextw insn with Zba extension The Zba extension provides add.uw insn which can be used to implement zext.w with rs2 set as ZERO. Signed-off-by: Xiao Wang <xiao.w.wang@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Pu Lehui <pulehui@huawei.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Pu Lehui <pulehui@huawei.com> Link: https://lore.kernel.org/bpf/20240516090430.493122-1-xiao.w.wang@intel.com	2024-05-24 16:53:12 +02:00
Martin KaFai Lau	ecec1887e2	Merge branch 'Replace mono_delivery_time with tstamp_type' Abhishek Chauhan says: ==================== Patch 1 :- This patch takes care of only renaming the mono delivery timestamp to tstamp_type with no change in functionality of existing available code in kernel also Starts assigning tstamp_type with either mono or real and introduces a new enum in the skbuff.h, again no change in functionality of the existing available code in kernel , just making the code scalable. Patch 2 :- Additional bit was added to support tai timestamp type to avoid tstamp drops in the forwarding path when testing TC-ETF. Patch is also updating bpf filter.c Some updates to bpf header files with introduction to BPF_SKB_CLOCK_TAI and documentation updates stating deprecation of BPF_SKB_TSTAMP_UNSPEC and BPF_SKB_TSTAMP_DELIVERY_MONO Patch 3:- Handles forwarding of UDP packets with TAI clock id tstamp_type type with supported changes for tc_redirect/tc_redirect_dtime to handle forwarding of UDP packets with TAI tstamp_type ==================== Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2024-05-23 14:14:50 -07:00
Abhishek Chauhan	c34e3ab2a7	selftests/bpf: Handle forwarding of UDP CLOCK_TAI packets With changes in the design to forward CLOCK_TAI in the skbuff framework, existing selftest framework needs modification to handle forwarding of UDP packets with CLOCK_TAI as clockid. Signed-off-by: Abhishek Chauhan <quic_abchauha@quicinc.com> Reviewed-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20240509211834.3235191-4-quic_abchauha@quicinc.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2024-05-23 14:14:43 -07:00
Abhishek Chauhan	1693c5db6a	net: Add additional bit to support clockid_t timestamp type tstamp_type is now set based on actual clockid_t compressed into 2 bits. To make the design scalable for future needs this commit bring in the change to extend the tstamp_type:1 to tstamp_type:2 to support other clockid_t timestamp. We now support CLOCK_TAI as part of tstamp_type as part of this commit with existing support CLOCK_MONOTONIC and CLOCK_REALTIME. Signed-off-by: Abhishek Chauhan <quic_abchauha@quicinc.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20240509211834.3235191-3-quic_abchauha@quicinc.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2024-05-23 14:14:36 -07:00
Abhishek Chauhan	4d25ca2d68	net: Rename mono_delivery_time to tstamp_type for scalabilty mono_delivery_time was added to check if skb->tstamp has delivery time in mono clock base (i.e. EDT) otherwise skb->tstamp has timestamp in ingress and delivery_time at egress. Renaming the bitfield from mono_delivery_time to tstamp_type is for extensibilty for other timestamps such as userspace timestamp (i.e. SO_TXTIME) set via sock opts. As we are renaming the mono_delivery_time to tstamp_type, it makes sense to start assigning tstamp_type based on enum defined in this commit. Earlier we used bool arg flag to check if the tstamp is mono in function skb_set_delivery_time, Now the signature of the functions accepts tstamp_type to distinguish between mono and real time. Also skb_set_delivery_type_by_clockid is a new function which accepts clockid to determine the tstamp_type. In future tstamp_type:1 can be extended to support userspace timestamp by increasing the bitfield. Signed-off-by: Abhishek Chauhan <quic_abchauha@quicinc.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20240509211834.3235191-2-quic_abchauha@quicinc.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2024-05-23 14:14:23 -07:00
Linus Torvalds	66ad4829dd	Merge tag 'net-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Quite smaller than usual. Notably it includes the fix for the unix regression from the past weeks. The TCP window fix will require some follow-up, already queued. Current release - regressions: - af_unix: fix garbage collection of embryos Previous releases - regressions: - af_unix: fix race between GC and receive path - ipv6: sr: fix missing sk_buff release in seg6_input_core - tcp: remove 64 KByte limit for initial tp->rcv_wnd value - eth: r8169: fix rx hangup - eth: lan966x: remove ptp traps in case the ptp is not enabled - eth: ixgbe: fix link breakage vs cisco switches - eth: ice: prevent ethtool from corrupting the channels Previous releases - always broken: - openvswitch: set the skbuff pkt_type for proper pmtud support - tcp: Fix shift-out-of-bounds in dctcp_update_alpha() Misc: - a bunch of selftests stabilization patches" * tag 'net-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (25 commits) r8169: Fix possible ring buffer corruption on fragmented Tx packets. idpf: Interpret .set_channels() input differently ice: Interpret .set_channels() input differently nfc: nci: Fix handling of zero-length payload packets in nci_rx_work() net: relax socket state check at accept time. tcp: remove 64 KByte limit for initial tp->rcv_wnd value net: ti: icssg_prueth: Fix NULL pointer dereference in prueth_probe() tls: fix missing memory barrier in tls_init net: fec: avoid lock evasion when reading pps_enable Revert "ixgbe: Manual AN-37 for troublesome link partners for X550 SFI" testing: net-drv: use stats64 for testing net: mana: Fix the extra HZ in mana_hwc_send_request net: lan966x: Remove ptp traps in case the ptp is not enabled. openvswitch: Set the skbuff pkt_type for proper pmtud support. selftest: af_unix: Make SCM_RIGHTS into OOB data. af_unix: Fix garbage collection of embryos carrying OOB with SCM_RIGHTS tcp: Fix shift-out-of-bounds in dctcp_update_alpha(). selftests/net: use tc rule to filter the na packet ipv6: sr: fix memleak in seg6_hmac_init_algo af_unix: Update unix_sk(sk)->oob_skb under sk_receive_queue lock. ...	2024-05-23 12:49:37 -07:00
Linus Torvalds	404001ddf3	Merge tag 'trace-fixes-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing fixes from Steven Rostedt: "Minor last minute fixes: - Fix a very tight race between the ring buffer readers and resizing the ring buffer - Correct some stale comments in the ring buffer code - Fix kernel-doc in the rv code - Add a MODULE_DESCRIPTION to preemptirq_delay_test" * tag 'trace-fixes-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: rv: Update rv_en(dis)able_monitor doc to match kernel-doc tracing: Add MODULE_DESCRIPTION() to preemptirq_delay_test ring-buffer: Fix a race between readers and resize checks ring-buffer: Correct stale comments related to non-consuming readers	2024-05-23 12:36:38 -07:00
Linus Torvalds	e82d2af501	Merge tag 'trace-tools-v6.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing tool fix from Steven Rostedt: "Fix printf format warnings in latency-collector. Use the printf format string with %s to take a string instead of taking in a string directly" * tag 'trace-tools-v6.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tools/latency-collector: Fix -Wformat-security compile warns	2024-05-23 12:32:15 -07:00
Linus Torvalds	d6a326d694	Merge tag 'trace-assign-str-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing cleanup from Steven Rostedt: "Remove second argument of __assign_str() The __assign_str() macro logic of the TRACE_EVENT() macro was optimized so that it no longer needs the second argument. The __assign_str() is always matched with __string() field that takes a field name and the source for that field: __string(field, source) The TRACE_EVENT() macro logic will save off the source value and then use that value to copy into the ring buffer via the __assign_str(). Before commit `c1fa617cae` ("tracing: Rework __assign_str() and __string() to not duplicate getting the string"), the __assign_str() needed the second argument which would perform the same logic as the __string() source parameter did. Not only would this add overhead, but it was error prone as if the __assign_str() source produced something different, it may not have allocated enough for the string in the ring buffer (as the __string() source was used to determine how much to allocate) Now that the __assign_str() just uses the same string that was used in __string() it no longer needs the source parameter. It can now be removed" * tag 'trace-assign-str-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing/treewide: Remove second parameter of __assign_str()	2024-05-23 12:28:01 -07:00
Linus Torvalds	bca2a25d3b	Merge tag 'sparc-for-6.10-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/alarsson/linux-sparc Pull sparc updates from Andreas Larsson: - Avoid on-stack cpumask variables in a number of places - Move struct termio to asm/termios.h, matching other architectures and allowing certain user space applications to build also for sparc - Fix missing prototype warnings for sparc64 - Fix version generation warnings for sparc32 - Fix bug where non-consecutive CPU IDs lead to some CPUs not starting - Simplification using swap and cleanup using NULL for pointer - Convert sparc parport and chmc drivers to use remove callbacks returning void * tag 'sparc-for-6.10-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/alarsson/linux-sparc: sparc/leon: Remove on-stack cpumask var sparc/pci_msi: Remove on-stack cpumask var sparc/of: Remove on-stack cpumask var sparc/irq: Remove on-stack cpumask var sparc/srmmu: Remove on-stack cpumask var sparc: chmc: Convert to platform remove callback returning void sparc: parport: Convert to platform remove callback returning void sparc: Compare pointers to NULL instead of 0 sparc: Use swap() to fix Coccinelle warning sparc32: Fix version generation failed warnings sparc64: Fix number of online CPUs sparc64: Fix prototype warning for sched_clock sparc64: Fix prototype warnings in adi_64.c sparc64: Fix prototype warning for dma_4v_iotsb_bind sparc64: Fix prototype warning for uprobe_trap sparc64: Fix prototype warning for alloc_irqstack_bootmem sparc64: Fix prototype warning for vmemmap_free sparc64: Fix prototype warnings in traps_64.c sparc64: Fix prototype warning for init_vdso_image sparc: move struct termio to asm/termios.h	2024-05-23 12:22:20 -07:00
Linus Torvalds	2b7ced108e	Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Will Deacon: "The major fix here is for a filesystem corruption issue reported on Apple M1 as a result of buggy management of the floating point register state introduced in 6.8. I initially reverted one of the offending patches, but in the end Ard cooked a proper fix so there's a revert+reapply in the series. Aside from that, we've got some CPU errata workarounds and misc other fixes. - Fix broken FP register state tracking which resulted in filesystem corruption when dm-crypt is used - Workarounds for Arm CPU errata affecting the SSBS Spectre mitigation - Fix lockdep assertion in DMC620 memory controller PMU driver - Fix alignment of BUG table when CONFIG_DEBUG_BUGVERBOSE is disabled" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64/fpsimd: Avoid erroneous elide of user state reload Reapply "arm64: fpsimd: Implement lazy restore for kernel mode FPSIMD" arm64: asm-bug: Add .align 2 to the end of __BUG_ENTRY perf/arm-dmc620: Fix lockdep assert in ->event_init() Revert "arm64: fpsimd: Implement lazy restore for kernel mode FPSIMD" arm64: errata: Add workaround for Arm errata 3194386 and 3312417 arm64: cputype: Add Neoverse-V3 definitions arm64: cputype: Add Cortex-X4 definitions arm64: barrier: Restore spec_bar() macro	2024-05-23 12:09:22 -07:00
Linus Torvalds	2ef32ad224	Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost Pull virtio updates from Michael Tsirkin: "Several new features here: - virtio-net is finally supported in vduse - virtio (balloon and mem) interaction with suspend is improved - vhost-scsi now handles signals better/faster And fixes, cleanups all over the place" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (48 commits) virtio-pci: Check if is_avq is NULL virtio: delete vq in vp_find_vqs_msix() when request_irq() fails MAINTAINERS: add Eugenio Pérez as reviewer vhost-vdpa: Remove usage of the deprecated ida_simple_xx() API vp_vdpa: don't allocate unused msix vectors sound: virtio: drop owner assignment fuse: virtio: drop owner assignment scsi: virtio: drop owner assignment rpmsg: virtio: drop owner assignment nvdimm: virtio_pmem: drop owner assignment wifi: mac80211_hwsim: drop owner assignment vsock/virtio: drop owner assignment net: 9p: virtio: drop owner assignment net: virtio: drop owner assignment net: caif: virtio: drop owner assignment misc: nsm: drop owner assignment iommu: virtio: drop owner assignment drm/virtio: drop owner assignment gpio: virtio: drop owner assignment firmware: arm_scmi: virtio: drop owner assignment ...	2024-05-23 12:04:36 -07:00
Shuah Khan	df73757cf8	tools/latency-collector: Fix -Wformat-security compile warns Fix the following -Wformat-security compile warnings adding missing format arguments: latency-collector.c: In function ‘show_available’: latency-collector.c:938:17: warning: format not a string literal and no format arguments [-Wformat-security] 938 \| warnx(no_tracer_msg); \| ^~~~~ latency-collector.c:943:17: warning: format not a string literal and no format arguments [-Wformat-security] 943 \| warnx(no_latency_tr_msg); \| ^~~~~ latency-collector.c: In function ‘find_default_tracer’: latency-collector.c:986:25: warning: format not a string literal and no format arguments [-Wformat-security] 986 \| errx(EXIT_FAILURE, no_tracer_msg); \| ^~~~ latency-collector.c: In function ‘scan_arguments’: latency-collector.c:1881:33: warning: format not a string literal and no format arguments [-Wformat-security] 1881 \| errx(EXIT_FAILURE, no_tracer_msg); \| ^~~~ Link: https://lore.kernel.org/linux-trace-kernel/20240404011009.32945-1-skhan@linuxfoundation.org Cc: stable@vger.kernel.org Fixes: `e23db805da` ("tracing/tools: Add the latency-collector to tools directory") Signed-off-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2024-05-23 10:46:01 -04:00
Ken Milmore	c71e3a5cff	r8169: Fix possible ring buffer corruption on fragmented Tx packets. An issue was found on the RTL8125b when transmitting small fragmented packets, whereby invalid entries were inserted into the transmit ring buffer, subsequently leading to calls to dma_unmap_single() with a null address. This was caused by rtl8169_start_xmit() not noticing changes to nr_frags which may occur when small packets are padded (to work around hardware quirks) in rtl8169_tso_csum_v2(). To fix this, postpone inspecting nr_frags until after any padding has been applied. Fixes: `9020845fb5` ("r8169: improve rtl8169_start_xmit") Cc: stable@vger.kernel.org Signed-off-by: Ken Milmore <ken.milmore@gmail.com> Reviewed-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://lore.kernel.org/r/27ead18b-c23d-4f49-a020-1fc482c5ac95@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-05-23 15:45:32 +02:00
Paolo Abeni	3d8597d8d7	Merge branch 'intel-interpret-set_channels-input-differently' Jacob Keller says: ==================== intel: Interpret .set_channels() input differently The ice and idpf drivers can trigger a crash with AF_XDP due to incorrect interpretation of the asymmetric Tx and Rx parameters in their .set_channels() implementations: 1. ethtool -l <IFNAME> -> combined: 40 2. Attach AF_XDP to queue 30 3. ethtool -L <IFNAME> rx 15 tx 15 combined number is not specified, so command becomes {rx_count = 15, tx_count = 15, combined_count = 40}. 4. ethnl_set_channels checks, if there are any AF_XDP of queues from the new (combined_count + rx_count) to the old one, so from 55 to 40, check does not trigger. 5. the driver interprets `rx 15 tx 15` as 15 combined channels and deletes the queue that AF_XDP is attached to. This is fundamentally a problem with interpreting a request for asymmetric queues as symmetric combined queues. Fix the ice and idpf drivers to stop interpreting such requests as a request for combined queues. Due to current driver design for both ice and idpf, it is not possible to support requests of the same count of Tx and Rx queues with independent interrupts, (i.e. ethtool -L <IFNAME> rx 15 tx 15) so such requests are now rejected. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> ==================== Link: https://lore.kernel.org/r/20240521-iwl-net-2024-05-14-set-channels-fixes-v2-0-7aa39e2e99f1@intel.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-05-23 13:02:28 +02:00
Larysa Zaremba	5e7695e021	idpf: Interpret .set_channels() input differently Unlike ice, idpf does not check, if user has requested at least 1 combined channel. Instead, it relies on a check in the core code. Unfortunately, the check does not trigger for us because of the hacky .set_channels() interpretation logic that is not consistent with the core code. This naturally leads to user being able to trigger a crash with an invalid input. This is how: 1. ethtool -l <IFNAME> -> combined: 40 2. ethtool -L <IFNAME> rx 0 tx 0 combined number is not specified, so command becomes {rx_count = 0, tx_count = 0, combined_count = 40}. 3. ethnl_set_channels checks, if there is at least 1 RX and 1 TX channel, comparing (combined_count + rx_count) and (combined_count + tx_count) to zero. Obviously, (40 + 0) is greater than zero, so the core code deems the input OK. 4. idpf interprets `rx 0 tx 0` as 0 channels and tries to proceed with such configuration. The issue has to be solved fundamentally, as current logic is also known to cause AF_XDP problems in ice [0]. Interpret the command in a way that is more consistent with ethtool manual [1] (--show-channels and --set-channels) and new ice logic. Considering that in the idpf driver only the difference between RX and TX queues forms dedicated channels, change the correct way to set number of channels to: ethtool -L <IFNAME> combined 10 /* For symmetric queues / ethtool -L <IFNAME> combined 8 tx 2 rx 0 / For asymmetric queues */ [0] https://lore.kernel.org/netdev/20240418095857.2827-1-larysa.zaremba@intel.com/ [1] https://man7.org/linux/man-pages/man8/ethtool.8.html Fixes: `02cbfba1ad` ("idpf: add ethtool callbacks") Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Igor Bagnucki <igor.bagnucki@intel.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-05-23 13:02:26 +02:00
Larysa Zaremba	05d6f442f3	ice: Interpret .set_channels() input differently A bug occurs because a safety check guarding AF_XDP-related queues in ethnl_set_channels(), does not trigger. This happens, because kernel and ice driver interpret the ethtool command differently. How the bug occurs: 1. ethtool -l <IFNAME> -> combined: 40 2. Attach AF_XDP to queue 30 3. ethtool -L <IFNAME> rx 15 tx 15 combined number is not specified, so command becomes {rx_count = 15, tx_count = 15, combined_count = 40}. 4. ethnl_set_channels checks, if there are any AF_XDP of queues from the new (combined_count + rx_count) to the old one, so from 55 to 40, check does not trigger. 5. ice interprets `rx 15 tx 15` as 15 combined channels and deletes the queue that AF_XDP is attached to. Interpret the command in a way that is more consistent with ethtool manual [0] (--show-channels and --set-channels). Considering that in the ice driver only the difference between RX and TX queues forms dedicated channels, change the correct way to set number of channels to: ethtool -L <IFNAME> combined 10 /* For symmetric queues / ethtool -L <IFNAME> combined 8 tx 2 rx 0 / For asymmetric queues */ [0] https://man7.org/linux/man-pages/man8/ethtool.8.html Fixes: `87324e747f` ("ice: Implement ethtool ops for channels") Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-05-23 13:02:26 +02:00
Ryosuke Yasuoka	6671e35249	nfc: nci: Fix handling of zero-length payload packets in nci_rx_work() When nci_rx_work() receives a zero-length payload packet, it should not discard the packet and exit the loop. Instead, it should continue processing subsequent packets. Fixes: `d24b03535e` ("nfc: nci: Fix uninit-value in nci_dev_up and nci_ntf_packet") Signed-off-by: Ryosuke Yasuoka <ryasuoka@redhat.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20240521153444.535399-1-ryasuoka@redhat.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-05-23 12:39:44 +02:00
Paolo Abeni	26afda78cd	net: relax socket state check at accept time. Christoph reported the following splat: WARNING: CPU: 1 PID: 772 at net/ipv4/af_inet.c:761 __inet_accept+0x1f4/0x4a0 Modules linked in: CPU: 1 PID: 772 Comm: syz-executor510 Not tainted 6.9.0-rc7-g7da7119fe22b #56 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014 RIP: 0010:__inet_accept+0x1f4/0x4a0 net/ipv4/af_inet.c:759 Code: 04 38 84 c0 0f 85 87 00 00 00 41 c7 04 24 03 00 00 00 48 83 c4 10 5b 41 5c 41 5d 41 5e 41 5f 5d c3 cc cc cc cc e8 ec b7 da fd <0f> 0b e9 7f fe ff ff e8 e0 b7 da fd 0f 0b e9 fe fe ff ff 89 d9 80 RSP: 0018:ffffc90000c2fc58 EFLAGS: 00010293 RAX: ffffffff836bdd14 RBX: 0000000000000000 RCX: ffff888104668000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: dffffc0000000000 R08: ffffffff836bdb89 R09: fffff52000185f64 R10: dffffc0000000000 R11: fffff52000185f64 R12: dffffc0000000000 R13: 1ffff92000185f98 R14: ffff88810754d880 R15: ffff8881007b7800 FS: 000000001c772880(0000) GS:ffff88811b280000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fb9fcf2e178 CR3: 00000001045d2002 CR4: 0000000000770ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <TASK> inet_accept+0x138/0x1d0 net/ipv4/af_inet.c:786 do_accept+0x435/0x620 net/socket.c:1929 __sys_accept4_file net/socket.c:1969 [inline] __sys_accept4+0x9b/0x110 net/socket.c:1999 __do_sys_accept net/socket.c:2016 [inline] __se_sys_accept net/socket.c:2013 [inline] __x64_sys_accept+0x7d/0x90 net/socket.c:2013 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0x58/0x100 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x4315f9 Code: fd ff 48 81 c4 80 00 00 00 e9 f1 fe ff ff 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 ab b4 fd ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007ffdb26d9c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002b RAX: ffffffffffffffda RBX: 0000000000400300 RCX: 00000000004315f9 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000004 RBP: 00000000006e1018 R08: 0000000000400300 R09: 0000000000400300 R10: 0000000000400300 R11: 0000000000000246 R12: 0000000000000000 R13: 000000000040cdf0 R14: 000000000040ce80 R15: 0000000000000055 </TASK> The reproducer invokes shutdown() before entering the listener status. After commit `94062790ae` ("tcp: defer shutdown(SEND_SHUTDOWN) for TCP_SYN_RECV sockets"), the above causes the child to reach the accept syscall in FIN_WAIT1 status. Eric noted we can relax the existing assertion in __inet_accept() Reported-by: Christoph Paasch <cpaasch@apple.com> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/490 Suggested-by: Eric Dumazet <edumazet@google.com> Fixes: `94062790ae` ("tcp: defer shutdown(SEND_SHUTDOWN) for TCP_SYN_RECV sockets") Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/23ab880a44d8cfd967e84de8b93dbf48848e3d8c.1716299669.git.pabeni@redhat.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-05-23 12:33:35 +02:00
Jason Xing	378979e94e	tcp: remove 64 KByte limit for initial tp->rcv_wnd value Recently, we had some servers upgraded to the latest kernel and noticed the indicator from the user side showed worse results than before. It is caused by the limitation of tp->rcv_wnd. In 2018 commit `a337531b94` ("tcp: up initial rmem to 128KB and SYN rwin to around 64KB") limited the initial value of tp->rcv_wnd to 65535, most CDN teams would not benefit from this change because they cannot have a large window to receive a big packet, which will be slowed down especially in long RTT. Small rcv_wnd means slow transfer speed, to some extent. It's the side effect for the latency/time-sensitive users. To avoid future confusion, current change doesn't affect the initial receive window on the wire in a SYN or SYN+ACK packet which are set within 65535 bytes according to RFC 7323 also due to the limit in __tcp_transmit_skb(): th->window = htons(min(tp->rcv_wnd, 65535U)); In one word, __tcp_transmit_skb() already ensures that constraint is respected, no matter how large tp->rcv_wnd is. The change doesn't violate RFC. Let me provide one example if with or without the patch: Before: client --- SYN: rwindow=65535 ---> server client <--- SYN+ACK: rwindow=65535 ---- server client --- ACK: rwindow=65536 ---> server Note: for the last ACK, the calculation is 512 << 7. After: client --- SYN: rwindow=65535 ---> server client <--- SYN+ACK: rwindow=65535 ---- server client --- ACK: rwindow=175232 ---> server Note: I use the following command to make it work: ip route change default via [ip] dev eth0 metric 100 initrwnd 120 For the last ACK, the calculation is 1369 << 7. When we apply such a patch, having a large rcv_wnd if the user tweak this knob can help transfer data more rapidly and save some rtts. Fixes: `a337531b94` ("tcp: up initial rmem to 128KB and SYN rwin to around 64KB") Signed-off-by: Jason Xing <kernelxing@tencent.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Link: https://lore.kernel.org/r/20240521134220.12510-1-kerneljasonxing@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-05-23 12:21:17 +02:00
Romain Gantois	b31c7e7808	net: ti: icssg_prueth: Fix NULL pointer dereference in prueth_probe() In the prueth_probe() function, if one of the calls to emac_phy_connect() fails due to of_phy_connect() returning NULL, then the subsequent call to phy_attached_info() will dereference a NULL pointer. Check the return code of emac_phy_connect and fail cleanly if there is an error. Fixes: `128d5874c0` ("net: ti: icssg-prueth: Add ICSSG ethernet driver") Cc: stable@vger.kernel.org Signed-off-by: Romain Gantois <romain.gantois@bootlin.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: MD Danish Anwar <danishanwar@ti.com> Link: https://lore.kernel.org/r/20240521-icssg-prueth-fix-v1-1-b4b17b1433e9@bootlin.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-05-23 12:11:09 +02:00
Dae R. Jeong	91e61dd7a0	tls: fix missing memory barrier in tls_init In tls_init(), a write memory barrier is missing, and store-store reordering may cause NULL dereference in tls_{setsockopt,getsockopt}. CPU0 CPU1 ----- ----- // In tls_init() // In tls_ctx_create() ctx = kzalloc() ctx->sk_proto = READ_ONCE(sk->sk_prot) -(1) // In update_sk_prot() WRITE_ONCE(sk->sk_prot, tls_prots) -(2) // In sock_common_setsockopt() READ_ONCE(sk->sk_prot)->setsockopt() // In tls_{setsockopt,getsockopt}() ctx->sk_proto->setsockopt() -(3) In the above scenario, when (1) and (2) are reordered, (3) can observe the NULL value of ctx->sk_proto, causing NULL dereference. To fix it, we rely on rcu_assign_pointer() which implies the release barrier semantic. By moving rcu_assign_pointer() after ctx->sk_proto is initialized, we can ensure that ctx->sk_proto are visible when changing sk->sk_prot. Fixes: `d5bee7374b` ("net/tls: Annotate access to sk_prot with READ_ONCE/WRITE_ONCE") Signed-off-by: Yewon Choi <woni9911@gmail.com> Signed-off-by: Dae R. Jeong <threeearcat@gmail.com> Link: https://lore.kernel.org/netdev/ZU4OJG56g2V9z_H7@dragonet/T/ Link: https://lore.kernel.org/r/Zkx4vjSFp0mfpjQ2@libra05 Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-05-23 12:03:26 +02:00
Wei Fang	3b1c92f8e5	net: fec: avoid lock evasion when reading pps_enable The assignment of pps_enable is protected by tmreg_lock, but the read operation of pps_enable is not. So the Coverity tool reports a lock evasion warning which may cause data race to occur when running in a multithread environment. Although this issue is almost impossible to occur, we'd better fix it, at least it seems more logically reasonable, and it also prevents Coverity from continuing to issue warnings. Fixes: `278d240478` ("net: fec: ptp: Enable PPS output based on ptp clock") Signed-off-by: Wei Fang <wei.fang@nxp.com> Link: https://lore.kernel.org/r/20240521023800.17102-1-wei.fang@nxp.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-05-23 11:15:54 +02:00
Jacob Keller	b35b1c0b4e	Revert "ixgbe: Manual AN-37 for troublesome link partners for X550 SFI" This reverts commit `565736048b`. According to the commit, it implements a manual AN-37 for some "troublesome" Juniper MX5 switches. This appears to be a workaround for a particular switch. It has been reported that this causes a severe breakage for other switches, including a Cisco 3560CX-12PD-S. The code appears to be a workaround for a specific switch which fails to link in SFI mode. It expects to see AN-37 auto negotiation in order to link. The Cisco switch is not expecting AN-37 auto negotiation. When the device starts the manual AN-37, the Cisco switch decides that the port is confused and stops attempting to link with it. This persists until a power cycle. A simple driver unload and reload does not resolve the issue, even if loading with a version of the driver which lacks this workaround. The authors of the workaround commit have not responded with clarifications, and the result of the workaround is complete failure to connect with other switches. This appears to be a case where the driver can either "correctly" link with the Juniper MX5 switch, at the cost of bricking the link with the Cisco switch, or it can behave properly for the Cisco switch, but fail to link with the Junipir MX5 switch. I do not know enough about the standards involved to clearly determine whether either switch is at fault or behaving incorrectly. Nor do I know whether there exists some alternative fix which corrects behavior with both switches. Revert the workaround for the Juniper switch. Fixes: `565736048b` ("ixgbe: Manual AN-37 for troublesome link partners for X550 SFI") Link: https://lore.kernel.org/netdev/cbe874db-9ac9-42b8-afa0-88ea910e1e99@intel.com/T/ Link: https://forum.proxmox.com/threads/intel-x553-sfp-ixgbe-no-go-on-pve8.135129/#post-612291 Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Cc: Jeff Daly <jeffd@silicom-usa.com> Cc: kernel.org-fo5k2w@ycharbi.fr Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240520-net-2024-05-20-revert-silicom-switch-workaround-v1-1-50f80f261c94@intel.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-05-23 10:48:35 +02:00
Joe Damato	a61a459f58	testing: net-drv: use stats64 for testing Testing a network device that has large numbers of bytes/packets may overflow. Using stats64 when comparing fixes this problem. I tripped on this while iterating on a qstats patch for mlx5. See below for confirmation without my added code that this is a bug. Before this patch (with added debugging output): $ NETIF=eth0 tools/testing/selftests/drivers/net/stats.py KTAP version 1 1..4 ok 1 stats.check_pause ok 2 stats.check_fec rstat: 481708634 qstat: 666201639514 key: tx-bytes not ok 3 stats.pkt_byte_sum ok 4 stats.qstat_by_ifindex Note the huge delta above ^^^ in the rtnl vs qstats. After this patch: $ NETIF=eth0 tools/testing/selftests/drivers/net/stats.py KTAP version 1 1..4 ok 1 stats.check_pause ok 2 stats.check_fec ok 3 stats.pkt_byte_sum ok 4 stats.qstat_by_ifindex It looks like rtnl_fill_stats in net/core/rtnetlink.c will attempt to copy the 64bit stats into a 32bit structure which is probably why this behavior is occurring. To show this is happening, you can get the underlying stats that the stats.py test uses like this: $ ./cli.py --spec ../../../Documentation/netlink/specs/rt_link.yaml \ --do getlink --json '{"ifi-index": 7}' And examine the output (heavily snipped to show relevant fields): 'stats': { 'multicast': 3739197, 'rx-bytes': 1201525399, 'rx-packets': 56807158, 'tx-bytes': 492404458, 'tx-packets': 1200285371, 'stats64': { 'multicast': 3739197, 'rx-bytes': 35561263767, 'rx-packets': 56807158, 'tx-bytes': 666212335338, 'tx-packets': 1200285371, The stats.py test prior to this patch was using the 'stats' structure above, which matches the failure output on my system. Comparing side by side, rx-bytes and tx-bytes, and getting ethtool -S output: rx-bytes stats: 1201525399 rx-bytes stats64: 35561263767 rx-bytes ethtool: 36203402638 tx-bytes stats: 492404458 tx-bytes stats64: 666212335338 tx-bytes ethtool: 666215360113 Note that the above was taken from a system with an mlx5 NIC, which only exposes ndo_get_stats64. Based on the ethtool output and qstat output, it appears that stats.py should be updated to use the 'stats64' structure for accurate comparisons when packet/byte counters get very large. To confirm that this was not related to the qstats code I was iterating on, I booted a kernel without my driver changes and re-ran the test which shows the qstats are skipped (as they don't exist for mlx5): NETIF=eth0 tools/testing/selftests/drivers/net/stats.py KTAP version 1 1..4 ok 1 stats.check_pause ok 2 stats.check_fec ok 3 stats.pkt_byte_sum # SKIP qstats not supported by the device ok 4 stats.qstat_by_ifindex # SKIP No ifindex supports qstats But, fetching the stats using the CLI $ ./cli.py --spec ../../../Documentation/netlink/specs/rt_link.yaml \ --do getlink --json '{"ifi-index": 7}' Shows the same issue (heavily snipped for relevant fields only): 'stats': { 'multicast': 105489, 'rx-bytes': 530879526, 'rx-packets': 751415, 'tx-bytes': 2510191396, 'tx-packets': 27700323, 'stats64': { 'multicast': 105489, 'rx-bytes': 530879526, 'rx-packets': 751415, 'tx-bytes': 15395093284, 'tx-packets': 27700323, Comparing side by side with ethtool -S on the unmodified mlx5 driver: tx-bytes stats: 2510191396 tx-bytes stats64: 15395093284 tx-bytes ethtool: 17718435810 Fixes: `f0e6c86e4b` ("testing: net-drv: add a driver test for stats reporting") Signed-off-by: Joe Damato <jdamato@fastly.com> Link: https://lore.kernel.org/r/20240520235850.190041-1-jdamato@fastly.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-05-23 10:18:29 +02:00
Linus Torvalds	c760b3725e	Merge tag 'mm-nonmm-stable-2024-05-22-17-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull more non-mm updates from Andrew Morton: - A series ("kbuild: enable more warnings by default") from Arnd Bergmann which enables a number of additional build-time warnings. We fixed all the fallout which we could find, there may still be a few stragglers. - Samuel Holland has developed the series "Unified cross-architecture kernel-mode FPU API". This does a lot of consolidation of per-architecture kernel-mode FPU usage and enables the use of newer AMD GPUs on RISC-V. - Tao Su has fixed some selftests build warnings in the series "Selftests: Fix compilation warnings due to missing _GNU_SOURCE definition". - This pull also includes a nilfs2 fixup from Ryusuke Konishi. * tag 'mm-nonmm-stable-2024-05-22-17-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (23 commits) nilfs2: make block erasure safe in nilfs_finish_roll_forward() selftests/harness: use 1024 in place of LINE_MAX Revert "selftests/harness: remove use of LINE_MAX" selftests/fpu: allow building on other architectures selftests/fpu: move FP code to a separate translation unit drm/amd/display: use ARCH_HAS_KERNEL_FPU_SUPPORT drm/amd/display: only use hard-float, not altivec on powerpc riscv: add support for kernel-mode FPU x86: implement ARCH_HAS_KERNEL_FPU_SUPPORT powerpc: implement ARCH_HAS_KERNEL_FPU_SUPPORT LoongArch: implement ARCH_HAS_KERNEL_FPU_SUPPORT lib/raid6: use CC_FLAGS_FPU for NEON CFLAGS arm64: crypto: use CC_FLAGS_FPU for NEON CFLAGS arm64: implement ARCH_HAS_KERNEL_FPU_SUPPORT ARM: crypto: use CC_FLAGS_FPU for NEON CFLAGS ARM: implement ARCH_HAS_KERNEL_FPU_SUPPORT arch: add ARCH_HAS_KERNEL_FPU_SUPPORT x86/fpu: fix asm/fpu/types.h include guard kbuild: enable -Wcast-function-type-strict unconditionally kbuild: enable -Wformat-truncation on clang ...	2024-05-22 18:59:29 -07:00

1 2 3 4 5 ...

1279075 Commits