linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-05-06 07:34:03 -04:00

Author	SHA1	Message	Date
Suman Ghosh	99c969a83d	octeontx2-pf: Add egress PFC support As of now all transmit queues transmit packets out of same scheduler queue hierarchy. Due to this PFC frames sent by peer are not handled properly, either all transmit queues are backpressured or none. To fix this when user enables PFC for a given priority map relavant transmit queue to a different scheduler queue hierarcy, so that backpressure is applied only to the traffic egressing out of that TXQ. Signed-off-by: Suman Ghosh <sumang@marvell.com> Link: https://lore.kernel.org/r/20220830120304.158060-1-sumang@marvell.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2022-09-01 09:37:47 +02:00
Zhengchao Shao	a102c8973d	net: sched: remove redundant NULL check in change hook function Currently, the change function can be called by two ways. The one way is that qdisc_change() will call it. Before calling change function, qdisc_change() ensures tca[TCA_OPTIONS] is not empty. The other way is that .init() will call it. The opt parameter is also checked before calling change function in .init(). Therefore, it's no need to check the input parameter opt in change function. Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Acked-by: Toke Høiland-Jørgensen <toke@toke.dk> Link: https://lore.kernel.org/r/20220829071219.208646-1-shaozhengchao@huawei.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2022-09-01 08:06:45 +02:00
Jakub Kicinski	0b4f688d53	Revert "sch_cake: Return __NET_XMIT_STOLEN when consuming enqueued skb" This reverts commit `90fabae8a2`. Patch was applied hastily, revert and let the v2 be reviewed. Fixes: `90fabae8a2` ("sch_cake: Return __NET_XMIT_STOLEN when consuming enqueued skb") Link: https://lore.kernel.org/all/87wnao2ha3.fsf@toke.dk/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-08-31 20:02:28 -07:00
Jakub Kicinski	a3daac631e	Merge branch 'tcp-tcp-challenge-ack-fixes' Eric Dumazet says: ==================== tcp: tcp challenge ack fixes syzbot found a typical data-race addressed in the first patch. While we are at it, second patch makes the global rate limit per net-ns and disabled by default. ==================== Link: https://lore.kernel.org/r/20220830185656.268523-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-08-31 19:56:54 -07:00
Eric Dumazet	79e3602caa	tcp: make global challenge ack rate limitation per net-ns and default disabled Because per host rate limiting has been proven problematic (side channel attacks can be based on it), per host rate limiting of challenge acks ideally should be per netns and turned off by default. This is a long due followup of following commits: `083ae30828` ("tcp: enable per-socket rate limiting of all 'challenge acks'") `f2b2c582e8` ("tcp: mitigate ACK loops for connections as tcp_sock") `75ff39ccc1` ("tcp: make challenge acks less predictable") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jason Baron <jbaron@akamai.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-08-31 19:56:48 -07:00
Eric Dumazet	8c70521238	tcp: annotate data-race around challenge_timestamp challenge_timestamp can be read an written by concurrent threads. This was expected, but we need to annotate the race to avoid potential issues. Following patch moves challenge_timestamp and challenge_count to per-netns storage to provide better isolation. Fixes: `354e4aa391` ("tcp: RFC 5961 5.2 Blind Data Injection Attack Mitigation") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-08-31 19:56:48 -07:00
Kurt Kanzenbach	52267ce25f	net: dsa: hellcreek: Print warning only once In case the source port cannot be decoded, print the warning only once. This still brings attention to the user and does not spam the logs at the same time. Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Link: https://lore.kernel.org/r/20220830163448.8921-1-kurt@linutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-08-31 19:54:04 -07:00
Richard Gobert	0e4d354762	net-next: Fix IP_UNICAST_IF option behavior for connected sockets The IP_UNICAST_IF socket option is used to set the outgoing interface for outbound packets. The IP_UNICAST_IF socket option was added as it was needed by the Wine project, since no other existing option (SO_BINDTODEVICE socket option, IP_PKTINFO socket option or the bind function) provided the needed characteristics needed by the IP_UNICAST_IF socket option. [1] The IP_UNICAST_IF socket option works well for unconnected sockets, that is, the interface specified by the IP_UNICAST_IF socket option is taken into consideration in the route lookup process when a packet is being sent. However, for connected sockets, the outbound interface is chosen when connecting the socket, and in the route lookup process which is done when a packet is being sent, the interface specified by the IP_UNICAST_IF socket option is being ignored. This inconsistent behavior was reported and discussed in an issue opened on systemd's GitHub project [2]. Also, a bug report was submitted in the kernel's bugzilla [3]. To understand the problem in more detail, we can look at what happens for UDP packets over IPv4 (The same analysis was done separately in the referenced systemd issue). When a UDP packet is sent the udp_sendmsg function gets called and the following happens: 1. The oif member of the struct ipcm_cookie ipc (which stores the output interface of the packet) is initialized by the ipcm_init_sk function to inet->sk.sk_bound_dev_if (the device set by the SO_BINDTODEVICE socket option). 2. If the IP_PKTINFO socket option was set, the oif member gets overridden by the call to the ip_cmsg_send function. 3. If no output interface was selected yet, the interface specified by the IP_UNICAST_IF socket option is used. 4. If the socket is connected and no destination address is specified in the send function, the struct ipcm_cookie ipc is not taken into consideration and the cached route, that was calculated in the connect function is being used. Thus, for a connected socket, the IP_UNICAST_IF sockopt isn't taken into consideration. This patch corrects the behavior of the IP_UNICAST_IF socket option for connect()ed sockets by taking into consideration the IP_UNICAST_IF sockopt when connecting the socket. In order to avoid reconnecting the socket, this option is still ignored when applied on an already connected socket until connect() is called again by the Richard Gobert. Change the __ip4_datagram_connect function, which is called during socket connection, to take into consideration the interface set by the IP_UNICAST_IF socket option, in a similar way to what is done in the udp_sendmsg function. [1] https://lore.kernel.org/netdev/1328685717.4736.4.camel@edumazet-laptop/T/ [2] https://github.com/systemd/systemd/issues/11935#issuecomment-618691018 [3] https://bugzilla.kernel.org/show_bug.cgi?id=210255 Signed-off-by: Richard Gobert <richardbgobert@gmail.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20220829111554.GA1771@debian Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-08-31 19:51:06 -07:00
Nicolas Dichtel	eb55dc09b5	ip: fix triggering of 'icmp redirect' __mkroute_input() uses fib_validate_source() to trigger an icmp redirect. My understanding is that fib_validate_source() is used to know if the src address and the gateway address are on the same link. For that, fib_validate_source() returns 1 (same link) or 0 (not the same network). __mkroute_input() is the only user of these positive values, all other callers only look if the returned value is negative. Since the below patch, fib_validate_source() didn't return anymore 1 when both addresses are on the same network, because the route lookup returns RT_SCOPE_LINK instead of RT_SCOPE_HOST. But this is, in fact, right. Let's adapat the test to return 1 again when both addresses are on the same link. CC: stable@vger.kernel.org Fixes: `747c143072` ("ip: fix dflt addr selection for connected nexthop") Reported-by: kernel test robot <yujie.liu@intel.com> Reported-by: Heng Qi <hengqi@linux.alibaba.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20220829100121.3821-1-nicolas.dichtel@6wind.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-08-31 19:50:36 -07:00
Jakub Kicinski	744ccd5c64	Merge branch 'net-sched-remove-unused-variables' Zhengchao Shao says: ==================== net: sched: remove unused variables The variable "other" is unused, remove it. ==================== Link: https://lore.kernel.org/r/20220830092255.281330-1-shaozhengchao@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-08-31 19:39:56 -07:00
Zhengchao Shao	4516c873e3	net: sched: gred/red: remove unused variables in struct red_stats The variable "other" in the struct red_stats is not used. Remove it. Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-08-31 19:39:53 -07:00
Zhengchao Shao	38af11717b	net: sched: choke: remove unused variables in struct choke_sched_data The variable "other" in the struct choke_sched_data is not used. Remove it. Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-08-31 19:39:53 -07:00
Robert Hancock	cb45a8bf46	net: axienet: Switch to 64-bit RX/TX statistics The RX and TX byte/packet statistics in this driver could be overflowed relatively quickly on a 32-bit platform. Switch these stats to use the u64_stats infrastructure to avoid this. Signed-off-by: Robert Hancock <robert.hancock@calian.com> Link: https://lore.kernel.org/r/20220829233901.3429419-1-robert.hancock@calian.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-08-31 19:18:20 -07:00
Linus Walleij	a60511cf15	net/rds: Pass a pointer to virt_to_page() Functions that work on a pointer to virtual memory such as virt_to_pfn() and users of that function such as virt_to_page() are supposed to pass a pointer to virtual memory, ideally a (void ) or other pointer. However since many architectures implement virt_to_pfn() as a macro, this function becomes polymorphic and accepts both a (unsigned long) and a (void ). If we instead implement a proper virt_to_pfn(void *addr) function the following happens (occurred on arch/arm): net/rds/message.c:357:56: warning: passing argument 1 of 'virt_to_pfn' makes pointer from integer without a cast [-Wint-conversion] Fix this with an explicit cast. Cc: Santosh Shilimkar <santosh.shilimkar@oracle.com> Cc: rds-devel@oss.oracle.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20220829132001.114858-1-linus.walleij@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-08-31 19:12:32 -07:00
Jann Horn	2555283eb4	mm/rmap: Fix anon_vma->degree ambiguity leading to double-reuse anon_vma->degree tracks the combined number of child anon_vmas and VMAs that use the anon_vma as their ->anon_vma. anon_vma_clone() then assumes that for any anon_vma attached to src->anon_vma_chain other than src->anon_vma, it is impossible for it to be a leaf node of the VMA tree, meaning that for such VMAs ->degree is elevated by 1 because of a child anon_vma, meaning that if ->degree equals 1 there are no VMAs that use the anon_vma as their ->anon_vma. This assumption is wrong because the ->degree optimization leads to leaf nodes being abandoned on anon_vma_clone() - an existing anon_vma is reused and no new parent-child relationship is created. So it is possible to reuse an anon_vma for one VMA while it is still tied to another VMA. This is an issue because is_mergeable_anon_vma() and its callers assume that if two VMAs have the same ->anon_vma, the list of anon_vmas attached to the VMAs is guaranteed to be the same. When this assumption is violated, vma_merge() can merge pages into a VMA that is not attached to the corresponding anon_vma, leading to dangling page->mapping pointers that will be dereferenced during rmap walks. Fix it by separately tracking the number of child anon_vmas and the number of VMAs using the anon_vma as their ->anon_vma. Fixes: `7a3ef208e6` ("mm: prevent endless growth of anon_vma hierarchy") Cc: stable@kernel.org Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2022-08-31 15:45:10 -07:00
Sven van Ashbrook	7305b78ae4	r8152: allow userland to disable multicast The rtl8152 driver does not disable multicasting when userspace asks it to. For example: $ ifconfig eth0 -multicast -allmulti $ tcpdump -p -i eth0 # will still capture multicast frames Fix by clearing the device multicast filter table when multicast and allmulti are both unset. Tested as follows: - Set multicast on eth0 network interface - verify that multicast packets are coming in: $ tcpdump -p -i eth0 - Clear multicast and allmulti on eth0 network interface - verify that no more multicast packets are coming in: $ tcpdump -p -i eth0 Signed-off-by: Sven van Ashbrook <svenva@chromium.org> Acked-by: Hayes Wang <hayeswang@realtek.com> Link: https://lore.kernel.org/r/20220830045923.net-next.v1.1.I4fee0ac057083d4f848caf0fa3a9fd466fc374a0@changeid Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-08-31 14:55:50 -07:00
Toke Høiland-Jørgensen	90fabae8a2	sch_cake: Return __NET_XMIT_STOLEN when consuming enqueued skb When the GSO splitting feature of sch_cake is enabled, GSO superpackets will be broken up and the resulting segments enqueued in place of the original skb. In this case, CAKE calls consume_skb() on the original skb, but still returns NET_XMIT_SUCCESS. This can confuse parent qdiscs into assuming the original skb still exists, when it really has been freed. Fix this by adding the __NET_XMIT_STOLEN flag to the return value in this case. Fixes: `0c850344d3` ("sch_cake: Conditionally split GSO segments") Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk> Reported-by: zdi-disclosures@trendmicro.com # ZDI-CAN-18231 Link: https://lore.kernel.org/r/20220831092103.442868-1-toke@toke.dk Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-08-31 14:20:08 -07:00
Wolfram Sang	f029c781dd	net: ethernet: move from strlcpy with unused retval to strscpy Follow the advice of the below link and prefer 'strscpy' in this subsystem. Conversion is 1:1 because the return value is not used. Generated by a coccinelle script. Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/ Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Reviewed-by: Petr Machata <petrm@nvidia.com> # For drivers/net/ethernet/mellanox/mlxsw Acked-by: Geoff Levand <geoff@infradead.org> # For ps3_gelic_net and spider_net_ethtool Acked-by: Tom Lendacky <thomas.lendacky@amd.com> # For drivers/net/ethernet/amd/xgbe/xgbe-ethtool.c Acked-by: Marcin Wojtas <mw@semihalf.com> # For drivers/net/ethernet/marvell/mvpp2 Reviewed-by: Leon Romanovsky <leonro@nvidia.com> # For drivers/net/ethernet/mellanox/mlx{4\|5} Reviewed-by: Shay Agroskin <shayagr@amazon.com> # For drivers/net/ethernet/amazon/ena Acked-by: Krzysztof Hałasa <khalasa@piap.pl> # For IXP4xx Ethernet Link: https://lore.kernel.org/r/20220830201457.7984-3-wsa+renesas@sang-engineering.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-08-31 14:11:26 -07:00
Wolfram Sang	fb3ceec187	net: move from strlcpy with unused retval to strscpy Follow the advice of the below link and prefer 'strscpy' in this subsystem. Conversion is 1:1 because the return value is not used. Generated by a coccinelle script. Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/ Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Acked-by: Marc Kleine-Budde <mkl@pengutronix.de> # for CAN Link: https://lore.kernel.org/r/20220830201457.7984-1-wsa+renesas@sang-engineering.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-08-31 14:11:07 -07:00
Martin KaFai Lau	c9ae8c966f	Merge branch 'fixes for concurrent htab updates' Hou Tao says: ==================== From: Hou Tao <houtao1@huawei.com> Hi, The patchset aims to fix the issues found during investigating the syzkaller problem reported in [0]. It seems that the concurrent updates to the same hash-table bucket may fail as shown in patch 1. Patch 1 uses preempt_disable() to fix the problem for htab_use_raw_lock() case. For !htab_use_raw_lock() case, the problem is left to "BPF specific memory allocator" patchset [1] in which !htab_use_raw_lock() will be removed. Patch 2 fixes the out-of-bound memory read problem reported in [0]. The problem has the root cause as patch 1 and it is fixed by handling -EBUSY from htab_lock_bucket() correctly. Patch 3 add two cases for hash-table update: one for the reentrancy of bpf_map_update_elem(), and another one for concurrent updates of the same hash-table bucket. Comments are always welcome. Regards, Tao [0]: https://lore.kernel.org/bpf/CACkBjsbuxaR6cv0kXJoVnBfL9ZJXjjoUcMpw_Ogc313jSrg14A@mail.gmail.com/ [1]: https://lore.kernel.org/bpf/20220819214232.18784-1-alexei.starovoitov@gmail.com/ Change Log: v4: * rebased on bpf-next * add htab_update to DENYLIST.s390x v3: https://lore.kernel.org/bpf/20220829023709.1958204-1-houtao@huaweicloud.com/ * patch 1: update commit message and add Fixes tag * patch 2: add Fixes tag * patch 3: elaborate the description of test cases v2: https://lore.kernel.org/bpf/bd60ef93-1c6a-2db2-557d-b09b92ad22bd@huaweicloud.com/ * Note the fix is for CONFIG_PREEMPT case in commit message and add Reviewed-by tag for patch 1 * Drop patch "bpf: Allow normally concurrent map updates for !htab_use_raw_lock() case" v1: https://lore.kernel.org/bpf/20220821033223.2598791-1-houtao@huaweicloud.com/ ==================== Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2022-08-31 14:10:01 -07:00
Hou Tao	1c636b6277	selftests/bpf: Add test cases for htab update One test demonstrates the reentrancy of hash map update on the same bucket should fail, and another one shows concureently updates of the same hash map bucket should succeed and not fail due to the reentrancy checking for bucket lock. There is no trampoline support on s390x, so move htab_update to denylist. Signed-off-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20220831042629.130006-4-houtao@huaweicloud.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2022-08-31 14:10:01 -07:00
Hou Tao	66a7a92e4d	bpf: Propagate error from htab_lock_bucket() to userspace In __htab_map_lookup_and_delete_batch() if htab_lock_bucket() returns -EBUSY, it will go to next bucket. Going to next bucket may not only skip the elements in current bucket silently, but also incur out-of-bound memory access or expose kernel memory to userspace if current bucket_cnt is greater than bucket_size or zero. Fixing it by stopping batch operation and returning -EBUSY when htab_lock_bucket() fails, and the application can retry or skip the busy batch as needed. Fixes: `20b6cc34ea` ("bpf: Avoid hashtab deadlock with map_locked") Reported-by: Hao Sun <sunhao.th@gmail.com> Signed-off-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20220831042629.130006-3-houtao@huaweicloud.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2022-08-31 14:10:01 -07:00
Hou Tao	2775da2162	bpf: Disable preemption when increasing per-cpu map_locked Per-cpu htab->map_locked is used to prohibit the concurrent accesses from both NMI and non-NMI contexts. But since commit `74d862b682` ("sched: Make migrate_disable/enable() independent of RT"), migrate_disable() is also preemptible under CONFIG_PREEMPT case, so now map_locked also disallows concurrent updates from normal contexts (e.g. userspace processes) unexpectedly as shown below: process A process B htab_map_update_elem() htab_lock_bucket() migrate_disable() /* return 1 / __this_cpu_inc_return() / preempted by B / htab_map_update_elem() / the same bucket as A / htab_lock_bucket() migrate_disable() / return 2, so lock fails */ __this_cpu_inc_return() return -EBUSY A fix that seems feasible is using in_nmi() in htab_lock_bucket() and only checking the value of map_locked for nmi context. But it will re-introduce dead-lock on bucket lock if htab_lock_bucket() is re-entered through non-tracing program (e.g. fentry program). One cannot use preempt_disable() to fix this issue as htab_use_raw_lock being false causes the bucket lock to be a spin lock which can sleep and does not work with preempt_disable(). Therefore, use migrate_disable() when using the spinlock instead of preempt_disable() and defer fixing concurrent updates to when the kernel has its own BPF memory allocator. Fixes: `74d862b682` ("sched: Make migrate_disable/enable() independent of RT") Reviewed-by: Hao Luo <haoluo@google.com> Signed-off-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20220831042629.130006-2-houtao@huaweicloud.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2022-08-31 14:10:01 -07:00
Martin KaFai Lau	197072945a	selftest/bpf: Ensure no module loading in bpf_setsockopt(TCP_CONGESTION) This patch adds a test to ensure bpf_setsockopt(TCP_CONGESTION, "not_exist") will not trigger the kernel module autoload. Before the fix: [ 40.535829] BUG: sleeping function called from invalid context at include/linux/sched/mm.h:274 [...] [ 40.552134] tcp_ca_find_autoload.constprop.0+0xcb/0x200 [ 40.552689] tcp_set_congestion_control+0x99/0x7b0 [ 40.553203] do_tcp_setsockopt+0x3ed/0x2240 [...] [ 40.556041] __bpf_setsockopt+0x124/0x640 Signed-off-by: Martin KaFai Lau <martin.lau@linux.dev> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220830231953.792412-1-martin.lau@linux.dev	2022-08-31 22:22:29 +02:00
Martin KaFai Lau	84e5a0f208	bpf, net: Avoid loading module when calling bpf_setsockopt(TCP_CONGESTION) When bpf prog changes tcp-cc by calling bpf_setsockopt(TCP_CONGESTION), it should not try to load module which may be a blocking operation. This details was correct in the v1 [0] but missed by mistake in the later revision in commit `cb388e7ee3` ("bpf: net: Change do_tcp_setsockopt() to use the sockopt's lock_sock() and capable()"). This patch fixes it by checking the has_current_bpf_ctx(). [0] https://lore.kernel.org/bpf/20220727060921.2373314-1-kafai@fb.com/ Fixes: `cb388e7ee3` ("bpf: net: Change do_tcp_setsockopt() to use the sockopt's lock_sock() and capable()") Signed-off-by: Martin KaFai Lau <martin.lau@linux.dev> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220830231946.791504-1-martin.lau@linux.dev	2022-08-31 22:21:45 +02:00
Axel Rasmussen	5a3a599810	selftests: net: sort .gitignore file This is the result of `sort tools/testing/selftests/net/.gitignore`, but preserving the comment at the top. Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Axel Rasmussen <axelrasmussen@google.com> Link: https://lore.kernel.org/r/20220829184748.1535580-1-axelrasmussen@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-08-31 12:47:57 -07:00
Randy Dunlap	404a5ad720	Documentation: networking: correct possessive "its" Change occurrences of "it's" that are possessive to "its" so that they don't read as "it is". Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Jiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/20220829235414.17110-1-rdunlap@infradead.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-08-31 12:36:08 -07:00
Heiner Kallweit	8af1a9afe1	net: phy: smsc: use device-managed clock API Simplify the code by using the device-managed clock API. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/b222be68-ba7e-999d-0a07-eca0ecedf74e@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-08-31 12:19:18 -07:00
Cong Wang	8fc29ff391	kcm: fix strp_init() order and cleanup strp_init() is called just a few lines above this csk->sk_user_data check, it also initializes strp->work etc., therefore, it is unnecessary to call strp_done() to cancel the freshly initialized work. And if sk_user_data is already used by KCM, psock->strp should not be touched, particularly strp->work state, so we need to move strp_init() after the csk->sk_user_data check. This also makes a lockdep warning reported by syzbot go away. Reported-and-tested-by: syzbot+9fc084a4348493ef65d2@syzkaller.appspotmail.com Reported-by: syzbot+e696806ef96cdd2d87cd@syzkaller.appspotmail.com Fixes: `e557124023` ("kcm: Check if sk_user_data already set in kcm_attach") Fixes: `dff8baa261` ("kcm: Call strp_stop before strp_done in kcm_attach") Cc: Tom Herbert <tom@herbertland.com> Signed-off-by: Cong Wang <cong.wang@bytedance.com> Link: https://lore.kernel.org/r/20220827181314.193710-1-xiyou.wangcong@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-08-31 12:16:44 -07:00
David Thompson	3a1a274e93	mlxbf_gige: compute MDIO period based on i1clk This patch adds logic to compute the MDIO period based on the i1clk, and thereafter write the MDIO period into the YU MDIO config register. The i1clk resource from the ACPI table is used to provide addressing to YU bootrecord PLL registers. The values in these registers are used to compute MDIO period. If the i1clk resource is not present in the ACPI table, then the current default hardcorded value of 430Mhz is used. The i1clk clock value of 430MHz is only accurate for boards with BF2 mid bin and main bin SoCs. The BF2 high bin SoCs have i1clk = 500MHz, but can support a slower MDIO period. Fixes: `f92e1869d7` ("Add Mellanox BlueField Gigabit Ethernet driver") Reviewed-by: Asmaa Mnebhi <asmaa@nvidia.com> Signed-off-by: David Thompson <davthompson@nvidia.com> Link: https://lore.kernel.org/r/20220826155916.12491-1-davthompson@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-08-31 12:13:46 -07:00
James Hilliard	14e5ce7994	libbpf: Add GCC support for bpf_tail_call_static The bpf_tail_call_static function is currently not defined unless using clang >= 8. To support bpf_tail_call_static on GCC we can check if __clang__ is not defined to enable bpf_tail_call_static. We need to use GCC assembly syntax when the compiler does not define __clang__ as LLVM inline assembly is not fully compatible with GCC. Signed-off-by: James Hilliard <james.hilliard1@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220829210546.755377-1-james.hilliard1@gmail.com	2022-08-31 20:54:23 +02:00
Linus Torvalds	c5e4d5e991	Merge tag 'fscache-fixes-20220831' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs Pull fscache/cachefiles fixes from David Howells: - Fix kdoc on fscache_use/unuse_cookie(). - Fix the error returned by cachefiles_ondemand_copen() from an upcall result. - Fix the distribution of requests in on-demand mode in cachefiles to be fairer by cycling through them rather than picking the one with the lowest ID each time (IDs being reused). * tag 'fscache-fixes-20220831' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: cachefiles: make on-demand request distribution fairer cachefiles: fix error return code in cachefiles_ondemand_copen() fscache: fix misdocumented parameter	2022-08-31 10:13:34 -07:00
Linus Torvalds	a1f764268f	Merge tag 'for-linus-2022083101' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid Pull HID fixes from Jiri Kosina: - NULL pointer dereference fix for Steam driver (Lee Jones) - memory leak fix for hidraw (Karthik Alapati) - regression fix for functionality of some UCLogic tables (Benjamin Tissoires) - a few new device IDs and device-specific quirks * tag 'for-linus-2022083101' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid: HID: nintendo: fix rumble worker null pointer deref HID: intel-ish-hid: ipc: Add Meteor Lake PCI device ID HID: input: fix uclogic tablets HID: Add Apple Touchbar on T2 Macs in hid_have_special_driver list HID: add Lenovo Yoga C630 battery quirk HID: AMD_SFH: Add a DMI quirk entry for Chromebooks HID: thrustmaster: Add sparco wheel and fix array length hid: intel-ish-hid: ishtp: Fix ishtp client sending disordered message HID: ishtp-hid-clientHID: ishtp-hid-client: Fix comment typo HID: asus: ROG NKey: Ignore portion of 0x5a report HID: hidraw: fix memory leak in hidraw_release() HID: steam: Prevent NULL pointer dereference in steam_{recv,send}_report	2022-08-31 09:54:14 -07:00
Linus Torvalds	2361d3841f	Merge tag 'v6.0-p2' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fix from Herbert Xu: "Fix a boot performance regression due to an unnecessary dependency on XOR_BLOCKS" * tag 'v6.0-p2' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: lib - remove unneeded selection of XOR_BLOCKS	2022-08-31 09:47:06 -07:00
Linus Torvalds	9c9d1896fa	Merge tag 'lsm-pr-20220829' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm Pull LSM support for IORING_OP_URING_CMD from Paul Moore: "Add SELinux and Smack controls to the io_uring IORING_OP_URING_CMD. These are necessary as without them the IORING_OP_URING_CMD remains outside the purview of the LSMs (Luis' LSM patch, Casey's Smack patch, and my SELinux patch). They have been discussed at length with the io_uring folks, and Jens has given his thumbs-up on the relevant patches (see the commit descriptions). There is one patch that is not strictly necessary, but it makes testing much easier and is very trivial: the /dev/null IORING_OP_URING_CMD patch." * tag 'lsm-pr-20220829' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm: Smack: Provide read control for io_uring_cmd /dev/null: add IORING_OP_URING_CMD support selinux: implement the security_uring_cmd() LSM hook lsm,io_uring: add LSM hooks for the new uring_cmd file op	2022-08-31 09:23:16 -07:00
Xin Yin	1122f40072	cachefiles: make on-demand request distribution fairer For now, enqueuing and dequeuing on-demand requests all start from idx 0, this makes request distribution unfair. In the weighty concurrent I/O scenario, the request stored in higher idx will starve. Searching requests cyclically in cachefiles_ondemand_daemon_read, makes distribution fairer. Fixes: `c838305450` ("cachefiles: notify the user daemon when looking up cookie") Reported-by: Yongqing Li <liyongqing@bytedance.com> Signed-off-by: Xin Yin <yinxin.x@bytedance.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeffle Xu <jefflexu@linux.alibaba.com> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20220817065200.11543-1-yinxin.x@bytedance.com/ # v1 Link: https://lore.kernel.org/r/20220825020945.2293-1-yinxin.x@bytedance.com/ # v2	2022-08-31 16:41:10 +01:00
Sun Ke	c93ccd63b1	cachefiles: fix error return code in cachefiles_ondemand_copen() The cache_size field of copen is specified by the user daemon. If cache_size < 0, then the OPEN request is expected to fail, while copen itself shall succeed. However, returning 0 is indeed unexpected when cache_size is an invalid error code. Fix this by returning error when cache_size is an invalid error code. Changes ======= v4: update the code suggested by Dan v3: update the commit log suggested by Jingbo. Fixes: `c838305450` ("cachefiles: notify the user daemon when looking up cookie") Signed-off-by: Sun Ke <sunke32@huawei.com> Suggested-by: Jeffle Xu <jefflexu@linux.alibaba.com> Suggested-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Reviewed-by: Jingbo Xu <jefflexu@linux.alibaba.com> Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/20220818111935.1683062-1-sunke32@huawei.com/ # v2 Link: https://lore.kernel.org/r/20220818125038.2247720-1-sunke32@huawei.com/ # v3 Link: https://lore.kernel.org/r/20220826023515.3437469-1-sunke32@huawei.com/ # v4	2022-08-31 16:41:10 +01:00
Khalid Masum	ec1bd37123	fscache: fix misdocumented parameter This patch fixes two warnings generated by make docs. The functions fscache_use_cookie and fscache_unuse_cookie, both have a parameter named cookie. But they are documented with the name "object" with unclear description. Which generates the warning when creating docs. This commit will replace the currently misdocumented parameter names with the correct ones while adding proper descriptions. CC: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Khalid Masum <khalid.masum.92@gmail.com> Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/20220521142446.4746-1-khalid.masum.92@gmail.com/ # v1 Link: https://lore.kernel.org/r/20220818040738.12036-1-khalid.masum.92@gmail.com/ # v2 Link: https://lore.kernel.org/r/880d7d25753fb326ee17ac08005952112fcf9bdb.1657360984.git.mchehab@kernel.org/ # Mauro's version	2022-08-31 14:57:28 +01:00
David S. Miller	6edd302a1c	Merge branch 'hns3-next' Guangbin Huang says: ==================== net: hns3: updates for -next This series includes some updates for the HNS3 ethernet driver. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2022-08-31 14:10:50 +01:00
Guangbin Huang	08aa17a0c1	net: hns3: net: hns3: add querying and setting fec off mode from firmware For some new devices, the FEC mode can not be set to OFF in speed 200G. In order to flexibly adapt to all types of devices, driver queries fec ability from firmware to decide whether OFF mode can be supported. Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-08-31 14:10:49 +01:00
Hao Lan	5c4f72842d	net: hns3: add querying and setting fec llrs mode from firmware This patch supports llrs fec mode in speed 200G for some new devices, and suppoprts querying llrs fec ability from firmware. Signed-off-by: Hao Lan <lanhao@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-08-31 14:10:49 +01:00
Guangbin Huang	eaf83ae59e	net: hns3: add querying fec ability from firmware For some new devices, driver can queries fec ability from firmware to decide which FEC mode can be supported. If devices of old version which not support querying fec ability, driver sets fixed ability according to current speed. Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-08-31 14:10:49 +01:00
Guangbin Huang	507e46ae26	net: hns3: add getting capabilities of gro offload and fd from firmware As some new devices may not support GRO offload and flow table director, to support these devices, driver needs to querying capabilities of GRO offload and flow table director from firmware. Whether the driver supports these two features depends on capabilities. For old device of version HNAE3_DEVICE_VERSION_V2, driver sets their capabilities of these two features to fixed value. Setting default features of netdev and debugfs also need to identify whether support these two features. Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-08-31 14:10:49 +01:00
David S. Miller	39a7d7261a	Merge branch 'thunderbolt-end-to-end-flow-control' Mika Westerberg says: ==================== thunderbolt: net: Enable full end-to-end flow control Thunderbolt/USB4 host controllers support full end-to-end flow control that prevents dropping packets if there are not enough hardware receive buffers. So far it has not been enabled for the networking driver yet but this series changes that. There is one snag though: the second generation (Intel Falcon Ridge) had a bug that needs special quirk to get it working. We had that in the early stages of the Thunderbolt/USB4 driver but it got dropped because it was not needed at the time. Now we add it back as a quirk for the host controller (NHI). The first patch of this series is a bugfix that I'm planning to push for v6.0-rc. Rest are v6.1 material. This also includes a patch that shows the XDomain link type in sysfs the same way we do for USB4 routers and updates the networking driver module description. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2022-08-31 14:05:12 +01:00
Mika Westerberg	e550ed4b87	net: thunderbolt: Update module description with mention of USB4 It is Thunderbolt/USB4 now so reflect that in the module description too to avoid any confusion. No functional changes. Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-08-31 14:05:12 +01:00
Mika Westerberg	8bdc25cf62	net: thunderbolt: Enable full end-to-end flow control USB4NET protocol allows the networking drivers to take advantage of end-to-end flow control supported by the USB4 host interface. This should prevent the receiving side from dropping network packets. In adddition add a module parameter that can be used to turn this off just in case it causes problems. Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-08-31 14:05:12 +01:00
Mika Westerberg	54669e2f17	thunderbolt: Add back Intel Falcon Ridge end-to-end flow control workaround As we are now enabling full end-to-end flow control to the Thunderbolt networking driver, in order for it to work properly on second generation Thunderbolt hardware (Falcon Ridge), we need to add back the workaround that was removed with commit `53f13319d1` ("thunderbolt: Get rid of E2E workaround"). However, this time we only apply it for Falcon Ridge controllers as a form of an additional quirk. For non-Falcon Ridge this does nothing. While there fix a typo 'reqister' -> 'register' in the comment. Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-08-31 14:05:12 +01:00
Mika Westerberg	f9cad07b84	thunderbolt: Show link type for XDomain connections too Following what we do for routers already, extend this to XDomain connections as well. This will show in sysfs whether the link is in USB4 or Thunderbolt mode. Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-08-31 14:05:12 +01:00
Mika Westerberg	ff7cd07f30	net: thunderbolt: Enable DMA paths only after rings are enabled If the other host starts sending packets early on it is possible that we are still in the middle of populating the initial Rx ring packets to the ring. This causes the tbnet_poll() to mess over the queue and causes list corruption. This happens specifically when connected with macOS as it seems start sending various IP discovery packets as soon as its side of the paths are configured. To prevent this we move the DMA path enabling to happen after we have primed the Rx ring. This makes sure no incoming packets can arrive before we are ready to handle them. Fixes: `e69b6c02b4` ("net: Add support for networking over Thunderbolt cable") Cc: stable@vger.kernel.org Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-08-31 14:05:11 +01:00
Duoming Zhou	c0955bf957	ethernet: rocker: fix sleep in atomic context bug in neigh_timer_handler The function neigh_timer_handler() is a timer handler that runs in an atomic context. When used by rocker, neigh_timer_handler() calls "kzalloc(.., GFP_KERNEL)" that may sleep. As a result, the sleep in atomic context bug will happen. One of the processes is shown below: ofdpa_fib4_add() ... neigh_add_timer() (wait a timer) neigh_timer_handler() neigh_release() neigh_destroy() rocker_port_neigh_destroy() rocker_world_port_neigh_destroy() ofdpa_port_neigh_destroy() ofdpa_port_ipv4_neigh() kzalloc(sizeof(.., GFP_KERNEL) //may sleep This patch changes the gfp_t parameter of kzalloc() from GFP_KERNEL to GFP_ATOMIC in order to mitigate the bug. Fixes: `00fc0c51e3` ("rocker: Change world_ops API and implementation to be switchdev independant") Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-08-31 14:01:29 +01:00

... 4 5 6 7 8 ...

1123098 Commits