linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-05-03 18:12:25 -04:00

Author	SHA1	Message	Date
Lorenzo Bianconi	7c48cb0176	xdp: add frags support to xdp_return_{buff/frame} Take into account if the received xdp_buff/xdp_frame is non-linear recycling/returning the frame memory to the allocator or into xdp_frame_bulk. Acked-by: Toke Hoiland-Jorgensen <toke@redhat.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/a961069febc868508ce1bdf5e53a343eb4e57cb2.1642758637.git.lorenzo@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-01-21 14:14:01 -08:00
Lorenzo Bianconi	ed7a58cb40	net: marvell: rely on xdp_update_skb_shared_info utility routine Rely on xdp_update_skb_shared_info routine in order to avoid resetting frags array in skb_shared_info structure building the skb in mvneta_swbm_build_skb(). Frags array is expected to be initialized by the receiving driver building the xdp_buff and here we just need to update memory metadata. Acked-by: Toke Hoiland-Jorgensen <toke@redhat.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/e0dad97f5d02b13f189f99f1e5bc8e61bef73412.1642758637.git.lorenzo@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-01-21 14:14:01 -08:00
Lorenzo Bianconi	d65a1906b3	net: xdp: add xdp_update_skb_shared_info utility routine Introduce xdp_update_skb_shared_info routine to update frags array metadata in skb_shared_info data structure converting to a skb from a xdp_buff or xdp_frame. According to the current skb_shared_info architecture in xdp_frame/xdp_buff and to the xdp frags support, there is no need to run skb_add_rx_frag() and reset frags array converting the buffer to a skb since the frag array will be in the same position for xdp_buff/xdp_frame and for the skb, we just need to update memory metadata. Introduce XDP_FLAGS_PF_MEMALLOC flag in xdp_buff_flags in order to mark the xdp_buff or xdp_frame as under memory-pressure if pages of the frags array are under memory pressure. Doing so we can avoid looping over all fragments in xdp_update_skb_shared_info routine. The driver is expected to set the flag constructing the xdp_buffer using xdp_buff_set_frag_pfmemalloc utility routine. Rely on xdp_update_skb_shared_info in __xdp_build_skb_from_frame routine converting the non-linear xdp_frame to a skb after performing a XDP_REDIRECT. Acked-by: Toke Hoiland-Jorgensen <toke@redhat.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/bfd23fb8a8d7438724f7819c567cdf99ffd6226f.1642758637.git.lorenzo@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-01-21 14:14:01 -08:00
Lorenzo Bianconi	d094c9851a	net: mvneta: simplify mvneta_swbm_add_rx_fragment management Relying on xdp frags bit, remove skb_shared_info structure allocated on the stack in mvneta_rx_swbm routine and simplify mvneta_swbm_add_rx_fragment accessing skb_shared_info in the xdp_buff structure directly. There is no performance penalty in this approach since mvneta_swbm_add_rx_fragment is run just for xdp frags use-case. Acked-by: Toke Hoiland-Jorgensen <toke@redhat.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/45f050c094ccffce49d6bc5112939ed35250ba90.1642758637.git.lorenzo@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-01-21 14:14:01 -08:00
Lorenzo Bianconi	76a676947b	net: mvneta: update frags bit before passing the xdp buffer to eBPF layer Update frags bit (XDP_FLAGS_HAS_FRAGS) in xdp_buff to notify XDP/eBPF layer and XDP remote drivers if this is a "non-linear" XDP buffer. Access skb_shared_info only if XDP_FLAGS_HAS_FRAGS flag is set in order to avoid possible cache-misses. Acked-by: Toke Hoiland-Jorgensen <toke@redhat.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/c00a73097f8a35860d50dae4a36e6cc9ef7e172f.1642758637.git.lorenzo@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-01-21 14:14:01 -08:00
Lorenzo Bianconi	2e88d4ff03	xdp: introduce flags field in xdp_buff/xdp_frame Introduce flags field in xdp_frame and xdp_buffer data structures to define additional buffer features. At the moment the only supported buffer feature is frags bit (XDP_FLAGS_HAS_FRAGS). frags bit is used to specify if this is a linear buffer (XDP_FLAGS_HAS_FRAGS not set) or a frags frame (XDP_FLAGS_HAS_FRAGS set). In the latter case the driver is expected to initialize the skb_shared_info structure at the end of the first buffer to link together subsequent buffers belonging to the same frame. Acked-by: Toke Hoiland-Jorgensen <toke@redhat.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/e389f14f3a162c0a5bc6a2e1aa8dd01a90be117d.1642758637.git.lorenzo@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-01-21 14:14:01 -08:00
Lorenzo Bianconi	d16697cb62	net: skbuff: add size metadata to skb_shared_info for xdp Introduce xdp_frags_size field in skb_shared_info data structure to store xdp_buff/xdp_frame frame paged size (xdp_frags_size will be used in xdp frags support). In order to not increase skb_shared_info size we will use a hole due to skb_shared_info alignment. Acked-by: Toke Hoiland-Jorgensen <toke@redhat.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/8a849819a3e0a143d540f78a3a5add76e17e980d.1642758637.git.lorenzo@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-01-21 14:14:01 -08:00
David S. Miller	03c82e80ec	Merge branch 'octeontx2-af-fixes' Subbaraya Sundeep says: ==================== octeontx-af2: Fixes for CN10K and CN9xxx platforms This patchset has consolidated fixes in Octeontx2 driver handling CN10K and CN9xxx platforms. When testing the new CN10K hardware some issues resurfaced like accessing wrong register for CN10K and enabling loopback on not supported interfaces. Some fixes are needed for CN9xxx platforms as well. Below is the description of patches Patch 1: AF sets RX RSS action for all the VFs when a VF is brought up. But when a PF sets RX action for its VF like Drop/Direct to a queue in ntuple filter it is not retained because of AF fixup. This patch skips modifying VF RX RSS action if PF has already set its action. Patch 2: When configuring backpressure wrong register is being read for LBKs hence fixed it. Patch 3: Some RVU blocks may take longer time to reset but are guaranteed to complete the reset. Hence wait till reset is complete. Patch 4: For enabling LMAC CN10K needs another register compared to CN9xxx platforms. Hence changed it. Patch 5: Adds missing barrier before submitting memory pointer to the aura hardware. Patch 6: Increase polling time while link credit restore and also return proper error code when timeout occurs. Patch 7: Internal loopback not supported on LPCS interfaces like SGMII/QSGMII so do not enable it. Patch 8: When there is a error in message processing, AF sets the error response and replies back to requestor. PF forwards a invalid message to VF back if AF reply has error in it. This way VF lacks the actual error set by AF for its message. This is changed such that PF simply forwards the actual reply and let VF handle the error. Patch 9: ntuple filter with "flow-type ether proto 0x8842 vlan 0x92e" was not working since ethertype 0x8842 is NGIO protocol. Hardware parser explicitly parses such NGIO packets and sets the packet as NGIO and do not set it as tagged packet. Fix this by changing parser such that it sets the packet as both NGIO and tagged by using separate layer types. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2022-01-21 14:32:21 +00:00
Kiran Kumar K	745166fcf0	octeontx2-af: Add KPU changes to parse NGIO as separate layer With current KPU profile NGIO is being parsed along with CTAG as a single layer. Because of this MCAM/ntuple rules installed with ethertype as 0x8842 are not being hit. Adding KPU profile changes to parse NGIO in separate ltype and CTAG in separate ltype. Fixes: `f9c49be90c` ("octeontx2-af: Update the default KPU profile and fixes") Signed-off-by: Kiran Kumar K <kirankumark@marvell.com> Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-01-21 14:32:21 +00:00
Subbaraya Sundeep	a8db854be2	octeontx2-pf: Forward error codes to VF PF forwards its VF messages to AF and corresponding replies from AF to VF. AF sets proper error code in the replies after processing message requests. Currently PF checks the error codes in replies and sends invalid message to VF. This way VF lacks the information of error code set by AF for its messages. This patch changes that such that PF simply forwards AF replies so that VF can handle error codes. Fixes: `d424b6c024` ("octeontx2-pf: Enable SRIOV and added VF mbox handling") Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-01-21 14:32:21 +00:00
Geetha sowjanya	df66b6ebc5	octeontx2-af: cn10k: Do not enable RPM loopback for LPC interfaces Internal looback is not supported to low rate LPCS interface like SGMII/QSGMII. Hence don't allow to enable for such interfaces. Fixes: `3ad3f8f93c` ("octeontx2-af: cn10k: MAC internal loopback support") Signed-off-by: Geetha sowjanya <gakula@marvell.com> Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-01-21 14:32:20 +00:00
Geetha sowjanya	1581d61b42	octeontx2-af: Increase link credit restore polling timeout It's been observed that sometimes link credit restore takes a lot of time than the current timeout. This patch increases the default timeout value and return the proper error value on failure. Fixes: `1c74b89171` ("octeontx2-af: Wait for TX link idle for credits change") Signed-off-by: Geetha sowjanya <gakula@marvell.com> Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-01-21 14:32:20 +00:00
Geetha sowjanya	c5d731c54a	octeontx2-pf: cn10k: Ensure valid pointers are freed to aura While freeing SQB pointers to aura, driver first memcpy to target address and then triggers lmtst operation to free pointer to the aura. We need to ensure(by adding dmb barrier)that memcpy is finished before pointers are freed to the aura. This patch also adds the missing sq context structure entry in debugfs. Fixes: `ef6c8da71e` ("octeontx2-pf: cn10K: Reserve LMTST lines per core") Signed-off-by: Geetha sowjanya <gakula@marvell.com> Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-01-21 14:32:20 +00:00
Geetha sowjanya	fae80edeaf	octeontx2-af: cn10k: Use appropriate register for LMAC enable CN10K platforms uses RPM(0..2)_MTI_MAC100(0..3)_COMMAND_CONFIG register for lmac TX/RX enable whereas CN9xxx platforms use CGX_CMRX_CONFIG register. This config change was missed when adding support for CN10K RPM. Fixes: `91c6945ea1` ("octeontx2-af: cn10k: Add RPM MAC support") Signed-off-by: Geetha sowjanya <gakula@marvell.com> Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-01-21 14:32:20 +00:00
Geetha sowjanya	03ffbc9914	octeontx2-af: Retry until RVU block reset complete Few RVU blocks like SSO require more time for reset on some silicons. Hence retrying the block reset until success. Fixes: `c0fa2cff88` ("octeontx2-af: Handle return value in block reset") Signed-off-by: Geetha sowjanya <gakula@marvell.com> Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-01-21 14:32:20 +00:00
Sunil Goutham	00bfe94e38	octeontx2-af: Fix LBK backpressure id count In rvu_nix_get_bpid() lbk_bpid_cnt is being read from wrong register. Due to this backpressure enable is failing for LBK VF32 onwards. This patch fixes that. Fixes: `fe1939bb23` ("octeontx2-af: Add SDP interface support") Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: Subbaraya Sundeep <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-01-21 14:32:20 +00:00
Subbaraya Sundeep	d225c449ab	octeontx2-af: Do not fixup all VF action entries AF modifies all the rules destined for VF to use the action same as default RSS action. This fixup was needed because AF only installs default rules with RSS action. But the action in rules installed by a PF for its VFs should not be changed by this fixup. This is because action can be drop or direct to queue as specified by user(ntuple filters). This patch fixes that problem. Fixes: `967db3529e` ("octeontx2-af: add support for multicast/promisc packet") Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-01-21 14:32:20 +00:00
David S. Miller	67ab55956e	Merge tag 'wireless-2022-01-21' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless Kalle Valo says: ==================== wireless fixes for v5.17 First set of fixes for v5.17. This is the first pull request from the new wireless tree and only changes to MAINTAINERS file. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2022-01-21 11:23:33 +00:00
David S. Miller	0b6d8cf2ec	Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2022-01-20 This series contains updates to i40e driver only. Jedrzej increases delay for EMP reset and adds checks to ensure a VF request to change queues can be met. Sylwester moves the placement of the Flow Director queue as to not fragment the queue pile which would cause later re-allocation issues. Karen prevents VF reset being invoked while another is still occurring to avoid reading invalid data. Joe Damato fixes some statistics fields to match the values of the fields they are based on. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2022-01-21 10:30:30 +00:00
Di Zhu	820e6e227c	selftests: bpf: test BPF_PROG_QUERY for progs attached to sockmap Add test for querying progs attached to sockmap. we use an existing libbpf query interface to query prog cnt before and after progs attaching to sockmap and check whether the queried prog id is right. Signed-off-by: Di Zhu <zhudi2@huawei.com> Acked-by: Yonghong Song <yhs@fb.com> Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com> Link: https://lore.kernel.org/r/20220119014005.1209-2-zhudi2@huawei.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-01-20 21:33:32 -08:00
Di Zhu	748cd5729a	bpf: support BPF_PROG_QUERY for progs attached to sockmap Right now there is no way to query whether BPF programs are attached to a sockmap or not. we can use the standard interface in libbpf to query, such as: bpf_prog_query(mapFd, BPF_SK_SKB_STREAM_PARSER, 0, NULL, ...); the mapFd is the fd of sockmap. Signed-off-by: Di Zhu <zhudi2@huawei.com> Acked-by: Yonghong Song <yhs@fb.com> Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com> Link: https://lore.kernel.org/r/20220119014005.1209-1-zhudi2@huawei.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-01-20 21:30:58 -08:00
Alexei Starovoitov	3f712d4691	Merge branch 'libbpf: streamline netlink-based XDP APIs' Andrii Nakryiko says: ==================== Revamp existing low-level XDP APIs provided by libbpf to follow more consistent naming (new APIs follow bpf_tc_xxx() approach where it makes sense) and be extensible without ABI breakages (OPTS-based). See patch #1 for details, remaining patches switch bpftool, selftests/bpf and samples/bpf to new APIs. ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-01-20 21:22:03 -08:00
Andrii Nakryiko	d4e34bfcbe	samples/bpf: adapt samples/bpf to bpf_xdp_xxx() APIs Use new bpf_xdp_*() APIs across all XDP-related BPF samples. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220120061422.2710637-5-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-01-20 21:22:02 -08:00
Andrii Nakryiko	544356524d	selftests/bpf: switch to new libbpf XDP APIs Switch to using new bpf_xdp_*() APIs across all selftests. Take advantage of a more straightforward and user-friendly semantics of old_prog_fd (0 means "don't care") in few places. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220120061422.2710637-4-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-01-20 21:22:02 -08:00
Andrii Nakryiko	c86575ecca	bpftool: use new API for attaching XDP program Switch to new bpf_xdp_attach() API to avoid deprecation warnings. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220120061422.2710637-3-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-01-20 21:22:02 -08:00
Andrii Nakryiko	c359821ac6	libbpf: streamline low-level XDP APIs Introduce 4 new netlink-based XDP APIs for attaching, detaching, and querying XDP programs: - bpf_xdp_attach; - bpf_xdp_detach; - bpf_xdp_query; - bpf_xdp_query_id. These APIs replace bpf_set_link_xdp_fd, bpf_set_link_xdp_fd_opts, bpf_get_link_xdp_id, and bpf_get_link_xdp_info APIs ([0]). The latter don't follow a consistent naming pattern and some of them use non-extensible approaches (e.g., struct xdp_link_info which can't be modified without breaking libbpf ABI). The approach I took with these low-level XDP APIs is similar to what we did with low-level TC APIs. There is a nice duality of bpf_tc_attach vs bpf_xdp_attach, and so on. I left bpf_xdp_attach() to support detaching when -1 is specified for prog_fd for generality and convenience, but bpf_xdp_detach() is preferred due to clearer naming and associated semantics. Both bpf_xdp_attach() and bpf_xdp_detach() accept the same opts struct allowing to specify expected old_prog_fd. While doing the refactoring, I noticed that old APIs require users to specify opts with old_fd == -1 to declare "don't care about already attached XDP prog fd" condition. Otherwise, FD 0 is assumed, which is essentially never an intended behavior. So I made this behavior consistent with other kernel and libbpf APIs, in which zero FD means "no FD". This seems to be more in line with the latest thinking in BPF land and should cause less user confusion, hopefully. For querying, I left two APIs, both more generic bpf_xdp_query() allowing to query multiple IDs and attach mode, but also a specialization of it, bpf_xdp_query_id(), which returns only requested prog_id. Uses of prog_id returning bpf_get_link_xdp_id() were so prevalent across selftests and samples, that it seemed a very common use case and using bpf_xdp_query() for doing it felt very cumbersome with a highly branches if/else chain based on flags and attach mode. Old APIs are scheduled for deprecation in libbpf 0.8 release. [0] Closes: https://github.com/libbpf/libbpf/issues/309 Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/r/20220120061422.2710637-2-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-01-20 21:22:02 -08:00
Alexei Starovoitov	1713e33bfd	Merge branch 'libbpf: deprecate legacy BPF map definitions' Andrii Nakryiko says: ==================== Officially deprecate legacy BPF map definitions in libbpf. They've been slated for deprecation for a while in favor of more powerful BTF-defined map definitions and this patch set adds warnings and a way to enforce this in libbpf through LIBBPF_STRICT_MAP_DEFINITIONS strict mode flag. Selftests are fixed up and updated, BPF documentation is updated, bpftool's strict mode usage is adjusted to avoid breaking users unnecessarily. v1->v2: - replace missed bpf_map_def case in Documentation/bpf/btf.rst (Alexei). ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-01-20 21:19:05 -08:00
Andrii Nakryiko	96c85308ee	docs/bpf: update BPF map definition example Use BTF-defined map definition in the documentation example. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220120060529.1890907-5-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-01-20 21:19:05 -08:00
Andrii Nakryiko	93b8952d22	libbpf: deprecate legacy BPF map definitions Enact deprecation of legacy BPF map definition in SEC("maps") ([0]). For the definitions themselves introduce LIBBPF_STRICT_MAP_DEFINITIONS flag for libbpf strict mode. If it is set, error out on any struct bpf_map_def-based map definition. If not set, libbpf will print out a warning for each legacy BPF map to raise awareness that it goes away. For any use of BPF_ANNOTATE_KV_PAIR() macro providing a legacy way to associate BTF key/value type information with legacy BPF map definition, warn through libbpf's pr_warn() error message (but don't fail BPF object open). BPF-side struct bpf_map_def is marked as deprecated. User-space struct bpf_map_def has to be used internally in libbpf, so it is left untouched. It should be enough for bpf_map__def() to be marked deprecated to raise awareness that it goes away. bpftool is an interesting case that utilizes libbpf to open BPF ELF object to generate skeleton. As such, even though bpftool itself uses full on strict libbpf mode (LIBBPF_STRICT_ALL), it has to relax it a bit for BPF map definition handling to minimize unnecessary disruptions. So opt-out of LIBBPF_STRICT_MAP_DEFINITIONS for bpftool. User's code that will later use generated skeleton will make its own decision whether to enforce LIBBPF_STRICT_MAP_DEFINITIONS or not. There are few tests in selftests/bpf that are consciously using legacy BPF map definitions to test libbpf functionality. For those, temporary opt out of LIBBPF_STRICT_MAP_DEFINITIONS mode for the duration of those tests. [0] Closes: https://github.com/libbpf/libbpf/issues/272 Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220120060529.1890907-4-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-01-20 21:19:05 -08:00
Andrii Nakryiko	ccc3f56918	selftests/bpf: convert remaining legacy map definitions Converted few remaining legacy BPF map definition to BTF-defined ones. For the remaining two bpf_map_def-based legacy definitions that we want to keep for testing purposes until libbpf 1.0 release, guard them in pragma to suppres deprecation warnings which will be added in libbpf in the next commit. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220120060529.1890907-3-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-01-20 21:19:05 -08:00
Andrii Nakryiko	32b3429479	selftests/bpf: fail build on compilation warning It's very easy to miss compilation warnings without -Werror, which is not set for selftests. libbpf and bpftool are already strict about this, so make selftests/bpf also treat compilation warnings as errors to catch such regressions early. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220120060529.1890907-2-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-01-20 21:19:05 -08:00
Jakub Kicinski	276c7635d7	Merge branch 'mptcp-a-few-fixes' Mat Martineau says: ==================== mptcp: A few fixes Patch 1 fixes a RCU locking issue when processing a netlink command that updates endpoint flags in the in-kernel MPTCP path manager. Patch 2 fixes a typo affecting available endpoint id tracking. Patch 3 fixes IPv6 routing in the MPTCP self tests. ==================== Link: https://lore.kernel.org/r/20220121003529.54930-1-mathew.j.martineau@linux.intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-01-20 20:24:04 -08:00
Paolo Abeni	9846921dba	selftests: mptcp: fix ipv6 routing setup MPJ ipv6 selftests currently lack per link route to the server net. Additionally, ipv6 subflows endpoints are created without any interface specified. The end-result is that in ipv6 self-tests subflows are created all on the same link, leading to expected delays and sporadic self-tests failures. Fix the issue by adding the missing setup bits. Fixes: `523514ed0a` ("selftests: mptcp: add ADD_ADDR IPv6 test cases") Reported-and-tested-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-01-20 20:24:02 -08:00
Geliang Tang	a4c0214fbe	mptcp: fix removing ids bitmap setting In mptcp_pm_nl_rm_addr_or_subflow(), the bit of rm_list->ids[i] in the id_avail_bitmap should be set, not rm_list->ids[1]. This patch fixed it. Fixes: `86e39e0448` ("mptcp: keep track of local endpoint still available for each msk") Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-01-20 20:24:01 -08:00
Paolo Abeni	8e9eacad7e	mptcp: fix msk traversal in mptcp_nl_cmd_set_flags() The MPTCP endpoint list is under RCU protection, guarded by the pernet spinlock. mptcp_nl_cmd_set_flags() traverses the list without acquiring the spin-lock nor under the RCU critical section. This change addresses the issue performing the lookup and the endpoint update under the pernet spinlock. Fixes: `0f9f696a50` ("mptcp: add set_flags command in PM netlink") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-01-20 20:24:01 -08:00
Jakub Kicinski	6f97fde869	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf Pablo Neira Ayuso says: ==================== Netfilter fixes for net 1) Incorrect helper module alias in netbios_ns, from Florian Westphal. 2) Remove unused variable in nf_tables. 3) Uninitialized last expression in nf_tables register tracking. 4) Memleak in nft_connlimit after moving stateful data out of the expression data area. 5) Bogus invalid stats update when NF_REPEAT is returned, from Florian. * git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf: netfilter: conntrack: don't increment invalid counter on NF_REPEAT netfilter: nft_connlimit: memleak if nf_ct_netns_get() fails netfilter: nf_tables: set last expression in register tracking area netfilter: nf_tables: remove unused variable netfilter: nf_conntrack_netbios_ns: fix helper module alias ==================== Link: https://lore.kernel.org/r/20220120125212.991271-1-pablo@netfilter.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-01-20 20:22:31 -08:00
Eric Dumazet	aafc2e3285	ipv6: annotate accesses to fn->fn_sernum struct fib6_node's fn_sernum field can be read while other threads change it. Add READ_ONCE()/WRITE_ONCE() annotations. Do not change existing smp barriers in fib6_get_cookie_safe() and __fib6_update_sernum_upto_root() syzbot reported: BUG: KCSAN: data-race in fib6_clean_node / inet6_csk_route_socket write to 0xffff88813df62e2c of 4 bytes by task 1920 on cpu 1: fib6_clean_node+0xc2/0x260 net/ipv6/ip6_fib.c:2178 fib6_walk_continue+0x38e/0x430 net/ipv6/ip6_fib.c:2112 fib6_walk net/ipv6/ip6_fib.c:2160 [inline] fib6_clean_tree net/ipv6/ip6_fib.c:2240 [inline] __fib6_clean_all+0x1a9/0x2e0 net/ipv6/ip6_fib.c:2256 fib6_flush_trees+0x6c/0x80 net/ipv6/ip6_fib.c:2281 rt_genid_bump_ipv6 include/net/net_namespace.h:488 [inline] addrconf_dad_completed+0x57f/0x870 net/ipv6/addrconf.c:4230 addrconf_dad_work+0x908/0x1170 process_one_work+0x3f6/0x960 kernel/workqueue.c:2307 worker_thread+0x616/0xa70 kernel/workqueue.c:2454 kthread+0x1bf/0x1e0 kernel/kthread.c:359 ret_from_fork+0x1f/0x30 read to 0xffff88813df62e2c of 4 bytes by task 15701 on cpu 0: fib6_get_cookie_safe include/net/ip6_fib.h:285 [inline] rt6_get_cookie include/net/ip6_fib.h:306 [inline] ip6_dst_store include/net/ip6_route.h:234 [inline] inet6_csk_route_socket+0x352/0x3c0 net/ipv6/inet6_connection_sock.c:109 inet6_csk_xmit+0x91/0x1e0 net/ipv6/inet6_connection_sock.c:121 __tcp_transmit_skb+0x1323/0x1840 net/ipv4/tcp_output.c:1402 tcp_transmit_skb net/ipv4/tcp_output.c:1420 [inline] tcp_write_xmit+0x1450/0x4460 net/ipv4/tcp_output.c:2680 __tcp_push_pending_frames+0x68/0x1c0 net/ipv4/tcp_output.c:2864 tcp_push+0x2d9/0x2f0 net/ipv4/tcp.c:725 mptcp_push_release net/mptcp/protocol.c:1491 [inline] __mptcp_push_pending+0x46c/0x490 net/mptcp/protocol.c:1578 mptcp_sendmsg+0x9ec/0xa50 net/mptcp/protocol.c:1764 inet6_sendmsg+0x5f/0x80 net/ipv6/af_inet6.c:643 sock_sendmsg_nosec net/socket.c:705 [inline] sock_sendmsg net/socket.c:725 [inline] kernel_sendmsg+0x97/0xd0 net/socket.c:745 sock_no_sendpage+0x84/0xb0 net/core/sock.c:3086 inet_sendpage+0x9d/0xc0 net/ipv4/af_inet.c:834 kernel_sendpage+0x187/0x200 net/socket.c:3492 sock_sendpage+0x5a/0x70 net/socket.c:1007 pipe_to_sendpage+0x128/0x160 fs/splice.c:364 splice_from_pipe_feed fs/splice.c:418 [inline] __splice_from_pipe+0x207/0x500 fs/splice.c:562 splice_from_pipe fs/splice.c:597 [inline] generic_splice_sendpage+0x94/0xd0 fs/splice.c:746 do_splice_from fs/splice.c:767 [inline] direct_splice_actor+0x80/0xa0 fs/splice.c:936 splice_direct_to_actor+0x345/0x650 fs/splice.c:891 do_splice_direct+0x106/0x190 fs/splice.c:979 do_sendfile+0x675/0xc40 fs/read_write.c:1245 __do_sys_sendfile64 fs/read_write.c:1310 [inline] __se_sys_sendfile64 fs/read_write.c:1296 [inline] __x64_sys_sendfile64+0x102/0x140 fs/read_write.c:1296 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae value changed: 0x0000026f -> 0x00000271 Reported by Kernel Concurrency Sanitizer on: CPU: 0 PID: 15701 Comm: syz-executor.2 Not tainted 5.16.0-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 The Fixes tag I chose is probably arbitrary, I do not think we need to backport this patch to older kernels. Fixes: `c5cff8561d` ("ipv6: add rcu grace period before freeing fib6_node") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Link: https://lore.kernel.org/r/20220120174112.1126644-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-01-20 20:18:37 -08:00
Eric Dumazet	ebdc1a0309	tcp: add a missing sk_defer_free_flush() in tcp_splice_read() Without it, splice users can hit the warning added in commit `79074a72d3` ("net: Flush deferred skb free on socket destroy") Fixes: `f35f821935` ("tcp: defer skb freeing after socket lock is released") Fixes: `79074a72d3` ("net: Flush deferred skb free on socket destroy") Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Gal Pressman <gal@nvidia.com> Link: https://lore.kernel.org/r/20220120124530.925607-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-01-20 20:17:50 -08:00
Gal Pressman	48cec899e3	tcp: Add a stub for sk_defer_free_flush() When compiling the kernel with CONFIG_INET disabled, the sk_defer_free_flush() should be defined as a nop. This resolves the following compilation error: ld: net/core/sock.o: in function `sk_defer_free_flush': ./include/net/tcp.h:1378: undefined reference to `__sk_defer_free_flush' Fixes: `79074a72d3` ("net: Flush deferred skb free on socket destroy") Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20220120123440.9088-1-gal@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-01-20 20:17:32 -08:00
Marek Behún	cbda1b1668	phylib: fix potential use-after-free Commit `bafbdd527d` ("phylib: Add device reset GPIO support") added call to phy_device_reset(phydev) after the put_device() call in phy_detach(). The comment before the put_device() call says that the phydev might go away with put_device(). Fix potential use-after-free by calling phy_device_reset() before put_device(). Fixes: `bafbdd527d` ("phylib: Add device reset GPIO support") Signed-off-by: Marek Behún <kabel@kernel.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20220119162748.32418-1-kabel@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-01-20 20:17:04 -08:00
Kumar Kartikeya Dwivedi	1058b6a78d	selftests/bpf: Do not fail build if CONFIG_NF_CONNTRACK=m/n Some users have complained that selftests fail to build when CONFIG_NF_CONNTRACK=m. It would be useful to allow building as long as it is set to module or built-in, even though in case of building as module, user would need to load it before running the selftest. Note that this also allows building selftest when CONFIG_NF_CONNTRACK is disabled. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220120164932.2798544-1-memxor@gmail.com	2022-01-20 14:34:50 -08:00
Felix Maurer	8c0be0631d	selftests: bpf: Fix bind on used port The bind_perm BPF selftest failed when port 111/tcp was already in use during the test. To fix this, the test now runs in its own network name space. To use unshare, it is necessary to reorder the includes. The style of the includes is adapted to be consistent with the other prog_tests. v2: Replace deprecated CHECK macro with ASSERT_OK Fixes: `8259fdeb30` ("selftests/bpf: Verify that rebinding to port < 1024 from BPF works") Signed-off-by: Felix Maurer <fmaurer@redhat.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com> Link: https://lore.kernel.org/bpf/551ee65533bb987a43f93d88eaf2368b416ccd32.1642518457.git.fmaurer@redhat.com	2022-01-20 14:32:00 -08:00
Andrii Nakryiko	38f033a16a	Merge branch 'rely on ASSERT marcos in xdp_bpf2bpf.c/xdp_adjust_tail.c' Lorenzo Bianconi says: ==================== Rely on ASSERT* macros and get rid of deprecated CHECK ones in xdp_bpf2bpf and xdp_adjust_tail bpf selftests. This is a preliminary series for XDP multi-frags support. Changes since v1: - run each ASSERT test separately - drop unnecessary return statements - drop unnecessary if condition in test_xdp_bpf2bpf() ==================== Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2022-01-20 13:54:58 -08:00
Lorenzo Bianconi	fa6fde350b	bpf: selftests: Get rid of CHECK macro in xdp_bpf2bpf.c Rely on ASSERT* macros and get rid of deprecated CHECK ones in xdp_bpf2bpf bpf selftest. Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/df7e5098465016e27d91f2c69a376a35d63a7621.1642679130.git.lorenzo@kernel.org	2022-01-20 13:54:57 -08:00
Lorenzo Bianconi	791cad0250	bpf: selftests: Get rid of CHECK macro in xdp_adjust_tail.c Rely on ASSERT* macros and get rid of deprecated CHECK ones in xdp_adjust_tail bpf selftest. Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/c0ab002ffa647a20ec9e584214bf0d4373142b54.1642679130.git.lorenzo@kernel.org	2022-01-20 13:54:57 -08:00
Joe Damato	3b8428b845	i40e: fix unsigned stat widths Change i40e_update_vsi_stats and struct i40e_vsi to use u64 fields to match the width of the stats counters in struct i40e_rx_queue_stats. Update debugfs code to use the correct format specifier for u64. Fixes: `41c445ff0f` ("i40e: main driver core") Signed-off-by: Joe Damato <jdamato@fastly.com> Reported-by: kernel test robot <lkp@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-01-20 10:25:41 -08:00
Karen Sornek	0f344c8129	i40e: Fix for failed to init adminq while VF reset Fix for failed to init adminq: -53 while VF is resetting via MAC address changing procedure. Added sync module to avoid reading deadbeef value in reinit adminq during software reset. Without this patch it is possible to trigger VF reset procedure during reinit adminq. This resulted in an incorrect reading of value from the AQP registers and generated the -53 error. Fixes: `5c3c48ac6b` ("i40e: implement virtual device interface") Signed-off-by: Grzegorz Szczurek <grzegorzx.szczurek@intel.com> Signed-off-by: Karen Sornek <karen.sornek@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-01-20 10:25:40 -08:00
Sylwester Dziedziuch	92947844b8	i40e: Fix queues reservation for XDP When XDP was configured on a system with large number of CPUs and X722 NIC there was a call trace with NULL pointer dereference. i40e 0000:87:00.0: failed to get tracking for 256 queues for VSI 0 err -12 i40e 0000:87:00.0: setup of MAIN VSI failed BUG: kernel NULL pointer dereference, address: 0000000000000000 RIP: 0010:i40e_xdp+0xea/0x1b0 [i40e] Call Trace: ? i40e_reconfig_rss_queues+0x130/0x130 [i40e] dev_xdp_install+0x61/0xe0 dev_xdp_attach+0x18a/0x4c0 dev_change_xdp_fd+0x1e6/0x220 do_setlink+0x616/0x1030 ? ahci_port_stop+0x80/0x80 ? ata_qc_issue+0x107/0x1e0 ? lock_timer_base+0x61/0x80 ? __mod_timer+0x202/0x380 rtnl_setlink+0xe5/0x170 ? bpf_lsm_binder_transaction+0x10/0x10 ? security_capable+0x36/0x50 rtnetlink_rcv_msg+0x121/0x350 ? rtnl_calcit.isra.0+0x100/0x100 netlink_rcv_skb+0x50/0xf0 netlink_unicast+0x1d3/0x2a0 netlink_sendmsg+0x22a/0x440 sock_sendmsg+0x5e/0x60 __sys_sendto+0xf0/0x160 ? __sys_getsockname+0x7e/0xc0 ? _copy_from_user+0x3c/0x80 ? __sys_setsockopt+0xc8/0x1a0 __x64_sys_sendto+0x20/0x30 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f83fa7a39e0 This was caused by PF queue pile fragmentation due to flow director VSI queue being placed right after main VSI. Because of this main VSI was not able to resize its queue allocation for XDP resulting in no queues allocated for main VSI when XDP was turned on. Fix this by always allocating last queue in PF queue pile for a flow director VSI. Fixes: `41c445ff0f` ("i40e: main driver core") Fixes: `74608d17fe` ("i40e: add support for XDP_TX action") Signed-off-by: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-01-20 10:25:40 -08:00
Jedrzej Jagielski	d701658a50	i40e: Fix issue when maximum queues is exceeded Before this patch VF interface vanished when maximum queue number was exceeded. Driver tried to add next queues even if there was not enough space. PF sent incorrect number of queues to the VF when there were not enough of them. Add an additional condition introduced to check available space in 'qp_pile' before proceeding. This condition makes it impossible to add queues if they number is greater than the number resulting from available space. Also add the search for free space in PF queue pair piles. Without this patch VF interfaces are not seen when available space for queues has been exceeded and following logs appears permanently in dmesg: "Unable to get VF config (-32)". "VF 62 failed opcode 3, retval: -5" "Unable to get VF config due to PF error condition, not retrying" Fixes: `7daa6bf329` ("i40e: driver core headers") Fixes: `41c445ff0f` ("i40e: main driver core") Signed-off-by: Jaroslaw Gawin <jaroslawx.gawin@intel.com> Signed-off-by: Slawomir Laba <slawomirx.laba@intel.com> Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-01-20 10:25:40 -08:00
Jedrzej Jagielski	9b13bd5313	i40e: Increase delay to 1 s after global EMP reset Recently simplified i40e_rebuild causes that FW sometimes is not ready after NVM update, the ping does not return. Increase the delay in case of EMP reset. Old delay of 300 ms was introduced for specific cards for 710 series. Now it works for all the cards and delay was increased. Fixes: `1fa51a650e` ("i40e: Add delay after EMP reset for firmware to recover") Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-01-20 10:25:39 -08:00

1 2 3 4 5 ...

1071816 Commits