linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-01-21 19:54:03 -05:00

Author	SHA1	Message	Date
Jakub Kicinski	53475287da	Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2023-11-21 We've added 85 non-merge commits during the last 12 day(s) which contain a total of 63 files changed, 4464 insertions(+), 1484 deletions(-). The main changes are: 1) Huge batch of verifier changes to improve BPF register bounds logic and range support along with a large test suite, and verifier log improvements, all from Andrii Nakryiko. 2) Add a new kfunc which acquires the associated cgroup of a task within a specific cgroup v1 hierarchy where the latter is identified by its id, from Yafang Shao. 3) Extend verifier to allow bpf_refcount_acquire() of a map value field obtained via direct load which is a use-case needed in sched_ext, from Dave Marchevsky. 4) Fix bpf_get_task_stack() helper to add the correct crosstask check for the get_perf_callchain(), from Jordan Rome. 5) Fix BPF task_iter internals where lockless usage of next_thread() was wrong. The rework also simplifies the code, from Oleg Nesterov. 6) Fix uninitialized tail padding via LIBBPF_OPTS_RESET, and another fix for certain BPF UAPI structs to fix verifier failures seen in bpf_dynptr usage, from Yonghong Song. 7) Add BPF selftest fixes for map_percpu_stats flakes due to per-CPU BPF memory allocator not being able to allocate per-CPU pointer successfully, from Hou Tao. 8) Add prep work around dynptr and string handling for kfuncs which is later going to be used by file verification via BPF LSM and fsverity, from Song Liu. 9) Improve BPF selftests to update multiple prog_tests to use ASSERT_* macros, from Yuran Pereira. 10) Optimize LPM trie lookup to check prefixlen before walking the trie, from Florian Lehner. 11) Consolidate virtio/9p configs from BPF selftests in config.vm file given they are needed consistently across archs, from Manu Bretelle. 12) Small BPF verifier refactor to remove register_is_const(), from Shung-Hsi Yu. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (85 commits) selftests/bpf: Replaces the usage of CHECK calls for ASSERTs in vmlinux selftests/bpf: Replaces the usage of CHECK calls for ASSERTs in bpf_obj_id selftests/bpf: Replaces the usage of CHECK calls for ASSERTs in bind_perm selftests/bpf: Replaces the usage of CHECK calls for ASSERTs in bpf_tcp_ca selftests/bpf: reduce verboseness of reg_bounds selftest logs bpf: bpf_iter_task_next: use next_task(kit->task) rather than next_task(kit->pos) bpf: bpf_iter_task_next: use __next_thread() rather than next_thread() bpf: task_group_seq_get_next: use __next_thread() rather than next_thread() bpf: emit frameno for PTR_TO_STACK regs if it differs from current one bpf: smarter verifier log number printing logic bpf: omit default off=0 and imm=0 in register state log bpf: emit map name in register state if applicable and available bpf: print spilled register state in stack slot bpf: extract register state printing bpf: move verifier state printing code to kernel/bpf/log.c bpf: move verbose_linfo() into kernel/bpf/log.c bpf: rename BPF_F_TEST_SANITY_STRICT to BPF_F_TEST_REG_INVARIANTS bpf: Remove test for MOVSX32 with offset=32 selftests/bpf: add iter test requiring range x range logic veristat: add ability to set BPF_F_TEST_SANITY_STRICT flag with -r flag ... ==================== Link: https://lore.kernel.org/r/20231122000500.28126-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-21 17:53:20 -08:00
Yuran Pereira	3ece0e85f6	selftests/bpf: Replaces the usage of CHECK calls for ASSERTs in vmlinux vmlinux.c uses the `CHECK` calls even though the use of ASSERT_ series of macros is preferred in the bpf selftests. This patch replaces all `CHECK` calls for equivalent `ASSERT_` macro calls. Signed-off-by: Yuran Pereira <yuran.pereira@hotmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/GV1PR10MB6563ED1023A2A3AEF30BDA5DE8BBA@GV1PR10MB6563.EURPRD10.PROD.OUTLOOK.COM	2023-11-21 10:45:26 -08:00
Yuran Pereira	f125d09b99	selftests/bpf: Replaces the usage of CHECK calls for ASSERTs in bpf_obj_id bpf_obj_id uses the `CHECK` calls even though the use of ASSERT_ series of macros is preferred in the bpf selftests. This patch replaces all `CHECK` calls for equivalent `ASSERT_` macro calls. Signed-off-by: Yuran Pereira <yuran.pereira@hotmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/GV1PR10MB65639AA3A10B4BBAA79952C7E8BBA@GV1PR10MB6563.EURPRD10.PROD.OUTLOOK.COM	2023-11-21 10:45:24 -08:00
Yuran Pereira	3ec1114a97	selftests/bpf: Replaces the usage of CHECK calls for ASSERTs in bind_perm bind_perm uses the `CHECK` calls even though the use of ASSERT_ series of macros is preferred in the bpf selftests. This patch replaces all `CHECK` calls for equivalent `ASSERT_` macro calls. Signed-off-by: Yuran Pereira <yuran.pereira@hotmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/GV1PR10MB656314F467E075A106CA02BFE8BBA@GV1PR10MB6563.EURPRD10.PROD.OUTLOOK.COM	2023-11-21 10:43:03 -08:00
Yuran Pereira	b0e2a03953	selftests/bpf: Replaces the usage of CHECK calls for ASSERTs in bpf_tcp_ca bpf_tcp_ca uses the `CHECK` calls even though the use of ASSERT_ series of macros is preferred in the bpf selftests. This patch replaces all `CHECK` calls for equivalent `ASSERT_` macro calls. Signed-off-by: Yuran Pereira <yuran.pereira@hotmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/GV1PR10MB6563F180C0F2BB4F6CFA5130E8BBA@GV1PR10MB6563.EURPRD10.PROD.OUTLOOK.COM	2023-11-21 10:43:03 -08:00
Pedro Tammela	4968afa014	selftests: tc-testing: report number of workers in use Report the number of workers in use to process the test batches. Since the number is now subject to a limit, avoid users getting confused. Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Reviewed-by: Simon Horman <horms@kernel.org> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://lore.kernel.org/r/20231117171208.2066136-7-pctammela@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-20 18:06:36 -08:00
Pedro Tammela	4b480cfb10	selftests: tc-testing: timeout on unbounded loops In the spirit of failing early, timeout on unbounded loops that take longer than 20 ticks to complete. Such loops are to ensure that objects created are already visible so tests can proceed without any issues. If a test setup takes more than 20 ticks to see an object, there's definetely something wrong. Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Reviewed-by: Simon Horman <horms@kernel.org> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://lore.kernel.org/r/20231117171208.2066136-6-pctammela@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-20 18:06:36 -08:00
Pedro Tammela	3f2d94a4ff	selftests: tc-testing: leverage -all in suite ns teardown Instead of listing lingering ns pinned files and delete them one by one, leverage '-all' from iproute2 to do it in a single process fork. Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Reviewed-by: Simon Horman <horms@kernel.org> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://lore.kernel.org/r/20231117171208.2066136-5-pctammela@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-20 18:06:36 -08:00
Pedro Tammela	3d5026fc5a	selftests: tc-testing: use netns delete from pyroute2 When pyroute2 is available, use the native netns delete routine instead of calling iproute2 to do it. As forks are expensive with some kernel configs, minimize its usage to avoid kselftests timeouts. Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Reviewed-by: Simon Horman <horms@kernel.org> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://lore.kernel.org/r/20231117171208.2066136-4-pctammela@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-20 18:06:36 -08:00
Pedro Tammela	50a5988a7a	selftests: tc-testing: move back to per test ns setup Surprisingly in kernel configs with most of the debug knobs turned on, pre-allocating the test resources makes tdc run much slower overall than when allocating resources on a per test basis. As these knobs are used in kselftests in downstream CIs, let's go back to the old way of doing things to avoid kselftests timeouts. Reported-by: kernel test robot <oliver.sang@intel.com> Closes: https://lore.kernel.org/oe-lkp/202311161129.3b45ed53-oliver.sang@intel.com Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Reviewed-by: Simon Horman <horms@kernel.org> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://lore.kernel.org/r/20231117171208.2066136-3-pctammela@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-20 18:06:36 -08:00
Pedro Tammela	025de7b6a6	selftests: tc-testing: cap parallel tdc to 4 cores We have observed a lot of lock contention and test instability when running with >8 cores. Enough to actually make the tests run slower than with fewer cores. Cap the maximum cores of parallel tdc to 4 which showed in testing to be a reasonable number for efficiency and stability in different kernel config scenarios. Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Reviewed-by: Simon Horman <horms@kernel.org> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://lore.kernel.org/r/20231117171208.2066136-2-pctammela@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-20 18:06:35 -08:00
Willem de Bruijn	a0bc96c0cd	selftests: net: verify fq per-band packet limit Commit `29f834aa32` ("net_sched: sch_fq: add 3 bands and WRR scheduling") introduces multiple traffic bands, and per-band maximum packet count. Per-band limits ensures that packets in one class cannot fill the entire qdisc and so cause DoS to the traffic in the other classes. Verify this behavior: 1. set the limit to 10 per band 2. send 20 pkts on band A: verify that 10 are queued, 10 dropped 3. send 20 pkts on band A: verify that 0 are queued, 20 dropped 4. send 20 pkts on band B: verify that 10 are queued, 10 dropped Packets must remain queued for a period to trigger this behavior. Use SO_TXTIME to store packets for 100 msec. The test reuses existing upstream test infra. The script is a fork of cmsg_time.sh. The scripts call cmsg_sender. The test extends cmsg_sender with two arguments: * '-P' SO_PRIORITY There is a subtle difference between IPv4 and IPv6 stack behavior: PF_INET/IP_TOS sets IP header bits and sk_priority PF_INET6/IPV6_TCLASS sets IP header bits BUT NOT sk_priority * '-n' num pkts Send multiple packets in quick succession. I first attempted a for loop in the script, but this is too slow in virtualized environments, causing flakiness as the 100ms timeout is reached and packets are dequeued. Also do not wait for timestamps to be queued unless timestamps are requested. Signed-off-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20231116203449.2627525-1-willemdebruijn.kernel@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-20 17:48:36 -08:00
Andrii Nakryiko	57b97ecb40	selftests/bpf: reduce verboseness of reg_bounds selftest logs Reduce verboseness of test_progs' output in reg_bounds set of tests with two changes. First, instead of each different operator (<, <=, >, ...) being it's own subtest, combine all different ops for the same (x, y, init_t, cond_t) values into single subtest. Instead of getting 6 subtests, we get one generic one, e.g.: #192/53 reg_bounds_crafted/(s64)[0xffffffffffffffff; 0] (s64)<op> 0xffffffff00000000:OK Second, for random generated test cases, treat all of them as a single test to eliminate very verbose output with random values in them. So now we'll just get one line per each combination of (init_t, cond_t), instead of 6 x 25 = 150 subtests before this change: #225 reg_bounds_rand_consts_s32_s32:OK Given we reduce verboseness so much, it makes sense to do a bit more random testing, so we also bump default number of random tests to 100, up from 25. This doesn't increase runtime significantly, especially in parallelized mode. With all the above changes we still make sure that we have all the information necessary for reproducing test case if it happens to fail. That includes reporting random seed and specific operator that is failing. Those will only be printed to console if related test/subtest fails, so it doesn't have any added verboseness implications. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231120180452.145849-1-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-11-20 12:54:04 -08:00
Pedro Tammela	54293e4d6a	selftests/tc-testing: add hashtable tests for u32 Add tests to specifically check for the refcount interactions of hashtables created by u32. These tables should not be deleted when referenced and the flush order should respect a tree like composition. Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://lore.kernel.org/r/20231114141856.974326-3-pctammela@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-18 19:38:23 -08:00
Andrii Nakryiko	0f8dbdbc64	bpf: smarter verifier log number printing logic Instead of always printing numbers as either decimals (and in some cases, like for "imm=%llx", in hexadecimals), decide the form based on actual values. For numbers in a reasonably small range (currently, [0, U16_MAX] for unsigned values, and [S16_MIN, S16_MAX] for signed ones), emit them as decimals. In all other cases, even for signed values, emit them in hexadecimals. For large values hex form is often times way more useful: it's easier to see an exact difference between 0xffffffff80000000 and 0xffffffff7fffffff, than between 18446744071562067966 and 18446744071562067967, as one particular example. Small values representing small pointer offsets or application constants, on the other hand, are way more useful to be represented in decimal notation. Adjust reg_bounds register state parsing logic to take into account this change. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Acked-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231118034623.3320920-8-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-11-18 11:39:59 -08:00
Andrii Nakryiko	1db747d75b	bpf: omit default off=0 and imm=0 in register state log Simplify BPF verifier log further by omitting default (and frequently irrelevant) off=0 and imm=0 parts for non-SCALAR_VALUE registers. As can be seen from fixed tests, this is often a visual noise for PTR_TO_CTX register and even for PTR_TO_PACKET registers. Omitting default values follows the rest of register state logic: we omit default values to keep verifier log succinct and to highlight interesting state that deviates from default one. E.g., we do the same for var_off, when it's unknown, which gives no additional information. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Acked-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231118034623.3320920-7-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-11-18 11:39:59 -08:00
Andrii Nakryiko	0c95c9fdb6	bpf: emit map name in register state if applicable and available In complicated real-world applications, whenever debugging some verification error through verifier log, it often would be very useful to see map name for PTR_TO_MAP_VALUE register. Usually this needs to be inferred from key/value sizes and maybe trying to guess C code location, but it's not always clear. Given verifier has the name, and it's never too long, let's just emit it for ptr_to_map_key, ptr_to_map_value, and const_ptr_to_map registers. We reshuffle the order a bit, so that map name, key size, and value size appear before offset and immediate values, which seems like a more logical order. Current output: R1_w=map_ptr(map=array_map,ks=4,vs=8,off=0,imm=0) But we'll get rid of useless off=0 and imm=0 parts in the next patch. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Acked-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231118034623.3320920-6-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-11-18 11:39:59 -08:00
Ido Schimmel	af51d6bd0b	selftests: mlxsw: Add PCI reset test Test that PCI reset works correctly by verifying that only the expected reset methods are supported and that after issuing the reset the ifindex of the port changes. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-11-18 17:38:51 +00:00
Andrii Nakryiko	ff8867af01	bpf: rename BPF_F_TEST_SANITY_STRICT to BPF_F_TEST_REG_INVARIANTS Rename verifier internal flag BPF_F_TEST_SANITY_STRICT to more neutral BPF_F_TEST_REG_INVARIANTS. This is a follow up to [0]. A few selftests and veristat need to be adjusted in the same patch as well. [0] https://patchwork.kernel.org/project/netdevbpf/patch/20231112010609.848406-5-andrii@kernel.org/ Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231117171404.225508-1-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-11-17 10:30:02 -08:00
Lucas Karpinski	3bdd9fd29c	selftests/net: synchronize udpgro tests' tx and rx connection The sockets used by udpgso_bench_tx aren't always ready when udpgso_bench_tx transmits packets. This issue is more prevalent in -rt kernels, but can occur in both. Replace the hacky sleep calls with a function that checks whether the ports in the namespace are ready for use. Suggested-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Lucas Karpinski <lkarpins@redhat.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-11-16 22:31:56 +00:00
Pedro Tammela	04fd47bf70	selftests: tc-testing: use parallel tdc in kselftests Leverage parallel tests in kselftests using all the available cpus. We tested this in tuxsuite and locally extensively and it seems it's ready for prime time. Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-11-16 22:30:10 +00:00
Pedro Tammela	bb9623c337	selftests: tc-testing: preload all modules in kselftests While running tdc tests in parallel it can race over the module loading done by tc and fail the run with random errors. So avoid this by preloading all modules before running tdc in kselftests. Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-11-16 22:30:10 +00:00
Pedro Tammela	fa63d353dd	selftests: tc-testing: rework namespaces and devices setup As mentioned in the TC Workshop 0x17, our recent changes to tdc broke downstream CI systems like tuxsuite. The issue is the classic problem with rcu/workqueue objects where you can miss them if not enough wall time has passed. The latter is subjective to the system and kernel config, in my machine could be nanoseconds while in another could be microseconds or more. In order to make the suite deterministic, poll for the existence of the objects in a reasonable manner. Talking netlink directly is the the best solution in order to avoid paying the cost of multiple 'fork()' calls, so introduce a netlink based setup routine using pyroute2. We leave the iproute2 one as a fallback when pyroute2 is not available. Also rework the iproute2 side to mimic the netlink routine where it creates DEV0 as the peer of DEV1 and moves DEV1 into the net namespace. This way when the namespace is deleted DEV0 is also deleted automatically, leaving no margin for resource leaks. Another bonus of this change is that our setup time sped up by a factor of 2 when using netlink. Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-11-16 22:30:10 +00:00
Pedro Tammela	9ffa01cab0	selftests: tc-testing: drop '-N' argument from nsPlugin This argument would bypass the net namespace creation and run the test in the root namespace, even if nsPlugin was specified. Drop it as it's the same as commenting out the nsPlugin from a test and adds additional complexity to the plugin code. Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-11-16 22:30:10 +00:00
Linus Torvalds	7475e51b87	Merge tag 'net-6.7-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from BPF and netfilter. Current release - regressions: - core: fix undefined behavior in netdev name allocation - bpf: do not allocate percpu memory at init stage - netfilter: nf_tables: split async and sync catchall in two functions - mptcp: fix possible NULL pointer dereference on close Current release - new code bugs: - eth: ice: dpll: fix initial lock status of dpll Previous releases - regressions: - bpf: fix precision backtracking instruction iteration - af_unix: fix use-after-free in unix_stream_read_actor() - tipc: fix kernel-infoleak due to uninitialized TLV value - eth: bonding: stop the device in bond_setup_by_slave() - eth: mlx5: - fix double free of encap_header - avoid referencing skb after free-ing in drop path - eth: hns3: fix VF reset - eth: mvneta: fix calls to page_pool_get_stats Previous releases - always broken: - core: set SOCK_RCU_FREE before inserting socket into hashtable - bpf: fix control-flow graph checking in privileged mode - eth: ppp: limit MRU to 64K - eth: stmmac: avoid rx queue overrun - eth: icssg-prueth: fix error cleanup on failing initialization - eth: hns3: fix out-of-bounds access may occur when coalesce info is read via debugfs - eth: cortina: handle large frames Misc: - selftests: gso: support CONFIG_MAX_SKB_FRAGS up to 45" * tag 'net-6.7-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (78 commits) macvlan: Don't propagate promisc change to lower dev in passthru net: sched: do not offload flows with a helper in act_ct net/mlx5e: Check return value of snprintf writing to fw_version buffer for representors net/mlx5e: Check return value of snprintf writing to fw_version buffer net/mlx5e: Reduce the size of icosq_str net/mlx5: Increase size of irq name buffer net/mlx5e: Update doorbell for port timestamping CQ before the software counter net/mlx5e: Track xmit submission to PTP WQ after populating metadata map net/mlx5e: Avoid referencing skb after free-ing in drop path of mlx5e_sq_xmit_wqe net/mlx5e: Don't modify the peer sent-to-vport rules for IPSec offload net/mlx5e: Fix pedit endianness net/mlx5e: fix double free of encap_header in update funcs net/mlx5e: fix double free of encap_header net/mlx5: Decouple PHC .adjtime and .adjphase implementations net/mlx5: DR, Allow old devices to use multi destination FTE net/mlx5: Free used cpus mask when an IRQ is released Revert "net/mlx5: DR, Supporting inline WQE when possible" bpf: Do not allocate percpu memory at init stage net: Fix undefined behavior in netdev name allocation dt-bindings: net: ethernet-controller: Fix formatting error ...	2023-11-16 07:51:26 -05:00
Jakub Kicinski	a6a6a0a9fd	Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Alexei Starovoitov says: ==================== pull-request: bpf 2023-11-15 We've added 7 non-merge commits during the last 6 day(s) which contain a total of 9 files changed, 200 insertions(+), 49 deletions(-). The main changes are: 1) Do not allocate bpf specific percpu memory unconditionally, from Yonghong. 2) Fix precision backtracking instruction iteration, from Andrii. 3) Fix control flow graph checking, from Andrii. 4) Fix xskxceiver selftest build, from Anders. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: bpf: Do not allocate percpu memory at init stage selftests/bpf: add more test cases for check_cfg() bpf: fix control-flow graph checking in privileged mode selftests/bpf: add edge case backtracking logic test bpf: fix precision backtracking instruction iteration bpf: handle ldimm64 properly in check_cfg() selftests: bpf: xskxceiver: ksft_print_msg: fix format type error ==================== Link: https://lore.kernel.org/r/20231115214949.48854-1-alexei.starovoitov@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-15 22:28:02 -08:00
Andrii Nakryiko	882e3d873c	selftests/bpf: add iter test requiring range x range logic Add a simple verifier test that requires deriving reg bounds for one register from another register that's not a constant. This is a realistic example of iterating elements of an array with fixed maximum number of elements, but smaller actual number of elements. This small example was an original motivation for doing this whole patch set in the first place, yes. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231112010609.848406-14-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-11-15 12:03:43 -08:00
Andrii Nakryiko	a5c57f81eb	veristat: add ability to set BPF_F_TEST_SANITY_STRICT flag with -r flag Add a new flag -r (--test-sanity), similar to -t (--test-states), to add extra BPF program flags when loading BPF programs. This allows to use veristat to easily catch sanity violations in production BPF programs. reg_bounds tests are also enforcing BPF_F_TEST_SANITY_STRICT flag now. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231112010609.848406-13-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-11-15 12:03:43 -08:00
Andrii Nakryiko	8c5677f8b3	selftests/bpf: set BPF_F_TEST_SANITY_SCRIPT by default Make sure to set BPF_F_TEST_SANITY_STRICT program flag by default across most verifier tests (and a bunch of others that set custom prog flags). There are currently two tests that do fail validation, if enforced strictly: verifier_bounds/crossing_64_bit_signed_boundary_2 and verifier_bounds/crossing_32_bit_signed_boundary_2. To accommodate them, we teach test_loader a flag negation: __flag(!<flagname>) will clear specified flag, allowing easy opt-out. We apply __flag(!BPF_F_TEST_SANITY_STRICT) to these to tests. Also sprinkle BPF_F_TEST_SANITY_STRICT everywhere where we already set test-only BPF_F_TEST_RND_HI32 flag, for completeness. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231112010609.848406-12-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-11-15 12:03:42 -08:00
Andrii Nakryiko	dab16659c5	selftests/bpf: add randomized reg_bounds tests Add random cases generation to reg_bounds.c and run them without SLOW_TESTS=1 to increase a chance of BPF CI catching latent issues. Suggested-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231112010609.848406-11-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-11-15 12:03:42 -08:00
Andrii Nakryiko	2b0d204e36	selftests/bpf: add range x range test to reg_bounds Now that verifier supports range vs range bounds adjustments, validate that by checking each generated range against every other generated range, across all supported operators (everything by JSET). We also add few cases that were problematic during development either for verifier or for selftest's range tracking implementation. Note that we utilize the same trick with splitting everything into multiple independent parallelizable tests, but init_t and cond_t. This brings down verification time in parallel mode from more than 8 hours down to less that 1.5 hours. 106 million cases were successfully validate for range vs range logic, in addition to about 7 million range vs const cases, added in earlier patch. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231112010609.848406-10-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-11-15 12:03:42 -08:00
Andrii Nakryiko	774f94c5e7	selftests/bpf: adjust OP_EQ/OP_NE handling to use subranges for branch taken Similar to kernel-side BPF verifier logic enhancements, use 32-bit subrange knowledge for is_branch_taken() logic in reg_bounds selftests. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20231112010609.848406-9-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-11-15 12:03:42 -08:00
Andrii Nakryiko	8863238993	selftests/bpf: BPF register range bounds tester Add test to validate BPF verifier's register range bounds tracking logic. The main bulk is a lot of auto-generated tests based on a small set of seed values for lower and upper 32 bits of full 64-bit values. Currently we validate only range vs const comparisons, but the idea is to start validating range over range comparisons in subsequent patch set. When setting up initial register ranges we treat registers as one of u64/s64/u32/s32 numeric types, and then independently perform conditional comparisons based on a potentially different u64/s64/u32/s32 types. This tests lots of tricky cases of deriving bounds information across different numeric domains. Given there are lots of auto-generated cases, we guard them behind SLOW_TESTS=1 envvar requirement, and skip them altogether otherwise. With current full set of upper/lower seed value, all supported comparison operators and all the combinations of u64/s64/u32/s32 number domains, we get about 7.7 million tests, which run in about 35 minutes on my local qemu instance without parallelization. But we also split those tests by init/cond numeric types, which allows to rely on test_progs's parallelization of tests with `-j` option, getting run time down to about 5 minutes on 8 cores. It's still something that shouldn't be run during normal test_progs run. But we can run it a reasonable time, and so perhaps a nightly CI test run (once we have it) would be a good option for this. We also add a small set of tricky conditions that came up during development and triggered various bugs or corner cases in either selftest's reimplementation of range bounds logic or in verifier's logic itself. These are fast enough to be run as part of normal test_progs test run and are great for a quick sanity checking. Let's take a look at test output to understand what's going on: $ sudo ./test_progs -t reg_bounds_crafted #191/1 reg_bounds_crafted/(u64)[0; 0xffffffff] (u64)< 0:OK ... #191/115 reg_bounds_crafted/(u64)[0; 0x17fffffff] (s32)< 0:OK ... #191/137 reg_bounds_crafted/(u64)[0xffffffff; 0x100000000] (u64)== 0:OK Each test case is uniquely and fully described by this generated string. E.g.: "(u64)[0; 0x17fffffff] (s32)< 0". This means that we initialize a register (R6) in such a way that verifier knows that it can have a value in [(u64)0; (u64)0x17fffffff] range. Another register (R7) is also set up as u64, but this time a constant (zero in this case). They then are compared using 32-bit signed < operation. Resulting TRUE/FALSE branches are evaluated (including cases where it's known that one of the branches will never be taken, in which case we validate that verifier also determines this as a dead code). Test validates that verifier's final register state matches expected state based on selftest's own reg_state logic, implemented from scratch for cross-checking purposes. These test names can be conveniently used for further debugging, and if -vv verboseness is requested we can get a corresponding verifier log (with mark_precise logs filtered out as irrelevant and distracting). Example below is slightly redacted for brevity, omitting irrelevant register output in some places, marked with [...]. $ sudo ./test_progs -a 'reg_bounds_crafted/(u32)[0; U32_MAX] (s32)< -1' -vv ... VERIFIER LOG: ======================== func#0 @0 0: R1=ctx(off=0,imm=0) R10=fp0 0: (05) goto pc+2 3: (85) call bpf_get_current_pid_tgid#14 ; R0_w=scalar() 4: (bc) w6 = w0 ; R0_w=scalar() R6_w=scalar(smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff)) 5: (85) call bpf_get_current_pid_tgid#14 ; R0_w=scalar() 6: (bc) w7 = w0 ; R0_w=scalar() R7_w=scalar(smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff)) 7: (b4) w1 = 0 ; R1_w=0 8: (b4) w2 = -1 ; R2=4294967295 9: (ae) if w6 < w1 goto pc-9 9: R1=0 R6=scalar(smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff)) 10: (2e) if w6 > w2 goto pc-10 10: R2=4294967295 R6=scalar(smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff)) 11: (b4) w1 = -1 ; R1_w=4294967295 12: (b4) w2 = -1 ; R2_w=4294967295 13: (ae) if w7 < w1 goto pc-13 ; R1_w=4294967295 R7=4294967295 14: (2e) if w7 > w2 goto pc-14 14: R2_w=4294967295 R7=4294967295 15: (bc) w0 = w6 ; [...] R6=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff)) 16: (bc) w0 = w7 ; [...] R7=4294967295 17: (ce) if w6 s< w7 goto pc+3 ; R6=scalar(id=1,smin=0,smax=umax=4294967295,smin32=-1,var_off=(0x0; 0xffffffff)) R7=4294967295 18: (bc) w0 = w6 ; [...] R6=scalar(id=1,smin=0,smax=umax=4294967295,smin32=-1,var_off=(0x0; 0xffffffff)) 19: (bc) w0 = w7 ; [...] R7=4294967295 20: (95) exit from 17 to 21: [...] 21: (bc) w0 = w6 ; [...] R6=scalar(id=1,smin=umin=umin32=2147483648,smax=umax=umax32=4294967294,smax32=-2,var_off=(0x80000000; 0x7fffffff)) 22: (bc) w0 = w7 ; [...] R7=4294967295 23: (95) exit from 13 to 1: [...] 1: [...] 1: (b7) r0 = 0 ; R0_w=0 2: (95) exit processed 24 insns (limit 1000000) max_states_per_insn 0 total_states 2 peak_states 2 mark_read 1 ===================== Verifier log above is for `(u32)[0; U32_MAX] (s32)< -1` use cases, where u32 range is used for initialization, followed by signed < operator. Note how we use w6/w7 in this case for register initialization (it would be R6/R7 for 64-bit types) and then `if w6 s< w7` for comparison at instruction #17. It will be `if R6 < R7` for 64-bit unsigned comparison. Above example gives a good impression of the overall structure of a BPF programs generated for reg_bounds tests. In the future, this "framework" can be extended to test not just conditional jumps, but also arithmetic operations. Adding randomized testing is another possibility. Some implementation notes. We basically have our own generics-like operations on numbers, where all the numbers are stored in u64, but how they are interpreted is passed as runtime argument enum num_t. Further, `struct range` represents a bounds range, and those are collected together into a minimal `struct reg_state`, which collects range bounds across all four numberical domains: u64, s64, u32, s64. Based on these primitives and `enum op` representing possible conditional operation (<, <=, >, >=, ==, !=), there is a set of generic helpers to perform "range arithmetics", which is used to maintain struct reg_state. We simulate what verifier will do for reg bounds of R6 and R7 registers using these range and reg_state primitives. Simulated information is used to determine branch taken conclusion and expected exact register state across all four number domains. Implementation of "range arithmetics" is more generic than what verifier is currently performing: it allows range over range comparisons and adjustments. This is the intended end goal of this patch set overall and verifier logic is enhanced in subsequent patches in this series to handle range vs range operations, at which point selftests are extended to validate these conditions as well. For now it's range vs const cases only. Note that tests are split into multiple groups by their numeric types for initialization of ranges and for comparison operation. This allows to use test_progs's -j parallelization to speed up tests, as we now have 16 groups of parallel running tests. Overall reduction of running time that allows is pretty good, we go down from more than 30 minutes to slightly less than 5 minutes running time. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Shung-Hsi Yu <shung-hsi.yu@suse.com> Link: https://lore.kernel.org/r/20231112010609.848406-8-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-11-15 12:03:42 -08:00
Paolo Abeni	7cefbe5e1d	selftests: mptcp: fix fastclose with csum failure Running the mp_join selftest manually with the following command line: ./mptcp_join.sh -z -C leads to some failures: 002 fastclose server test # ... rtx [fail] got 1 MP_RST[s] TX expected 0 # ... rstrx [fail] got 1 MP_RST[s] RX expected 0 The problem is really in the wrong expectations for the RST checks implied by the csum validation. Note that the same check is repeated explicitly in the same test-case, with the correct expectation and pass successfully. Address the issue explicitly setting the correct expectation for the failing checks. Reported-by: Xiumei Mu <xmu@redhat.com> Fixes: `6bf41020b7` ("selftests: mptcp: update and extend fastclose test-cases") Cc: stable@vger.kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts <matttbe@kernel.org> Signed-off-by: Matthieu Baerts <matttbe@kernel.org> Link: https://lore.kernel.org/r/20231114-upstream-net-20231113-mptcp-misc-fixes-6-7-rc2-v1-5-7b9cd6a7b7f4@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-14 20:10:21 -08:00
Yafang Shao	360769233c	selftests/bpf: Add selftests for cgroup1 hierarchy Add selftests for cgroup1 hierarchy. The result as follows, $ tools/testing/selftests/bpf/test_progs --name=cgroup1_hierarchy #36/1 cgroup1_hierarchy/test_cgroup1_hierarchy:OK #36/2 cgroup1_hierarchy/test_root_cgid:OK #36/3 cgroup1_hierarchy/test_invalid_level:OK #36/4 cgroup1_hierarchy/test_invalid_cgid:OK #36/5 cgroup1_hierarchy/test_invalid_hid:OK #36/6 cgroup1_hierarchy/test_invalid_cgrp_name:OK #36/7 cgroup1_hierarchy/test_invalid_cgrp_name2:OK #36/8 cgroup1_hierarchy/test_sleepable_prog:OK #36 cgroup1_hierarchy:OK Summary: 1/8 PASSED, 0 SKIPPED, 0 FAILED Besides, I also did some stress test similar to the patch #2 in this series, as follows (with CONFIG_PROVE_RCU_LIST enabled): - Continuously mounting and unmounting named cgroups in some tasks, for example: cgrp_name=$1 while true do mount -t cgroup -o none,name=$cgrp_name none /$cgrp_name umount /$cgrp_name done - Continuously run this selftest concurrently, while true; do ./test_progs --name=cgroup1_hierarchy; done They can ran successfully without any RCU warnings in dmesg. Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Link: https://lore.kernel.org/r/20231111090034.4248-7-laoar.shao@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-11-14 08:59:23 -08:00
Yafang Shao	bf47300b18	selftests/bpf: Add a new cgroup helper get_cgroup_hierarchy_id() A new cgroup helper function, get_cgroup1_hierarchy_id(), has been introduced to obtain the ID of a cgroup1 hierarchy based on the provided cgroup name. This cgroup name can be obtained from the /proc/self/cgroup file. Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Link: https://lore.kernel.org/r/20231111090034.4248-6-laoar.shao@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-11-14 08:56:56 -08:00
Yafang Shao	c1dcc050aa	selftests/bpf: Add a new cgroup helper get_classid_cgroup_id() Introduce a new helper function to retrieve the cgroup ID from a net_cls cgroup directory. Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Link: https://lore.kernel.org/r/20231111090034.4248-5-laoar.shao@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-11-14 08:56:56 -08:00
Yafang Shao	f744d35ecf	selftests/bpf: Add parallel support for classid Include the current pid in the classid cgroup path. This way, different testers relying on classid-based configurations will have distinct classid cgroup directories, enabling them to run concurrently. Additionally, we leverage the current pid as the classid, ensuring unique identification. Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Link: https://lore.kernel.org/r/20231111090034.4248-4-laoar.shao@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-11-14 08:56:56 -08:00
Yafang Shao	4849775587	selftests/bpf: Fix issues in setup_classid_environment() If the net_cls subsystem is already mounted, attempting to mount it again in setup_classid_environment() will result in a failure with the error code EBUSY. Despite this, tmpfs will have been successfully mounted at /sys/fs/cgroup/net_cls. Consequently, the /sys/fs/cgroup/net_cls directory will be empty, causing subsequent setup operations to fail. Here's an error log excerpt illustrating the issue when net_cls has already been mounted at /sys/fs/cgroup/net_cls prior to running setup_classid_environment(): - Before that change $ tools/testing/selftests/bpf/test_progs --name=cgroup_v1v2 test_cgroup_v1v2:PASS:server_fd 0 nsec test_cgroup_v1v2:PASS:client_fd 0 nsec test_cgroup_v1v2:PASS:cgroup_fd 0 nsec test_cgroup_v1v2:PASS:server_fd 0 nsec run_test:PASS:skel_open 0 nsec run_test:PASS:prog_attach 0 nsec test_cgroup_v1v2:PASS:cgroup-v2-only 0 nsec (cgroup_helpers.c:248: errno: No such file or directory) Opening Cgroup Procs: /sys/fs/cgroup/net_cls/cgroup.procs (cgroup_helpers.c:540: errno: No such file or directory) Opening cgroup classid: /sys/fs/cgroup/net_cls/cgroup-test-work-dir/net_cls.classid run_test:PASS:skel_open 0 nsec run_test:PASS:prog_attach 0 nsec (cgroup_helpers.c:248: errno: No such file or directory) Opening Cgroup Procs: /sys/fs/cgroup/net_cls/cgroup-test-work-dir/cgroup.procs run_test:FAIL:join_classid unexpected error: 1 (errno 2) test_cgroup_v1v2:FAIL:cgroup-v1v2 unexpected error: -1 (errno 2) (cgroup_helpers.c:248: errno: No such file or directory) Opening Cgroup Procs: /sys/fs/cgroup/net_cls/cgroup.procs #44 cgroup_v1v2:FAIL Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED - After that change $ tools/testing/selftests/bpf/test_progs --name=cgroup_v1v2 #44 cgroup_v1v2:OK Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Link: https://lore.kernel.org/r/20231111090034.4248-3-laoar.shao@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-11-14 08:56:56 -08:00
Jordan Rome	727a92d62f	selftests/bpf: Add assert for user stacks in test_task_stack This is a follow up to: commit `b8e3a87a62` ("bpf: Add crosstask check to __bpf_get_stack"). This test ensures that the task iterator only gets a single user stack (for the current task). Signed-off-by: Jordan Rome <linux@jordanrome.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/bpf/20231112023010.144675-1-linux@jordanrome.com	2023-11-13 18:39:38 -08:00
Linus Torvalds	4eeee6636a	Merge tag 'loongarch-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson Pull LoongArch updates from Huacai Chen: - support PREEMPT_DYNAMIC with static keys - relax memory ordering for atomic operations - support BPF CPU v4 instructions for LoongArch - some build and runtime warning fixes * tag 'loongarch-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson: selftests/bpf: Enable cpu v4 tests for LoongArch LoongArch: BPF: Support signed mod instructions LoongArch: BPF: Support signed div instructions LoongArch: BPF: Support 32-bit offset jmp instructions LoongArch: BPF: Support unconditional bswap instructions LoongArch: BPF: Support sign-extension mov instructions LoongArch: BPF: Support sign-extension load instructions LoongArch: Add more instruction opcodes and emit_* helpers LoongArch/smp: Call rcutree_report_cpu_starting() earlier LoongArch: Relax memory ordering for atomic operations LoongArch: Mark __percpu functions as always inline LoongArch: Disable module from accessing external data directly LoongArch: Support PREEMPT_DYNAMIC with static keys	2023-11-12 10:58:08 -08:00
Yonghong Song	100888fb6d	selftests/bpf: Fix pyperf180 compilation failure with clang18 With latest clang18 (main branch of llvm-project repo), when building bpf selftests, [~/work/bpf-next (master)]$ make -C tools/testing/selftests/bpf LLVM=1 -j The following compilation error happens: fatal error: error in backend: Branch target out of insn range ... Stack dump: 0. Program arguments: clang -g -Wall -Werror -D__TARGET_ARCH_x86 -mlittle-endian -I/home/yhs/work/bpf-next/tools/testing/selftests/bpf/tools/include -I/home/yhs/work/bpf-next/tools/testing/selftests/bpf -I/home/yhs/work/bpf-next/tools/include/uapi -I/home/yhs/work/bpf-next/tools/testing/selftests/usr/include -idirafter /home/yhs/work/llvm-project/llvm/build.18/install/lib/clang/18/include -idirafter /usr/local/include -idirafter /usr/include -Wno-compare-distinct-pointer-types -DENABLE_ATOMICS_TESTS -O2 --target=bpf -c progs/pyperf180.c -mcpu=v3 -o /home/yhs/work/bpf-next/tools/testing/selftests/bpf/pyperf180.bpf.o 1. <eof> parser at end of file 2. Code generation ... The compilation failure only happens to cpu=v2 and cpu=v3. cpu=v4 is okay since cpu=v4 supports 32-bit branch target offset. The above failure is due to upstream llvm patch [1] where some inlining behavior are changed in clang18. To workaround the issue, previously all 180 loop iterations are fully unrolled. The bpf macro __BPF_CPU_VERSION__ (implemented in clang18 recently) is used to avoid unrolling changes if cpu=v4. If __BPF_CPU_VERSION__ is not available and the compiler is clang18, the unrollng amount is unconditionally reduced. [1] `1a2e77cf9e` Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Alan Maguire <alan.maguire@oracle.com> Link: https://lore.kernel.org/bpf/20231110193644.3130906-1-yonghong.song@linux.dev	2023-11-11 12:18:10 -08:00
Andrii Nakryiko	e2e57d637a	selftests/bpf: add more test cases for check_cfg() Add a few more simple cases to validate proper privileged vs unprivileged loop detection behavior. conditional_loop2 is the one reported by Hao Sun that triggered this set of fixes. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Suggested-by: Hao Sun <sunhao.th@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231110061412.2995786-2-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-11-09 22:57:25 -08:00
Andrii Nakryiko	10e14e9652	bpf: fix control-flow graph checking in privileged mode When BPF program is verified in privileged mode, BPF verifier allows bounded loops. This means that from CFG point of view there are definitely some back-edges. Original commit adjusted check_cfg() logic to not detect back-edges in control flow graph if they are resulting from conditional jumps, which the idea that subsequent full BPF verification process will determine whether such loops are bounded or not, and either accept or reject the BPF program. At least that's my reading of the intent. Unfortunately, the implementation of this idea doesn't work correctly in all possible situations. Conditional jump might not result in immediate back-edge, but just a few unconditional instructions later we can arrive at back-edge. In such situations check_cfg() would reject BPF program even in privileged mode, despite it might be bounded loop. Next patch adds one simple program demonstrating such scenario. To keep things simple, instead of trying to detect back edges in privileged mode, just assume every back edge is valid and let subsequent BPF verification prove or reject bounded loops. Note a few test changes. For unknown reason, we have a few tests that are specified to detect a back-edge in a privileged mode, but looking at their code it seems like the right outcome is passing check_cfg() and letting subsequent verification to make a decision about bounded or not bounded looping. Bounded recursion case is also interesting. The example should pass, as recursion is limited to just a few levels and so we never reach maximum number of nested frames and never exhaust maximum stack depth. But the way that max stack depth logic works today it falsely detects this as exceeding max nested frame count. This patch series doesn't attempt to fix this orthogonal problem, so we just adjust expected verifier failure. Suggested-by: Alexei Starovoitov <ast@kernel.org> Fixes: `2589726d12` ("bpf: introduce bounded loops") Reported-by: Hao Sun <sunhao.th@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231110061412.2995786-1-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-11-09 22:57:24 -08:00
Andrii Nakryiko	62ccdb11d3	selftests/bpf: add edge case backtracking logic test Add a dedicated selftests to try to set up conditions to have a state with same first and last instruction index, but it actually is a loop 3->4->1->2->3. This confuses mark_chain_precision() if verifier doesn't take into account jump history. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231110002638.4168352-4-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-11-09 20:11:20 -08:00
Andrii Nakryiko	3feb263bb5	bpf: handle ldimm64 properly in check_cfg() ldimm64 instructions are 16-byte long, and so have to be handled appropriately in check_cfg(), just like the rest of BPF verifier does. This has implications in three places: - when determining next instruction for non-jump instructions; - when determining next instruction for callback address ldimm64 instructions (in visit_func_call_insn()); - when checking for unreachable instructions, where second half of ldimm64 is expected to be unreachable; We take this also as an opportunity to report jump into the middle of ldimm64. And adjust few test_verifier tests accordingly. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Reported-by: Hao Sun <sunhao.th@gmail.com> Fixes: `475fb78fbf` ("bpf: verifier (add branch/goto checks)") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231110002638.4168352-2-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-11-09 20:11:20 -08:00
Anders Roxell	fe69a1b1b6	selftests: bpf: xskxceiver: ksft_print_msg: fix format type error Crossbuilding selftests/bpf for architecture arm64, format specifies type error show up like. xskxceiver.c:912:34: error: format specifies type 'int' but the argument has type '__u64' (aka 'unsigned long long') [-Werror,-Wformat] ksft_print_msg("[%s] expected meta_count [%d], got meta_count [%d]\n", ~~ %llu __func__, pkt->pkt_nb, meta->count); ^~~~~~~~~~~ xskxceiver.c:929:55: error: format specifies type 'unsigned long long' but the argument has type 'u64' (aka 'unsigned long') [-Werror,-Wformat] ksft_print_msg("Frag invalid addr: %llx len: %u\n", addr, len); ~~~~ ^~~~ Fixing the issues by casting to (unsigned long long) and changing the specifiers to be %llu from %d and %u, since with u64s it might be %llx or %lx, depending on architecture. Signed-off-by: Anders Roxell <anders.roxell@linaro.org> Link: https://lore.kernel.org/r/20231109174328.1774571-1-anders.roxell@linaro.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-11-09 19:18:12 -08:00
Dave Marchevsky	e9ed8df718	selftests/bpf: Test bpf_refcount_acquire of node obtained via direct ld This patch demonstrates that verifier changes earlier in this series result in bpf_refcount_acquire(mapval->stashed_kptr) passing verification. The added test additionally validates that stashing a kptr in mapval and - in a separate BPF program - refcount_acquiring the kptr without unstashing works as expected at runtime. Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com> Link: https://lore.kernel.org/r/20231107085639.3016113-7-davemarchevsky@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-11-09 19:07:51 -08:00
Dave Marchevsky	f460e7bdb0	selftests/bpf: Add test passing MAYBE_NULL reg to bpf_refcount_acquire The test added in this patch exercises the logic fixed in the previous patch in this series. Before the previous patch's changes, bpf_refcount_acquire accepts MAYBE_NULL local kptrs; after the change the verifier correctly rejects the such a call. Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com> Link: https://lore.kernel.org/r/20231107085639.3016113-3-davemarchevsky@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-11-09 19:07:51 -08:00
Andrii Nakryiko	27007fae70	veristat: add ability to filter top N results Add ability to filter top B results, both in replay/verifier mode and comparison mode. Just adding `-n10` will emit only first 10 rows, or less, if there is not enough rows. This is not just a shortcut instead of passing veristat output through `head`, though. Filtering out all the other rows influences final table formatting, as table column widths are calculated based on actual emitted test. To demonstrate the difference, compare two "equivalent" forms below, one using head and another using -n argument. TOP N FEATURE ============= [vmuser@archvm bpf]$ sudo ./veristat -C ~/baseline-results-selftests.csv ~/sanity2-results-selftests.csv -e file,prog,insns,states -s '\|insns_diff\|' -n10 File Program Insns (A) Insns (B) Insns (DIFF) States (A) States (B) States (DIFF) ---------------------------------------- --------------------- --------- --------- ------------ ---------- ---------- ------------- test_seg6_loop.bpf.linked3.o __add_egr_x 12440 12360 -80 (-0.64%) 364 357 -7 (-1.92%) async_stack_depth.bpf.linked3.o async_call_root_check 145 145 +0 (+0.00%) 3 3 +0 (+0.00%) async_stack_depth.bpf.linked3.o pseudo_call_check 139 139 +0 (+0.00%) 3 3 +0 (+0.00%) atomic_bounds.bpf.linked3.o sub 7 7 +0 (+0.00%) 0 0 +0 (+0.00%) bench_local_storage_create.bpf.linked3.o kmalloc 5 5 +0 (+0.00%) 0 0 +0 (+0.00%) bench_local_storage_create.bpf.linked3.o sched_process_fork 22 22 +0 (+0.00%) 2 2 +0 (+0.00%) bench_local_storage_create.bpf.linked3.o socket_post_create 23 23 +0 (+0.00%) 2 2 +0 (+0.00%) bind4_prog.bpf.linked3.o bind_v4_prog 358 358 +0 (+0.00%) 33 33 +0 (+0.00%) bind6_prog.bpf.linked3.o bind_v6_prog 429 429 +0 (+0.00%) 37 37 +0 (+0.00%) bind_perm.bpf.linked3.o bind_v4_prog 15 15 +0 (+0.00%) 1 1 +0 (+0.00%) PIPING TO HEAD ============== [vmuser@archvm bpf]$ sudo ./veristat -C ~/baseline-results-selftests.csv ~/sanity2-results-selftests.csv -e file,prog,insns,states -s '\|insns_diff\|' \| head -n12 File Program Insns (A) Insns (B) Insns (DIFF) States (A) States (B) States (DIFF) ----------------------------------------------------- ---------------------------------------------------- --------- --------- ------------ ---------- ---------- ------------- test_seg6_loop.bpf.linked3.o __add_egr_x 12440 12360 -80 (-0.64%) 364 357 -7 (-1.92%) async_stack_depth.bpf.linked3.o async_call_root_check 145 145 +0 (+0.00%) 3 3 +0 (+0.00%) async_stack_depth.bpf.linked3.o pseudo_call_check 139 139 +0 (+0.00%) 3 3 +0 (+0.00%) atomic_bounds.bpf.linked3.o sub 7 7 +0 (+0.00%) 0 0 +0 (+0.00%) bench_local_storage_create.bpf.linked3.o kmalloc 5 5 +0 (+0.00%) 0 0 +0 (+0.00%) bench_local_storage_create.bpf.linked3.o sched_process_fork 22 22 +0 (+0.00%) 2 2 +0 (+0.00%) bench_local_storage_create.bpf.linked3.o socket_post_create 23 23 +0 (+0.00%) 2 2 +0 (+0.00%) bind4_prog.bpf.linked3.o bind_v4_prog 358 358 +0 (+0.00%) 33 33 +0 (+0.00%) bind6_prog.bpf.linked3.o bind_v6_prog 429 429 +0 (+0.00%) 37 37 +0 (+0.00%) bind_perm.bpf.linked3.o bind_v4_prog 15 15 +0 (+0.00%) 1 1 +0 (+0.00%) Note all the wasted whitespace in the "PIPING TO HEAD" variant. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231108051430.1830950-2-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-11-09 19:07:51 -08:00

1 2 3 4 5 ...

14088 Commits