linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-03-26 21:31:13 -04:00

Author	SHA1	Message	Date
Andrii Nakryiko	002d3afce6	selftests/bpf: add CO-RE relocs struct flavors tests Add tests verifying that BPF program can use various struct/union "flavors" to extract data from the same target struct/union. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-08-07 14:43:49 -07:00
Andrii Nakryiko	df36e62141	selftests/bpf: add CO-RE relocs testing setup Add CO-RE relocation test runner. Add one simple test validating that libbpf's logic for searching for kernel image and loading BTF out of it works. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-08-07 14:43:49 -07:00
Andrii Nakryiko	2dc26d5a4f	selftests/bpf: add BPF_CORE_READ relocatable read macro Add BPF_CORE_READ macro used in tests to do bpf_core_read(), which automatically captures offset relocation. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-08-07 14:43:49 -07:00
Andrii Nakryiko	ddc7c30426	libbpf: implement BPF CO-RE offset relocation algorithm This patch implements the core logic for BPF CO-RE offsets relocations. Every instruction that needs to be relocated has corresponding bpf_offset_reloc as part of BTF.ext. Relocations are performed by trying to match recorded "local" relocation spec against potentially many compatible "target" types, creating corresponding spec. Details of the algorithm are noted in corresponding comments in the code. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-08-07 14:43:49 -07:00
Andrii Nakryiko	4cedc0dad9	libbpf: add .BTF.ext offset relocation section loading Add support for BPF CO-RE offset relocations. Add section/record iteration macros for .BTF.ext. These macro are useful for iterating over each .BTF.ext record, either for dumping out contents or later for BPF CO-RE relocation handling. To enable other parts of libbpf to work with .BTF.ext contents, moved a bunch of type definitions into libbpf_internal.h. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-08-07 14:43:49 -07:00
Andrii Nakryiko	b03bc6853c	libbpf: convert libbpf code to use new btf helpers Simplify code by relying on newly added BTF helper functions. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-08-07 14:43:49 -07:00
Andrii Nakryiko	ef20a9b27c	libbpf: add helpers for working with BTF types Add lots of frequently used helpers that simplify working with BTF types. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-08-07 14:43:49 -07:00
Dan Carpenter	e7e6c6320c	IB/mlx5: Check the correct variable in error handling code The code accidentally checks "event_sub" instead of "event_sub->eventfd". Fixes: `7597385371` ("IB/mlx5: Enable subscription for device events over DEVX") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Acked-by: Leon Romanovsky <leonro@mellanox.com> Link: https://lore.kernel.org/r/20190807123236.GA11452@mwanda Signed-off-by: Doug Ledford <dledford@redhat.com>	2019-08-07 16:46:41 -04:00
Mark Zhang	d97de8887a	RDMA/counter: Prevent QP counter binding if counters unsupported In case of rdma_counter_init() fails, counter allocation and QP bind should not be allowed. Fixes: `413d334750` ("RDMA/counter: Add set/clear per-port auto mode support") Fixes: `1bd8e0a9d0` ("RDMA/counter: Allow manual mode configuration support") Signed-off-by: Mark Zhang <markz@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Link: https://lore.kernel.org/r/20190807101819.7581-1-leon@kernel.org Signed-off-by: Doug Ledford <dledford@redhat.com>	2019-08-07 16:09:23 -04:00
Yishai Hadas	f591822c3c	IB/mlx5: Fix implicit MR release flow Once implicit MR is being called to be released by ib_umem_notifier_release() its leaves were marked as "dying". However, when dereg_mr()->mlx5_ib_free_implicit_mr()->mr_leaf_free() is called, it skips running the mr_leaf_free_action (i.e. umem_odp->work) when those leaves were marked as "dying". As such ib_umem_release() for the leaves won't be called and their MRs will be leaked as well. When an application exits/killed without calling dereg_mr we might hit the above flow. This fatal scenario is reported by WARN_ON() upon mlx5_ib_dealloc_ucontext() as ibcontext->per_mm_list is not empty, the call trace can be seen below. Originally the "dying" mark as part of ib_umem_notifier_release() was introduced to prevent pagefault_mr() from returning a success response once this happened. However, we already have today the completion mechanism so no need for that in those flows any more. Even in case a success response will be returned the firmware will not find the pages and an error will be returned in the following call as a released mm will cause ib_umem_odp_map_dma_pages() to permanently fail mmget_not_zero(). Fix the above issue by dropping the "dying" from the above flows. The other flows that are using "dying" are still needed it for their synchronization purposes. WARNING: CPU: 1 PID: 7218 at drivers/infiniband/hw/mlx5/main.c:2004 mlx5_ib_dealloc_ucontext+0x84/0x90 [mlx5_ib] CPU: 1 PID: 7218 Comm: ibv_rc_pingpong Tainted: G E 5.2.0-rc6+ #13 Call Trace: uverbs_destroy_ufile_hw+0xb5/0x120 [ib_uverbs] ib_uverbs_close+0x1f/0x80 [ib_uverbs] __fput+0xbe/0x250 task_work_run+0x88/0xa0 do_exit+0x2cb/0xc30 ? __fput+0x14b/0x250 do_group_exit+0x39/0xb0 get_signal+0x191/0x920 ? _raw_spin_unlock_bh+0xa/0x20 ? inet_csk_accept+0x229/0x2f0 do_signal+0x36/0x5e0 ? put_unused_fd+0x5b/0x70 ? __sys_accept4+0x1a6/0x1e0 ? inet_hash+0x35/0x40 ? release_sock+0x43/0x90 ? _raw_spin_unlock_bh+0xa/0x20 ? inet_listen+0x9f/0x120 exit_to_usermode_loop+0x5c/0xc6 do_syscall_64+0x182/0x1b0 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: `81713d3788` ("IB/mlx5: Add implicit MR support") Link: https://lore.kernel.org/r/20190805083010.21777-1-leon@kernel.org Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2019-08-07 15:36:43 -03:00
Jens Axboe	752ead4449	libata: add SG safety checks in SFF pio transfers Abort processing of a command if we run out of mapped data in the SG list. This should never happen, but a previous bug caused it to be possible. Play it safe and attempt to abort nicely if we don't have more SG segments left. Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-08-07 12:23:57 -06:00
Jens Axboe	2d72715017	libata: have ata_scsi_rw_xlat() fail invalid passthrough requests For passthrough requests, libata-scsi takes what the user passes in as gospel. This can be problematic if the user fills in the CDB incorrectly. One example of that is in request sizes. For read/write commands, the CDB contains fields describing the transfer length of the request. These should match with the SG_IO header fields, but libata-scsi currently does no validation of that. Check that the number of blocks in the CDB for passthrough requests matches what was mapped into the request. If the CDB asks for more data then the validated SG_IO header fields, error it. Reported-by: Krishna Ram Prakash R <krp@gtux.in> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-08-07 12:20:52 -06:00
Jens Axboe	e15c2ffa10	block: fix O_DIRECT error handling for bio fragments `0eb6ddfb86` tried to fix this up, but introduced a use-after-free of dio. Additionally, we still had an issue with error handling, as reported by Darrick: "I noticed a regression in xfs/747 (an unreleased xfstest for the xfs_scrub media scanning feature) on 5.3-rc3. I'll condense that down to a simpler reproducer: error-test: 0 209 linear 8:48 0 error-test: 209 1 error error-test: 210 6446894 linear 8:48 210 Basically we have a ~3G /dev/sdd and we set up device mapper to fail IO for sector 209 and to pass the io to the scsi device everywhere else. On 5.3-rc3, performing a directio pread of this range with a < 1M buffer (in other words, a request for fewer than MAX_BIO_PAGES bytes) yields EIO like you'd expect: pread64(3, 0x7f880e1c7000, 1048576, 0) = -1 EIO (Input/output error) pread: Input/output error +++ exited with 0 +++ But doing it with a larger buffer succeeds(!): pread64(3, "XFSB\0\0\20\0\0\0\0\0\0\fL\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1146880, 0) = 1146880 read 1146880/1146880 bytes at offset 0 1 MiB, 1 ops; 0.0009 sec (1.124 GiB/sec and 1052.6316 ops/sec) +++ exited with 0 +++ (Note that the part of the buffer corresponding to the dm-error area is uninitialized) On 5.3-rc2, both commands would fail with EIO like you'd expect. The only change between rc2 and rc3 is commit `0eb6ddfb86` ("block: Fix __blkdev_direct_IO() for bio fragments"). AFAICT we end up in __blkdev_direct_IO with a 1120K buffer, which gets split into two bios: one for the first BIO_MAX_PAGES worth of data (1MB) and a second one for the 96k after that." Fix this by noting that it's always safe to dereference dio if we get BLK_QC_T_EAGAIN returned, as end_io hasn't been run for that case. So we can safely increment the dio size before calling submit_bio(), and then decrement it on failure (not that it really matters, as the bio and dio are going away). For error handling, return to the original method of just using 'ret' for tracking the error, and the size tracking in dio->size. Fixes: `0eb6ddfb86` ("block: Fix __blkdev_direct_IO() for bio fragments") Fixes: `6a43074e2f` ("block: properly handle IOCB_NOWAIT for async O_DIRECT IO") Reported-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-08-07 12:19:43 -06:00
Chuhong Yuan	94f3e14e00	mlx5: Use refcount_t for refcount Reference counters are preferred to use refcount_t instead of atomic_t. This is because the implementation of refcount_t can prevent overflows and detect possible use-after-free. So convert atomic_t ref counters to refcount_t. Signed-off-by: Chuhong Yuan <hslester96@gmail.com> Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2019-08-07 11:01:48 -07:00
Trond Myklebust	67e7b52d44	NFSv4: Ensure state recovery handles ETIMEDOUT correctly Ensure that the state recovery code handles ETIMEDOUT correctly, and also that we set RPC_TASK_TIMEOUT when recovering open state. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2019-08-07 12:55:11 -04:00
Alex Deucher	4b3e30ed3e	Revert "drm/amdkfd: New IOCTL to allocate queue GWS" This reverts commit `1a058c3376`. This interface is still in too much flux. Revert until it's sorted out. Acked-by: Oak Zeng <Oak.Zeng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-07 10:21:38 -05:00
Qu Wenruo	07301df7d2	btrfs: trim: Check the range passed into to prevent overflow Normally the range->len is set to default value (U64_MAX), but when it's not default value, we should check if the range overflows. And if it overflows, return -EINVAL before doing anything. Reviewed-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2019-08-07 16:42:39 +02:00
Filipe Manana	d7cd4dd907	Btrfs: fix sysfs warning and missing raid sysfs directories In the 5.3 merge window, commit `7c7e301406` ("btrfs: sysfs: Replace default_attrs in ktypes with groups"), we started using the member "defaults_groups" for the kobject type "btrfs_raid_ktype". That leads to a series of warnings when running some test cases of fstests, such as btrfs/027, btrfs/124 and btrfs/176. The traces produced by those warnings are like the following: [116648.059212] kernfs: can not remove 'total_bytes', no directory [116648.060112] WARNING: CPU: 3 PID: 28500 at fs/kernfs/dir.c:1504 kernfs_remove_by_name_ns+0x75/0x80 (...) [116648.066482] CPU: 3 PID: 28500 Comm: umount Tainted: G W 5.3.0-rc3-btrfs-next-54 #1 (...) [116648.069376] RIP: 0010:kernfs_remove_by_name_ns+0x75/0x80 (...) [116648.072385] RSP: 0018:ffffabfd0090bd08 EFLAGS: 00010282 [116648.073437] RAX: 0000000000000000 RBX: ffffffffc0c11998 RCX: 0000000000000000 [116648.074201] RDX: ffff9fff603a7a00 RSI: ffff9fff603978a8 RDI: ffff9fff603978a8 [116648.074956] RBP: ffffffffc0b9ca2f R08: 0000000000000000 R09: 0000000000000001 [116648.075708] R10: ffff9ffe1f72e1c0 R11: 0000000000000000 R12: ffffffffc0b94120 [116648.076434] R13: ffffffffb3d9b4e0 R14: 0000000000000000 R15: dead000000000100 [116648.077143] FS: 00007f9cdc78a2c0(0000) GS:ffff9fff60380000(0000) knlGS:0000000000000000 [116648.077852] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [116648.078546] CR2: 00007f9fc4747ab4 CR3: 00000005c7832003 CR4: 00000000003606e0 [116648.079235] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [116648.079907] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [116648.080585] Call Trace: [116648.081262] remove_files+0x31/0x70 [116648.081929] sysfs_remove_group+0x38/0x80 [116648.082596] sysfs_remove_groups+0x34/0x70 [116648.083258] kobject_del+0x20/0x60 [116648.083933] btrfs_free_block_groups+0x405/0x430 [btrfs] [116648.084608] close_ctree+0x19a/0x380 [btrfs] [116648.085278] generic_shutdown_super+0x6c/0x110 [116648.085951] kill_anon_super+0xe/0x30 [116648.086621] btrfs_kill_super+0x12/0xa0 [btrfs] [116648.087289] deactivate_locked_super+0x3a/0x70 [116648.087956] cleanup_mnt+0xb4/0x160 [116648.088620] task_work_run+0x7e/0xc0 [116648.089285] exit_to_usermode_loop+0xfa/0x100 [116648.089933] do_syscall_64+0x1cb/0x220 [116648.090567] entry_SYSCALL_64_after_hwframe+0x49/0xbe [116648.091197] RIP: 0033:0x7f9cdc073b37 (...) [116648.100046] ---[ end trace 22e24db328ccadf8 ]--- [116648.100618] ------------[ cut here ]------------ [116648.101175] kernfs: can not remove 'used_bytes', no directory [116648.101731] WARNING: CPU: 3 PID: 28500 at fs/kernfs/dir.c:1504 kernfs_remove_by_name_ns+0x75/0x80 (...) [116648.105649] CPU: 3 PID: 28500 Comm: umount Tainted: G W 5.3.0-rc3-btrfs-next-54 #1 (...) [116648.107461] RIP: 0010:kernfs_remove_by_name_ns+0x75/0x80 (...) [116648.109336] RSP: 0018:ffffabfd0090bd08 EFLAGS: 00010282 [116648.109979] RAX: 0000000000000000 RBX: ffffffffc0c119a0 RCX: 0000000000000000 [116648.110625] RDX: ffff9fff603a7a00 RSI: ffff9fff603978a8 RDI: ffff9fff603978a8 [116648.111283] RBP: ffffffffc0b9ca41 R08: 0000000000000000 R09: 0000000000000001 [116648.111940] R10: ffff9ffe1f72e1c0 R11: 0000000000000000 R12: ffffffffc0b94120 [116648.112603] R13: ffffffffb3d9b4e0 R14: 0000000000000000 R15: dead000000000100 [116648.113268] FS: 00007f9cdc78a2c0(0000) GS:ffff9fff60380000(0000) knlGS:0000000000000000 [116648.113939] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [116648.114607] CR2: 00007f9fc4747ab4 CR3: 00000005c7832003 CR4: 00000000003606e0 [116648.115286] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [116648.115966] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [116648.116649] Call Trace: [116648.117326] remove_files+0x31/0x70 [116648.117997] sysfs_remove_group+0x38/0x80 [116648.118671] sysfs_remove_groups+0x34/0x70 [116648.119342] kobject_del+0x20/0x60 [116648.120022] btrfs_free_block_groups+0x405/0x430 [btrfs] [116648.120707] close_ctree+0x19a/0x380 [btrfs] [116648.121396] generic_shutdown_super+0x6c/0x110 [116648.122057] kill_anon_super+0xe/0x30 [116648.122702] btrfs_kill_super+0x12/0xa0 [btrfs] [116648.123335] deactivate_locked_super+0x3a/0x70 [116648.123961] cleanup_mnt+0xb4/0x160 [116648.124586] task_work_run+0x7e/0xc0 [116648.125210] exit_to_usermode_loop+0xfa/0x100 [116648.125830] do_syscall_64+0x1cb/0x220 [116648.126463] entry_SYSCALL_64_after_hwframe+0x49/0xbe [116648.127080] RIP: 0033:0x7f9cdc073b37 (...) [116648.135923] ---[ end trace 22e24db328ccadf9 ]--- These happen because, during the unmount path, we call kobject_del() for raid kobjects that are not fully initialized, meaning that we set their ktype (as btrfs_raid_ktype) through link_block_group() but we didn't set their parent kobject, which is done through btrfs_add_raid_kobjects(). We have this split raid kobject setup since commit `75cb379d26` ("btrfs: defer adding raid type kobject until after chunk relocation") in order to avoid triggering reclaim during contextes where we can not (either we are holding a transaction handle or some lock required by the transaction commit path), so that we do the calls to kobject_add(), which triggers GFP_KERNEL allocations, through btrfs_add_raid_kobjects() in contextes where it is safe to trigger reclaim. That change expected that a new raid kobject can only be created either when mounting the filesystem or after raid profile conversion through the relocation path. However, we can have new raid kobject created in other two cases at least: 1) During device replace (or scrub) after adding a device a to the filesystem. The replace procedure (and scrub) do calls to btrfs_inc_block_group_ro() which can allocate a new block group with a new raid profile (because we now have more devices). This can be triggered by test cases btrfs/027 and btrfs/176. 2) During a degraded mount trough any write path. This can be triggered by test case btrfs/124. Fixing this by adding extra calls to btrfs_add_raid_kobjects(), not only makes things more complex and fragile, can also introduce deadlocks with reclaim the following way: 1) Calling btrfs_add_raid_kobjects() at btrfs_inc_block_group_ro() or anywhere in the replace/scrub path will cause a deadlock with reclaim because if reclaim happens and a transaction commit is triggered, the transaction commit path will block at btrfs_scrub_pause(). 2) During degraded mounts it is essentially impossible to figure out where to add extra calls to btrfs_add_raid_kobjects(), because allocation of a block group with a new raid profile can happen anywhere, which means we can't safely figure out which contextes are safe for reclaim, as we can either hold a transaction handle or some lock needed by the transaction commit path. So it is too complex and error prone to have this split setup of raid kobjects. So fix the issue by consolidating the setup of the kobjects in a single place, at link_block_group(), and setup a nofs context there in order to prevent reclaim being triggered by the memory allocations done through the call chain of kobject_add(). Besides fixing the sysfs warnings during kobject_del(), this also ensures the sysfs directories for the new raid profiles end up created and visible to users (a bug that existed before the 5.3 commit `7c7e301406` ("btrfs: sysfs: Replace default_attrs in ktypes with groups")). Fixes: `75cb379d26` ("btrfs: defer adding raid type kobject until after chunk relocation") Fixes: `7c7e301406` ("btrfs: sysfs: Replace default_attrs in ktypes with groups") Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2019-08-07 16:25:44 +02:00
Gustavo A. R. Silva	7468a4eae5	x86: mtrr: cyrix: Mark expected switch fall-through Mark switch cases where we are expecting to fall through. Fix the following warning (Building: i386_defconfig i386): arch/x86/kernel/cpu/mtrr/cyrix.c:99:6: warning: this statement may fall through [-Wimplicit-fallthrough=] Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lkml.kernel.org/r/20190805201712.GA19927@embeddedor	2019-08-07 15:12:01 +02:00
Gustavo A. R. Silva	4ab9ab656a	x86/ptrace: Mark expected switch fall-through Mark switch cases where we are expecting to fall through. Fix the following warning (Building: allnoconfig i386): arch/x86/kernel/ptrace.c:202:6: warning: this statement may fall through [-Wimplicit-fallthrough=] if (unlikely(value == 0)) ^ arch/x86/kernel/ptrace.c:206:2: note: here default: ^~~~~~~ Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lkml.kernel.org/r/20190805195654.GA17831@embeddedor	2019-08-07 15:12:01 +02:00
Takashi Iwai	c02f77d32d	ALSA: hda - Workaround for crackled sound on AMD controller (1022:1457) A long-time problem on the recent AMD chip (X370, X470, B450, etc with PCI ID 1022:1457) with Realtek codecs is the crackled or distorted sound for capture streams, as well as occasional playback hiccups. After lengthy debugging sessions, the workarounds we've found are like the following: - Set up the proper driver caps for this controller, similar as the other AMD controller. - Correct the DMA position reporting with the fixed FIFO size, which is similar like as workaround used for VIA chip set. - Even after the position correction, PulseAudio still shows mysterious stalls of playback streams when a capture is triggered in timer-scheduled mode. Since we have no clear way to eliminate the stall, pass the BATCH PCM flag for PA to suppress the tsched mode as a temporary workaround. This patch implements the workarounds. For the driver caps, it defines a new preset, AXZ_DCAPS_PRESET_AMD_SB. It enables the FIFO- corrected position reporting (corresponding to the new position_fix=6) and enforces the SNDRV_PCM_INFO_BATCH flag. Note that the current implementation is merely a workaround. Hopefully we'll find a better alternative in future, especially about removing the BATCH flag hack again. BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=195303 Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>	2019-08-07 13:47:46 +02:00
Mika Westerberg	0617bdede5	Revert "PCI: Add missing link delays required by the PCIe spec" Commit `c2bf1fc212` ("PCI: Add missing link delays required by the PCIe spec") turned out causing issues with some systems either by making them unresponsive or slowing down runtime and system wide resume of PCIe devices. While root cause for the unresponsiveness is still under investigation given the amount of issues reported better to revert it for now. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204413 Link: https://lore.kernel.org/linux-pci/SL2P216MB01878BBCD75F21D882AEEA2880C60@SL2P216MB0187.KORP216.PROD.OUTLOOK.COM/ Link: https://lore.kernel.org/linux-pci/2857501d-c167-547d-c57d-d5d24ea1f1dc@molgen.mpg.de/ Reported-by: Matthias Andree <matthias.andree@gmx.de> Reported-by: Paul Menzel <pmenzel@molgen.mpg.de> Reported-by: Nicholas Johnson <nicholas.johnson-opensource@outlook.com.au> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-08-07 13:06:42 +02:00
Wenwen Wang	3d92aa45fb	ALSA: hiface: fix multiple memory leak bugs In hiface_pcm_init(), 'rt' is firstly allocated through kzalloc(). Later on, hiface_pcm_init_urb() is invoked to initialize 'rt->out_urbs[i]'. In hiface_pcm_init_urb(), 'rt->out_urbs[i].buffer' is allocated through kzalloc(). However, if hiface_pcm_init_urb() fails, both 'rt' and 'rt->out_urbs[i].buffer' are not deallocated, leading to memory leak bugs. Also, 'rt->out_urbs[i].buffer' is not deallocated if snd_pcm_new() fails. To fix the above issues, free 'rt' and 'rt->out_urbs[i].buffer'. Fixes: `a91c3fb2f8` ("Add M2Tech hiFace USB-SPDIF driver") Signed-off-by: Wenwen Wang <wenwen@cs.uga.edu> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>	2019-08-07 12:20:00 +02:00
Marek Olšák	d9dfe768b3	Revert "drm/amdgpu: fix transform feedback GDS hang on gfx10 (v2)" This reverts commit `9ed2c993d7`. SET_CONFIG_REG writes to memory if register shadowing is enabled, causing a VM fault. NGG streamout is unstable anyway, so all UMDs should use legacy streamout. I think Mesa is the only driver using NGG streamout. Signed-off-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Le Ma <Le.Ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-06 23:28:41 -05:00
David S. Miller	13dfb3fa49	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Just minor overlapping changes in the conflicts here. Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-06 18:44:57 -07:00
Alexei Starovoitov	682cdbdc21	Merge branch 'test_progs-stdio' Stanislav Fomichev says: ==================== I was looking into converting test_sockops* to test_progs framework and that requires using cgroup_helpers.c which rely on stdio/stderr. Let's use open_memstream to override stdout into buffer during subtests instead of custom test_{v,}printf wrappers. That lets us continue to use stdio in the subtests and dump it on failure if required. That would also fix bpf_find_map which currently uses printf to signal failure (missed during test_printf conversion). ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-08-06 17:17:53 -07:00
Stanislav Fomichev	16e910d446	selftests/bpf: test_progs: drop extra trailing tab Small (un)related cleanup. Cc: Andrii Nakryiko <andriin@fb.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-08-06 17:17:52 -07:00
Stanislav Fomichev	66bd2ec1e0	selftests/bpf: test_progs: test__printf -> printf Now that test__printf is a simple wraper around printf, let's drop it (and test__vprintf as well). Cc: Andrii Nakryiko <andriin@fb.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-08-06 17:17:52 -07:00
Stanislav Fomichev	946152b3c5	selftests/bpf: test_progs: switch to open_memstream Use open_memstream to override stdout during test execution. The copy of the original stdout is held in env.stdout and used to print subtest info and dump failed log. test_{v,}printf are now simple wrappers around stdout and will be removed in the next patch. v5: * fix -v crash by always setting env.std{in,err} (Alexei Starovoitov) * drop force_log check from stdio_hijack (Andrii Nakryiko) v4: * one field per line for stdout/stderr (Andrii Nakryiko) v3: * don't do strlen over log_buf, log_cnt has it already (Andrii Nakryiko) v2: * add ifdef __GLIBC__ around open_memstream (maybe pointless since we already depend on glibc for argp_parse) * hijack stderr as well (Andrii Nakryiko) * don't hijack for every test, do it once (Andrii Nakryiko) * log_cap -> log_size (Andrii Nakryiko) * do fseeko in a proper place (Andrii Nakryiko) * check open_memstream returned value (Andrii Nakryiko) Cc: Andrii Nakryiko <andriin@fb.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-08-06 17:17:52 -07:00
Linus Torvalds	33920f1ec5	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from David Miller: "Yeah I should have sent a pull request last week, so there is a lot more here than usual: 1) Fix memory leak in ebtables compat code, from Wenwen Wang. 2) Several kTLS bug fixes from Jakub Kicinski (circular close on disconnect etc.) 3) Force slave speed check on link state recovery in bonding 802.3ad mode, from Thomas Falcon. 4) Clear RX descriptor bits before assigning buffers to them in stmmac, from Jose Abreu. 5) Several missing of_node_put() calls, mostly wrt. for_each_() OF loops, from Nishka Dasgupta. 6) Double kfree_skb() in peak_usb can driver, from Stephane Grosjean. 7) Need to hold sock across skb->destructor invocation, from Cong Wang. 8) IP header length needs to be validated in ipip tunnel xmit, from Haishuang Yan. 9) Use after free in ip6 tunnel driver, also from Haishuang Yan. 10) Do not use MSI interrupts on r8169 chips before RTL8168d, from Heiner Kallweit. 11) Upon bridge device init failure, we need to delete the local fdb. From Nikolay Aleksandrov. 12) Handle erros from of_get_mac_address() properly in stmmac, from Martin Blumenstingl. 13) Handle concurrent rename vs. dump in netfilter ipset, from Jozsef Kadlecsik. 14) Setting NETIF_F_LLTX on mac80211 causes complete breakage with some devices, so revert. From Johannes Berg. 15) Fix deadlock in rxrpc, from David Howells. 16) Fix Kconfig deps of enetc driver, we must have PHYLIB. From Yue Haibing. 17) Fix mvpp2 crash on module removal, from Matteo Croce. 18) Fix race in genphy_update_link, from Heiner Kallweit. 19) bpf_xdp_adjust_head() stopped working with generic XDP when we fixes generic XDP to support stacked devices properly, fix from Jesper Dangaard Brouer. 20) Unbalanced RCU locking in rt6_update_exception_stamp_rt(), from David Ahern. 21) Several memory leaks in new sja1105 driver, from Vladimir Oltean" git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (214 commits) net: dsa: sja1105: Fix memory leak on meta state machine error path net: dsa: sja1105: Fix memory leak on meta state machine normal path net: dsa: sja1105: Really fix panic on unregistering PTP clock net: dsa: sja1105: Use the LOCKEDS bit for SJA1105 E/T as well net: dsa: sja1105: Fix broken learning with vlan_filtering disabled net: dsa: qca8k: Add of_node_put() in qca8k_setup_mdio_bus() net: sched: sample: allow accessing psample_group with rtnl net: sched: police: allow accessing police->params with rtnl net: hisilicon: Fix dma_map_single failed on arm64 net: hisilicon: fix hip04-xmit never return TX_BUSY net: hisilicon: make hip04_tx_reclaim non-reentrant tc-testing: updated vlan action tests with batch create/delete net sched: update vlan action for batched events operations net: stmmac: tc: Do not return a fragment entry net: stmmac: Fix issues when number of Queues >= 4 net: stmmac: xgmac: Fix XGMAC selftests be2net: disable bh with spin_lock in be_process_mcc net: cxgb3_main: Fix a resource leak in a error path in 'init_one()' net: ethernet: sun4i-emac: Support phy-handle property for finding PHYs net: bridge: move default pvid init/deinit to NETDEV_REGISTER/UNREGISTER ...	2019-08-06 17:11:59 -07:00
David S. Miller	05bb520376	Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 40GbE Intel Wired LAN Driver Updates 2019-08-05 This series contains updates to i40e driver only. Dmitrii adds missing statistic counters for VEB and VEB TC's. Slawomir adds support for logging the "Disable Firmware LLDP" flag option and its current status. Jake fixes an issue where VF's being notified of their link status before their queues are enabled which was causing issues. So always report link status down when the VF queues are not enabled. Also adds future proofing when statistics are added or removed by adding checks to ensure the data pointer for the strings lines up with the expected statistics count. Czeslaw fixes the advertised mode reported in ethtool for FEC, where the "None BaseR RS" was always being displayed no matter what the mode it was in. Also added logging information when the PF is entering or leaving "allmulti" (or promiscuous) mode. Fixed up the logging logic for VF's when leaving multicast mode to not include unicast as well. v2: drop Aleksandr's patch (previously patch #2 in the series) to display the VF MAC address that is set by the VF while community feedback is addressed. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-06 14:41:45 -07:00
Yifeng Sun	aa733660db	openvswitch: Print error when ovs_execute_actions() fails Currently in function ovs_dp_process_packet(), return values of ovs_execute_actions() are silently discarded. This patch prints out an debug message when error happens so as to provide helpful hints for debugging. Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-06 14:38:12 -07:00
Atish Patra	713203e303	RISC-V: Remove per cpu clocksource There is only one clocksource in RISC-V. The boot cpu initializes that clocksource. No need to keep a percpu data structure. Signed-off-by: Atish Patra <atish.patra@wdc.com> Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>	2019-08-06 14:37:58 -07:00
David S. Miller	feac1d6802	Merge branch 'sja1105-fixes' Vladimir Oltean says: ==================== Fixes for SJA1105 DSA: FDBs, Learning and PTP This is an assortment of functional fixes for the sja1105 switch driver targeted for the "net" tree (although they apply on net-next just as well). Patch 1/5 ("net: dsa: sja1105: Fix broken learning with vlan_filtering disabled") repairs a breakage introduced in the early development stages of the driver: support for traffic from the CPU has broken "normal" frame forwarding (based on DMAC) - there is connectivity through the switch only because all frames are flooded. I debated whether this patch qualifies as a fix, since it puts the switch into a mode it has never operated in before (aka SVL). But "normal" forwarding did use to work before the "Traffic support for SJA1105 DSA driver" patchset, and arguably this patch should have been part of that. Also, it would be strange for this feature to be broken in the 5.2 LTS. Patch 2/5 ("net: dsa: sja1105: Use the LOCKEDS bit for SJA1105 E/T as well") is a simplification of a previous FDB-related patch that is currently in the 5.3 rc's. Patches 3/5 - 5/5 fix various crashes found while running linuxptp over the switch ports for extended periods of time, or in conjunction with other error conditions. The fixed-up commits were all introduced in 5.2. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-06 14:37:02 -07:00
Vladimir Oltean	93fa8587b2	net: dsa: sja1105: Fix memory leak on meta state machine error path When RX timestamping is enabled and two link-local (non-meta) frames are received in a row, this constitutes an error. The tagger is always caching the last link-local frame, in an attempt to merge it with the meta follow-up frame when that arrives. To recover from the above error condition, the initial cached link-local frame is dropped and the second frame in a row is cached (in expectance of the second meta frame). However, when dropping the initial link-local frame, its backing memory was being leaked. Fixes: `f3097be21b` ("net: dsa: sja1105: Add a state machine for RX timestamping") Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-06 14:37:02 -07:00
Vladimir Oltean	f163fed276	net: dsa: sja1105: Fix memory leak on meta state machine normal path After a meta frame is received, it is associated with the cached sp->data->stampable_skb from the DSA tagger private structure. Cached means its refcount is incremented with skb_get() in order for dsa_switch_rcv() to not free it when the tagger .rcv returns NULL. The mistake is that skb_unref() is not the correct function to use. It will correctly decrement the refcount (which will go back to zero) but the skb memory will not be freed. That is the job of kfree_skb(), which also calls skb_unref(). But it turns out that freeing the cached stampable_skb is in fact not necessary. It is still a perfectly valid skb, and now it is even annotated with the partial RX timestamp. So remove the skb_copy() altogether and simply pass the stampable_skb with a refcount of 1 (incremented by us, decremented by dsa_switch_rcv) up the stack. Fixes: `f3097be21b` ("net: dsa: sja1105: Add a state machine for RX timestamping") Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-06 14:37:02 -07:00
Vladimir Oltean	6cb0abbdf9	net: dsa: sja1105: Really fix panic on unregistering PTP clock The IS_ERR_OR_NULL(priv->clock) check inside sja1105_ptp_clock_unregister() is preventing cancel_delayed_work_sync from actually being run. Additionally, sja1105_ptp_clock_unregister() does not actually get run, when placed in sja1105_remove(). The DSA switch gets torn down, but the sja1105 module does not get unregistered. So sja1105_ptp_clock_unregister needs to be moved to sja1105_teardown, to be symmetrical with sja1105_ptp_clock_register which is called from the DSA sja1105_setup. It is strange to fix a "fixes" patch, but the probe failure can only be seen when the attached PHY does not respond to MDIO (issue which I can't pinpoint the reason to) and it goes away after I power-cycle the board. This time the patch was validated on a failing board, and the kernel panic from the fixed commit's message can no longer be seen. Fixes: `29dd908d35` ("net: dsa: sja1105: Cancel PTP delayed work on unregister") Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-06 14:37:02 -07:00
Vladimir Oltean	4b7da3d808	net: dsa: sja1105: Use the LOCKEDS bit for SJA1105 E/T as well It looks like the FDB dump taken from first-generation switches also contains information on whether entries are static or not. So use that instead of searching through the driver's tables. Fixes: `d763778224` ("net: dsa: sja1105: Implement is_static for FDB entries on E/T") Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-06 14:37:02 -07:00
Vladimir Oltean	6d7c7d948a	net: dsa: sja1105: Fix broken learning with vlan_filtering disabled When put under a bridge with vlan_filtering 0, the SJA1105 ports will flood all traffic as if learning was broken. This is because learning interferes with the rx_vid's configured by dsa_8021q as unique pvid's. So learning technically still does work, it's just that the learnt entries never get matched due to their unique VLAN ID. The setting that saves the day is Shared VLAN Learning, which on this switch family works exactly as desired: VLAN tagging still works (untagged traffic gets the correct pvid) and FDB entries are still populated with the correct contents including VID. Also, a frame cannot violate the forwarding domain restrictions enforced by its classified VLAN. It is just that the VID is ignored when looking up the FDB for taking a forwarding decision (selecting the egress port). This patch activates SVL, and the result is that frames with a learnt DMAC are no longer flooded in the scenario described above. Now exactly because SVL works as desired, we have to revisit some earlier patches: - It is no longer necessary to manipulate the VID of the 'bridge fdb {add,del}' command when vlan_filtering is off. This is because now, SVL is enabled for that case, so the actual VID does not matter. - It is still desirable to hide dsa_8021q VID's in the FDB dump callback. But right now the dump callback should no longer hide duplicates (one per each front panel port's pvid, plus one for the VLAN that the CPU port is going to tag a TX frame with), because there shouldn't be any (the switch will match a single FDB entry no matter its VID anyway). Not really... It's no longer necessary to transform a 'bridge fdb add' into 5 fdb add operations, but the user might still add a fdb entry with any vid, and all of them would appear as duplicates in 'bridge fdb show'. So force a 'bridge fdb add' to insert the VID of 0, so that we can prune the duplicates at insertion time. The VID of 0 is better than 1 because it is always guaranteed to be in the ports' hardware filter. DSA also avoids putting the VID inside the netlink response message towards the bridge driver when we return this particular VID, which makes it suitable for FDB entries learnt with vlan_filtering off. Fixes: `227d07a07e` ("net: dsa: sja1105: Add support for traffic through standalone ports") Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: Georg Waibel <georg.waibel@sensor-technik.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-06 14:37:02 -07:00
Nishka Dasgupta	f26e0cca14	net: dsa: qca8k: Add of_node_put() in qca8k_setup_mdio_bus() Each iteration of for_each_available_child_of_node() puts the previous node, but in the case of a return from the middle of the loop, there is no put, thus causing a memory leak. Hence add an of_node_put() before the return. Additionally, the local variable ports in the function qca8k_setup_mdio_bus() takes the return value of of_get_child_by_name(), which gets a node but does not put it. If the function returns without putting ports, it may cause a memory leak. Hence put ports before the mid-loop return statement, and also outside the loop after its last usage in this function. Issues found with Coccinelle. Signed-off-by: Nishka Dasgupta <nishkadg.linux@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-06 14:35:13 -07:00
David S. Miller	ef68de56c7	Merge branch 'Support-tunnels-over-VLAN-in-NFP' John Hurley says: ==================== Support tunnels over VLAN in NFP This patchset deals with tunnel encap and decap when the end-point IP address is on an internal port (for example and OvS VLAN port). Tunnel encap without VLAN is already supported in the NFP driver. This patchset extends that to include a push VLAN along with tunnel header push. Patches 1-4 extend the flow_offload IR API to include actions that use skbedit to set the ptype of an SKB and that send a packet to port ingress from the act_mirred module. Such actions are used in flower rules that forward tunnel packets to internal ports where they can be decapsulated. OvS and its TC API is an example of a user-space app that produces such rules. Patch 5 modifies the encap offload code to allow the pushing of a VLAN header after a tunnel header push. Patches 6-10 deal with tunnel decap when the end-point is on an internal port. They detect 'pre-tunnel rules' which do not deal with tunnels themselves but, rather, forward packets to internal ports where they can be decapped if required. Such rules are offloaded to a table in HW along with an indication of whether packets need to be passed to this table of not (based on their destination MAC address). Matching against this table prior to decapsulation in HW allows the correct parsing and handling of outer VLANs on tunnelled packets and the correct updating of stats for said 'pre-tunnel' rules. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-06 14:24:22 -07:00
John Hurley	2e0bc7f3cb	nfp: flower: encode mac indexes with pre-tunnel rule check When a tunnel packet arrives on the NFP card, its destination MAC is looked up and MAC index returned for it. This index can help verify the tunnel by, for example, ensuring that the packet arrived on the expected port. If the packet is destined for a known MAC that is not connected to a given physical port then the mac index can have a global value (e.g. when a series of bonded ports shared the same MAC). If the packet is to be detunneled at a bridge device or internal port like an Open vSwitch VLAN port, then it should first match a 'pre-tunnel' rule to direct it to that internal port. Use the MAC index to indicate if a packet should match a pre-tunnel rule before decap is allowed. Do this by tracking the number of internal ports associated with a MAC address and, if the number if >0, set a bit in the mac_index to forward the packet to the pre-tunnel table before continuing with decap. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-06 14:24:22 -07:00
John Hurley	09aa811bb7	nfp: flower: remove offloaded MACs when reprs are applied to OvS bridges MAC addresses along with an identifying index are offloaded to firmware to allow tunnel decapsulation. If a tunnel packet arrives with a matching destination MAC address and a verified index, it can continue on the decapsulation process. This replicates the MAC verifications carried out in the kernel network stack. When a netdev is added to a bridge (e.g. OvS) then packets arriving on that dev are directed through the bridge datapath instead of passing through the network stack. Therefore, tunnelled packets matching the MAC of that dev will not be decapped here. Replicate this behaviour on firmware by removing offloaded MAC addresses when a MAC representer is added to an OvS bridge. This can prevent any false positive tunnel decaps. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-06 14:24:22 -07:00
John Hurley	f12725d98c	nfp: flower: offload pre-tunnel rules Pre-tunnel rules are TC flower and OvS rules that forward a packet to the tunnel end point where it can then pass through the network stack and be decapsulated. These are required if the tunnel end point is, say, an OvS internal port. Currently, firmware determines that a packet is in a tunnel and decaps it if it has a known destination IP and MAC address. However, this bypasses the flower pre-tunnel rule and so does not update the stats. Further to this it ignores VLANs that may exist outside of the tunnel header. Offload pre-tunnel rules to the NFP. This embeds the pre-tunnel rule into the tunnel decap process based on (firmware) mac index and VLAN. This means that decap can be carried out correctly with VLANs and that stats can be updated for all kernel rules correctly. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-06 14:24:22 -07:00
John Hurley	120ffd84a9	nfp: flower: verify pre-tunnel rules Pre-tunnel rules must direct packets to an internal port based on L2 information. Rules that egress to an internal port are already indicated by a non-NULL device in its nfp_fl_payload struct. Verfiy the rest of the match fields indicate that the rule is a pre-tunnel rule. This requires a full match on the destination MAC address, an option VLAN field, and no specific matches on other lower layer fields (with the exception of L4 proto and flags). If a rule is identified as a pre-tunnel rule then mark it for offload to the pre-tunnel table. Similarly, remove it from the pre-tunnel table on rule deletion. The actual offloading of these commands is left to a following patch. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-06 14:24:22 -07:00
John Hurley	f5c977eed7	nfp: flower: detect potential pre-tunnel rules Pre-tunnel rules are used when the tunnel end-point is on an 'internal port'. These rules are used to direct the tunnelled packets (based on outer header fields) to the internal port where they can be detunnelled. The rule must send the packet to ingress the internal port at the TC layer. Currently FW does not support an action to send to ingress so cannot offload such rules. However, in preparation for populating the pre-tunnel table to represent such rules, check for rules that send to the ingress of an internal port and mark them as such. Further validation of such rules is left to subsequent patches. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-06 14:24:21 -07:00
John Hurley	4b10c53d81	nfp: flower: push vlan after tunnel in merge NFP allows the merging of 2 flows together into a single offloaded flow. In the kernel datapath the packet must match 1 flow, impliment its actions, recirculate, match the 2nd flow and also impliment its actions. Merging creates a single flow with all actions from the 2 original flows. Firmware impliments a tunnel header push as the packet is about to egress the card. Therefore, if the first merge rule candiate pushes a tunnel, then the second rule can only have an egress action for a valid merge to occur (or else the action ordering will be incorrect). This prevents the pushing of a tunnel header followed by the pushing of a vlan header. In order to support this behaviour, firmware allows VLAN information to be encoded in the tunnel push action. If this is non zero then the fw will push a VLAN after the tunnel header push meaning that 2 such flows with these actions can be merged (with action order being maintained). Support tunnel in VLAN pushes by encoding VLAN information in the tunnel push action of any merge flow requiring this. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-06 14:24:21 -07:00
John Hurley	48e584ac58	net: sched: add ingress mirred action to hardware IR TC mirred actions (redirect and mirred) can send to egress or ingress of a device. Currently only egress is used for hw offload rules. Modify the intermediate representation for hw offload to include mirred actions that go to ingress. This gives drivers access to such rules and can decide whether or not to offload them. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-06 14:24:21 -07:00
John Hurley	d7609c96c6	net: tc_act: add helpers to detect ingress mirred actions TC mirred actions can send to egress or ingress on a given netdev. Helpers exist to detect actions that are mirred to egress. Extend the header file to include helpers to detect ingress mirred actions. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-06 14:24:21 -07:00
John Hurley	fb1b775a24	net: sched: add skbedit of ptype action to hardware IR TC rules can impliment skbedit actions. Currently actions that modify the skb mark are passed to offloading drivers via the hardware intermediate representation in the flow_offload API. Extend this to include skbedit actions that modify the packet type of the skb. Such actions may be used to set the ptype to HOST when redirecting a packet to ingress. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-06 14:24:21 -07:00

... 33 34 35 36 37 ...

858756 Commits