linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-04-15 16:33:33 -04:00

Author	SHA1	Message	Date
Kent Overstreet	6447544c3d	bcachefs: Improve error printing in btree_node_check_topology() We had a bug report where the errors from btree_node_check_topology() don't seem to be getting printed; log_fsck_err() does some fancy ratelimiting-type stuff that we don't want here. Instead, just use bch2_count_fsck_err(); this is simpler, and modelled after how we're currently handling bucket ref update errors in buckets.c. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-31 22:03:17 -04:00
Kent Overstreet	f402d9710b	bcachefs: bch2_readdir() now calls str_hash_check_key() More self healing code: readdir will now notice if there are dirents hashed incorrectly, and it'll repair them if errors=fix_safe. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-31 22:03:17 -04:00
Kent Overstreet	a592268260	bcachefs: bch2_str_hash_check_key() may now be called without snapshots_seen We don't track snapshot overwrites outside of fsck, so for this to be called at runtime outside of fsck we need to create it on demand, when we have repair to do. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-31 22:03:17 -04:00
Kent Overstreet	cb6f5d0dec	bcachefs: __bch2_insert_snapshot_whiteouts() refactoring Now uses bch2_get_snapshot_overwrites(), and much shorter. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-31 22:03:17 -04:00
Kent Overstreet	801cb2bd6c	bcachefs: bch2_get_snapshot_overwrites() New helper for getting a list of snapshot IDs that have overwritten a given key. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-31 22:03:17 -04:00
Kent Overstreet	d21262d4e3	bcachefs: bch2_dev_journal_bucket_delete() Recover from "journal and btree in same bucket". Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-31 22:03:17 -04:00
Kent Overstreet	0224d17d76	bcachefs: Runtime self healing for keys for deleted snapshots If snapshot deletion incorrectly missing some keys and leaves keys for deleted snapshots, that causes a bit of a problem for data move - we can't move an extent for a nonexistent snapshot, because the extent might have to be fragmented, and maintaining correct visibility in child snapshots doesn't work if it doesn't have a snapshot. Previously we'd just skip these keys, but it turns out that causes copygc to spin. So we need runtime self healing, i.e. calling check_key_has_snapshot() from the data move path. Snapshot deletion v2 included sentinal values for deleted snapshot nodes, so this is quite safe. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-31 22:03:17 -04:00
Kent Overstreet	f02d153274	bcachefs: Don't unlock trans before data_update_init() data_update_init() does need to do btree operations, delay doing the unlock-before-io. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-31 22:03:17 -04:00
Kent Overstreet	642c1aabb0	bcachefs: Use bch2_err_matches() for BCH_ERR_fsck_(fix\|ignore) We'll be adding subtypes of these errors, and new error code tracing. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-31 22:03:16 -04:00
Linus Torvalds	00c010e130	Merge tag 'mm-stable-2025-05-31-14-50' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - "Add folio_mk_pte()" from Matthew Wilcox simplifies the act of creating a pte which addresses the first page in a folio and reduces the amount of plumbing which architecture must implement to provide this. - "Misc folio patches for 6.16" from Matthew Wilcox is a shower of largely unrelated folio infrastructure changes which clean things up and better prepare us for future work. - "memory,x86,acpi: hotplug memory alignment advisement" from Gregory Price adds early-init code to prevent x86 from leaving physical memory unused when physical address regions are not aligned to memory block size. - "mm/compaction: allow more aggressive proactive compaction" from Michal Clapinski provides some tuning of the (sadly, hard-coded (more sadly, not auto-tuned)) thresholds for our invokation of proactive compaction. In a simple test case, the reduction of a guest VM's memory consumption was dramatic. - "Minor cleanups and improvements to swap freeing code" from Kemeng Shi provides some code cleaups and a small efficiency improvement to this part of our swap handling code. - "ptrace: introduce PTRACE_SET_SYSCALL_INFO API" from Dmitry Levin adds the ability for a ptracer to modify syscalls arguments. At this time we can alter only "system call information that are used by strace system call tampering, namely, syscall number, syscall arguments, and syscall return value. This series should have been incorporated into mm.git's "non-MM" branch, but I goofed. - "fs/proc: extend the PAGEMAP_SCAN ioctl to report guard regions" from Andrei Vagin extends the info returned by the PAGEMAP_SCAN ioctl against /proc/pid/pagemap. This permits CRIU to more efficiently get at the info about guard regions. - "Fix parameter passed to page_mapcount_is_type()" from Gavin Shan implements that fix. No runtime effect is expected because validate_page_before_insert() happens to fix up this error. - "kernel/events/uprobes: uprobe_write_opcode() rewrite" from David Hildenbrand basically brings uprobe text poking into the current decade. Remove a bunch of hand-rolled implementation in favor of using more current facilities. - "mm/ptdump: Drop assumption that pxd_val() is u64" from Anshuman Khandual provides enhancements and generalizations to the pte dumping code. This might be needed when 128-bit Page Table Descriptors are enabled for ARM. - "Always call constructor for kernel page tables" from Kevin Brodsky ensures that the ctor/dtor is always called for kernel pgtables, as it already is for user pgtables. This permits the addition of more functionality such as "insert hooks to protect page tables". This change does result in various architectures performing unnecesary work, but this is fixed up where it is anticipated to occur. - "Rust support for mm_struct, vm_area_struct, and mmap" from Alice Ryhl adds plumbing to permit Rust access to core MM structures. - "fix incorrectly disallowed anonymous VMA merges" from Lorenzo Stoakes takes advantage of some VMA merging opportunities which we've been missing for 15 years. - "mm/madvise: batch tlb flushes for MADV_DONTNEED and MADV_FREE" from SeongJae Park optimizes process_madvise()'s TLB flushing. Instead of flushing each address range in the provided iovec, we batch the flushing across all the iovec entries. The syscall's cost was approximately halved with a microbenchmark which was designed to load this particular operation. - "Track node vacancy to reduce worst case allocation counts" from Sidhartha Kumar makes the maple tree smarter about its node preallocation. stress-ng mmap performance increased by single-digit percentages and the amount of unnecessarily preallocated memory was dramaticelly reduced. - "mm/gup: Minor fix, cleanup and improvements" from Baoquan He removes a few unnecessary things which Baoquan noted when reading the code. - ""Enhance sysfs handling for memory hotplug in weighted interleave" from Rakie Kim "enhances the weighted interleave policy in the memory management subsystem by improving sysfs handling, fixing memory leaks, and introducing dynamic sysfs updates for memory hotplug support". Fixes things on error paths which we are unlikely to hit. - "mm/damon: auto-tune DAMOS for NUMA setups including tiered memory" from SeongJae Park introduces new DAMOS quota goal metrics which eliminate the manual tuning which is required when utilizing DAMON for memory tiering. - "mm/vmalloc.c: code cleanup and improvements" from Baoquan He provides cleanups and small efficiency improvements which Baoquan found via code inspection. - "vmscan: enforce mems_effective during demotion" from Gregory Price changes reclaim to respect cpuset.mems_effective during demotion when possible. because presently, reclaim explicitly ignores cpuset.mems_effective when demoting, which may cause the cpuset settings to violated. This is useful for isolating workloads on a multi-tenant system from certain classes of memory more consistently. - "Clean up split_huge_pmd_locked() and remove unnecessary folio pointers" from Gavin Guo provides minor cleanups and efficiency gains in in the huge page splitting and migrating code. - "Use kmem_cache for memcg alloc" from Huan Yang creates a slab cache for `struct mem_cgroup', yielding improved memory utilization. - "add max arg to swappiness in memory.reclaim and lru_gen" from Zhongkun He adds a new "max" argument to the "swappiness=" argument for memory.reclaim MGLRU's lru_gen. This directs proactive reclaim to reclaim from only anon folios rather than file-backed folios. - "kexec: introduce Kexec HandOver (KHO)" from Mike Rapoport is the first step on the path to permitting the kernel to maintain existing VMs while replacing the host kernel via file-based kexec. At this time only memblock's reserve_mem is preserved. - "mm: Introduce for_each_valid_pfn()" from David Woodhouse provides and uses a smarter way of looping over a pfn range. By skipping ranges of invalid pfns. - "sched/numa: Skip VMA scanning on memory pinned to one NUMA node via cpuset.mems" from Libo Chen removes a lot of pointless VMA scanning when a task is pinned a single NUMA mode. Dramatic performance benefits were seen in some real world cases. - "JFS: Implement migrate_folio for jfs_metapage_aops" from Shivank Garg addresses a warning which occurs during memory compaction when using JFS. - "move all VMA allocation, freeing and duplication logic to mm" from Lorenzo Stoakes moves some VMA code from kernel/fork.c into the more appropriate mm/vma.c. - "mm, swap: clean up swap cache mapping helper" from Kairui Song provides code consolidation and cleanups related to the folio_index() function. - "mm/gup: Cleanup memfd_pin_folios()" from Vishal Moola does that. - "memcg: Fix test_memcg_min/low test failures" from Waiman Long addresses some bogus failures which are being reported by the test_memcontrol selftest. - "eliminate mmap() retry merge, add .mmap_prepare hook" from Lorenzo Stoakes commences the deprecation of file_operations.mmap() in favor of the new file_operations.mmap_prepare(). The latter is more restrictive and prevents drivers from messing with things in ways which, amongst other problems, may defeat VMA merging. - "memcg: decouple memcg and objcg stocks"" from Shakeel Butt decouples the per-cpu memcg charge cache from the objcg's one. This is a step along the way to making memcg and objcg charging NMI-safe, which is a BPF requirement. - "mm/damon: minor fixups and improvements for code, tests, and documents" from SeongJae Park is yet another batch of miscellaneous DAMON changes. Fix and improve minor problems in code, tests and documents. - "memcg: make memcg stats irq safe" from Shakeel Butt converts memcg stats to be irq safe. Another step along the way to making memcg charging and stats updates NMI-safe, a BPF requirement. - "Let unmap_hugepage_range() and several related functions take folio instead of page" from Fan Ni provides folio conversions in the hugetlb code. * tag 'mm-stable-2025-05-31-14-50' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (285 commits) mm: pcp: increase pcp->free_count threshold to trigger free_high mm/hugetlb: convert use of struct page to folio in __unmap_hugepage_range() mm/hugetlb: refactor __unmap_hugepage_range() to take folio instead of page mm/hugetlb: refactor unmap_hugepage_range() to take folio instead of page mm/hugetlb: pass folio instead of page to unmap_ref_private() memcg: objcg stock trylock without irq disabling memcg: no stock lock for cpu hot-unplug memcg: make __mod_memcg_lruvec_state re-entrant safe against irqs memcg: make count_memcg_events re-entrant safe against irqs memcg: make mod_memcg_state re-entrant safe against irqs memcg: move preempt disable to callers of memcg_rstat_updated memcg: memcg_rstat_updated re-entrant safe against irqs mm: khugepaged: decouple SHMEM and file folios' collapse selftests/eventfd: correct test name and improve messages alloc_tag: check mem_profiling_support in alloc_tag_init Docs/damon: update titles and brief introductions to explain DAMOS selftests/damon/_damon_sysfs: read tried regions directories in order mm/damon/tests/core-kunit: add a test for damos_set_filters_default_reject() mm/damon/paddr: remove unused variable, folio_list, in damon_pa_stat() mm/damon/sysfs-schemes: fix wrong comment on damons_sysfs_quota_goal_metric_strs ...	2025-05-31 15:44:16 -07:00
Uday Shankar	08652bd86e	Documentation: ublk: document UBLK_F_PER_IO_DAEMON Explain the restrictions imposed on ublk servers in two cases: 1. When UBLK_F_PER_IO_DAEMON is set (current ublk_drv) 2. When UBLK_F_PER_IO_DAEMON is not set (legacy) Remove most references to per-queue daemons, as the new UBLK_F_PER_IO_DAEMON feature renders that concept obsolete. Signed-off-by: Uday Shankar <ushankar@purestorage.com> Reviewed-by: Caleb Sander Mateos <csander@purestorage.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250529-ublk_task_per_io-v8-9-e9d3b119336a@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-05-31 14:38:57 -06:00
Uday Shankar	17574aa2a0	selftests: ublk: add stress test for per io daemons Add a new test_stress_06 for the per io daemons feature. This is just a copy of test_stress_01 with the per_io_tasks flag added, with varying amounts of nthreads. This test is able to reproduce a panic which was caught manually during development [1]; in the current version of this patch set, it passes. Note that this commit also makes all stress tests using the run_io_and_remove helper more stressful by additionally exercising the batch submit (queue_rqs) path. [1] https://lore.kernel.org/linux-block/aDgwGoGCEpwd1mFY@fedora/ Suggested-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Uday Shankar <ushankar@purestorage.com> Link: https://lore.kernel.org/r/20250529-ublk_task_per_io-v8-8-e9d3b119336a@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-05-31 14:38:53 -06:00
Uday Shankar	236918d3e9	selftests: ublk: add functional test for per io daemons Add a new test test_generic_12 which: - sets up a ublk server with per_io_tasks and a different number of ublk server threads and ublk_queues. This is possible now that these objects are decoupled - runs some I/O load from a single CPU - verifies that all the ublk server threads handle some I/O Before this changeset, this test fails, since I/O issued from one CPU is always handled by the one ublk server thread. After this changeset, the test passes. In the future, the last check above may be strengthened to "verify that all ublk server threads handle the same amount of I/O." However, this requires some adjustments/bugfixes to tag allocation, so this work is postponed to a followup. Signed-off-by: Uday Shankar <ushankar@purestorage.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250529-ublk_task_per_io-v8-7-e9d3b119336a@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-05-31 14:38:48 -06:00
Uday Shankar	abe54c1603	selftests: ublk: kublk: decouple ublk_queues from ublk server threads Add support in kublk for decoupled ublk_queues and ublk server threads. kublk now has two modes of operation: - (preexisting mode) threads and queues are paired 1:1, and each thread services all the I/Os of one queue - (new mode) thread and queue counts are independently configurable. threads service I/Os in a way that balances load across threads even if load is not balanced over queues. The default is the preexisting mode. The new mode is activated by passing the --per_io_tasks flag. Signed-off-by: Uday Shankar <ushankar@purestorage.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250529-ublk_task_per_io-v8-6-e9d3b119336a@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-05-31 14:38:43 -06:00
Uday Shankar	b9848ca7a7	selftests: ublk: kublk: move per-thread data out of ublk_queue Towards the goal of decoupling ublk_queues from ublk server threads, move resources/data that should be per-thread rather than per-queue out of ublk_queue and into a new struct ublk_thread. Signed-off-by: Uday Shankar <ushankar@purestorage.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250529-ublk_task_per_io-v8-5-e9d3b119336a@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-05-31 14:38:39 -06:00
Uday Shankar	8f75ba28b8	selftests: ublk: kublk: lift queue initialization out of thread Currently, each ublk server I/O handler thread initializes its own queue. However, as we move towards decoupled ublk_queues and ublk server threads, this model does not make sense anymore, as there will no longer be a concept of a thread having "its own" queue. So lift queue initialization out of the per-thread ublk_io_handler_fn and into a loop in ublk_start_daemon (which runs once for each device). There is a part of ublk_queue_init (ring initialization) which does actually need to happen on the thread that will use the ring; that is separated into a separate ublk_thread_init which is still called by each I/O handler thread. Signed-off-by: Uday Shankar <ushankar@purestorage.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250529-ublk_task_per_io-v8-4-e9d3b119336a@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-05-31 14:38:35 -06:00
Uday Shankar	9773709752	selftests: ublk: kublk: tie sqe allocation to io instead of queue We currently have a helper ublk_queue_alloc_sqes which the ublk targets use to allocate SQEs for their own operations. However, as we move towards decoupled ublk_queues and ublk server threads, this helper does not make sense anymore. SQEs are allocated from rings, and we will have one ring per thread to avoid locking. Change the SQE allocation helper to ublk_io_alloc_sqes. Currently this still allocates SQEs from the io's queue's ring, but when we fully decouple threads and queues, it will allocate from the io's thread's ring instead. Signed-off-by: Uday Shankar <ushankar@purestorage.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250529-ublk_task_per_io-v8-3-e9d3b119336a@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-05-31 14:38:30 -06:00
Uday Shankar	bf098d7269	selftests: ublk: kublk: plumb q_id in io_uring user_data Currently, when we process CQEs, we know which ublk_queue we are working on because we know which ring we are working on, and ublk_queues and rings are in 1:1 correspondence. However, as we decouple ublk_queues from ublk server threads, ublk_queues and rings will no longer be in 1:1 correspondence - each ublk server thread will have a ring, and each thread may issue commands against more than one ublk_queue. So in order to know which ublk_queue a CQE refers to, plumb that information in the associated SQE's user_data. Signed-off-by: Uday Shankar <ushankar@purestorage.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250529-ublk_task_per_io-v8-2-e9d3b119336a@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-05-31 14:38:26 -06:00
Uday Shankar	ab03a61c66	ublk: have a per-io daemon instead of a per-queue daemon Currently, ublk_drv associates to each hardware queue (hctx) a unique task (called the queue's ubq_daemon) which is allowed to issue COMMIT_AND_FETCH commands against the hctx. If any other task attempts to do so, the command fails immediately with EINVAL. When considered together with the block layer architecture, the result is that for each CPU C on the system, there is a unique ublk server thread which is allowed to handle I/O submitted on CPU C. This can lead to suboptimal performance under imbalanced load generation. For an extreme example, suppose all the load is generated on CPUs mapping to a single ublk server thread. Then that thread may be fully utilized and become the bottleneck in the system, while other ublk server threads are totally idle. This issue can also be addressed directly in the ublk server without kernel support by having threads dequeue I/Os and pass them around to ensure even load. But this solution requires inter-thread communication at least twice for each I/O (submission and completion), which is generally a bad pattern for performance. The problem gets even worse with zero copy, as more inter-thread communication would be required to have the buffer register/unregister calls to come from the correct thread. Therefore, address this issue in ublk_drv by allowing each I/O to have its own daemon task. Two I/Os in the same queue are now allowed to be serviced by different daemon tasks - this was not possible before. Imbalanced load can then be balanced across all ublk server threads by having the ublk server threads issue FETCH_REQs in a round-robin manner. As a small toy example, consider a system with a single ublk device having 2 queues, each of depth 4. A ublk server having 4 threads could issue its FETCH_REQs against this device as follows (where each entry is the qid,tag pair that the FETCH_REQ targets): ublk server thread: T0 T1 T2 T3 0,0 0,1 0,2 0,3 1,3 1,0 1,1 1,2 This setup allows for load that is concentrated on one hctx/ublk_queue to be spread out across all ublk server threads, alleviating the issue described above. Add the new UBLK_F_PER_IO_DAEMON feature to ublk_drv, which ublk servers can use to essentially test for the presence of this change and tailor their behavior accordingly. Signed-off-by: Uday Shankar <ushankar@purestorage.com> Reviewed-by: Caleb Sander Mateos <csander@purestorage.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Jens Axboe <axboe@kernel.dk> Link: https://lore.kernel.org/r/20250529-ublk_task_per_io-v8-1-e9d3b119336a@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-05-31 14:38:21 -06:00
Linus Torvalds	b42966552b	Merge tag 'fbdev-for-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev Pull fbdev updates from Helge Deller: "Many small but important fixes for special cases in the fbdev, fbcon and vgacon code which were found with Syzkaller, Svace and other tools by various people and teams (e.g. Linux Verification Center). Some smaller code cleanups in the nvidiafb, arkfb, atyfb and viafb drivers and two spelling fixes" * tag 'fbdev-for-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev: fbdev: Fix fb_set_var to prevent null-ptr-deref in fb_videomode_to_var fbdev: Fix do_register_framebuffer to prevent null-ptr-deref in fb_videomode_to_var fbdev: sstfb.rst: Fix spelling mistake fbdev: core: fbcvt: avoid division by 0 in fb_cvt_hperiod() fbcon: Make sure modelist not set on unregistered console vgacon: Add check for vc_origin address range in vgacon_scroll() fbdev: arkfb: Cast ics5342_init() allocation type fbdev: nvidiafb: Correct const string length in nvidiafb_setup() fbdev: atyfb: Remove unused PCI vendor ID fbdev: carminefb: Fix spelling mistake of CARMINE_TOTAL_DIPLAY_MEM fbdev: via: use new GPIO line value setter callbacks	2025-05-31 11:34:27 -07:00
Mark Brown	4cb6c8af85	selftests/filesystems: Fix build of anon_inode_test The newly added anon_inode_test test fails to build due to attempting to include a nonexisting overlayfs/wrapper.h: anon_inode_test.c:10:10: fatal error: overlayfs/wrappers.h: No such file or directory 10 \| #include "overlayfs/wrappers.h" \| ^~~~~~~~~~~~~~~~~~~~~~ This is due to `0bd92b9fe5` ("selftests/filesystems: move wrapper.h out of overlayfs subdir") which was added in the vfs-6.16.selftests branch which was based on -rc5 and did not contain the newly added test so once things were merged into mainline the build started failing - both parent commits are fine. Fixes: `3e406741b1` ("Merge tag 'vfs-6.16-rc1.selftests' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs") Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2025-05-31 08:43:53 -07:00
Linus Torvalds	dee264c16a	Merge tag 'gcc-minimum-version-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic Pull compiler version requirement update from Arnd Bergmann: "Require gcc-8 and binutils-2.30 x86 already uses gcc-8 as the minimum version, this changes all other architectures to the same version. gcc-8 is used is Debian 10 and Red Hat Enterprise Linux 8, both of which are still supported, and binutils 2.30 is the oldest corresponding version on those. Ubuntu Pro 18.04 and SUSE Linux Enterprise Server 15 both use gcc-7 as the system compiler but additionally include toolchains that remain supported. With the new minimum toolchain versions, a number of workarounds for older versions can be dropped, in particular on x86_64 and arm64. Importantly, the updated compiler version allows removing two of the five remaining gcc plugins, as support for sancov and structeak features is already included in modern compiler versions. I tried collecting the known changes that are possible based on the new toolchain version, but expect that more cleanups will be possible. Since this touches multiple architectures, I merged the patches through the asm-generic tree." * tag 'gcc-minimum-version-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: Makefile.kcov: apply needed compiler option unconditionally in CFLAGS_KCOV Documentation: update binutils-2.30 version reference gcc-plugins: remove SANCOV gcc plugin Kbuild: remove structleak gcc plugin arm64: drop binutils version checks raid6: skip avx512 checks kbuild: require gcc-8 and binutils-2.30	2025-05-31 08:16:52 -07:00
Linus Torvalds	31848987f1	Merge tag 'soc-newsoc-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull sophgo SoC devicetree updates from Arnd Bergmann: "The Sophgo SG2044 SoC is their second generation server chip with 64 cores, following the SG2042. In addition, there are minor updates for the cv180x SoCs" * tag 'soc-newsoc-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: riscv: dts: sophgo: switch precise compatible for existed clock device for CV18XX riscv: dts: sophgo: Add initial device tree of Sophgo SRD3-10 dt-bindings: riscv: sophgo: Add SG2044 compatible string dt-bindings: interrupt-controller: Add Sophgo SG2044 PLIC dt-bindings: interrupt-controller: Add Sophgo SG2044 CLINT mswi riscv: dts: sopgho: use SOC_PERIPHERAL_IRQ to calculate interrupt number riscv: dts: sophgo: rename header file cv18xx.dtsi to cv180x.dtsi riscv: dts: sophgo: Move riscv cpu definition to a separate file riscv: dts: sophgo: Move all soc specific device into soc dtsi file riscv: sophgo: dts: Add spi controller for SG2042 riscv: dts: sophgo: sg2042: add pinctrl support	2025-05-31 08:14:37 -07:00
Linus Torvalds	ec71f661a5	Merge tag 'soc-dt-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull SoC devicetree updates from Arnd Bergmann: "There are 11 newly supported SoCs, but these are all either new variants of existing designs, or straight reuses of the existing chip in a new package: - RK3562 is a new chip based on the old Cortex-A53 core, apparently a low-cost version of the Cortex-A55 based RK3568/RK3566. - NXP i.MX94 is a minor variation of i.MX93/i.MX95 with a different set of on-chip peripherals. - Renesas RZ/V2N (R9A09G056) is a new member of the larger RZ/V2 family - Amlogic S6/S7/S7D - Samsung Exynos7870 is an older chip similar to Exynos7885 - WonderMedia wm8950 is a minor variation on the wm8850 chip - Amlogic s805y is almost idential to s805x - Allwinner A523 is similar to A527 and T527 - Qualcomm MSM8926 is a variant of MSM8226 - Qualcomm Snapdragon X1P42100 is related to R1E80100 There are also 65 boards, including reference designs for the chips above, this includes - 12 new boards based on TI K3 series chips, most of them from Toradex - 10 devices using Rockchips RK35xx and PX30 chips - 2 phones and 2 laptops based on Qualcomm Snapdragon designs - 10 NXP i.MX8/i.MX9 boards, mostly for embedded/industrial uses - 3 Samsung Galaxy phones based on Exynos7870 - 5 Allwinner based boards using a variety of ARMv8 chips - 9 32-bit machines, each based on a different SoC family Aside from the new hardware, there is the usual set of cleanups and newly added hardware support on existing machines, for a total of 965 devicetree changesets" * tag 'soc-dt-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (956 commits) MAINTAINERS, mailmap: update Sven Peter's email address arm64: dts: renesas: rzg3e-smarc-som: Reduce I2C2 clock frequency arm64: dts: nuvoton: Add pinctrl ARM: dts: samsung: sp5v210-aries: Align wifi node name with bindings arm64: dts: blaize-blzp1600: Enable GPIO support dt-bindings: clock: socfpga: convert to yaml arm64: dts: rockchip: move rk3562 pinctrl node outside the soc node arm64: dts: rockchip: fix rk3562 pcie unit addresses arm64: dts: rockchip: move rk3528 pinctrl node outside the soc node arm64: dts: rockchip: remove a double-empty line from rk3576 core dtsi arm64: dts: rockchip: move rk3576 pinctrl node outside the soc node arm64: dts: rockchip: fix rk3576 pcie unit addresses arm64: dts: rockchip: Drop assigned-clock* from cpu nodes on rk3588 arm64: dts: rockchip: Add missing SFC power-domains to rk3576 Revert "arm64: dts: mediatek: mt8390-genio-common: Add firmware-name for scp0" arm64: dts: mediatek: mt8188: Address binding warnings for MDP3 nodes arm64: dts: mt6359: Rename RTC node to match binding expectations arm64: dts: mt8365-evk: Add goodix touchscreen support arm64: dts: mediatek: mt8188: Add missing #reset-cells property arm64: dts: airoha: en7581: Add PCIe nodes to EN7581 SoC evaluation board ...	2025-05-31 08:08:56 -07:00
Linus Torvalds	f79749a8e4	Merge tag 'soc-defconfig-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull SoC defconfig updates from Arnd Bergmann: "The usual defconfig updates enable configuration options for drivers that got added. A few SoC specific options are enabled in Kconfig files instead, in place of the defconfig files" * tag 'soc-defconfig-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: arm64: defconfig: enable ACPM protocol and Exynos mailbox arm64: defconfig: Enable configs for MediaTek Genio EVK boards arm64: defconfig: mediatek: enable PHY drivers arm64: defconfig: Enable Rockchip SAI and ES8328 arm64: defconfig: Add Toradex Embedded Controller config arm64: defconfig: Enable TPIC2810 GPIO expander riscv: defconfig: spacemit: enable clock controller driver for SpacemiT K1 arm64: defconfig: Enable TMP102 as module arm64: defconfig: Enable hwspinlock and eQEP for K3 arm64: defconfig: Add CDNS_DSI and CDNS_PHY config riscv: defconfig: spacemit: enable gpio support for K1 SoC arm64: defconfig: Enable IPQ5424 RDP466 base configs riscv: Enable PM_GENERIC_DOMAINS for T-Head SoCs	2025-05-31 08:06:14 -07:00
Linus Torvalds	500af1defb	Merge tag 'soc-arm-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM SoC updates from Arnd Bergmann: "The main update in size is the removal of the TI DaVinci DA830 SoC support. DA830 is similar to DA850, which remain supported, but only the reference board was ever supported, and we removed that one 3 years ago as it had never been converted to devicetree. There are some other cleanups for OMAP4 and a few boards using old GPIO interfaces" * tag 'soc-arm-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: ARM: s3c: stop including gpio.h ARM: dts: davinci: da850-evm: Increase fifo threshold ARM: OMAP2+: Fix l4ls clk domain handling in STANDBY ARM: broadcom: MAINTAINERS: Cover bcm2712 files bus: ti-sysc: PRUSS OCP configuration ARM: davinci: remove support for da830 ARM: omap: pmic-cpcap: do not mess around without CPCAP or OMAP4 ARM: omap2plus_defconfig: enable I2C devices of GTA04 ARM: s3c/gpio: use new line value setter callbacks ARM: scoop/gpio: use new line value setter callbacks ARM: sa1100/gpio: use new line value setter callbacks ARM: orion/gpio: use new line value setter callbacks	2025-05-31 08:03:09 -07:00
Linus Torvalds	297d9111e9	Merge tag 'soc-drivers-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull SoC driver updates from Arnd Bergmann: "Updates are across the usual driver subsystems with SoC specific drivers: - added soc specicific drivers for sophgo cv1800 and sg2044, qualcomm sm8750, and amlogic c3 and s4 chips. - cache controller updates for sifive chips, plus binding changes for other cache descriptions. - memory controller drivers for mediatek mt6893, stm32 and cleanups for a few more drivers - reset controller drivers for T-Head TH1502, Sophgo sg2044 and Renesas RZ/V2H(P) - SCMI firmware updates to better deal with buggy firmware, plus better support for Qualcomm X1E and NXP i.MX specific interfaces - a new platform driver for the crypto firmware on Cznic Turris Omnia/MOX - cleanups for the TEE firmware subsystem and amdtee driver - minor updates and fixes for freescale/nxp, qualcomm, google, aspeed, wondermedia, ti, nxp, renesas, hisilicon, mediatek, broadcom and samsung SoCs" * tag 'soc-drivers-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (133 commits) soc: aspeed: Add NULL check in aspeed_lpc_enable_snoop() soc: aspeed: lpc: Fix impossible judgment condition ARM: aspeed: Don't select SRAM docs: firmware: qcom_scm: Fix kernel-doc warning soc: fsl: qe: Consolidate chained IRQ handler install/remove firmware: qcom: scm: Allow QSEECOM for HP EliteBook Ultra G1q dt-bindings: mfd: qcom,tcsr: Add compatible for ipq5018 dt-bindings: cache: add QiLai compatible to ax45mp memory: stm32_omm: Fix error handling in stm32_omm_disable_child() dt-bindings: cache: Convert marvell,tauros2-cache to DT schema dt-bindings: cache: Convert marvell,{feroceon,kirkwood}-cache to DT schema soc: samsung: exynos-pmu: enable CPU hotplug support for gs101 MAINTAINERS: Add google,gs101-pmu-intr-gen.yaml binding file dt-bindings: soc: samsung: exynos-pmu: gs101: add google,pmu-intr-gen phandle dt-bindings: soc: google: Add gs101-pmu-intr-gen binding documentation bus: fsl-mc: Use strscpy() instead of strscpy_pad() soc: fsl: qbman: Remove const from portal->cgrs allocation type bus: fsl_mc: Fix driver_managed_dma check bus: fsl-mc: increase MC_CMD_COMPLETION_TIMEOUT_MS value bus: fsl-mc: drop useless cleanup ...	2025-05-31 07:53:30 -07:00
Helge Deller	213205889d	parisc/unaligned: Fix hex output to show 8 hex chars Change back printk format to 0x%08lx instead of %#08lx, since the latter does not seem to reliably format the value to 8 hex chars. Signed-off-by: Helge Deller <deller@gmx.de> Cc: stable@vger.kernel.org # v5.18+ Fixes: `e5e9e7f222` ("parisc/unaligned: Enhance user-space visible output")	2025-05-31 16:50:39 +02:00
Linus Torvalds	9d49da4388	Revert "iommu: make inclusion of arm/arm-smmu-v3 directory conditional" This reverts commit `e436576b02`. That commit is very broken, and seems to have missed the fact that CONFIG_ARM_SMMU_V3 is not just a yes-or-no thing, but also can be modular. So it caused build errors on arm64 allmodconfig setups: ERROR: modpost: "arm_smmu_make_cdtable_ste" [drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.ko] undefined! ERROR: modpost: "arm_smmu_make_s2_domain_ste" [drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.ko] undefined! ERROR: modpost: "arm_smmu_make_s1_cd" [drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.ko] undefined! ... (and six more symbols just the same). Link: https://lore.kernel.org/all/CAHk-=wh4qRwm7AQ8sBmQj7qECzgAhj4r73RtCDfmHo5SdcN0Jw@mail.gmail.com/ Cc: Joerg Roedel <joro@8bytes.org> Cc: Rolf Eike Beer <eb@emlix.com> Cc: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2025-05-31 07:43:16 -07:00
Ian Rogers	a913ef6fd8	perf callchain: Always populate the addr_location map when adding IP Dropping symbols also meant the callchain maps wasn't populated, but the callchain map is needed to find the DSO. Plumb the symbols option better, falling back to thread__find_map() rather than thread__find_symbol() when symbols are disabled. Fixes: `02b2705017` ("perf callchain: Allow symbols to be optional when resolving a callchain") Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.ibm.com> Cc: Ben Gainey <ben.gainey@arm.com> Cc: Chaitanya S Prakash <chaitanyas.prakash@arm.com> Cc: Charlie Jenkins <charlie@rivosinc.com> Cc: Chun-Tse Shao <ctshao@google.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: Graham Woodward <graham.woodward@arm.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Krzysztof Łopatowski <krzysztof.m.lopatowski@gmail.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Levi Yun <yeoreum.yun@arm.com> Cc: Li Huafei <lihuafei1@huawei.com> Cc: linux-arm-kernel@lists.infradead.org Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin Liška <martin.liska@hey.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Matt Fleming <matt@readmodwrite.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Song Liu <song@kernel.org> Cc: Steinar H. Gunderson <sesse@google.com> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Cc: Steve Clevenger <scclevenger@os.amperecomputing.com> Cc: Thomas Falcon <thomas.falcon@intel.com> Cc: Veronika Molnarova <vmolnaro@redhat.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Will Deacon <will@kernel.org> Cc: Yicong Yang <yangyicong@hisilicon.com> Cc: Yujie Liu <yujie.liu@intel.com> Cc: Zhongqiu Han <quic_zhonhan@quicinc.com> Cc: Zixian Cai <fzczx123@gmail.com> Link: https://lore.kernel.org/r/20250529044000.759937-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2025-05-31 08:58:30 -03:00
Namhyung Kim	0df14c1f1e	perf lock contention: Reject more than 10ms delays for safety Delaying kernel operations can be dangerous and the kernel may kill (non-sleepable) BPF programs running for long in the future. Limit the max delay to 10ms and update the document about it. $ sudo ./perf lock con -abl -J 100000us@cgroup_mutex true lock delay is too long: 100000us (> 10ms) Usage: perf lock contention [<options>] -J, --inject-delay <TIME@FUNC> Inject delays to specific locks Suggested-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20250515181042.555189-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2025-05-31 08:45:24 -03:00
Murad Masimov	05f6e18387	fbdev: Fix fb_set_var to prevent null-ptr-deref in fb_videomode_to_var If fb_add_videomode() in fb_set_var() fails to allocate memory for fb_videomode, later it may lead to a null-ptr dereference in fb_videomode_to_var(), as the fb_info is registered while not having the mode in modelist that is expected to be there, i.e. the one that is described in fb_info->var. ================================================================ general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#1] PREEMPT SMP KASAN NOPTI KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f] CPU: 1 PID: 30371 Comm: syz-executor.1 Not tainted 5.10.226-syzkaller #0 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014 RIP: 0010:fb_videomode_to_var+0x24/0x610 drivers/video/fbdev/core/modedb.c:901 Call Trace: display_to_var+0x3a/0x7c0 drivers/video/fbdev/core/fbcon.c:929 fbcon_resize+0x3e2/0x8f0 drivers/video/fbdev/core/fbcon.c:2071 resize_screen drivers/tty/vt/vt.c:1176 [inline] vc_do_resize+0x53a/0x1170 drivers/tty/vt/vt.c:1263 fbcon_modechanged+0x3ac/0x6e0 drivers/video/fbdev/core/fbcon.c:2720 fbcon_update_vcs+0x43/0x60 drivers/video/fbdev/core/fbcon.c:2776 do_fb_ioctl+0x6d2/0x740 drivers/video/fbdev/core/fbmem.c:1128 fb_ioctl+0xe7/0x150 drivers/video/fbdev/core/fbmem.c:1203 vfs_ioctl fs/ioctl.c:48 [inline] __do_sys_ioctl fs/ioctl.c:753 [inline] __se_sys_ioctl fs/ioctl.c:739 [inline] __x64_sys_ioctl+0x19a/0x210 fs/ioctl.c:739 do_syscall_64+0x33/0x40 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x67/0xd1 ================================================================ The reason is that fb_info->var is being modified in fb_set_var(), and then fb_videomode_to_var() is called. If it fails to add the mode to fb_info->modelist, fb_set_var() returns error, but does not restore the old value of fb_info->var. Restore fb_info->var on failure the same way it is done earlier in the function. Found by Linux Verification Center (linuxtesting.org) with Syzkaller. Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Cc: stable@vger.kernel.org Signed-off-by: Murad Masimov <m.masimov@mt-integration.ru> Signed-off-by: Helge Deller <deller@gmx.de>	2025-05-31 10:24:02 +02:00
Murad Masimov	17186f1f90	fbdev: Fix do_register_framebuffer to prevent null-ptr-deref in fb_videomode_to_var If fb_add_videomode() in do_register_framebuffer() fails to allocate memory for fb_videomode, it will later lead to a null-ptr dereference in fb_videomode_to_var(), as the fb_info is registered while not having the mode in modelist that is expected to be there, i.e. the one that is described in fb_info->var. ================================================================ general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#1] PREEMPT SMP KASAN NOPTI KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f] CPU: 1 PID: 30371 Comm: syz-executor.1 Not tainted 5.10.226-syzkaller #0 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014 RIP: 0010:fb_videomode_to_var+0x24/0x610 drivers/video/fbdev/core/modedb.c:901 Call Trace: display_to_var+0x3a/0x7c0 drivers/video/fbdev/core/fbcon.c:929 fbcon_resize+0x3e2/0x8f0 drivers/video/fbdev/core/fbcon.c:2071 resize_screen drivers/tty/vt/vt.c:1176 [inline] vc_do_resize+0x53a/0x1170 drivers/tty/vt/vt.c:1263 fbcon_modechanged+0x3ac/0x6e0 drivers/video/fbdev/core/fbcon.c:2720 fbcon_update_vcs+0x43/0x60 drivers/video/fbdev/core/fbcon.c:2776 do_fb_ioctl+0x6d2/0x740 drivers/video/fbdev/core/fbmem.c:1128 fb_ioctl+0xe7/0x150 drivers/video/fbdev/core/fbmem.c:1203 vfs_ioctl fs/ioctl.c:48 [inline] __do_sys_ioctl fs/ioctl.c:753 [inline] __se_sys_ioctl fs/ioctl.c:739 [inline] __x64_sys_ioctl+0x19a/0x210 fs/ioctl.c:739 do_syscall_64+0x33/0x40 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x67/0xd1 ================================================================ Even though fbcon_init() checks beforehand if fb_match_mode() in var_to_display() fails, it can not prevent the panic because fbcon_init() does not return error code. Considering this and the comment in the code about fb_match_mode() returning NULL - "This should not happen" - it is better to prevent registering the fb_info if its mode was not set successfully. Also move fb_add_videomode() closer to the beginning of do_register_framebuffer() to avoid having to do the cleanup on fail. Found by Linux Verification Center (linuxtesting.org) with Syzkaller. Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Cc: stable@vger.kernel.org Signed-off-by: Murad Masimov <m.masimov@mt-integration.ru> Signed-off-by: Helge Deller <deller@gmx.de>	2025-05-31 10:24:02 +02:00
Rujra Bhatt	9c221db509	fbdev: sstfb.rst: Fix spelling mistake Fix spelling: "tweeks" to "tweaks" Signed-off-by: Rujra Bhatt <braker.noob.kernel@gmail.com> Signed-off-by: Helge Deller <deller@gmx.de>	2025-05-31 10:24:02 +02:00
Sergey Shtylyov	3f6dae09fc	fbdev: core: fbcvt: avoid division by 0 in fb_cvt_hperiod() In fb_find_mode_cvt(), iff mode->refresh somehow happens to be 0x80000000, cvt.f_refresh will become 0 when multiplying it by 2 due to overflow. It's then passed to fb_cvt_hperiod(), where it's used as a divider -- division by 0 will result in kernel oops. Add a sanity check for cvt.f_refresh to avoid such overflow... Found by Linux Verification Center (linuxtesting.org) with the Svace static analysis tool. Fixes: `96fe6a2109` ("[PATCH] fbdev: Add VESA Coordinated Video Timings (CVT) support") Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: Helge Deller <deller@gmx.de>	2025-05-31 10:24:02 +02:00
Kees Cook	cedc1b6339	fbcon: Make sure modelist not set on unregistered console It looks like attempting to write to the "store_modes" sysfs node will run afoul of unregistered consoles: UBSAN: array-index-out-of-bounds in drivers/video/fbdev/core/fbcon.c:122:28 index -1 is out of range for type 'fb_info [32]' ... fbcon_info_from_console+0x192/0x1a0 drivers/video/fbdev/core/fbcon.c:122 fbcon_new_modelist+0xbf/0x2d0 drivers/video/fbdev/core/fbcon.c:3048 fb_new_modelist+0x328/0x440 drivers/video/fbdev/core/fbmem.c:673 store_modes+0x1c9/0x3e0 drivers/video/fbdev/core/fbsysfs.c:113 dev_attr_store+0x55/0x80 drivers/base/core.c:2439 static struct fb_info fbcon_registered_fb[FB_MAX]; ... static signed char con2fb_map[MAX_NR_CONSOLES]; ... static struct fb_info *fbcon_info_from_console(int console) ... return fbcon_registered_fb[con2fb_map[console]]; If con2fb_map contains a -1 things go wrong here. Instead, return NULL, as callers of fbcon_info_from_console() are trying to compare against existing "info" pointers, so error handling should kick in correctly. Reported-by: syzbot+a7d4444e7b6e743572f7@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/679d0a8f.050a0220.163cdc.000c.GAE@google.com/ Signed-off-by: Kees Cook <kees@kernel.org> Signed-off-by: Helge Deller <deller@gmx.de>	2025-05-31 10:24:02 +02:00
GONG Ruiqi	864f9963ec	vgacon: Add check for vc_origin address range in vgacon_scroll() Our in-house Syzkaller reported the following BUG (twice), which we believed was the same issue with [1]: ================================================================== BUG: KASAN: slab-out-of-bounds in vcs_scr_readw+0xc2/0xd0 drivers/tty/vt/vt.c:4740 Read of size 2 at addr ffff88800f5bef60 by task syz.7.2620/12393 ... Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x72/0xa0 lib/dump_stack.c:106 print_address_description.constprop.0+0x6b/0x3d0 mm/kasan/report.c:364 print_report+0xba/0x280 mm/kasan/report.c:475 kasan_report+0xa9/0xe0 mm/kasan/report.c:588 vcs_scr_readw+0xc2/0xd0 drivers/tty/vt/vt.c:4740 vcs_write_buf_noattr drivers/tty/vt/vc_screen.c:493 [inline] vcs_write+0x586/0x840 drivers/tty/vt/vc_screen.c:690 vfs_write+0x219/0x960 fs/read_write.c:584 ksys_write+0x12e/0x260 fs/read_write.c:639 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x59/0x110 arch/x86/entry/common.c:81 entry_SYSCALL_64_after_hwframe+0x78/0xe2 ... </TASK> Allocated by task 5614: kasan_save_stack+0x20/0x40 mm/kasan/common.c:45 kasan_set_track+0x25/0x30 mm/kasan/common.c:52 ____kasan_kmalloc mm/kasan/common.c:374 [inline] __kasan_kmalloc+0x8f/0xa0 mm/kasan/common.c:383 kasan_kmalloc include/linux/kasan.h:201 [inline] __do_kmalloc_node mm/slab_common.c:1007 [inline] __kmalloc+0x62/0x140 mm/slab_common.c:1020 kmalloc include/linux/slab.h:604 [inline] kzalloc include/linux/slab.h:721 [inline] vc_do_resize+0x235/0xf40 drivers/tty/vt/vt.c:1193 vgacon_adjust_height+0x2d4/0x350 drivers/video/console/vgacon.c:1007 vgacon_font_set+0x1f7/0x240 drivers/video/console/vgacon.c:1031 con_font_set drivers/tty/vt/vt.c:4628 [inline] con_font_op+0x4da/0xa20 drivers/tty/vt/vt.c:4675 vt_k_ioctl+0xa10/0xb30 drivers/tty/vt/vt_ioctl.c:474 vt_ioctl+0x14c/0x1870 drivers/tty/vt/vt_ioctl.c:752 tty_ioctl+0x655/0x1510 drivers/tty/tty_io.c:2779 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:871 [inline] __se_sys_ioctl+0x12d/0x190 fs/ioctl.c:857 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x59/0x110 arch/x86/entry/common.c:81 entry_SYSCALL_64_after_hwframe+0x78/0xe2 Last potentially related work creation: kasan_save_stack+0x20/0x40 mm/kasan/common.c:45 __kasan_record_aux_stack+0x94/0xa0 mm/kasan/generic.c:492 __call_rcu_common.constprop.0+0xc3/0xa10 kernel/rcu/tree.c:2713 netlink_release+0x620/0xc20 net/netlink/af_netlink.c:802 __sock_release+0xb5/0x270 net/socket.c:663 sock_close+0x1e/0x30 net/socket.c:1425 __fput+0x408/0xab0 fs/file_table.c:384 __fput_sync+0x4c/0x60 fs/file_table.c:465 __do_sys_close fs/open.c:1580 [inline] __se_sys_close+0x68/0xd0 fs/open.c:1565 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x59/0x110 arch/x86/entry/common.c:81 entry_SYSCALL_64_after_hwframe+0x78/0xe2 Second to last potentially related work creation: kasan_save_stack+0x20/0x40 mm/kasan/common.c:45 __kasan_record_aux_stack+0x94/0xa0 mm/kasan/generic.c:492 __call_rcu_common.constprop.0+0xc3/0xa10 kernel/rcu/tree.c:2713 netlink_release+0x620/0xc20 net/netlink/af_netlink.c:802 __sock_release+0xb5/0x270 net/socket.c:663 sock_close+0x1e/0x30 net/socket.c:1425 __fput+0x408/0xab0 fs/file_table.c:384 task_work_run+0x154/0x240 kernel/task_work.c:239 exit_task_work include/linux/task_work.h:45 [inline] do_exit+0x8e5/0x1320 kernel/exit.c:874 do_group_exit+0xcd/0x280 kernel/exit.c:1023 get_signal+0x1675/0x1850 kernel/signal.c:2905 arch_do_signal_or_restart+0x80/0x3b0 arch/x86/kernel/signal.c:310 exit_to_user_mode_loop kernel/entry/common.c:111 [inline] exit_to_user_mode_prepare include/linux/entry-common.h:328 [inline] __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline] syscall_exit_to_user_mode+0x1b3/0x1e0 kernel/entry/common.c:218 do_syscall_64+0x66/0x110 arch/x86/entry/common.c:87 entry_SYSCALL_64_after_hwframe+0x78/0xe2 The buggy address belongs to the object at ffff88800f5be000 which belongs to the cache kmalloc-2k of size 2048 The buggy address is located 2656 bytes to the right of allocated 1280-byte region [ffff88800f5be000, ffff88800f5be500) ... Memory state around the buggy address: ffff88800f5bee00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff88800f5bee80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc >ffff88800f5bef00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ^ ffff88800f5bef80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff88800f5bf000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ================================================================== By analyzing the vmcore, we found that vc->vc_origin was somehow placed one line prior to vc->vc_screenbuf when vc was in KD_TEXT mode, and further writings to /dev/vcs caused out-of-bounds reads (and writes right after) in vcs_write_buf_noattr(). Our further experiments show that in most cases, vc->vc_origin equals to vga_vram_base when the console is in KD_TEXT mode, and it's around vc->vc_screenbuf for the KD_GRAPHICS mode. But via triggerring a TIOCL_SETVESABLANK ioctl beforehand, we can make vc->vc_origin be around vc->vc_screenbuf while the console is in KD_TEXT mode, and then by writing the special 'ESC M' control sequence to the tty certain times (depends on the value of `vc->state.y - vc->vc_top`), we can eventually move vc->vc_origin prior to vc->vc_screenbuf. Here's the PoC, tested on QEMU: ``` int main() { const int RI_NUM = 10; // should be greater than `vc->state.y - vc->vc_top` int tty_fd, vcs_fd; const char tty_path = "/dev/tty0"; const char vcs_path = "/dev/vcs"; const char escape_seq[] = "\x1bM"; // ESC + M const char trigger_seq[] = "Let's trigger an OOB write."; struct vt_sizes vt_size = { 70, 2 }; int blank = TIOCL_BLANKSCREEN; tty_fd = open(tty_path, O_RDWR); char vesa_mode[] = { TIOCL_SETVESABLANK, 1 }; ioctl(tty_fd, TIOCLINUX, vesa_mode); ioctl(tty_fd, TIOCLINUX, &blank); ioctl(tty_fd, VT_RESIZE, &vt_size); for (int i = 0; i < RI_NUM; ++i) write(tty_fd, escape_seq, sizeof(escape_seq) - 1); vcs_fd = open(vcs_path, O_RDWR); write(vcs_fd, trigger_seq, sizeof(trigger_seq)); close(vcs_fd); close(tty_fd); return 0; } ``` To solve this problem, add an address range validation check in vgacon_scroll(), ensuring vc->vc_origin never precedes vc_screenbuf. Reported-by: syzbot+9c09fda97a1a65ea859b@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=9c09fda97a1a65ea859b [1] Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Cc: stable@vger.kernel.org Co-developed-by: Yi Yang <yiyang13@huawei.com> Signed-off-by: Yi Yang <yiyang13@huawei.com> Signed-off-by: GONG Ruiqi <gongruiqi1@huawei.com> Signed-off-by: Helge Deller <deller@gmx.de>	2025-05-31 10:24:02 +02:00
Kees Cook	ede481f6da	fbdev: arkfb: Cast ics5342_init() allocation type In preparation for making the kmalloc family of allocators type aware, we need to make sure that the returned type from the allocation matches the type of the variable being assigned. (Before, the allocator would always return "void ", which can be implicitly cast to any pointer type.) The assigned type is "struct dac_info " but the returned type will be "struct ics5342_info *", which has a larger allocation size. This is by design, as struct ics5342_info contains struct dac_info as its first member. (patch slightly modified by Helge Deller) Signed-off-by: Kees Cook <kees@kernel.org> Signed-off-by: Helge Deller <deller@gmx.de>	2025-05-31 10:24:02 +02:00
Zijun Hu	34fe05cd2d	fbdev: nvidiafb: Correct const string length in nvidiafb_setup() The actual length of const string "noaccel" is 7, but the strncmp() branch in nvidiafb_setup() wrongly hard codes it as 6. Fix by using actual length 7 as argument of the strncmp(). Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com> Signed-off-by: Helge Deller <deller@gmx.de>	2025-05-31 10:24:01 +02:00
Andy Shevchenko	c9b26429c8	fbdev: atyfb: Remove unused PCI vendor ID The custom definition of PCI vendor ID in video/mach64.h is unused. Remove it. Note, that the proper one is available in pci_ids.h. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Helge Deller <deller@gmx.de>	2025-05-31 10:24:01 +02:00
Colin Ian King	67ebb5890a	fbdev: carminefb: Fix spelling mistake of CARMINE_TOTAL_DIPLAY_MEM There is a spelling mistake in macro CARMINE_TOTAL_DIPLAY_MEM. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Signed-off-by: Helge Deller <deller@gmx.de>	2025-05-31 10:24:01 +02:00
Bartosz Golaszewski	2be358790e	fbdev: via: use new GPIO line value setter callbacks struct gpio_chip now has callbacks for setting line values that return an integer, allowing to indicate failures. Convert the driver to using them. Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Helge Deller <deller@gmx.de>	2025-05-31 10:24:01 +02:00
Dapeng Mi	86aa94cd50	perf/x86/intel: Fix incorrect MSR index calculations in intel_pmu_config_acr() The MSR offset calculations in intel_pmu_config_acr() are buggy. To calculate fixed counter MSR addresses in intel_pmu_config_acr(), the HW counter index "idx" is subtracted by INTEL_PMC_IDX_FIXED. This leads to the ACR mask value of fixed counters to be incorrectly saved to the positions of GP counters in acr_cfg_b[], e.g. For fixed counter 0, its ACR counter mask should be saved to acr_cfg_b[32], but it's saved to acr_cfg_b[0] incorrectly. Fix this issue. [ mingo: Clarified & improved the changelog. ] Fixes: `ec980e4fac` ("perf/x86/intel: Support auto counter reload") Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250529080236.2552247-2-dapeng1.mi@linux.intel.com	2025-05-31 10:05:16 +02:00
Steven Rostedt	99850a1c93	x86/fpu: Remove unused trace events The following trace events are not used and defining them just wastes memory: x86_fpu_before_restore x86_fpu_after_restore x86_fpu_init_state Simply remove them. Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Linux Trace Kernel <linux-trace-kernel@vger.kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Oleg Nesterov <oleg@redhat.com> Link: https://lore.kernel.org/all/20250529130138.544ffec4@gandalf.local.home # background Link: https://lore.kernel.org/r/20250529131024.7c2ef96f@gandalf.local.home # x86 submission	2025-05-31 09:40:40 +02:00
Linus Torvalds	8bf722c684	Merge tag 'trace-ringbuffer-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull ring-buffer updates from Steven Rostedt: - Allow the persistent ring buffer to be memory mapped In the last merge window there was issues with the implementation of mapping the persistent ring buffer because it was assumed that the persistent memory was just physical memory without being part of the kernel virtual address space. But this was incorrect and the persistent ring buffer can be mapped the same way as the allocated ring buffer is mapped. The metadata for the persistent ring buffer is different than the normal ring buffer and the organization of mapping it to user space is a little different. Make the updates needed to the meta data to allow the persistent ring buffer to be mapped to user space. - Fix cpus_read_lock() with buffer->mutex and cpu_buffer->mapping_lock Mapping the ring buffer to user space uses the cpu_buffer->mapping_lock. The buffer->mutex can be taken when the mapping_lock is held, giving the locking order of: cpu_buffer->mapping_lock -->> buffer->mutex. But there also exists the ordering: buffer->mutex -->> cpus_read_lock() mm->mmap_lock -->> cpu_buffer->mapping_lock cpus_read_lock() -->> mm->mmap_lock causing a circular chain of: cpu_buffer->mapping_lock -> buffer->mutex -->> cpus_read_lock() -->> mm->mmap_lock -->> cpu_buffer->mapping_lock By moving the cpus_read_lock() outside the buffer->mutex where: cpus_read_lock() -->> buffer->mutex, breaks the deadlock chain. - Do not trigger WARN_ON() for commit overrun When the ring buffer is user space mapped and there's a "commit overrun" (where an interrupt preempted an event, and then added so many events it filled the buffer having to drop events when it hit the preempted event) a WARN_ON() was triggered if this was read via a memory mapped buffer. This is due to "missed events" being non zero when the reader page ended up with the commit page. The idea was, if the writer is on the reader page, there's only one page that has been written to and there should be no missed events. But if a commit overrun is done where the writer is off the commit page and looped around to the commit page causing missed events, it is possible that the reader page is the commit page with missed events. Instead of triggering a WARN_ON() when the reader page is the commit page with missed events, trigger it when the reader page is the tail_page with missed events. That's because the writer is always on the tail_page if an event was interrupted (which holds the commit event) and continues off the commit page. - Reset the persistent buffer if it is fully consumed On boot up, if the user fully consumes the last boot buffer of the persistent buffer, if it reboots without enabling it, there will still be events in the buffer which can cause confusion. Instead, reset the buffer when it is fully consumed, so that the data is not read again. - Clean up some goto out jumps There's a few cases that the code jumps to the "out:" label that simply returns a value. There used to be more work done at those labels but now that they simply return a value use a return instead of jumping to a label. - Use guard() to simplify some of the code Add guard() around some locking instead of jumping to a label to do the unlocking. - Use free() to simplify some of the code Use free(kfree) on variables that will get freed on error and use return_ptr() to return the variable when its not freed. There's one instance where free(kfree) simplifies the code on a temp variable that was allocated just for the function use. * tag 'trace-ringbuffer-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: ring-buffer: Simplify functions with __free(kfree) to free allocations ring-buffer: Make ring_buffer_{un}map() simpler with guard(mutex) ring-buffer: Simplify ring_buffer_read_page() with guard() ring-buffer: Simplify reset_disabled_cpu_buffer() with use of guard() ring-buffer: Remove jump to out label in ring_buffer_swap_cpu() ring-buffer: Removed unnecessary if() goto out where out is the next line tracing: Reset last-boot buffers when reading out all cpu buffers ring-buffer: Allow reserve_mem persistent ring buffers to be mmapped ring-buffer: Do not trigger WARN_ON() due to a commit_overrun ring-buffer: Move cpus_read_lock() outside of buffer->mutex	2025-05-30 21:20:11 -07:00
Linus Torvalds	03ebff0c83	Merge tag 'microblaze-v6.16' of git://git.monstr.eu/linux-2.6-microblaze Pull microblaze update from Michal Simek: - Small OF update * tag 'microblaze-v6.16' of git://git.monstr.eu/linux-2.6-microblaze: microblaze: Use of_property_present() for non-boolean properties	2025-05-30 21:11:21 -07:00
Jakub Kicinski	558428921e	Merge branch 'net-fix-inet_proto_csum_replace_by_diff-for-ipv6' Paul Chaignon says: ==================== net: Fix inet_proto_csum_replace_by_diff for IPv6 This patchset fixes a bug that causes skb->csum to hold an incorrect value when calling inet_proto_csum_replace_by_diff for an IPv6 packet in CHECKSUM_COMPLETE state. This bug affects BPF helper bpf_l4_csum_replace and IPv6 ILA in adj-transport mode. In those cases, inet_proto_csum_replace_by_diff updates the L4 checksum field after an IPv6 address change. These two changes cancel each other in terms of checksum, so skb->csum shouldn't be updated. v2: https://lore.kernel.org/aCz84JU60wd8etiT@mail.gmail.com ==================== Link: https://patch.msgid.link/cover.1748509484.git.paul.chaignon@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-05-30 19:53:53 -07:00
Paul Chaignon	ead7f9b8de	bpf: Fix L4 csum update on IPv6 in CHECKSUM_COMPLETE In Cilium, we use bpf_csum_diff + bpf_l4_csum_replace to, among other things, update the L4 checksum after reverse SNATing IPv6 packets. That use case is however not currently supported and leads to invalid skb->csum values in some cases. This patch adds support for IPv6 address changes in bpf_l4_csum_update via a new flag. When calling bpf_l4_csum_replace in Cilium, it ends up calling inet_proto_csum_replace_by_diff: 1: void inet_proto_csum_replace_by_diff(__sum16 sum, struct sk_buff skb, 2: __wsum diff, bool pseudohdr) 3: { 4: if (skb->ip_summed != CHECKSUM_PARTIAL) { 5: csum_replace_by_diff(sum, diff); 6: if (skb->ip_summed == CHECKSUM_COMPLETE && pseudohdr) 7: skb->csum = ~csum_sub(diff, skb->csum); 8: } else if (pseudohdr) { 9: sum = ~csum_fold(csum_add(diff, csum_unfold(sum))); 10: } 11: } The bug happens when we're in the CHECKSUM_COMPLETE state. We've just updated one of the IPv6 addresses. The helper now updates the L4 header checksum on line 5. Next, it updates skb->csum on line 7. It shouldn't. For an IPv6 packet, the updates of the IPv6 address and of the L4 checksum will cancel each other. The checksums are set such that computing a checksum over the packet including its checksum will result in a sum of 0. So the same is true here when we update the L4 checksum on line 5. We'll update it as to cancel the previous IPv6 address update. Hence skb->csum should remain untouched in this case. The same bug doesn't affect IPv4 packets because, in that case, three fields are updated: the IPv4 address, the IP checksum, and the L4 checksum. The change to the IPv4 address and one of the checksums still cancel each other in skb->csum, but we're left with one checksum update and should therefore update skb->csum accordingly. That's exactly what inet_proto_csum_replace_by_diff does. This special case for IPv6 L4 checksums is also described atop inet_proto_csum_replace16, the function we should be using in this case. This patch introduces a new bpf_l4_csum_replace flag, BPF_F_IPV6, to indicate that we're updating the L4 checksum of an IPv6 packet. When the flag is set, inet_proto_csum_replace_by_diff will skip the skb->csum update. Fixes: `7d672345ed` ("bpf: add generic bpf_csum_diff helper") Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://patch.msgid.link/96a6bc3a443e6f0b21ff7b7834000e17fb549e05.1748509484.git.paul.chaignon@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-05-30 19:53:51 -07:00
Paul Chaignon	6043b794c7	net: Fix checksum update for ILA adj-transport During ILA address translations, the L4 checksums can be handled in different ways. One of them, adj-transport, consist in parsing the transport layer and updating any found checksum. This logic relies on inet_proto_csum_replace_by_diff and produces an incorrect skb->csum when in state CHECKSUM_COMPLETE. This bug can be reproduced with a simple ILA to SIR mapping, assuming packets are received with CHECKSUM_COMPLETE: $ ip a show dev eth0 14: eth0@if15: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 62:ae:35:9e:0f:8d brd ff:ff:ff:ff:ff:ff link-netnsid 0 inet6 3333:0:0:1::c078/64 scope global valid_lft forever preferred_lft forever inet6 fd00:10:244:1::c078/128 scope global nodad valid_lft forever preferred_lft forever inet6 fe80::60ae:35ff:fe9e:f8d/64 scope link proto kernel_ll valid_lft forever preferred_lft forever $ ip ila add loc_match fd00:10:244:1 loc 3333:0:0:1 \ csum-mode adj-transport ident-type luid dev eth0 Then I hit [fd00:10:244:1::c078]:8000 with a server listening only on [3333:0:0:1::c078]:8000. With the bug, the SYN packet is dropped with SKB_DROP_REASON_TCP_CSUM after inet_proto_csum_replace_by_diff changed skb->csum. The translation and drop are visible on pwru [1] traces: IFACE TUPLE FUNC eth0:9 [fd00:10:244:3::3d8]:51420->[fd00:10:244:1::c078]:8000(tcp) ipv6_rcv eth0:9 [fd00:10:244:3::3d8]:51420->[fd00:10:244:1::c078]:8000(tcp) ip6_rcv_core eth0:9 [fd00:10:244:3::3d8]:51420->[fd00:10:244:1::c078]:8000(tcp) nf_hook_slow eth0:9 [fd00:10:244:3::3d8]:51420->[fd00:10:244:1::c078]:8000(tcp) inet_proto_csum_replace_by_diff eth0:9 [fd00:10:244:3::3d8]:51420->[3333:0:0:1::c078]:8000(tcp) tcp_v6_early_demux eth0:9 [fd00:10:244:3::3d8]:51420->[3333:0:0:1::c078]:8000(tcp) ip6_route_input eth0:9 [fd00:10:244:3::3d8]:51420->[3333:0:0:1::c078]:8000(tcp) ip6_input eth0:9 [fd00:10:244:3::3d8]:51420->[3333:0:0:1::c078]:8000(tcp) ip6_input_finish eth0:9 [fd00:10:244:3::3d8]:51420->[3333:0:0:1::c078]:8000(tcp) ip6_protocol_deliver_rcu eth0:9 [fd00:10:244:3::3d8]:51420->[3333:0:0:1::c078]:8000(tcp) raw6_local_deliver eth0:9 [fd00:10:244:3::3d8]:51420->[3333:0:0:1::c078]:8000(tcp) ipv6_raw_deliver eth0:9 [fd00:10:244:3::3d8]:51420->[3333:0:0:1::c078]:8000(tcp) tcp_v6_rcv eth0:9 [fd00:10:244:3::3d8]:51420->[3333:0:0:1::c078]:8000(tcp) __skb_checksum_complete eth0:9 [fd00:10:244:3::3d8]:51420->[3333:0:0:1::c078]:8000(tcp) kfree_skb_reason(SKB_DROP_REASON_TCP_CSUM) eth0:9 [fd00:10:244:3::3d8]:51420->[3333:0:0:1::c078]:8000(tcp) skb_release_head_state eth0:9 [fd00:10:244:3::3d8]:51420->[3333:0:0:1::c078]:8000(tcp) skb_release_data eth0:9 [fd00:10:244:3::3d8]:51420->[3333:0:0:1::c078]:8000(tcp) skb_free_head eth0:9 [fd00:10:244:3::3d8]:51420->[3333:0:0:1::c078]:8000(tcp) kfree_skbmem This is happening because inet_proto_csum_replace_by_diff is updating skb->csum when it shouldn't. The L4 checksum is updated such that it "cancels" the IPv6 address change in terms of checksum computation, so the impact on skb->csum is null. Note this would be different for an IPv4 packet since three fields would be updated: the IPv4 address, the IP checksum, and the L4 checksum. Two would cancel each other and skb->csum would still need to be updated to take the L4 checksum change into account. This patch fixes it by passing an ipv6 flag to inet_proto_csum_replace_by_diff, to skip the skb->csum update if we're in the IPv6 case. Note the behavior of the only other user of inet_proto_csum_replace_by_diff, the BPF subsystem, is left as is in this patch and fixed in the subsequent patch. With the fix, using the reproduction from above, I can confirm skb->csum is not touched by inet_proto_csum_replace_by_diff and the TCP SYN proceeds to the application after the ILA translation. Link: https://github.com/cilium/pwru [1] Fixes: `65d7ab8de5` ("net: Identifier Locator Addressing module") Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://patch.msgid.link/b5539869e3550d46068504feb02d37653d939c0b.1748509484.git.paul.chaignon@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-05-30 19:53:51 -07:00
Jakub Kicinski	44abca1fe3	Merge branch 'net-stmmac-prevent-div-by-0' Alexis Lothoré says: ==================== net: stmmac: prevent div by 0 fix a small splat I am observing on a STM32MP157 platform at boot (see commit 1) due to a division by 0. v3: https://lore.kernel.org/20250528-stmmac_tstamp_div-v3-0-b525ecdfd84c@bootlin.com v2: https://lore.kernel.org/20250527-stmmac_tstamp_div-v2-1-663251b3b542@bootlin.com v1: https://lore.kernel.org/20250523-stmmac_tstamp_div-v1-1-bca8a5a3a477@bootlin.com ==================== Link: https://patch.msgid.link/20250529-stmmac_tstamp_div-v4-0-d73340a794d5@bootlin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-05-30 19:33:30 -07:00

... 23 24 25 26 27 ...

1368207 Commits