linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-04-21 16:38:07 -04:00

Author	SHA1	Message	Date
Linus Torvalds	388a810192	Merge tag 'erofs-for-6.3-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs Pull erofs fixes from Gao Xiang: "The most important one reverts an improper fix which can cause an unexpected warning more often on specific images, and another one fixes LZMA decompression on 32-bit platforms. The others are minor fixes and cleanups. - Fix LZMA decompression failure on HIGHMEM platforms - Revert an inproper fix since it is actually an implementation issue of vmalloc() - Avoid a wrong DBG_BUGON since it could be triggered with -EINTR - Minor cleanups" * tag 'erofs-for-6.3-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs: erofs: use wrapper i_blocksize() in erofs_file_read_iter() erofs: get rid of a useless DBG_BUGON erofs: Revert "erofs: fix kvcalloc() misuse with __GFP_NOFAIL" erofs: fix wrong kunmap when using LZMA on HIGHMEM platforms erofs: mark z_erofs_lzma_init/erofs_pcpubuf_init w/ __init	2023-03-10 08:51:57 -08:00
Linus Torvalds	92cadfcffa	Merge tag 'nfsd-6.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fixes from Chuck Lever: - Protect NFSD writes against filesystem freezing - Fix a potential memory leak during server shutdown * tag 'nfsd-6.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: SUNRPC: Fix a server shutdown leak NFSD: Protect against filesystem freezing	2023-03-10 08:45:30 -08:00
Linus Torvalds	ae195ca1a8	Merge tag 'for-6.3-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: "First batch of fixes. Among them there are two updates to sysfs and ioctl which are not strictly fixes but are used for testing so there's no reason to delay them. - fix block group item corruption after inserting new block group - fix extent map logging bit not cleared for split maps after dropping range - fix calculation of unusable block group space reporting bogus values due to 32/64b division - fix unnecessary increment of read error stat on write error - improve error handling in inode update - export per-device fsid in DEV_INFO ioctl to distinguish seeding devices, needed for testing - allocator size classes: - fix potential dead lock in size class loading logic - print sysfs stats for the allocation classes" * tag 'for-6.3-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: fix block group item corruption after inserting new block group btrfs: fix extent map logging bit not cleared for split maps after dropping range btrfs: fix percent calculation for bg reclaim message btrfs: fix unnecessary increment of read error stat on write error btrfs: handle btrfs_del_item errors in __btrfs_update_delayed_inode btrfs: ioctl: return device fsid from DEV_INFO ioctl btrfs: fix potential dead lock in size class loading logic btrfs: sysfs: add size class stats	2023-03-10 08:39:13 -08:00
Yue Hu	3993f4f456	erofs: use wrapper i_blocksize() in erofs_file_read_iter() linux/fs.h has a wrapper for this operation. Signed-off-by: Yue Hu <huyue2@coolpad.com> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Reviewed-by: Chao Yu <chao@kernel.org> Link: https://lore.kernel.org/r/20230306075527.1338-1-zbestahu@gmail.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2023-03-09 23:36:04 +08:00
Gao Xiang	9ff471800b	erofs: get rid of a useless DBG_BUGON `err` could be -EINTR and it should not be the case. Actually such DBG_BUGON is useless. Reviewed-by: Chao Yu <chao@kernel.org> Link: https://lore.kernel.org/r/20230309053148.9223-2-hsiangkao@linux.alibaba.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2023-03-09 23:36:01 +08:00
Gao Xiang	647dd2c3f0	erofs: Revert "erofs: fix kvcalloc() misuse with __GFP_NOFAIL" Let's revert commit `12724ba389` ("erofs: fix kvcalloc() misuse with __GFP_NOFAIL") since kvmalloc() already supports __GFP_NOFAIL in commit `a421ef3030` ("mm: allow !GFP_KERNEL allocations for kvmalloc"). So the original fix was wrong. Actually there was some issue as [1] discussed, so before that mm fix is landed, the warn could still happen but applying this commit first will cause less. [1] https://lore.kernel.org/r/20230305053035.1911-1-hsiangkao@linux.alibaba.com Fixes: `12724ba389` ("erofs: fix kvcalloc() misuse with __GFP_NOFAIL") Reviewed-by: Chao Yu <chao@kernel.org> Link: https://lore.kernel.org/r/20230309053148.9223-1-hsiangkao@linux.alibaba.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2023-03-09 23:36:01 +08:00
Gao Xiang	8f121dfb15	erofs: fix wrong kunmap when using LZMA on HIGHMEM platforms As the call trace shown, the root cause is kunmap incorrect pages: BUG: kernel NULL pointer dereference, address: 00000000 CPU: 1 PID: 40 Comm: kworker/u5:0 Not tainted 6.2.0-rc5 #4 Workqueue: erofs_worker z_erofs_decompressqueue_work EIP: z_erofs_lzma_decompress+0x34b/0x8ac z_erofs_decompress+0x12/0x14 z_erofs_decompress_queue+0x7e7/0xb1c z_erofs_decompressqueue_work+0x32/0x60 process_one_work+0x24b/0x4d8 ? process_one_work+0x1a4/0x4d8 worker_thread+0x14c/0x3fc kthread+0xe6/0x10c ? rescuer_thread+0x358/0x358 ? kthread_complete_and_exit+0x18/0x18 ret_from_fork+0x1c/0x28 ---[ end trace 0000000000000000 ]--- The bug is trivial and should be fixed now. It has no impact on !HIGHMEM platforms. Fixes: `622ceaddb7` ("erofs: lzma compression support") Cc: <stable@vger.kernel.org> # 5.16+ Reviewed-by: Yue Hu <huyue2@coolpad.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20230305134455.88236-1-hsiangkao@linux.alibaba.com	2023-03-09 22:49:57 +08:00
Yangtao Li	a279adedbb	erofs: mark z_erofs_lzma_init/erofs_pcpubuf_init w/ __init They are used during the erofs module init phase. Let's mark it as __init like any other function. Signed-off-by: Yangtao Li <frank.li@vivo.com> Reviewed-by: Yue Hu <huyue2@coolpad.com> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Reviewed-by: Chao Yu <chao@kernel.org> Link: https://lore.kernel.org/r/20230303063731.66760-1-frank.li@vivo.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2023-03-09 22:49:30 +08:00
Filipe Manana	675dfe1223	btrfs: fix block group item corruption after inserting new block group We can often end up inserting a block group item, for a new block group, with a wrong value for the used bytes field. This happens if for the new allocated block group, in the same transaction that created the block group, we have tasks allocating extents from it as well as tasks removing extents from it. For example: 1) Task A creates a metadata block group X; 2) Two extents are allocated from block group X, so its "used" field is updated to 32K, and its "commit_used" field remains as 0; 3) Transaction commit starts, by some task B, and it enters btrfs_start_dirty_block_groups(). There it tries to update the block group item for block group X, which currently has its "used" field with a value of 32K. But that fails since the block group item was not yet inserted, and so on failure update_block_group_item() sets the "commit_used" field of the block group back to 0; 4) The block group item is inserted by task A, when for example btrfs_create_pending_block_groups() is called when releasing its transaction handle. This results in insert_block_group_item() inserting the block group item in the extent tree (or block group tree), with a "used" field having a value of 32K, but without updating the "commit_used" field in the block group, which remains with value of 0; 5) The two extents are freed from block X, so its "used" field changes from 32K to 0; 6) The transaction commit by task B continues, it enters btrfs_write_dirty_block_groups() which calls update_block_group_item() for block group X, and there it decides to skip the block group item update, because "used" has a value of 0 and "commit_used" has a value of 0 too. As a result, we end up with a block item having a 32K "used" field but no extents allocated from it. When this issue happens, a btrfs check reports an error like this: [1/7] checking root items [2/7] checking extents block group [1104150528 1073741824] used 39796736 but extent items used 0 ERROR: errors found in extent allocation tree or chunk allocation (...) Fix this by making insert_block_group_item() update the block group's "commit_used" field. Fixes: `7248e0cebb` ("btrfs: skip update of block group item if used bytes are the same") CC: stable@vger.kernel.org # 6.2+ Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2023-03-08 01:14:01 +01:00
Chuck Lever	fd9a2e1d51	NFSD: Protect against filesystem freezing Flole observes this WARNING on occasion: [1210423.486503] WARNING: CPU: 8 PID: 1524732 at fs/ext4/ext4_jbd2.c:75 ext4_journal_check_start+0x68/0xb0 Reported-by: <flole@flole.de> Suggested-by: Jan Kara <jack@suse.cz> Link: https://bugzilla.kernel.org/show_bug.cgi?id=217123 Fixes: `73da852e38` ("nfsd: use vfs_iter_read/write") Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2023-03-07 09:33:42 -05:00
Filipe Manana	e4cc1483f3	btrfs: fix extent map logging bit not cleared for split maps after dropping range At btrfs_drop_extent_map_range() we are clearing the EXTENT_FLAG_LOGGING bit on a 'flags' variable that was not initialized. This makes static checkers complain about it, so initialize the 'flags' variable before clearing the bit. In practice this has no consequences, because EXTENT_FLAG_LOGGING should not be set when btrfs_drop_extent_map_range() is called, as an fsync locks the inode in exclusive mode, locks the inode's mmap semaphore in exclusive mode too and it always flushes all delalloc. Also add a comment about why we clear EXTENT_FLAG_LOGGING on a copy of the flags of the split extent map. Reported-by: Dan Carpenter <error27@gmail.com> Link: https://lore.kernel.org/linux-btrfs/Y%2FyipSVozUDEZKow@kili/ Fixes: `db21370bff` ("btrfs: drop extent map range more efficiently") Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2023-03-06 19:28:19 +01:00
Johannes Thumshirn	95cd356ca2	btrfs: fix percent calculation for bg reclaim message We have a report, that the info message for block-group reclaim is crossing the 100% used mark. This is happening as we were truncating the divisor for the division (the block_group->length) to a 32bit value. Fix this by using div64_u64() to not truncate the divisor. In the worst case, it can lead to a div by zero error and should be possible to trigger on 4 disks RAID0, and each device is large enough: $ mkfs.btrfs -f /dev/test/scratch[1234] -m raid1 -d raid0 btrfs-progs v6.1 [...] Filesystem size: 40.00GiB Block group profiles: Data: RAID0 4.00GiB <<< Metadata: RAID1 256.00MiB System: RAID1 8.00MiB Reported-by: Forza <forza@tnonline.net> Link: https://lore.kernel.org/linux-btrfs/e99483.c11a58d.1863591ca52@tnonline.net/ Fixes: `5f93e776c6` ("btrfs: zoned: print unusable percentage when reclaiming block groups") CC: stable@vger.kernel.org # 5.15+ Reviewed-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: David Sterba <dsterba@suse.com> [ add Qu's note ] Signed-off-by: David Sterba <dsterba@suse.com>	2023-03-06 19:28:19 +01:00
Naohiro Aota	98e8d36a26	btrfs: fix unnecessary increment of read error stat on write error Current btrfs_log_dev_io_error() increases the read error count even if the erroneous IO is a WRITE request. This is because it forget to use "else if", and all the error WRITE requests counts as READ error as there is (of course) no REQ_RAHEAD bit set. Fixes: `c3a62baf21` ("btrfs: use chained bios when cloning") CC: stable@vger.kernel.org # 6.1+ Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2023-03-06 19:28:19 +01:00
void0red	c06016a02a	btrfs: handle btrfs_del_item errors in __btrfs_update_delayed_inode Even if the slot is already read out, we may still need to re-balance the tree, thus it can cause error in that btrfs_del_item() call and we need to handle it properly. Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: void0red <void0red@gmail.com> Signed-off-by: David Sterba <dsterba@suse.com>	2023-03-06 19:28:19 +01:00
Qu Wenruo	2943868a90	btrfs: ioctl: return device fsid from DEV_INFO ioctl Currently user space utilizes dev info ioctl to grab the info of a certain devid, this includes its device uuid. But the returned info is not enough to determine if a device is a seed. Commit `a26d60dedf` ("btrfs: sysfs: add devinfo/fsid to retrieve actual fsid from the device") exports the same value in sysfs so this is for parity with ioctl. Add a new member, fsid, into btrfs_ioctl_dev_info_args, and populate the member with fsid value. This should not cause any compatibility problem, following the combinations: - Old user space, old kernel - Old user space, new kernel User space tool won't even check the new member. - New user space, old kernel The kernel won't touch the new member, and user space tool should zero out its argument, thus the new member is all zero. User space tool can then know the kernel doesn't support this fsid reporting, and falls back to whatever they can. - New user space, new kernel Go as planned. Would find the fsid member is no longer zero, and trust its value. Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2023-03-06 19:28:19 +01:00
Boris Burkov	12148367d7	btrfs: fix potential dead lock in size class loading logic As reported by Filipe, there's a potential deadlock caused by using btrfs_search_forward on commit_root. The locking there is unconditional, even if ->skip_locking and ->search_commit_root is set. It's not meant to be used for commit roots, so it always needs to do locking. So if another task is COWing a child node of the same root node and then needs to wait for block group caching to complete when trying to allocate a metadata extent, it deadlocks. For example: [539604.239315] sysrq: Show Blocked State [539604.240133] task:kworker/u16:6 state:D stack:0 pid:2119594 ppid:2 flags:0x00004000 [539604.241613] Workqueue: btrfs-cache btrfs_work_helper [btrfs] [539604.242673] Call Trace: [539604.243129] <TASK> [539604.243925] __schedule+0x41d/0xee0 [539604.244797] ? rcu_read_lock_sched_held+0x12/0x70 [539604.245399] ? rwsem_down_read_slowpath+0x185/0x490 [539604.246111] schedule+0x5d/0xf0 [539604.246593] rwsem_down_read_slowpath+0x2da/0x490 [539604.247290] ? rcu_barrier_tasks_trace+0x10/0x20 [539604.248090] __down_read_common+0x3d/0x150 [539604.248702] down_read_nested+0xc3/0x140 [539604.249280] __btrfs_tree_read_lock+0x24/0x100 [btrfs] [539604.250097] btrfs_read_lock_root_node+0x48/0x60 [btrfs] [539604.250915] btrfs_search_forward+0x59/0x460 [btrfs] [539604.251781] ? btrfs_global_root+0x50/0x70 [btrfs] [539604.252476] caching_thread+0x1be/0x920 [btrfs] [539604.253167] btrfs_work_helper+0xf6/0x400 [btrfs] [539604.253848] process_one_work+0x24f/0x5a0 [539604.254476] worker_thread+0x52/0x3b0 [539604.255166] ? __pfx_worker_thread+0x10/0x10 [539604.256047] kthread+0xf0/0x120 [539604.256591] ? __pfx_kthread+0x10/0x10 [539604.257212] ret_from_fork+0x29/0x50 [539604.257822] </TASK> [539604.258233] task:btrfs-transacti state:D stack:0 pid:2236474 ppid:2 flags:0x00004000 [539604.259802] Call Trace: [539604.260243] <TASK> [539604.260615] __schedule+0x41d/0xee0 [539604.261205] ? rcu_read_lock_sched_held+0x12/0x70 [539604.262000] ? rwsem_down_read_slowpath+0x185/0x490 [539604.262822] schedule+0x5d/0xf0 [539604.263374] rwsem_down_read_slowpath+0x2da/0x490 [539604.266228] ? lock_acquire+0x160/0x310 [539604.266917] ? rcu_read_lock_sched_held+0x12/0x70 [539604.267996] ? lock_contended+0x19e/0x500 [539604.268720] __down_read_common+0x3d/0x150 [539604.269400] down_read_nested+0xc3/0x140 [539604.270057] __btrfs_tree_read_lock+0x24/0x100 [btrfs] [539604.271129] btrfs_read_lock_root_node+0x48/0x60 [btrfs] [539604.272372] btrfs_search_slot+0x143/0xf70 [btrfs] [539604.273295] update_block_group_item+0x9e/0x190 [btrfs] [539604.274282] btrfs_start_dirty_block_groups+0x1c4/0x4f0 [btrfs] [539604.275381] ? __mutex_unlock_slowpath+0x45/0x280 [539604.276390] btrfs_commit_transaction+0xee/0xed0 [btrfs] [539604.277391] ? lock_acquire+0x1a4/0x310 [539604.278080] ? start_transaction+0xcb/0x6c0 [btrfs] [539604.279099] transaction_kthread+0x142/0x1c0 [btrfs] [539604.279996] ? __pfx_transaction_kthread+0x10/0x10 [btrfs] [539604.280673] kthread+0xf0/0x120 [539604.281050] ? __pfx_kthread+0x10/0x10 [539604.281496] ret_from_fork+0x29/0x50 [539604.281966] </TASK> [539604.282255] task:fsstress state:D stack:0 pid:2236483 ppid:1 flags:0x00004006 [539604.283897] Call Trace: [539604.284700] <TASK> [539604.285088] __schedule+0x41d/0xee0 [539604.285660] schedule+0x5d/0xf0 [539604.286175] btrfs_wait_block_group_cache_progress+0xf2/0x170 [btrfs] [539604.287342] ? __pfx_autoremove_wake_function+0x10/0x10 [539604.288450] find_free_extent+0xd93/0x1750 [btrfs] [539604.289256] ? _raw_spin_unlock+0x29/0x50 [539604.289911] ? btrfs_get_alloc_profile+0x127/0x2a0 [btrfs] [539604.290843] btrfs_reserve_extent+0x147/0x290 [btrfs] [539604.291943] btrfs_alloc_tree_block+0xcb/0x3e0 [btrfs] [539604.292903] __btrfs_cow_block+0x138/0x580 [btrfs] [539604.293773] btrfs_cow_block+0x10e/0x240 [btrfs] [539604.294595] btrfs_search_slot+0x7f3/0xf70 [btrfs] [539604.295585] btrfs_update_device+0x71/0x1b0 [btrfs] [539604.296459] btrfs_chunk_alloc_add_chunk_item+0xe0/0x340 [btrfs] [539604.297489] btrfs_chunk_alloc+0x1bf/0x490 [btrfs] [539604.298335] find_free_extent+0x6fa/0x1750 [btrfs] [539604.299174] ? _raw_spin_unlock+0x29/0x50 [539604.299950] ? btrfs_get_alloc_profile+0x127/0x2a0 [btrfs] [539604.300918] btrfs_reserve_extent+0x147/0x290 [btrfs] [539604.301797] btrfs_alloc_tree_block+0xcb/0x3e0 [btrfs] [539604.303017] ? lock_release+0x224/0x4a0 [539604.303855] __btrfs_cow_block+0x138/0x580 [btrfs] [539604.304789] btrfs_cow_block+0x10e/0x240 [btrfs] [539604.305611] btrfs_search_slot+0x7f3/0xf70 [btrfs] [539604.306682] ? btrfs_global_root+0x50/0x70 [btrfs] [539604.308198] lookup_inline_extent_backref+0x17b/0x7a0 [btrfs] [539604.309254] lookup_extent_backref+0x43/0xd0 [btrfs] [539604.310122] __btrfs_free_extent+0xf8/0x810 [btrfs] [539604.310874] ? lock_release+0x224/0x4a0 [539604.311724] ? btrfs_merge_delayed_refs+0x17b/0x1d0 [btrfs] [539604.313023] __btrfs_run_delayed_refs+0x2ba/0x1260 [btrfs] [539604.314271] btrfs_run_delayed_refs+0x8f/0x1c0 [btrfs] [539604.315445] ? rcu_read_lock_sched_held+0x12/0x70 [539604.316706] btrfs_commit_transaction+0xa2/0xed0 [btrfs] [539604.317855] ? do_raw_spin_unlock+0x4b/0xa0 [539604.318544] ? _raw_spin_unlock+0x29/0x50 [539604.319240] create_subvol+0x53d/0x6e0 [btrfs] [539604.320283] btrfs_mksubvol+0x4f5/0x590 [btrfs] [539604.321220] __btrfs_ioctl_snap_create+0x11b/0x180 [btrfs] [539604.322307] btrfs_ioctl_snap_create_v2+0xc6/0x150 [btrfs] [539604.323295] btrfs_ioctl+0x9f7/0x33e0 [btrfs] [539604.324331] ? rcu_read_lock_sched_held+0x12/0x70 [539604.325137] ? lock_release+0x224/0x4a0 [539604.325808] ? __x64_sys_ioctl+0x87/0xc0 [539604.326467] __x64_sys_ioctl+0x87/0xc0 [539604.327109] do_syscall_64+0x38/0x90 [539604.327875] entry_SYSCALL_64_after_hwframe+0x72/0xdc [539604.328792] RIP: 0033:0x7f05a7babaeb This needs to use regular btrfs_search_slot() with some skip and stop logic. Since we only consider five samples (five search slots), don't bother with the complexity of looking for commit_root_sem contention. If necessary, it can be added to the load function in between samples. Reported-by: Filipe Manana <fdmanana@kernel.org> Link: https://lore.kernel.org/linux-btrfs/CAL3q7H7eKMD44Z1+=Kb-1RFMMeZpAm2fwyO59yeBwCcSOU80Pg@mail.gmail.com/ Fixes: `c7eec3d9aa` ("btrfs: load block group size class when caching") Signed-off-by: Boris Burkov <boris@bur.io> Signed-off-by: David Sterba <dsterba@suse.com>	2023-03-06 19:28:19 +01:00
Jan Kara	63bceed808	udf: Warn if block mapping is done for in-ICB files Now that address space operations are merge dfor in-ICB and normal files, it is more likely some code mistakenly tries to map blocks for in-ICB files. WARN and return error instead of silently returning garbage. Signed-off-by: Jan Kara <jack@suse.cz>	2023-03-06 16:38:25 +01:00
Jan Kara	cecb1f0654	udf: Fix reading of in-ICB files After merging address space operations of normal and in-ICB files, readahead could get called for in-ICB files which resulted in udf_get_block() being called for these files. udf_get_block() is not prepared to be called for in-ICB files and ends up returning garbage results as it interprets file data as extent list. Fix the problem by skipping readahead for in-ICB files. Fixes: `37a8a39f7a` ("udf: Switch to single address_space_operations") Signed-off-by: Jan Kara <jack@suse.cz>	2023-03-06 16:38:25 +01:00
Jan Kara	49854d3ccc	udf: Fix lost writes in udf_adinicb_writepage() The patch converting udf_adinicb_writepage() to avoid manually kmapping the page used memcpy_to_page() however that copies in the wrong direction (effectively overwriting file data with the old contents). What we should be using is memcpy_from_page() to copy data from the page into the inode and then mark inode dirty to store the data. Fixes: `5cfc45321a` ("udf: Convert udf_adinicb_writepage() to memcpy_to_page()") Signed-off-by: Jan Kara <jack@suse.cz>	2023-03-06 16:38:25 +01:00
Linus Torvalds	20fdfd55ab	Merge tag 'mm-hotfixes-stable-2023-03-04-13-12' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "17 hotfixes. Eight are for MM and seven are for other parts of the kernel. Seven are cc:stable and eight address post-6.3 issues or were judged unsuitable for -stable backporting" * tag 'mm-hotfixes-stable-2023-03-04-13-12' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: mailmap: map Dikshita Agarwal's old address to his current one mailmap: map Vikash Garodia's old address to his current one fs/cramfs/inode.c: initialize file_ra_state fs: hfsplus: fix UAF issue in hfsplus_put_super panic: fix the panic_print NMI backtrace setting lib: parser: update documentation for match_NUMBER functions kasan, x86: don't rename memintrinsics in uninstrumented files kasan: test: fix test for new meminstrinsic instrumentation kasan: treat meminstrinsic as builtins in uninstrumented files kasan: emit different calls for instrumentable memintrinsics ocfs2: fix non-auto defrag path not working issue ocfs2: fix defrag path triggering jbd2 ASSERT mailmap: map Georgi Djakov's old Linaro address to his current one mm/hwpoison: convert TTU_IGNORE_HWPOISON to TTU_HWPOISON lib/zlib: DFLTCC deflate does not write all available bits for Z_NO_FLUSH mm/damon/paddr: fix missing folio_put() mm/mremap: fix dup_anon_vma() in vma_merge() case 4	2023-03-04 13:32:50 -08:00
Linus Torvalds	3162745aad	Merge tag '6.3-rc-smb3-client-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6 Pull more cifs updates from Steve French: - xfstest generic/208 fix (memory leak) - minor netfs fix (to address smatch warning) - a DFS fix for stable - a reconnect race fix - two multichannel fixes - RDMA (smbdirect) fix - two additional writeback fixes from David * tag '6.3-rc-smb3-client-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6: cifs: Fix memory leak in direct I/O cifs: prevent data race in cifs_reconnect_tcon() cifs: improve checking of DFS links over STATUS_OBJECT_NAME_INVALID iov: Fix netfs_extract_user_to_sg() cifs: Fix cifs_write_back_from_locked_folio() cifs: reuse cifs_match_ipaddr for comparison of dstaddr too cifs: match even the scope id for ipv6 addresses cifs: Fix an uninitialised variable cifs: Add some missing xas_retry() calls	2023-03-03 16:26:43 -08:00
Andrew Morton	3e35102666	fs/cramfs/inode.c: initialize file_ra_state file_ra_state_init() assumes that the file_ra_state has been zeroed out. Fixes a KMSAN used-unintialized issue (at least). Fixes: `cf948cbc35` ("cramfs: read_mapping_page() is synchronous") Reported-by: syzbot <syzbot+8ce7f8308d91e6b8bbe2@syzkaller.appspotmail.com> Link: https://lkml.kernel.org/r/0000000000008f74e905f56df987@google.com Cc: Matthew Wilcox <willy@infradead.org> Cc: Nicolas Pitre <nico@fluxnic.net> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-03-02 21:54:23 -08:00
Dongliang Mu	07db5e247a	fs: hfsplus: fix UAF issue in hfsplus_put_super The current hfsplus_put_super first calls hfs_btree_close on sbi->ext_tree, then invokes iput on sbi->hidden_dir, resulting in an use-after-free issue in hfsplus_release_folio. As shown in hfsplus_fill_super, the error handling code also calls iput before hfs_btree_close. To fix this error, we move all iput calls before hfsplus_btree_close. Note that this patch is tested on Syzbot. Link: https://lkml.kernel.org/r/20230226124948.3175736-1-mudongliangabcd@gmail.com Reported-by: syzbot+57e3e98f7e3b80f64d56@syzkaller.appspotmail.com Tested-by: Dongliang Mu <mudongliangabcd@gmail.com> Signed-off-by: Dongliang Mu <mudongliangabcd@gmail.com> Cc: Bart Van Assche <bvanassche@acm.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-03-02 21:54:23 -08:00
Linus Torvalds	c3f9b9fa10	Merge tag 'ceph-for-6.3-rc1' of https://github.com/ceph/ceph-client Pull ceph fixes from Ilya Dryomov: "Two small fixes from Xiubo and myself, marked for stable" * tag 'ceph-for-6.3-rc1' of https://github.com/ceph/ceph-client: rbd: avoid use-after-free in do_rbd_add() when rbd_dev_create() fails ceph: update the time stamps and try to drop the suid/sgid	2023-03-02 10:48:30 -08:00
David Howells	71562809e4	cifs: Fix memory leak in direct I/O When __cifs_readv() and __cifs_writev() extract pages from a user-backed iterator into a BVEC-type iterator, they set ->bv_need_unpin to note whether they need to unpin the pages later. However, in both cases they examine the BVEC-type iterator and not the source iterator - and so bv_need_unpin doesn't get set and the pages are leaked. I think this may be responsible for the generic/208 xfstest failing occasionally with: WARNING: CPU: 0 PID: 3064 at mm/gup.c:218 try_grab_page+0x65/0x100 RIP: 0010:try_grab_page+0x65/0x100 follow_page_pte+0x1a7/0x570 __get_user_pages+0x1a2/0x650 __gup_longterm_locked+0xdc/0xb50 internal_get_user_pages_fast+0x17f/0x310 pin_user_pages_fast+0x46/0x60 iov_iter_extract_pages+0xc9/0x510 ? __kmalloc_large_node+0xb1/0x120 ? __kmalloc_node+0xbe/0x130 netfs_extract_user_iter+0xbf/0x200 [netfs] __cifs_writev+0x150/0x330 [cifs] vfs_write+0x2a8/0x3c0 ksys_pwrite64+0x65/0xa0 with the page refcount going negative. This is less unlikely than it seems because the page is being pinned, not simply got, and so the refcount increased by 1024 each time, and so only needs to be called around ~2097152 for the refcount to go negative. Further, the test program (aio-dio-invalidate-failure) uses a 32MiB static buffer and all the PTEs covering it refer to the same page because it's never written to. The warning in try_grab_page(): if (WARN_ON_ONCE(folio_ref_count(folio) <= 0)) return -ENOMEM; then trips and prevents us ever using the page again for DIO at least. Fixes: `d08089f649` ("cifs: Change the I/O paths to use an iterator rather than a page list") Reported-by: Murphy Zhou <jencce.kernel@gmail.com> Link: https://lore.kernel.org/r/CAH2r5mvaTsJ---n=265a4zqRA7pP+o4MJ36WCQUS6oPrOij8cw@mail.gmail.com Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com> cc: Shyam Prasad N <nspmangalore@gmail.com> cc: Rohith Surabattula <rohiths.msft@gmail.com> cc: Jeff Layton <jlayton@kernel.org> cc: linux-cifs@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>	2023-03-01 18:18:25 -06:00
Paulo Alcantara	1bcd548d93	cifs: prevent data race in cifs_reconnect_tcon() Make sure to get an up-to-date TCP_Server_Info::nr_targets value prior to waiting the server to be reconnected in cifs_reconnect_tcon(). It is set in cifs_tcp_ses_needs_reconnect() and protected by TCP_Server_Info::srv_lock. Create a new cifs_wait_for_server_reconnect() helper that can be used by both SMB2+ and CIFS reconnect code. Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2023-03-01 18:18:25 -06:00
Paulo Alcantara	b9ee2e307c	cifs: improve checking of DFS links over STATUS_OBJECT_NAME_INVALID Do not map STATUS_OBJECT_NAME_INVALID to -EREMOTE under non-DFS shares, or 'nodfs' mounts or CONFIG_CIFS_DFS_UPCALL=n builds. Otherwise, in the slow path, get a referral to figure out whether it is an actual DFS link. This could be simply reproduced under a non-DFS share by running the following $ mount.cifs //srv/share /mnt -o ... $ cat /mnt/$(printf '\U110000') cat: '/mnt/'$'\364\220\200\200': Object is remote Fixes: `c877ce47e1` ("cifs: reduce roundtrips on create/qinfo requests") CC: stable@vger.kernel.org # 6.2 Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2023-03-01 18:18:25 -06:00
David Howells	4c0421fa6d	iov: Fix netfs_extract_user_to_sg() Fix the loop check in netfs_extract_user_to_sg() for extraction from user-backed iterators to do the body if npages > 0, not if npages < 0 (which it can never be). This isn't currently used by cifs, which only ever extracts data from BVEC, KVEC and XARRAY iterators at this level, user-backed iterators having being decanted into BVEC iterators at a higher level to accommodate the work being done in a kernel thread. Found by smatch: fs/netfs/iterator.c:139 netfs_extract_user_to_sg() warn: unsigned 'npages' is never less than zero. Fixes: `0185846975` ("netfs: Add a function to extract an iterator into a scatterlist") Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/oe-kbuild-all/202302261115.P3TQi1ZO-lkp@intel.com/ Reported-by: Dan Carpenter <error27@gmail.com> Link: https://lore.kernel.org/r/Y/yYnAhoAYDBKixX@kili Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Signed-off-by: David Howells <dhowells@redhat.com> cc: Jeff Layton <jlayton@kernel.org> cc: linux-cifs@vger.kernel.org cc: linux-cachefs@redhat.com Signed-off-by: Steve French <stfrench@microsoft.com>	2023-03-01 18:18:25 -06:00
David Howells	0268792f77	cifs: Fix cifs_write_back_from_locked_folio() cifs_write_back_from_locked_folio() should return the number of bytes read, but returns the result of ->async_writev(), which will be 0 on success. As it happens, this doesn't prevent cifs_writepages_region() from working as it will then examine and ignore the pages that are no longer dirty rather than just skipping over them. Fixes: `d08089f649` ("cifs: Change the I/O paths to use an iterator rather than a page list") Signed-off-by: David Howells <dhowells@redhat.com> cc: Shyam Prasad N <nspmangalore@gmail.com> cc: Rohith Surabattula <rohiths.msft@gmail.com> cc: Tom Talpey <tom@talpey.com> cc: Jeff Layton <jlayton@kernel.org> cc: linux-cifs@vger.kernel.org Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2023-03-01 18:18:25 -06:00
Shyam Prasad N	410612b072	cifs: reuse cifs_match_ipaddr for comparison of dstaddr too We have two pieces of code that does pretty much the same comparison. This change reuses cifs_match_ipaddr within match_address. Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2023-03-01 18:18:24 -06:00
Shyam Prasad N	a21be1f5df	cifs: match even the scope id for ipv6 addresses match_address function matches the scope id for ipv6 addresses, but cifs_match_ipaddr (which is another function used for comparison) does not use scope id. Doing so with this change. Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2023-03-01 18:18:24 -06:00
David Howells	4bd3e4c7e4	cifs: Fix an uninitialised variable Fix an uninitialised variable introduced in cifs. Fixes: `3d78fe73fa` ("cifs: Build the RDMA SGE list directly from an iterator") Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz> cc: Steve French <sfrench@samba.org> cc: Shyam Prasad N <nspmangalore@gmail.com> cc: Rohith Surabattula <rohiths.msft@gmail.com> cc: Tom Talpey <tom@talpey.com> cc: Jeff Layton <jlayton@kernel.org> cc: linux-cifs@vger.kernel.org cc: linux-rdma@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>	2023-03-01 18:17:36 -06:00
David Howells	1907e0fec1	cifs: Add some missing xas_retry() calls The xas_for_each loops added into fs/cifs/file.c need to go round again if indicated by xas_retry(). Fixes: `b8713c4dbf` ("cifs: Add some helper functions") Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz> cc: Shyam Prasad N <nspmangalore@gmail.com> cc: Rohith Surabattula <rohiths.msft@gmail.com> cc: Tom Talpey <tom@talpey.com> cc: Jeff Layton <jlayton@kernel.org> cc: linux-cifs@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>	2023-03-01 18:17:16 -06:00
Boris Burkov	fcd9531b30	btrfs: sysfs: add size class stats Make it possible to see the distribution of size classes for block groups. Helpful for testing and debugging the allocator w.r.t. to size classes. The new stats can be found at the path: /sys/fs/btrfs/<FSID>/allocation/<bg-type>/size_class but they will only be non-zero for bg-type = data. Signed-off-by: Boris Burkov <boris@bur.io> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2023-03-01 19:27:20 +01:00
Linus Torvalds	f122a08b19	capability: just use a 'u64' instead of a 'u32[2]' array Back in 2008 we extended the capability bits from 32 to 64, and we did it by extending the single 32-bit capability word from one word to an array of two words. It was then obfuscated by hiding the "2" behind two macro expansions, with the reasoning being that maybe it gets extended further some day. That reasoning may have been valid at the time, but the last thing we want to do is to extend the capability set any more. And the array of values not only causes source code oddities (with loops to deal with it), but also results in worse code generation. It's a lose-lose situation. So just change the 'u32[2]' into a 'u64' and be done with it. We still have to deal with the fact that the user space interface is designed around an array of these 32-bit values, but that was the case before too, since the array layouts were different (ie user space doesn't use an array of 32-bit values for individual capability masks, but an array of 32-bit slices of multiple masks). So that marshalling of data is actually simplified too, even if it does remain somewhat obscure and odd. This was all triggered by my reaction to the new "cap_isidentical()" introduced recently. By just using a saner data structure, it went from unsigned __capi; CAP_FOR_EACH_U32(__capi) { if (a.cap[__capi] != b.cap[__capi]) return false; } return true; to just being return a.val == b.val; instead. Which is rather more obvious both to humans and to compilers. Cc: Mateusz Guzik <mjguzik@gmail.com> Cc: Casey Schaufler <casey@schaufler-ca.com> Cc: Serge Hallyn <serge@hallyn.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Paul Moore <paul@paul-moore.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2023-03-01 10:01:22 -08:00
Linus Torvalds	64e851689e	Merge tag 'uml-for-linus-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux Pull UML updates from Richard Weinberger: - Add support for rust (yay!) - Add support for LTO - Add platform bus support to virtio-pci - Various virtio fixes - Coding style, spelling cleanups * tag 'uml-for-linus-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux: (27 commits) Documentation: rust: Fix arch support table uml: vector: Remove unused definitions VECTOR_{WRITE,HEADERS} um: virt-pci: properly remove PCI device from bus um: virtio_uml: move device breaking into workqueue um: virtio_uml: mark device as unregistered when breaking it um: virtio_uml: free command if adding to virtqueue failed UML: define RUNTIME_DISCARD_EXIT virt-pci: add platform bus support um-virt-pci: Make max delay configurable um: virt-pci: implement pcibios_get_phb_of_node() um: Support LTO um: put power options in a menu um: Use CFLAGS_vmlinux um: Prevent building modules incompatible with MODVERSIONS um: Avoid pcap multiple definition errors um: Make the definition of cpu_data more compatible x86: um: vdso: Add '%rcx' and '%r11' to the syscall clobber list rust: arch/um: Add support for CONFIG_RUST under x86_64 UML rust: arch/um: Disable FP/SIMD instruction to match x86 rust: arch/um: Use 'pie' relocation mode under UML ...	2023-03-01 09:13:00 -08:00
Linus Torvalds	e31b283a58	Merge tag 'ubifs-for-linus-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs Pull jffs2, ubi and ubifs updates from Richard Weinberger: "JFFS2: - Fix memory corruption in error path - Spelling and coding style fixes UBI: - Switch to BLK_MQ_F_BLOCKING in ubiblock - Wire up partent device (for sysfs) - Multiple UAF bugfixes - Fix for an infinite loop in WL error path UBIFS: - Fix for multiple memory leaks in error paths - Fixes for wrong space accounting - Minor cleanups - Spelling and coding style fixes" * tag 'ubifs-for-linus-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs: (36 commits) ubi: block: Fix a possible use-after-free bug in ubiblock_create() ubifs: make kobj_type structures constant mtd: ubi: block: wire-up device parent mtd: ubi: wire-up parent MTD device ubi: use correct names in function kernel-doc comments ubi: block: set BLK_MQ_F_BLOCKING jffs2: Fix list_del corruption if compressors initialized failed jffs2: Use function instead of macro when initialize compressors jffs2: fix spelling mistake "neccecary"->"necessary" ubifs: Fix kernel-doc ubifs: Fix some kernel-doc comments UBI: Fastmap: Fix kernel-doc ubi: ubi_wl_put_peb: Fix infinite loop when wear-leveling work failed ubi: Fix UAF wear-leveling entry in eraseblk_count_seq_show() ubi: fastmap: Fix missed fm_anchor PEB in wear-leveling after disabling fastmap ubifs: ubifs_releasepage: Remove ubifs_assert(0) to valid this process ubifs: ubifs_writepage: Mark page dirty after writing inode failed ubifs: dirty_cow_znode: Fix memleak in error handling path ubifs: Re-statistic cleaned znode count if commit failed ubi: Fix permission display of the debugfs files ...	2023-03-01 09:06:51 -08:00
Linus Torvalds	3808330b20	Merge tag '9p-6.3-for-linus-part1' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs Pull 9p updates from Eric Van Hensbergen: - some fixes and cleanup setting up for a larger set of performance patches I've been working on - a contributed fixes relating to 9p/rdma - some contributed fixes relating to 9p/xen * tag '9p-6.3-for-linus-part1' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs: fs/9p: fix error reporting in v9fs_dir_release net/9p: fix bug in client create for .L 9p/rdma: unmap receive dma buffer in rdma_request()/post_recv() 9p/xen: fix connection sequence 9p/xen: fix version parsing fs/9p: Expand setup of writeback cache to all levels net/9p: Adjust maximum MSIZE to account for p9 header	2023-03-01 08:52:49 -08:00
Linus Torvalds	6e110580bc	Merge tag 'jfs-6.3' of https://github.com/kleikamp/linux-shaggy Pull jfs update from Dave Kleikamp: "Just one simple sanity check" * tag 'jfs-6.3' of https://github.com/kleikamp/linux-shaggy: fs/jfs: fix shift exponent db_agl2size negative	2023-03-01 08:47:19 -08:00
Linus Torvalds	e103ecedce	Merge tag 'exfat-for-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat Pull exfat updates from Namjae Jeon: - Handle vendor extension and allocation entries as unrecognized benign secondary entries - Fix wrong ->i_blocks on devices with non-512 byte sector - Add the check to avoid returning -EIO from exfat_readdir() at current position exceeding the directory size - Fix a bug that reach the end of the directory stream at a position not aligned with the dentry size - Redefine DIR_DELETED as 0xFFFFFFF7, the bad cluster number - Two cleanup fixes and fix cluster leakage in error handling * tag 'exfat-for-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat: exfat: fix the newly allocated clusters are not freed in error handling exfat: don't print error log in normal case exfat: remove unneeded code from exfat_alloc_cluster() exfat: handle unreconized benign secondary entries exfat: fix inode->i_blocks for non-512 byte sector size device exfat: redefine DIR_DELETED as the bad cluster number exfat: fix reporting fs error when reading dir beyond EOF exfat: fix unexpected EOF while reading dir	2023-03-01 08:42:27 -08:00
Linus Torvalds	c0927a7a53	Merge tag 'xfs-6.3-merge-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux Pull moar xfs updates from Darrick Wong: "This contains a fix for a deadlock in the allocator. It continues the slow march towards being able to offline AGs, and it refactors the interface to the xfs allocator to be less indirection happy. Summary: - Fix a deadlock in the free space allocator due to the AG-walking algorithm forgetting to follow AG-order locking rules - Make the inode allocator prefer existing free inodes instead of failing to allocate new inode chunks when free space is low - Set minleft correctly when setting allocator parameters for bmap changes - Fix uninitialized variable access in the getfsmap code - Make a distinction between active and passive per-AG structure references. For now, active references are taken to perform some work in an AG on behalf of a high level operation; passive references are used by lower level code to finish operations started by other threads. Eventually this will become part of online shrink - Split out all the different allocator strategies into separate functions to move us away from design antipattern of filling out a huge structure for various differentish things and issuing a single function multiplexing call - Various cleanups in the filestreams allocator code, which we might very well want to deprecate instead of continuing - Fix a bug with the agi rotor code that was introduced earlier in this series" * tag 'xfs-6.3-merge-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (44 commits) xfs: restore old agirotor behavior xfs: fix uninitialized variable access xfs: refactor the filestreams allocator pick functions xfs: return a referenced perag from filestreams allocator xfs: pass perag to filestreams tracing xfs: use for_each_perag_wrap in xfs_filestream_pick_ag xfs: track an active perag reference in filestreams xfs: factor out MRU hit case in xfs_filestream_select_ag xfs: remove xfs_filestream_select_ag() longest extent check xfs: merge new filestream AG selection into xfs_filestream_select_ag() xfs: merge filestream AG lookup into xfs_filestream_select_ag() xfs: move xfs_bmap_btalloc_filestreams() to xfs_filestreams.c xfs: use xfs_bmap_longest_free_extent() in filestreams xfs: get rid of notinit from xfs_bmap_longest_free_extent xfs: factor out filestreams from xfs_bmap_btalloc_nullfb xfs: convert trim to use for_each_perag_range xfs: convert xfs_alloc_vextent_iterate_ags() to use perag walker xfs: move the minimum agno checks into xfs_alloc_vextent_check_args xfs: fold xfs_alloc_ag_vextent() into callers xfs: move allocation accounting to xfs_alloc_vextent_set_fsbno() ...	2023-02-28 16:08:30 -08:00
Linus Torvalds	b07ce43db6	Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 updates from Ted Ts'o: "Improve performance for ext4 by allowing multiple process to perform direct I/O writes to preallocated blocks by using a shared inode lock instead of taking an exclusive lock. In addition, multiple bug fixes and cleanups" * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: fix incorrect options show of original mount_opt and extend mount_opt2 ext4: Fix possible corruption when moving a directory ext4: init error handle resource before init group descriptors ext4: fix task hung in ext4_xattr_delete_inode jbd2: fix data missing when reusing bh which is ready to be checkpointed ext4: update s_journal_inum if it changes after journal replay ext4: fail ext4_iget if special inode unallocated ext4: fix function prototype mismatch for ext4_feat_ktype ext4: remove unnecessary variable initialization ext4: fix inode tree inconsistency caused by ENOMEM ext4: refuse to create ea block when umounted ext4: optimize ea_inode block expansion ext4: remove dead code in updating backup sb ext4: dio take shared inode lock when overwriting preallocated blocks ext4: don't show commit interval if it is zero ext4: use ext4_fc_tl_mem in fast-commit replay path ext4: improve xattr consistency checking and error reporting	2023-02-28 09:05:47 -08:00
Yuezhang Mo	d5c514b6a0	exfat: fix the newly allocated clusters are not freed in error handling In error handling 'free_cluster', before num_alloc clusters allocated, p_chain->size will not updated and always 0, thus the newly allocated clusters are not freed. Signed-off-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Reviewed-by: Andy Wu <Andy.Wu@sony.com> Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>	2023-02-28 20:01:40 +09:00
Yuezhang Mo	3ce937cb8c	exfat: don't print error log in normal case When allocating a new cluster, exFAT first allocates from the next cluster of the last cluster of the file. If the last cluster of the file is the last cluster of the volume, allocate from the first cluster. This is a normal case, but the following error log will be printed. It makes users confused, so this commit removes the error log. [1960905.181545] exFAT-fs (sdb1): hint_cluster is invalid (262130) Signed-off-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Reviewed-by: Andy Wu <Andy.Wu@sony.com> Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>	2023-02-28 20:01:36 +09:00
Yuezhang Mo	8d2909eeca	exfat: remove unneeded code from exfat_alloc_cluster() In the removed code, num_clusters is 0, nothing is done in exfat_chain_cont_cluster(), so it is unneeded, remove it. Signed-off-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Reviewed-by: Andy Wu <Andy.Wu@sony.com> Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>	2023-02-28 20:01:22 +09:00
Heming Zhao via Ocfs2-devel	236b9254f8	ocfs2: fix non-auto defrag path not working issue This fixes three issues on move extents ioctl without auto defrag: a) In ocfs2_find_victim_alloc_group(), we have to convert bits to block first in case of global bitmap. b) In ocfs2_probe_alloc_group(), when finding enough bits in block group bitmap, we have to back off move_len to start pos as well, otherwise it may corrupt filesystem. c) In ocfs2_ioctl_move_extents(), set me_threshold both for non-auto and auto defrag paths. Otherwise it will set move_max_hop to 0 and finally cause unexpectedly ENOSPC error. Currently there are no tools triggering the above issues since defragfs.ocfs2 enables auto defrag by default. Tested with manually changing defragfs.ocfs2 to run non auto defrag path. Link: https://lkml.kernel.org/r/20230220050526.22020-1-heming.zhao@suse.com Signed-off-by: Heming Zhao <heming.zhao@suse.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-02-27 17:00:15 -08:00
Heming Zhao via Ocfs2-devel	60eed1e3d4	ocfs2: fix defrag path triggering jbd2 ASSERT code path: ocfs2_ioctl_move_extents ocfs2_move_extents ocfs2_defrag_extent __ocfs2_move_extent + ocfs2_journal_access_di + ocfs2_split_extent //sub-paths call jbd2_journal_restart + ocfs2_journal_dirty //crash by jbs2 ASSERT crash stacks: PID: 11297 TASK: ffff974a676dcd00 CPU: 67 COMMAND: "defragfs.ocfs2" #0 [ffffb25d8dad3900] machine_kexec at ffffffff8386fe01 #1 [ffffb25d8dad3958] __crash_kexec at ffffffff8395959d #2 [ffffb25d8dad3a20] crash_kexec at ffffffff8395a45d #3 [ffffb25d8dad3a38] oops_end at ffffffff83836d3f #4 [ffffb25d8dad3a58] do_trap at ffffffff83833205 #5 [ffffb25d8dad3aa0] do_invalid_op at ffffffff83833aa6 #6 [ffffb25d8dad3ac0] invalid_op at ffffffff84200d18 [exception RIP: jbd2_journal_dirty_metadata+0x2ba] RIP: ffffffffc09ca54a RSP: ffffb25d8dad3b70 RFLAGS: 00010207 RAX: 0000000000000000 RBX: ffff9706eedc5248 RCX: 0000000000000000 RDX: 0000000000000001 RSI: ffff97337029ea28 RDI: ffff9706eedc5250 RBP: ffff9703c3520200 R8: 000000000f46b0b2 R9: 0000000000000000 R10: 0000000000000001 R11: 00000001000000fe R12: ffff97337029ea28 R13: 0000000000000000 R14: ffff9703de59bf60 R15: ffff9706eedc5250 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #7 [ffffb25d8dad3ba8] ocfs2_journal_dirty at ffffffffc137fb95 [ocfs2] #8 [ffffb25d8dad3be8] __ocfs2_move_extent at ffffffffc139a950 [ocfs2] #9 [ffffb25d8dad3c80] ocfs2_defrag_extent at ffffffffc139b2d2 [ocfs2] Analysis This bug has the same root cause of 'commit `7f27ec978b` ("ocfs2: call ocfs2_journal_access_di() before ocfs2_journal_dirty() in ocfs2_write_end_nolock()")'. For this bug, jbd2_journal_restart() is called by ocfs2_split_extent() during defragmenting. How to fix For ocfs2_split_extent() can handle journal operations totally by itself. Caller doesn't need to call journal access/dirty pair, and caller only needs to call journal start/stop pair. The fix method is to remove journal access/dirty from __ocfs2_move_extent(). The discussion for this patch: https://oss.oracle.com/pipermail/ocfs2-devel/2023-February/000647.html Link: https://lkml.kernel.org/r/20230217003717.32469-1-heming.zhao@suse.com Signed-off-by: Heming Zhao <heming.zhao@suse.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-02-27 17:00:15 -08:00
Mateusz Guzik	981ee95cc1	vfs: avoid duplicating creds in faccessat if possible access(2) remains commonly used, for example on exec: access("/etc/ld.so.preload", R_OK) or when running gcc: strace -c gcc empty.c % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 0.00 0.000000 0 42 26 access It falls down to do_faccessat without the AT_EACCESS flag, which in turn results in allocation of new creds in order to modify fsuid/fsgid and caps. This is a very expensive process single-threaded and most notably multi-threaded, with numerous structures getting refed and unrefed on imminent new cred destruction. Turns out for typical consumers the resulting creds would be identical and this can be checked upfront, avoiding the hard work. An access benchmark plugged into will-it-scale running on Cascade Lake shows: test proc before after access1 1 1310582 2908735 (+121%) # distinct files access1 24 4716491 63822173 (+1353%) # distinct files access2 24 2378041 5370335 (+125%) # same file The above benchmarks are not integrated into will-it-scale, but can be found in a pull request: https://github.com/antonblanchard/will-it-scale/pull/36/files Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2023-02-27 16:39:19 -08:00
Linus Torvalds	103830683c	Merge tag 'f2fs-for-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs updates from Jaegeuk Kim: "In this round, we've got a huge number of patches that improve code readability along with minor bug fixes, while we've mainly fixed some critical issues in recently-added per-block age-based extent_cache, atomic write support, and some folio cases. Enhancements: - add sysfs nodes to set last_age_weight and manage discard_io_aware_gran - show ipu policy in debugfs - reduce stack memory cost by using bitfield in struct f2fs_io_info - introduce trace_f2fs_replace_atomic_write_block - enhance iostat support and adds flush commands Bug fixes: - revert "f2fs: truncate blocks in batch in __complete_revoke_list()" - fix kernel crash on the atomic write abort flow - call clear_page_private_reference in .{release,invalid}_folio - support .migrate_folio for compressed inode - fix cgroup writeback accounting with fs-layer encryption - retry to update the inode page given data corruption - fix kernel crash due to NULL io->bio - fix some bugs in per-block age-based extent_cache: - wrong calculation of block age - update age extent in f2fs_do_zero_range() - update age extent correctly during truncation" * tag 'f2fs-for-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (81 commits) f2fs: drop unnecessary arg for f2fs_ioc_*() f2fs: Revert "f2fs: truncate blocks in batch in __complete_revoke_list()" f2fs: synchronize atomic write aborts f2fs: fix wrong segment count f2fs: replace si->sbi w/ sbi in stat_show() f2fs: export ipu policy in debugfs f2fs: make kobj_type structures constant f2fs: fix to do sanity check on extent cache correctly f2fs: add missing description for ipu_policy node f2fs: fix to set ipu policy f2fs: fix typos in comments f2fs: fix kernel crash due to null io->bio f2fs: use iostat_lat_type directly as a parameter in the iostat_update_and_unbind_ctx() f2fs: add sysfs nodes to set last_age_weight f2fs: fix f2fs_show_options to show nogc_merge mount option f2fs: fix cgroup writeback accounting with fs-layer encryption f2fs: fix wrong calculation of block age f2fs: fix to update age extent in f2fs_do_zero_range() f2fs: fix to update age extent correctly during truncation f2fs: fix to avoid potential memory corruption in __update_iostat_latency() ...	2023-02-27 16:18:51 -08:00
Linus Torvalds	11c7052998	Merge tag 'soc-drivers-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM SoC driver updates from Arnd Bergmann: "As usual, there are lots of minor driver changes across SoC platforms from NXP, Amlogic, AMD Zynq, Mediatek, Qualcomm, Apple and Samsung. These usually add support for additional chip variations in existing drivers, but also add features or bugfixes. The SCMI firmware subsystem gains a unified raw userspace interface through debugfs, which can be used for validation purposes. Newly added drivers include: - New power management drivers for StarFive JH7110, Allwinner D1 and Renesas RZ/V2M - A driver for Qualcomm battery and power supply status - A SoC device driver for identifying Nuvoton WPCM450 chips - A regulator coupler driver for Mediatek MT81xxv" * tag 'soc-drivers-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (165 commits) power: supply: Introduce Qualcomm PMIC GLINK power supply soc: apple: rtkit: Do not copy the reg state structure to the stack soc: sunxi: SUN20I_PPU should depend on PM memory: renesas-rpc-if: Remove redundant division of dummy soc: qcom: socinfo: Add IDs for IPQ5332 and its variant dt-bindings: arm: qcom,ids: Add IDs for IPQ5332 and its variant dt-bindings: power: qcom,rpmpd: add RPMH_REGULATOR_LEVEL_LOW_SVS_L1 firmware: qcom_scm: Move qcom_scm.h to include/linux/firmware/qcom/ MAINTAINERS: Update qcom CPR maintainer entry dt-bindings: firmware: document Qualcomm SM8550 SCM dt-bindings: firmware: qcom,scm: add qcom,scm-sa8775p compatible soc: qcom: socinfo: Add Soc IDs for IPQ8064 and variants dt-bindings: arm: qcom,ids: Add Soc IDs for IPQ8064 and variants soc: qcom: socinfo: Add support for new field in revision 17 soc: qcom: smd-rpm: Add IPQ9574 compatible soc: qcom: pmic_glink: remove redundant calculation of svid soc: qcom: stats: Populate all subsystem debugfs files dt-bindings: soc: qcom,rpmh-rsc: Update to allow for generic nodes soc: qcom: pmic_glink: add CONFIG_NET/CONFIG_OF dependencies soc: qcom: pmic_glink: Introduce altmode support ...	2023-02-27 10:04:49 -08:00

1 2 3 4 5 ...

80793 Commits