Avoid linker error for randomly generated config file that
has CONFIG_BRANCH_PROFILE_NONE enabled and make it similar
to riscv, x86 and also to commit 4bf3ec384e ("s390: disable
branch profiling for vdso").
Reviewed-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Currently, if the device is offline and all the channel paths are
either configured or varied offline, the associated subchannel gets
unregistered. Don't unregister the subchannel, instead unregister
offline device.
Signed-off-by: Vineeth Vijayan <vneethv@linux.ibm.com>
Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Commit f1045056c7 ("topology/sysfs: rework book and drawer topology
ifdefery") activates the book and drawer topology, previously activated by
CONFIG_SCHED_{BOOK,DRAWER}, dependent on the existence of certain macro
definitions. Hence, since then, CONFIG_SCHED_{BOOK,DRAWER} have no effect
and any further purpose.
Remove the obsolete configs SCHED_{BOOK,DRAWER}.
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Link: https://lore.kernel.org/r/20230508040916.16733-1-lukas.bulwahn@gmail.com
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
s390's struct statfs and struct statfs64 contain padding, which
field-by-field copying does not set. Initialize the respective structs
with zeros before filling them and copying them to userspace, like it's
already done for the compat versions of these structs.
Found by KMSAN.
[agordeev@linux.ibm.com: fixed typo in patch description]
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Cc: stable@vger.kernel.org # v4.14+
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Link: https://lore.kernel.org/r/20230504144021.808932-2-iii@linux.ibm.com
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Use "a" constraint instead of "d" constraint to pass the state parameter to
the do_sqbs() inline assembly. This prevents that general purpose register
zero is used for the state parameter.
If the compiler would select general purpose register zero this would be
problematic for the used instruction in rsy format: the register used for
the state parameter is a base register. If the base register is general
purpose register zero the contents of the register are unexpectedly ignored
when the instruction is executed.
This only applies to z/VM guests using QIOASSIST with dedicated (pass through)
QDIO-based devices such as FCP [zfcp driver] as well as real OSA or
HiperSockets [qeth driver].
A possible symptom for this case using zfcp is the following repeating kernel
message pattern:
zfcp <devbusid>: A QDIO problem occurred
zfcp <devbusid>: A QDIO problem occurred
zfcp <devbusid>: qdio: ZFCP on SC <sc> using AI:1 QEBSM:1 PRI:1 TDD:1 SIGA: W
zfcp <devbusid>: A QDIO problem occurred
zfcp <devbusid>: A QDIO problem occurred
Each of the qdio problem message can be accompanied by the following entries
for the affected subchannel <sc> in
/sys/kernel/debug/s390dbf/qdio_error/hex_ascii for zfcp or qeth:
<sc> ccq: 69....
<sc> SQBS ERROR.
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Cc: Steffen Maier <maier@linux.ibm.com>
Fixes: 8129ee1642 ("[PATCH] s390: qdio V=V pass-through")
Cc: <stable@vger.kernel.org>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
The pointer to IPL Parameter Information Block is stored
in the absolute lowcore for later use by dump tools. That
pointer is a virtual address, though it should be physical
instead.
Note, this does not fix a real issue, since virtual and
physical addresses are currently the same.
Suggested-by: Heiko Carstens <hca@linux.ibm.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
When SMT thread CPUs are added to CPU masks the nr_cpu_ids
limit is not checked and could be exceeded. This leads to
a warning for example if CONFIG_DEBUG_PER_CPU_MAPS is set
and the command line parameter nr_cpus is set to 1.
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Currently when the new channel-path is enabled, we do evaluation only
on the subchannels with a device connected on it. This is because,
in the past, if the device in the subchannel is not working or not
available, we used to unregister the subchannels. But, from the 'commit
2297791c92 ("s390/cio: dont unregister subchannel from child-drivers")'
we allow subchannels with or without an active device connected
on it. So, when we do the io_subchannel_verify, make sure that,
we are evaluating the subchannels without any device too.
Fixes: 2297791c92 ("s390/cio: dont unregister subchannel from child-drivers")
Reported-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Signed-off-by: Vineeth Vijayan <vneethv@linux.ibm.com>
Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Set CONFIG_INIT_STACK_NONE=y in defconfigs to avoid the extra overhead of
initializing all stack variables by default. Users who want to have that
must change the configuration on their own.
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Key blobs for the IOCTLs PKEY_KBLOB2PROTK[23] may contain clear key
material. Zeroize the copies of these keys in kernel memory after
creating the protected key.
Reviewed-by: Harald Freudenberger <freude@linux.ibm.com>
Signed-off-by: Holger Dengler <dengler@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Pull compute express link fixes from Dan Williams:
- Fix a compilation issue with DEFINE_STATIC_SRCU() in the unit tests
- Fix leaking kernel memory to a root-only sysfs attribute
* tag 'cxl-fixes-6.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl:
cxl: Add missing return to cdat read error path
tools/testing/cxl: Use DEFINE_STATIC_SRCU()
Pull parisc architecture fixes from Helge Deller:
- Fix encoding of swp_entry due to added SWP_EXCLUSIVE flag
- Include reboot.h to avoid gcc-12 compiler warning
* tag 'parisc-for-6.4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: Fix encoding of swp_entry due to added SWP_EXCLUSIVE flag
parisc: kexec: include reboot.h
Pull ARM fixes from Russell King:
- fix unwinder for uleb128 case
- fix kernel-doc warnings for HP Jornada 7xx
- fix unbalanced stack on vfp success path
* tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
ARM: 9297/1: vfp: avoid unbalanced stack on 'success' return path
ARM: 9296/1: HP Jornada 7XX: fix kernel-doc warnings
ARM: 9295/1: unwind:fix unwind abort for uleb128 case
Pull locking fix from Borislav Petkov:
- Make sure __down_read_common() is always inlined so that the callers'
names land in traceevents output and thus the blocked function can be
identified
* tag 'locking_urgent_for_v6.4_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking/rwsem: Add __always_inline annotation to __down_read_common() and inlined callers
Pull perf fixes from Borislav Petkov:
- Make sure the PEBS buffer is flushed before reprogramming the
hardware so that the correct record sizes are used
- Update the sample size for AMD BRS events
- Fix a confusion with using the same on-stack struct with different
events in the event processing path
* tag 'perf_urgent_for_v6.4_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/intel/ds: Flush PEBS DS when changing PEBS_DATA_CFG
perf/x86: Fix missing sample size update on AMD BRS
perf/core: Fix perf_sample_data not properly initialized for different swevents in perf_tp_event()
Pull scheduler fix from Borislav Petkov:
- Fix a couple of kernel-doc warnings
* tag 'sched_urgent_for_v6.4_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched: fix cid_lock kernel-doc warnings
Pull x86 fix from Borislav Petkov:
- Add the required PCI IDs so that the generic SMN accesses provided by
amd_nb.c work for drivers which switch to them. Add a PCI device ID
to k10temp's table so that latter is loaded on such systems too
* tag 'x86_urgent_for_v6.4_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
hwmon: (k10temp) Add PCI ID for family 19, model 78h
x86/amd_nb: Add PCI ID for family 19h model 78h
Pull timer fix from Borislav Petkov:
- Prevent CPU state corruption when an active clockevent broadcast
device is replaced while the system is already in oneshot mode
* tag 'timers_urgent_for_v6.4_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
tick/broadcast: Make broadcast device replacement work correctly
Pull ext4 fixes from Ted Ts'o:
"Some ext4 bug fixes (mostly to address Syzbot reports)"
* tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: bail out of ext4_xattr_ibody_get() fails for any reason
ext4: add bounds checking in get_max_inline_xattr_value_size()
ext4: add indication of ro vs r/w mounts in the mount message
ext4: fix deadlock when converting an inline directory in nojournal mode
ext4: improve error recovery code paths in __ext4_remount()
ext4: improve error handling from ext4_dirhash()
ext4: don't clear SB_RDONLY when remounting r/w until quota is re-enabled
ext4: check iomap type only if ext4_iomap_begin() does not fail
ext4: avoid a potential slab-out-of-bounds in ext4_group_desc_csum
ext4: fix data races when using cached status extents
ext4: avoid deadlock in fs reclaim with page writeback
ext4: fix invalid free tracking in ext4_xattr_move_to_block()
ext4: remove a BUG_ON in ext4_mb_release_group_pa()
ext4: allow ext4_get_group_info() to fail
ext4: fix lockdep warning when enabling MMP
ext4: fix WARNING in mb_find_extent
Pull SCSI fix from James Bottomley:
"A single small fix for the UFS driver to fix a power management
failure"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: ufs: core: Fix I/O hang that occurs when BKOPS fails in W-LUN suspend
In ext4_update_inline_data(), if ext4_xattr_ibody_get() fails for any
reason, it's best if we just fail as opposed to stumbling on,
especially if the failure is EFSCORRUPTED.
Cc: stable@kernel.org
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Whether the file system is mounted read-only or read/write is more
important than the quota mode, which we are already printing. Add the
ro vs r/w indication since this can be helpful in debugging problems
from the console log.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
If there are failures while changing the mount options in
__ext4_remount(), we need to restore the old mount options.
This commit fixes two problem. The first is there is a chance that we
will free the old quota file names before a potential failure leading
to a use-after-free. The second problem addressed in this commit is
if there is a failed read/write to read-only transition, if the quota
has already been suspended, we need to renable quota handling.
Cc: stable@kernel.org
Link: https://lore.kernel.org/r/20230506142419.984260-2-tytso@mit.edu
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
When a file system currently mounted read/only is remounted
read/write, if we clear the SB_RDONLY flag too early, before the quota
is initialized, and there is another process/thread constantly
attempting to create a directory, it's possible to trigger the
WARN_ON_ONCE(dquot_initialize_needed(inode));
in ext4_xattr_block_set(), with the following stack trace:
WARNING: CPU: 0 PID: 5338 at fs/ext4/xattr.c:2141 ext4_xattr_block_set+0x2ef2/0x3680
RIP: 0010:ext4_xattr_block_set+0x2ef2/0x3680 fs/ext4/xattr.c:2141
Call Trace:
ext4_xattr_set_handle+0xcd4/0x15c0 fs/ext4/xattr.c:2458
ext4_initxattrs+0xa3/0x110 fs/ext4/xattr_security.c:44
security_inode_init_security+0x2df/0x3f0 security/security.c:1147
__ext4_new_inode+0x347e/0x43d0 fs/ext4/ialloc.c:1324
ext4_mkdir+0x425/0xce0 fs/ext4/namei.c:2992
vfs_mkdir+0x29d/0x450 fs/namei.c:4038
do_mkdirat+0x264/0x520 fs/namei.c:4061
__do_sys_mkdirat fs/namei.c:4076 [inline]
__se_sys_mkdirat fs/namei.c:4074 [inline]
__x64_sys_mkdirat+0x89/0xa0 fs/namei.c:4074
Cc: stable@kernel.org
Link: https://lore.kernel.org/r/20230506142419.984260-1-tytso@mit.edu
Reported-by: syzbot+6385d7d3065524c5ca6d@syzkaller.appspotmail.com
Link: https://syzkaller.appspot.com/bug?id=6513f6cb5cd6b5fc9f37e3bb70d273b94be9c34c
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Ext4 has a filesystem wide lock protecting ext4_writepages() calls to
avoid races with switching of journalled data flag or inode format. This
lock can however cause a deadlock like:
CPU0 CPU1
ext4_writepages()
percpu_down_read(sbi->s_writepages_rwsem);
ext4_change_inode_journal_flag()
percpu_down_write(sbi->s_writepages_rwsem);
- blocks, all readers block from now on
ext4_do_writepages()
ext4_init_io_end()
kmem_cache_zalloc(io_end_cachep, GFP_KERNEL)
fs_reclaim frees dentry...
dentry_unlink_inode()
iput() - last ref =>
iput_final() - inode dirty =>
write_inode_now()...
ext4_writepages() tries to acquire sbi->s_writepages_rwsem
and blocks forever
Make sure we cannot recurse into filesystem reclaim from writeback code
to avoid the deadlock.
Reported-by: syzbot+6898da502aef574c5f8a@syzkaller.appspotmail.com
Link: https://lore.kernel.org/all/0000000000004c66b405fa108e27@google.com
Fixes: c8585c6fca ("ext4: fix races between changing inode journal mode and ext4_writepages")
CC: stable@vger.kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20230504124723.20205-1-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Previously, ext4_get_group_info() would treat an invalid group number
as BUG(), since in theory it should never happen. However, if a
malicious attaker (or fuzzer) modifies the superblock via the block
device while it is the file system is mounted, it is possible for
s_first_data_block to get set to a very large number. In that case,
when calculating the block group of some block number (such as the
starting block of a preallocation region), could result in an
underflow and very large block group number. Then the BUG_ON check in
ext4_get_group_info() would fire, resutling in a denial of service
attack that can be triggered by root or someone with write access to
the block device.
For a quality of implementation perspective, it's best that even if
the system administrator does something that they shouldn't, that it
will not trigger a BUG. So instead of BUG'ing, ext4_get_group_info()
will call ext4_error and return NULL. We also add fallback code in
all of the callers of ext4_get_group_info() that it might NULL.
Also, since ext4_get_group_info() was already borderline to be an
inline function, un-inline it. The results in a next reduction of the
compiled text size of ext4 by roughly 2k.
Cc: stable@kernel.org
Link: https://lore.kernel.org/r/20230430154311.579720-2-tytso@mit.edu
Reported-by: syzbot+e2efa3efc15a1c9e95c3@syzkaller.appspotmail.com
Link: https://syzkaller.appspot.com/bug?id=69b28112e098b070f639efb356393af3ffec4220
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
Pull block fixes from Jens Axboe:
"Just a few minor fixes for drivers, and a deletion of a file that is
woefully out-of-date these days"
* tag 'block-6.4-2023-05-13' of git://git.kernel.dk/linux:
Documentation/block: drop the request.rst file
ublk: fix command op code check
block/rnbd: replace REQ_OP_FLUSH with REQ_OP_WRITE
nbd: Fix debugfs_create_dir error checking
SYM_FUNC_START_LOCAL_NOALIGN() adds an endbr leading to this layout
(leaving only the last 2 bytes of the address):
3bff <zen_untrain_ret>:
3bff: f3 0f 1e fa endbr64
3c03: f6 test $0xcc,%bl
3c04 <__x86_return_thunk>:
3c04: c3 ret
3c05: cc int3
3c06: 0f ae e8 lfence
However, "the RET at __x86_return_thunk must be on a 64 byte boundary,
for alignment within the BTB."
Use SYM_START instead.
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: <stable@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull more btrfs fixes from David Sterba:
- fix incorrect number of bitmap entries for space cache if loading is
interrupted by some error
- fix backref walking, this breaks a mode of LOGICAL_INO_V2 ioctl that
is used in deduplication tools
- zoned mode fixes:
- properly finish zone reserved for relocation
- correctly calculate super block zone end on ZNS
- properly initialize new extent buffer for redirty
- make mount option clear_cache work with block-group-tree, to rebuild
free-space-tree instead of temporarily disabling it that would lead
to a forced read-only mount
- fix alignment check for offset when printing extent item
* tag 'for-6.4-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: make clear_cache mount option to rebuild FST without disabling it
btrfs: zero the buffer before marking it dirty in btrfs_redirty_list_add
btrfs: zoned: fix full zone super block reading on ZNS
btrfs: zoned: zone finish data relocation BG with last IO
btrfs: fix backref walking not returning all inode refs
btrfs: fix space cache inconsistency after error loading it from disk
btrfs: print-tree: parent bytenr must be aligned to sector size
Pull cifs client fixes from Steve French:
- fix for copy_file_range bug for very large files that are multiples
of rsize
- do not ignore "isolated transport" flag if set on share
- set rasize default better
- three fixes related to shutdown and freezing (fixes 4 xfstests, and
closes deferred handles faster in some places that were missed)
* tag '6.4-rc1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
cifs: release leases for deferred close handles when freezing
smb3: fix problem remounting a share after shutdown
SMB3: force unmount was failing to close deferred close files
smb3: improve parallel reads of large files
do not reuse connection if share marked as isolated
cifs: fix pcchunk length type in smb2_copychunk_range
Pull vfs fix from Christian Brauner:
"During the pipe nonblock rework the check for both O_NONBLOCK and
IOCB_NOWAIT was dropped. Both checks need to be performed to ensure
that files without O_NONBLOCK but IOCB_NOWAIT don't block when writing
to or reading from a pipe.
This just contains the fix adding the check for IOCB_NOWAIT back in"
* tag 'vfs/v6.4-rc1/pipe' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs:
pipe: check for IOCB_NOWAIT alongside O_NONBLOCK
Pull io_uring fix from Jens Axboe:
"Just a single fix making io_uring_sqe_cmd() available regardless of
CONFIG_IO_URING, fixing a regression introduced during the merge
window if nvme was selected but io_uring was not"
* tag 'io_uring-6.4-2023-05-12' of git://git.kernel.dk/linux:
io_uring: make io_uring_sqe_cmd() unconditionally available
Pull RISC-V fix from Palmer Dabbelt:
"Just a single fix this week for a build issue. That'd usually be a
good sign, but we've started to get some reports of boot failures on
some hardware/bootloader configurations. Nothing concrete yet, but
I've got a funny feeling that's where much of the bug hunting is going
right now.
Nothing's reproducing on my end, though, and this fixes some pretty
concrete issues so I figured there's no reason to delay it:
- a fix to the linker script to avoid orpahaned sections in
kernel/pi"
* tag 'riscv-for-linus-6.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: Fix orphan section warnings caused by kernel/pi