When filesystem's ->get_block function does not map the buffer head when
called from __mpage_writepage(), __mpage_writepage() will happily go and
pass bogus bdev and block number to bio allocation routines which leads to
crashes sooner or later.
E.g. UDF can do this because it doesn't want to allocate blocks from
->writepages callbacks. It allocates blocks on write or page fault but
writeback can still spot dirty buffers without underlying blocks allocated
e.g. if blocksize < pagesize, the tail page is dirtied (which means all
its buffers are dirtied), and truncate extends the file so that some
buffer starts to be within i_size.
Link: https://lkml.kernel.org/r/20230126085155.26395-1-jack@suse.cz
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Use try_cmpxchg instead of cmpxchg (*ptr, old, new) == old in
ext4_update_bh_state. x86 CMPXCHG instruction returns success in ZF flag,
so this change saves a compare after cmpxchg (and related move instruction
in front of cmpxchg).
Also, try_cmpxchg implicitly assigns old *ptr value to "old" when cmpxchg
fails. There is no need to re-read the value in the loop.
No functional change intended.
Link: https://lkml.kernel.org/r/20221102071147.6642-1-ubizjak@gmail.com
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Patch series "checkpatch.pl: warn about discouraged tags and missing Link:
tags", v4.
The first two changes make checkpatch.pl check for a few mistakes wrt to
links to bug reports Linus recently complained about a few times.
Avoiding those is also important for my regression tracking efforts a lot,
as the automated tracking performed by regzbot relies on the proper usage
of the Link: tag.
The third patch fixes a few small oddities noticed in existing code during
review of the two changes.
This patch (of 3):
Issue a warning when encountering URLs behind unknown tags, as Linus
recently stated ```please stop making up random tags that make no sense.
Just use "Link:"```[1]. That statement was triggered by an use of
'BugLink', but that's not the only tag people invented:
$ git log -100000 --no-merges --format=email -P \
--grep='^\w+:[ ]*http' | grep -Poh '^\w+:[ ]*http' | \
sort | uniq -c | sort -rn | head -n 20
103958 Link: http
418 BugLink: http
372 Patchwork: http
280 Closes: http
224 Bug: http
123 References: http
84 Bugzilla: http
61 URL: http
42 v1: http
38 Datasheet: http
20 v2: http
9 Ref: http
9 Fixes: http
9 Buglink: http
8 v3: http
8 Reference: http
7 See: http
6 1: http
5 link: http
3 Link:http
Some of these non-standard tags make it harder for external tools that
rely on use of proper tags. One of those tools is the regression tracking
bot 'regzbot', which looks out for "Link:" tags pointing to reports of
tracked regressions.
The initial idea was to use a disallow list to raise an error when
encountering known unwanted tags like BugLink:; during review it was
requested to use a list of allowed tags instead[2].
Link: https://lkml.kernel.org/r/cover.1674217480.git.linux@leemhuis.info
Link: https://lore.kernel.org/all/CAHk-=wgs38ZrfPvy=nOwVkVzjpM3VFU1zobP37Fwd_h9iAD5JQ@mail.gmail.com/ [1]
Link: https://lore.kernel.org/all/15f7df96d49082fb7799dda6e187b33c84f38831.camel@perches.com/ [2]
Link: https://lkml.kernel.org/r/3b036087d80b8c0e07a46a1dbaaf4ad0d018f8d5.1674217480.git.linux@leemhuis.info
Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Co-developed-by: Thorsten Leemhuis <linux@leemhuis.info>
Signed-off-by: Thorsten Leemhuis <linux@leemhuis.info>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: Dwaipayan Ray <dwaipayanray1@gmail.com>
Cc: Joe Perches <joe@perches.com>
Cc: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Use try_cmpxchg instead of cmpxchg (*ptr, old, new) == old in
{set,clear}_bits_ll. x86 CMPXCHG instruction returns success in ZF flag,
so this change saves a compare after cmpxchg (and related move instruction
in front of cmpxchg).
Also, try_cmpxchg implicitly assigns old *ptr value to "old" when cmpxchg
fails.
Note that the value from *ptr should be read using READ_ONCE to prevent
the compiler from merging, refetching or reordering the read.
The patch also declares these two functions inline, to ensure inlining.
No functional change intended.
Link: https://lkml.kernel.org/r/20230118150703.4024-1-ubizjak@gmail.com
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
kexec allows replacing the current kernel with a different one. This is
usually a source of concerns for sysadmins that want to harden a system.
Linux already provides a way to disable loading new kexec kernel via
kexec_load_disabled, but that control is very coard, it is all or nothing
and does not make distinction between a panic kexec and a normal kexec.
This patch introduces new sysctl parameters, with finer tuning to specify
how many times a kexec kernel can be loaded. The sysadmin can set
different limits for kexec panic and kexec reboot kernels. The value can
be modified at runtime via sysctl, but only with a stricter value.
With these new parameters on place, a system with loadpin and verity
enabled, using the following kernel parameters:
sysctl.kexec_load_limit_reboot=0 sysct.kexec_load_limit_panic=1 can have a
good warranty that if initrd tries to load a panic kernel, a malitious
user will have small chances to replace that kernel with a different one,
even if they can trigger timeouts on the disk where the panic kernel
lives.
Link: https://lkml.kernel.org/r/20221114-disable-kexec-reset-v6-3-6a8531a09b9a@chromium.org
Signed-off-by: Ricardo Ribalda <ribalda@chromium.org>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Acked-by: Baoquan He <bhe@redhat.com>
Cc: Bagas Sanjaya <bagasdotme@gmail.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Guilherme G. Piccoli <gpiccoli@igalia.com> # Steam Deck
Cc: Joel Fernandes (Google) <joel@joelfernandes.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Philipp Rudo <prudo@redhat.com>
Cc: Ross Zwisler <zwisler@kernel.org>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Fix many W=1 kernel-doc warnings in fs/ntfs/:
fs/ntfs/aops.c:30: warning: Incorrect use of kernel-doc format: * ntfs_end_buffer_async_read - async io completion for reading attributes
fs/ntfs/aops.c:46: warning: expecting prototype for aops.c(). Prototype was for ntfs_end_buffer_async_read() instead
fs/ntfs/aops.c:1655: warning: cannot understand function prototype: 'const struct address_space_operations ntfs_normal_aops = '
fs/ntfs/aops.c:1670: warning: cannot understand function prototype: 'const struct address_space_operations ntfs_compressed_aops = '
fs/ntfs/aops.c:1685: warning: cannot understand function prototype: 'const struct address_space_operations ntfs_mst_aops = '
fs/ntfs/compress.c:22: warning: Incorrect use of kernel-doc format: * ntfs_compression_constants - enum of constants used in the compression code
fs/ntfs/compress.c:24: warning: cannot understand function prototype: 'typedef enum '
fs/ntfs/compress.c:47: warning: cannot understand function prototype: 'u8 *ntfs_compression_buffer; '
fs/ntfs/compress.c:52: warning: expecting prototype for ntfs_cb_lock(). Prototype was for DEFINE_SPINLOCK() instead
fs/ntfs/dir.c:21: warning: Incorrect use of kernel-doc format: * The little endian Unicode string $I30 as a global constant.
fs/ntfs/dir.c:23: warning: cannot understand function prototype: 'ntfschar I30[5] = '
fs/ntfs/inode.c:31: warning: Incorrect use of kernel-doc format: * ntfs_test_inode - compare two (possibly fake) inodes for equality
fs/ntfs/inode.c:47: warning: expecting prototype for inode.c(). Prototype was for ntfs_test_inode() instead
fs/ntfs/inode.c:2956: warning: expecting prototype for ntfs_write_inode(). Prototype was for __ntfs_write_inode() instead
fs/ntfs/mft.c:24: warning: expecting prototype for mft.c - NTFS kernel mft record operations. Part of the Linux(). Prototype was for MAX_BHS() instead
fs/ntfs/namei.c:263: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
* Inode operations for directories.
fs/ntfs/namei.c:368: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
* Export operations allowing NFS exporting of mounted NTFS partitions.
fs/ntfs/runlist.c:16: warning: Incorrect use of kernel-doc format: * ntfs_rl_mm - runlist memmove
fs/ntfs/runlist.c:22: warning: expecting prototype for runlist.c - NTFS runlist handling code. Part of the Linux(). Prototype was for ntfs_rl_mm() instead
fs/ntfs/super.c:61: warning: missing initial short description on line:
* simple_getbool -
fs/ntfs/super.c:2661: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
* The complete super operations.
Link: https://lkml.kernel.org/r/20230109010041.21442-1-rdunlap@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Anton Altaparmakov <anton@tuxera.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Fix multiple kernel-doc warnings in freevxfs:
fs/freevxfs/vxfs_subr.c:45: warning: Function parameter or member 'mapping' not described in 'vxfs_get_page'
fs/freevxfs/vxfs_subr.c:45: warning: Excess function parameter 'ip' description in 'vxfs_get_page'
2 warnings
fs/freevxfs/vxfs_subr.c:101: warning: expecting prototype for vxfs_get_block(). Prototype was for vxfs_getblk() instead
fs/freevxfs/vxfs_super.c:184: warning: expecting prototype for vxfs_read_super(). Prototype was for vxfs_fill_super() instead
Link: https://lkml.kernel.org/r/20230109022915.17504-1-rdunlap@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
This command provides a way to traverse the entire page hierarchy by a
given virtual address on x86. In addition to qemu's commands info
tlb/info mem it provides the complete information about the paging
structure for an arbitrary virtual address. It supports 4KB/2MB/1GB and 5
level paging.
Here is an example output for 2MB success translation:
(gdb) translate-vm address
cr3:
cr3 binary data 0x1085be003
next entry physical address 0x1085be000
---
bit 3 page level write through False
bit 4 page level cache disabled False
level 4:
entry address 0xffff8881085be7f8
page entry binary data 0x800000010ac83067
next entry physical address 0x10ac83000
---
bit 0 entry present True
bit 1 read/write access allowed True
bit 2 user access allowed True
bit 3 page level write through False
bit 4 page level cache disabled False
bit 5 entry has been accessed True
bit 7 page size False
bit 11 restart to ordinary False
bit 63 execute disable True
level 3:
entry address 0xffff88810ac83a48
page entry binary data 0x101af7067
next entry physical address 0x101af7000
---
bit 0 entry present True
bit 1 read/write access allowed True
bit 2 user access allowed True
bit 3 page level write through False
bit 4 page level cache disabled False
bit 5 entry has been accessed True
bit 7 page size False
bit 11 restart to ordinary False
bit 63 execute disable False
level 2:
entry address 0xffff888101af7368
page entry binary data 0x80000001634008e7
page size 2MB
page physical address 0x163400000
---
bit 0 entry present True
bit 1 read/write access allowed True
bit 2 user access allowed True
bit 3 page level write through False
bit 4 page level cache disabled False
bit 5 entry has been accessed True
bit 7 page size True
bit 6 page dirty True
bit 8 global translation False
bit 11 restart to ordinary True
bit 12 pat False
bits (59, 62) protection key 0
bit 63 execute disable True
[dmitrii.bundin.a@gmail.com: add SPDX line, other tweaks]
Link: https://lkml.kernel.org/r/20230113175151.22278-1-dmitrii.bundin.a@gmail.com
[akpm@linux-foundation.org: s/physicall/physical/]
Link: https://lkml.kernel.org/r/20230102171014.31408-1-dmitrii.bundin.a@gmail.com
Signed-off-by: Dmitrii Bundin <dmitrii.bundin.a@gmail.com>
Acked by: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jan Kiszka <jan.kiszka@siemens.com>
Cc: Kieran Bingham <kbingham@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
When working on SoC bring-up, (a full) userspace may not be available,
making it hard to benchmark the CPU performance of the system under
development. Still, one may want to have a rough idea of the (relative)
performance of one or more CPU cores, especially when working on e.g. the
clock driver that controls the CPU core clock(s).
Hence make the classical Dhrystone 2.1 benchmark available as a Linux
kernel test module, based on[1].
When built-in, this benchmark can be run without any userspace present.
Parallel runs (run on multiple CPU cores) are supported, just kick the
"run" file multiple times.
Note that the actual figures depend on the configuration options that
control compiler optimization (e.g. CONFIG_CC_OPTIMIZE_FOR_SIZE vs.
CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE), and on the compiler options used when
building the kernel in general. Hence numbers may differ from those
obtained by running similar benchmarks in userspace.
[1] https://github.com/qris/dhrystone-deb.git
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lkml.kernel.org/r/4d07ad990740a5f1e426ce4566fb514f60ec9bdd.1670509558.git.geert+renesas@glider.be
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: David Gow <davidgow@google.com>
[geert+renesas@glider.be: fix uninitialized use of ret]
Link: https://lkml.kernel.org/r/alpine.DEB.2.22.394.2212190857310.137329@ramsan.of.borg
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Pull x86 fixes from Borislav Petkov:
- Make sure the poking PGD is pinned for Xen PV as it requires it this
way
- Fixes for two resctrl races when moving a task or creating a new
monitoring group
- Fix SEV-SNP guests running under HyperV where MTRRs are disabled to
not return a UC- type mapping type on memremap() and thus cause a
serious slowdown
- Fix insn mnemonics in bioscall.S now that binutils is starting to fix
confusing insn suffixes
* tag 'x86_urgent_for_v6.2_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mm: fix poking_init() for Xen PV guests
x86/resctrl: Fix event counts regression in reused RMIDs
x86/resctrl: Fix task CLOSID/RMID update race
x86/pat: Fix pat_x_mtrr_type() for MTRR disabled case
x86/boot: Avoid using Intel mnemonics in AT&T syntax asm
Pull EDAC fixes from Borislav Petkov:
- Fix the EDAC device's confusion in the polling setting units
- Fix a memory leak in highbank's probing function
* tag 'edac_urgent_for_v6.2_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
EDAC/highbank: Fix memory leak in highbank_mc_probe()
EDAC/device: Fix period calculation in edac_device_reset_delay_period()
Pull powerpc fixes from Michael Ellerman:
- Fix a build failure with some versions of ld that have an odd version
string
- Fix incorrect use of mutex in the IMC PMU driver
Thanks to Kajol Jain, Michael Petlan, Ojaswin Mujoo, Peter Zijlstra, and
Yang Yingliang.
* tag 'powerpc-6.2-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/64s/hash: Make stress_hpt_timer_fn() static
powerpc/imc-pmu: Fix use of mutex in IRQs disabled section
powerpc/boot: Fix incorrect version calculation issue in ld_version
Pull memblock fix from Mike Rapoport:
"memblock: always release pages to the buddy allocator in
memblock_free_late()
If CONFIG_DEFERRED_STRUCT_PAGE_INIT is enabled, memblock_free_pages()
only releases pages to the buddy allocator if they are not in the
deferred range. This is correct for free pages (as defined by
for_each_free_mem_pfn_range_in_zone()) because free pages in the
deferred range will be initialized and released as part of the
deferred init process.
memblock_free_pages() is called by memblock_free_late(), which is used
to free reserved ranges after memblock_free_all() has run. All pages
in reserved ranges have been initialized at that point, and
accordingly, those pages are not touched by the deferred init process.
This means that currently, if the pages that memblock_free_late()
intends to release are in the deferred range, they will never be
released to the buddy allocator. They will forever be reserved.
In addition, memblock_free_pages() calls kmsan_memblock_free_pages(),
which is also correct for free pages but is not correct for reserved
pages. KMSAN metadata for reserved pages is initialized by
kmsan_init_shadow(), which runs shortly before memblock_free_all().
For both of these reasons, memblock_free_pages() should only be called
for free pages, and memblock_free_late() should call
__free_pages_core() directly instead.
One case where this issue can occur in the wild is EFI boot on x86_64.
The x86 EFI code reserves all EFI boot services memory ranges via
memblock_reserve() and frees them later via memblock_free_late()
(efi_reserve_boot_services() and efi_free_boot_services(),
respectively).
If any of those ranges happens to fall within the deferred init range,
the pages will not be released and that memory will be unavailable.
For example, on an Amazon EC2 t3.micro VM (1 GB) booting via EFI:
v6.2-rc2:
Node 0, zone DMA
spanned 4095
present 3999
managed 3840
Node 0, zone DMA32
spanned 246652
present 245868
managed 178867
v6.2-rc2 + patch:
Node 0, zone DMA
spanned 4095
present 3999
managed 3840
Node 0, zone DMA32
spanned 246652
present 245868
managed 222816 # +43,949 pages"
* tag 'fixes-2023-01-14' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock:
mm: Always release pages to the buddy allocator in memblock_free_late().
Pull kernel hardening fixes from Kees Cook:
- Fix CFI hash randomization with KASAN (Sami Tolvanen)
- Check size of coreboot table entry and use flex-array
* tag 'hardening-v6.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
kbuild: Fix CFI hash randomization with KASAN
firmware: coreboot: Check size of table entry and use flex-array
Pull module fix from Luis Chamberlain:
"Just one fix for modules by Nick"
* tag 'modules-6.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux:
kallsyms: Fix scheduling with interrupts disabled in self-test
Pull cifs fixes from Steve French:
- memory leak and double free fix
- two symlink fixes
- minor cleanup fix
- two smb1 fixes
* tag '6.2-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
cifs: Fix uninitialized memory read for smb311 posix symlink create
cifs: fix potential memory leaks in session setup
cifs: do not query ifaces on smb1 mounts
cifs: fix double free on failed kerberos auth
cifs: remove redundant assignment to the variable match
cifs: fix file info setting in cifs_open_file()
cifs: fix file info setting in cifs_query_path_info()
Pull SCSI fixes from James Bottomley:
"Two minor fixes in the hisi_sas driver which only impact enterprise
style multi-expander and shared disk situations and no core changes"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: hisi_sas: Set a port invalid only if there are no devices attached when refreshing port id
scsi: hisi_sas: Use abort task set to reset SAS disks when discovered
Pull ATA fix from Damien Le Moal:
"A single fix to prevent building the pata_cs5535 driver with user mode
linux as it uses msr operations that are not defined with UML"
* tag 'ata-6.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata:
ata: pata_cs5535: Don't build on UML
Pull block fixes from Jens Axboe:
"Nothing major in here, just a collection of NVMe fixes and dropping a
wrong might_sleep() that static checkers tripped over but which isn't
valid"
* tag 'block-6.2-2023-01-13' of git://git.kernel.dk/linux:
MAINTAINERS: stop nvme matching for nvmem files
nvme: don't allow unprivileged passthrough on partitions
nvme: replace the "bool vec" arguments with flags in the ioctl path
nvme: remove __nvme_ioctl
nvme-pci: fix error handling in nvme_pci_enable()
nvme-pci: add NVME_QUIRK_IDENTIFY_CNS quirk to Apple T2 controllers
nvme-apple: add NVME_QUIRK_IDENTIFY_CNS quirk to fix regression
block: Drop spurious might_sleep() from blk_put_queue()