Commit Graph

1215403 Commits

Author SHA1 Message Date
Ard Biesheuvel
e7761d827e efi/unaccepted: Use ACPI reclaim memory for unaccepted memory table
Kyril reports that crashkernels fail to work on confidential VMs that
rely on the unaccepted memory table, and this appears to be caused by
the fact that it is not considered part of the set of firmware tables
that the crashkernel needs to map.

This is an oversight, and a result of the use of the EFI_LOADER_DATA
memory type for this table. The correct memory type to use for any
firmware table is EFI_ACPI_RECLAIM_MEMORY (including ones created by the
EFI stub), even though the name suggests that is it specific to ACPI.
ACPI reclaim means that the memory is used by the firmware to expose
information to the operating system, but that the memory region has no
special significance to the firmware itself, and the OS is free to
reclaim the memory and use it as ordinary memory if it is not interested
in the contents, or if it has already consumed them. In Linux, this
memory is never reclaimed, but it is always covered by the kernel direct
map and generally made accessible as ordinary memory.

On x86, ACPI reclaim memory is translated into E820_ACPI, which the
kexec logic already recognizes as memory that the crashkernel may need
to to access, and so it will be mapped and accessible to the booting
crash kernel.

Fixes: 745e3ed85f ("efi/libstub: Implement support for unaccepted memory")
Reported-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-09-11 06:37:51 +00:00
Ard Biesheuvel
aba7e066c7 efi/x86: Ensure that EFI_RUNTIME_MAP is enabled for kexec
CONFIG_EFI_RUNTIME_MAP needs to be enabled in order for kexec to be able
to provide the required information about the EFI runtime mappings to
the incoming kernel, regardless of whether kexec_load() or
kexec_file_load() is being used. Without this information, kexec boot in
EFI mode is not possible.

The CONFIG_EFI_RUNTIME_MAP option is currently directly configurable if
CONFIG_EXPERT is enabled, so that it can be turned on for debugging
purposes even if KEXEC is not enabled. However, the upshot of this is
that it can also be disabled even when it shouldn't.

So tweak the Kconfig declarations to avoid this situation.

Reported-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-09-11 06:37:50 +00:00
Ard Biesheuvel
762f169f5d efi/x86: Move EFI runtime call setup/teardown helpers out of line
Only the arch_efi_call_virt() macro that some architectures override
needs to be a macro, given that it is variadic and encapsulates calls
via function pointers that have different prototypes.

The associated setup and teardown code are not special in this regard,
and don't need to be instantiated at each call site. So turn them into
ordinary C functions and move them out of line.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-09-11 06:37:50 +00:00
Christophe JAILLET
e97eb65dd4 ata: sata_mv: Fix incorrect string length computation in mv_dump_mem()
snprintf() returns the "number of characters which *would* be generated for
the given input", not the size *really* generated.

In order to avoid too large values for 'o' (and potential negative values
for "sizeof(linebuf) o") use scnprintf() instead of snprintf().

Note that given the "w < 4" in the for loop, the buffer can NOT
overflow, but using the *right* function is always better.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
2023-09-11 15:13:35 +09:00
Niklas Cassel
24e0e61db3 ata: libata: disallow dev-initiated LPM transitions to unsupported states
In AHCI 1.3.1, the register description for CAP.SSC:
"When cleared to ‘0’, software must not allow the HBA to initiate
transitions to the Slumber state via agressive link power management nor
the PxCMD.ICC field in each port, and the PxSCTL.IPM field in each port
must be programmed to disallow device initiated Slumber requests."

In AHCI 1.3.1, the register description for CAP.PSC:
"When cleared to ‘0’, software must not allow the HBA to initiate
transitions to the Partial state via agressive link power management nor
the PxCMD.ICC field in each port, and the PxSCTL.IPM field in each port
must be programmed to disallow device initiated Partial requests."

Ensure that we always set the corresponding bits in PxSCTL.IPM, such that
a device is not allowed to initiate transitions to power states which are
unsupported by the HBA.

DevSleep is always initiated by the HBA, however, for completeness, set the
corresponding bit in PxSCTL.IPM such that agressive link power management
cannot transition to DevSleep if DevSleep is not supported.

sata_link_scr_lpm() is used by libahci, ata_piix and libata-pmp.
However, only libahci has the ability to read the CAP/CAP2 register to see
if these features are supported. Therefore, in order to not introduce any
regressions on ata_piix or libata-pmp, create flags that indicate that the
respective feature is NOT supported. This way, the behavior for ata_piix
and libata-pmp should remain unchanged.

This change is based on a patch originally submitted by Runa Guo-oc.

Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Fixes: 1152b2617a ("libata: implement sata_link_scr_lpm() and make ata_dev_set_feature() global")
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
2023-09-11 15:09:11 +09:00
Linus Torvalds
0bb80ecc33 Linux 6.6-rc1 v6.6-rc1 2023-09-10 16:28:41 -07:00
Linus Torvalds
1548b060d6 Merge tag 'topic/drm-ci-2023-08-31-1' of git://anongit.freedesktop.org/drm/drm
Pull drm ci scripts from Dave Airlie:
 "This is a bunch of ci integration for the freedesktop gitlab instance
  where we currently do upstream userspace testing on diverse sets of
  GPU hardware. From my perspective I think it's an experiment worth
  going with and seeing how the benefits/noise playout keeping these
  files useful.

  Ideally I'd like to get this so we can do pre-merge testing on PRs
  eventually.

  Below is some info from danvet on why we've ended up making the
  decision and how we can roll it back if we decide it was a bad plan.

  Why in upstream?

   - like documentation, testcases, tools CI integration is one of these
     things where you can waste endless amounts of time if you
     accidentally have a version that doesn't match your source code

   - but also like the above, there's a balance, this is the initial cut
     of what we think makes sense to keep in sync vs out-of-tree,
     probably needs adjustment

   - gitlab supports out-of-repo gitlab integration and that's what's
     been used for the kernel in drm, but it results in per-driver
     fragmentation and lots of duplicated effort. the simple act of
     smashing an arbitrary winner into a topic branch already started
     surfacing patches on dri-devel and sparking good cross driver team
     discussions

  Why gitlab?

   - it's not any more shit than any of the other CI

   - drm userspace uses it extensively for everything in userspace, we
     have a lot of people and experience with this, including
     integration of hw testing labs

   - media userspace like gstreamer is also on gitlab.fd.o, and there's
     discussion to extend this to the media subsystem in some fashion

  Can this be shared?

   - there's definitely a pile of code that could move to scripts/ if
     other subsystem adopt ci integration in upstream kernel git. other
     bits are more drm/gpu specific like the igt-gpu-tests/tools
     integration

   - docker images can be run locally or in other CI runners

  Will we regret this?

   - it's all in one directory, intentionally, for easy deletion

   - probably 1-2 years in upstream to see whether this is worth it or a
     Big Mistake. that's roughly what it took to _really_ roll out solid
     CI in the bigger userspace projects we have on gitlab.fd.o like
     mesa3d"

* tag 'topic/drm-ci-2023-08-31-1' of git://anongit.freedesktop.org/drm/drm:
  drm: ci: docs: fix build warning - add missing escape
  drm: Add initial ci/ subdirectory
2023-09-10 11:55:26 -07:00
David S. Miller
6eadb0b3d0 Merge branch 'smc-r-fixes'
Guangguan Wang says:

====================
Two fixes for SMC-R
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2023-09-10 19:31:43 +01:00
Guangguan Wang
f5146e3ef0 net/smc: use smc_lgr_list.lock to protect smc_lgr_list.list iterate in smcr_port_add
While doing smcr_port_add, there maybe linkgroup add into or delete
from smc_lgr_list.list at the same time, which may result kernel crash.
So, use smc_lgr_list.lock to protect smc_lgr_list.list iterate in
smcr_port_add.

The crash calltrace show below:
BUG: kernel NULL pointer dereference, address: 0000000000000000
PGD 0 P4D 0
Oops: 0000 [#1] SMP NOPTI
CPU: 0 PID: 559726 Comm: kworker/0:92 Kdump: loaded Tainted: G
Hardware name: Alibaba Cloud Alibaba Cloud ECS, BIOS 449e491 04/01/2014
Workqueue: events smc_ib_port_event_work [smc]
RIP: 0010:smcr_port_add+0xa6/0xf0 [smc]
RSP: 0000:ffffa5a2c8f67de0 EFLAGS: 00010297
RAX: 0000000000000001 RBX: ffff9935e0650000 RCX: 0000000000000000
RDX: 0000000000000010 RSI: ffff9935e0654290 RDI: ffff9935c8560000
RBP: 0000000000000000 R08: 0000000000000000 R09: ffff9934c0401918
R10: 0000000000000000 R11: ffffffffb4a5c278 R12: ffff99364029aae4
R13: ffff99364029aa00 R14: 00000000ffffffed R15: ffff99364029ab08
FS:  0000000000000000(0000) GS:ffff994380600000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 0000000f06a10003 CR4: 0000000002770ef0
PKRU: 55555554
Call Trace:
 smc_ib_port_event_work+0x18f/0x380 [smc]
 process_one_work+0x19b/0x340
 worker_thread+0x30/0x370
 ? process_one_work+0x340/0x340
 kthread+0x114/0x130
 ? __kthread_cancel_work+0x50/0x50
 ret_from_fork+0x1f/0x30

Fixes: 1f90a05d9f ("net/smc: add smcr_port_add() and smcr_link_up() processing")
Signed-off-by: Guangguan Wang <guangguan.wang@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-09-10 19:31:42 +01:00
Guangguan Wang
6912e72483 net/smc: bugfix for smcr v2 server connect success statistic
In the macro SMC_STAT_SERV_SUCC_INC, the smcd_version is used
to determin whether to increase the v1 statistic or the v2
statistic. It is correct for SMCD. But for SMCR, smcr_version
should be used.

Signed-off-by: Guangguan Wang <guangguan.wang@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-09-10 19:31:42 +01:00
Ratheesh Kannoth
88e69af061 octeontx2-pf: Fix page pool cache index corruption.
The access to page pool `cache' array and the `count' variable
is not locked. Page pool cache access is fine as long as there
is only one consumer per pool.

octeontx2 driver fills in rx buffers from page pool in NAPI context.
If system is stressed and could not allocate buffers, refiiling work
will be delegated to a delayed workqueue. This means that there are
two cosumers to the page pool cache.

Either workqueue or IRQ/NAPI can be run on other CPU. This will lead
to lock less access, hence corruption of cache pool indexes.

To fix this issue, NAPI is rescheduled from workqueue context to refill
rx buffers.

Fixes: b2e3406a38 ("octeontx2-pf: Add support for page pool")
Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-09-10 19:29:15 +01:00
Jinjie Ruan
281f65d29d net: microchip: vcap api: Fix possible memory leak for vcap_dup_rule()
Inject fault When select CONFIG_VCAP_KUNIT_TEST, the below memory leak
occurs. If kzalloc() for duprule succeeds, but the following
kmemdup() fails, the duprule, ckf and caf memory will be leaked. So kfree
them in the error path.

unreferenced object 0xffff122744c50600 (size 192):
  comm "kunit_try_catch", pid 346, jiffies 4294896122 (age 911.812s)
  hex dump (first 32 bytes):
    10 27 00 00 04 00 00 00 1e 00 00 00 2c 01 00 00  .'..........,...
    00 00 00 00 00 00 00 00 18 06 c5 44 27 12 ff ff  ...........D'...
  backtrace:
    [<00000000394b0db8>] __kmem_cache_alloc_node+0x274/0x2f8
    [<0000000001bedc67>] kmalloc_trace+0x38/0x88
    [<00000000b0612f98>] vcap_dup_rule+0x50/0x460
    [<000000005d2d3aca>] vcap_add_rule+0x8cc/0x1038
    [<00000000eef9d0f8>] test_vcap_xn_rule_creator.constprop.0.isra.0+0x238/0x494
    [<00000000cbda607b>] vcap_api_rule_remove_in_front_test+0x1ac/0x698
    [<00000000c8766299>] kunit_try_run_case+0xe0/0x20c
    [<00000000c4fe9186>] kunit_generic_run_threadfn_adapter+0x50/0x94
    [<00000000f6864acf>] kthread+0x2e8/0x374
    [<0000000022e639b3>] ret_from_fork+0x10/0x20

Fixes: 814e769320 ("net: microchip: vcap api: Add a storage state to a VCAP rule")
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-09-10 18:51:42 +01:00
Julia Lawall
e73d1ab6cd net: bcmasp: add missing of_node_put
for_each_available_child_of_node performs an of_node_get
on each iteration, so a break out of the loop requires an
of_node_put.

This was done using the Coccinelle semantic patch
iterators/for_each_child.cocci

Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-09-10 18:50:37 +01:00
Juntong Deng
ced33ca07d selftests/net: Improve bind_bhash.sh to accommodate predictable network interface names
Starting with v197, systemd uses predictable interface network names,
the traditional interface naming scheme (eth0) is deprecated, therefore
it cannot be assumed that the eth0 interface exists on the host.

This modification makes the bind_bhash test program run in a separate
network namespace and no longer needs to consider the name of the
network interface on the host.

Signed-off-by: Juntong Deng <juntong.deng@outlook.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-09-10 18:49:29 +01:00
Linus Torvalds
e56b2b6057 Merge tag 'x86-urgent-2023-09-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
 "Fix preemption delays in the SGX code, remove unnecessarily
  UAPI-exported code, fix a ld.lld linker (in)compatibility quirk and
  make the x86 SMP init code a bit more conservative to fix kexec()
  lockups"

* tag 'x86-urgent-2023-09-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/sgx: Break up long non-preemptible delays in sgx_vepc_release()
  x86: Remove the arch_calc_vm_prot_bits() macro from the UAPI
  x86/build: Fix linker fill bytes quirk/incompatibility for ld.lld
  x86/smp: Don't send INIT to non-present and non-booted CPUs
2023-09-10 10:39:31 -07:00
Linus Torvalds
e79dbf03d8 Merge tag 'perf-urgent-2023-09-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 perf event fix from Ingo Molnar:
 "Work around a firmware bug in the uncore PMU driver, affecting certain
  Intel systems"

* tag 'perf-urgent-2023-09-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/uncore: Correct the number of CHAs on EMR
2023-09-10 10:34:46 -07:00
Linus Torvalds
535a265d7f Merge tag 'perf-tools-for-v6.6-1-2023-09-05' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
Pull perf tools updates from Arnaldo Carvalho de Melo:
 "perf tools maintainership:

   - Add git information for perf-tools and perf-tools-next trees and
     branches to the MAINTAINERS file. That is where development now
     takes place and myself and Namhyung Kim have write access, more
     people to come as we emulate other maintainer groups.

  perf record:

   - Record kernel data maps when 'perf record --data' is used, so that
     global variables can be resolved and used in tools that do data
     profiling.

  perf trace:

   - Remove the old, experimental support for BPF events in which a .c
     file was passed as an event: "perf trace -e hello.c" to then get
     compiled and loaded.

     The only known usage for that, that shipped with the kernel as an
     example for such events, augmented the raw_syscalls tracepoints and
     was converted to a libbpf skeleton, reusing all the user space
     components and the BPF code connected to the syscalls.

     In the end just the way to glue the BPF part and the user space
     type beautifiers changed, now being performed by libbpf skeletons.

     The next step is to use BTF to do pretty printing of all syscall
     types, as discussed with Alan Maguire and others.

     Now, on a perf built with BUILD_BPF_SKEL=1 we get most if not all
     path/filenames/strings, some of the networking data structures,
     perf_event_attr, etc, i.e. systemwide tracing of nanosleep calls
     and perf_event_open syscalls while 'perf stat' runs 'sleep' for 5
     seconds:

      # perf trace -a -e *nanosleep,perf* perf stat -e cycles,instructions sleep 5
         0.000 (   9.034 ms): perf/327641 perf_event_open(attr_uptr: { type: 0 (PERF_TYPE_HARDWARE), size: 136, config: 0 (PERF_COUNT_HW_CPU_CYCLES), sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1, enable_on_exec: 1, exclude_guest: 1 }, pid: 327642 (perf), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 3
         9.039 (   0.006 ms): perf/327641 perf_event_open(attr_uptr: { type: 0 (PERF_TYPE_HARDWARE), size: 136, config: 0x1 (PERF_COUNT_HW_INSTRUCTIONS), sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1, enable_on_exec: 1, exclude_guest: 1 }, pid: 327642 (perf-exec), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 4
             ? (           ): gpm/991  ... [continued]: clock_nanosleep())               = 0
        10.133 (           ): sleep/327642 clock_nanosleep(rqtp: { .tv_sec: 5, .tv_nsec: 0 }, rmtp: 0x7ffd36f83ed0) ...
             ? (           ): pool-gsd-smart/3051  ... [continued]: clock_nanosleep())   = 0
        30.276 (           ): gpm/991 clock_nanosleep(rqtp: { .tv_sec: 2, .tv_nsec: 0 }, rmtp: 0x7ffcc6f73710) ...
       223.215 (1000.430 ms): pool-gsd-smart/3051 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f6e7fffec90) = 0
        30.276 (2000.394 ms): gpm/991  ... [continued]: clock_nanosleep())               = 0
      1230.814 (           ): pool-gsd-smart/3051 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f6e7fffec90) ...
      1230.814 (1000.404 ms): pool-gsd-smart/3051  ... [continued]: clock_nanosleep())   = 0
      2030.886 (           ): gpm/991 clock_nanosleep(rqtp: { .tv_sec: 2, .tv_nsec: 0 }, rmtp: 0x7ffcc6f73710) ...
      2237.709 (1000.153 ms): pool-gsd-smart/3051 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f6e7fffec90) = 0
             ? (           ): crond/1172  ... [continued]: clock_nanosleep())            = 0
      3242.699 (           ): pool-gsd-smart/3051 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f6e7fffec90) ...
      2030.886 (2000.385 ms): gpm/991  ... [continued]: clock_nanosleep())               = 0
      3728.078 (           ): crond/1172 clock_nanosleep(rqtp: { .tv_sec: 60, .tv_nsec: 0 }, rmtp: 0x7ffe0971dcf0) ...
      3242.699 (1000.158 ms): pool-gsd-smart/3051  ... [continued]: clock_nanosleep())   = 0
      4031.409 (           ): gpm/991 clock_nanosleep(rqtp: { .tv_sec: 2, .tv_nsec: 0 }, rmtp: 0x7ffcc6f73710) ...
        10.133 (5000.375 ms): sleep/327642  ... [continued]: clock_nanosleep())          = 0

      Performance counter stats for 'sleep 5':

             2,617,347      cycles
             1,855,997      instructions                     #    0.71  insn per cycle

           5.002282128 seconds time elapsed

           0.000855000 seconds user
           0.000852000 seconds sys

  perf annotate:

   - Building with binutils' libopcode now is opt-in (BUILD_NONDISTRO=1)
     for licensing reasons, and we missed a build test on
     tools/perf/tests makefile.

     Since we now default to NDEBUG=1, we ended up segfaulting when
     building with BUILD_NONDISTRO=1 because a needed initialization
     routine was being "error checked" via an assert.

     Fix it by explicitly checking the result and aborting instead if it
     fails.

     We better back propagate the error, but at least 'perf annotate' on
     samples collected for a BPF program is back working when perf is
     built with BUILD_NONDISTRO=1.

  perf report/top:

   - Add back TUI hierarchy mode header, that is seen when using 'perf
     report/top --hierarchy'.

   - Fix the number of entries for 'e' key in the TUI that was
     preventing navigation of lines when expanding an entry.

  perf report/script:

   - Support cross platform register handling, allowing a perf.data file
     collected on one architecture to have registers sampled correctly
     displayed when analysis tools such as 'perf report' and 'perf
     script' are used on a different architecture.

   - Fix handling of event attributes in pipe mode, i.e. when one uses:

  	perf record -o - | perf report -i -

     When no perf.data files are used.

   - Handle files generated via pipe mode with a version of perf and
     then read also via pipe mode with a different version of perf,
     where the event attr record may have changed, use the record size
     field to properly support this version mismatch.

  perf probe:

   - Accessing global variables from uprobes isn't supported, make the
     error message state that instead of stating that some minimal
     kernel version is needed to have that feature. This seems just a
     tool limitation, the kernel probably has all that is needed.

  perf tests:

   - Fix a reference count related leak in the dlfilter v0 API where the
     result of a thread__find_symbol_fb() is not matched with an
     addr_location__exit() to drop the reference counts of the resolved
     components (machine, thread, map, symbol, etc). Add a dlfilter test
     to make sure that doesn't regresses.

   - Lots of fixes for the 'perf test' written in shell script related
     to problems found with the shellcheck utility.

   - Fixes for 'perf test' shell scripts testing features enabled when
     perf is built with BUILD_BPF_SKEL=1, such as 'perf stat' bpf
     counters.

   - Add perf record sample filtering test, things like the following
     example, that gets implemented as a BPF filter attached to the
     event:

       # perf record -e task-clock -c 10000 --filter 'ip < 0xffffffff00000000'

   - Improve the way the task_analyzer test checks if libtraceevent is
     linked, using 'perf version --build-options' instead of the more
     expensinve 'perf record -e "sched:sched_switch"'.

   - Add support for riscv in the mmap-basic test. (This went as well
     via the RiscV tree, same contents).

  libperf:

   - Implement riscv mmap support (This went as well via the RiscV tree,
     same contents).

  perf script:

   - New tool that converts perf.data files to the firefox profiler
     format so that one can use the visualizer at
     https://profiler.firefox.com/. Done by Anup Sharma as part of this
     year's Google Summer of Code.

     One can generate the output and upload it to the web interface but
     Anup also automated everything:

       perf script gecko -F 99 -a sleep 60

   - Support syscall name parsing on arm64.

   - Print "cgroup" field on the same line as "comm".

  perf bench:

   - Add new 'uprobe' benchmark to measure the overhead of uprobes
     with/without BPF programs attached to it.

   - breakpoints are not available on power9, skip that test.

  perf stat:

   - Add #num_cpus_online literal to be used in 'perf stat' metrics, and
     add this extra 'perf test' check that exemplifies its purpose:

  	TEST_ASSERT_VAL("#num_cpus_online",
                         expr__parse(&num_cpus_online, ctx, "#num_cpus_online") == 0);
  	TEST_ASSERT_VAL("#num_cpus", expr__parse(&num_cpus, ctx, "#num_cpus") == 0);
  	TEST_ASSERT_VAL("#num_cpus >= #num_cpus_online", num_cpus >= num_cpus_online);

  Miscellaneous:

   - Improve tool startup time by lazily reading PMU, JSON, sysfs data.

   - Improve error reporting in the parsing of events, passing YYLTYPE
     to error routines, so that the output can show were the parsing
     error was found.

   - Add 'perf test' entries to check the parsing of events
     improvements.

   - Fix various leak for things detected by -fsanitize=address, mostly
     things that would be freed at tool exit, including:

       - Free evsel->filter on the destructor.

       - Allow tools to register a thread->priv destructor and use it in
         'perf trace'.

       - Free evsel->priv in 'perf trace'.

       - Free string returned by synthesize_perf_probe_point() when the
         caller fails to do all it needs.

   - Adjust various compiler options to not consider errors some
     warnings when building with broken headers found in things like
     python, flex, bison, as we otherwise build with -Werror. Some for
     gcc, some for clang, some for some specific version of those, some
     for some specific version of flex or bison, or some specific
     combination of these components, bah.

   - Allow customization of clang options for BPF target, this helps
     building on gentoo where there are other oddities where BPF targets
     gets passed some compiler options intended for the native build, so
     building with WERROR=0 helps while these oddities are fixed.

   - Dont pass ERR_PTR() values to perf_session__delete() in 'perf top'
     and 'perf lock', fixing some segfaults when handling some odd
     failures.

   - Add LTO build option.

   - Fix format of unordered lists in the perf docs
     (tools/perf/Documentation)

   - Overhaul the bison files, using constructs such as YYNOMEM.

   - Remove unused tokens from the bison .y files.

   - Add more comments to various structs.

   - A few LoongArch enablement patches.

  Vendor events (JSON):

   - Add JSON metrics for Yitian 710 DDR (aarch64). Things like:

  	EventName, BriefDescription
  	visible_window_limit_reached_rd, "At least one entry in read queue reaches the visible window limit.",
  	visible_window_limit_reached_wr, "At least one entry in write queue reaches the visible window limit.",
  	op_is_dqsosc_mpc	       , "A DQS Oscillator MPC command to DRAM.",
  	op_is_dqsosc_mrr	       , "A DQS Oscillator MRR command to DRAM.",
  	op_is_tcr_mrr		       , "A Temperature Compensated Refresh(TCR) MRR command to DRAM.",

   - Add AmpereOne metrics (aarch64).

   - Update N2 and V2 metrics (aarch64) and events using Arm telemetry
     repo.

   - Update scale units and descriptions of common topdown metrics on
     aarch64. Things like:
       - "MetricExpr": "stall_slot_frontend / (#slots * cpu_cycles)",
       - "BriefDescription": "Frontend bound L1 topdown metric",
       + "MetricExpr": "100 * (stall_slot_frontend / (#slots * cpu_cycles))",
       + "BriefDescription": "This metric is the percentage of total slots that were stalled due to resource constraints in the frontend of the processor.",

   - Update events for intel: meteorlake to 1.04, sapphirerapids to
     1.15, Icelake+ metric constraints.

   - Update files for the power10 platform"

* tag 'perf-tools-for-v6.6-1-2023-09-05' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (217 commits)
  perf parse-events: Fix driver config term
  perf parse-events: Fixes relating to no_value terms
  perf parse-events: Fix propagation of term's no_value when cloning
  perf parse-events: Name the two term enums
  perf list: Don't print Unit for "default_core"
  perf vendor events intel: Fix modifier in tma_info_system_mem_parallel_reads for skylake
  perf dlfilter: Avoid leak in v0 API test use of resolve_address()
  perf metric: Add #num_cpus_online literal
  perf pmu: Remove str from perf_pmu_alias
  perf parse-events: Make common term list to strbuf helper
  perf parse-events: Minor help message improvements
  perf pmu: Avoid uninitialized use of alias->str
  perf jevents: Use "default_core" for events with no Unit
  perf test stat_bpf_counters_cgrp: Enhance perf stat cgroup BPF counter test
  perf test shell stat_bpf_counters: Fix test on Intel
  perf test shell record_bpf_filter: Skip 6.2 kernel
  libperf: Get rid of attr.id field
  perf tools: Convert to perf_record_header_attr_id()
  libperf: Add perf_record_header_attr_id()
  perf tools: Handle old data in PERF_RECORD_ATTR
  ...
2023-09-09 20:06:17 -07:00
Linus Torvalds
fd3a5940e6 Merge tag '6.6-rc-smb3-client-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6
Pull smb client fixes from Steve French:

 - six smb3 client fixes including ones to allow controlling smb3
   directory caching timeout and limits, and one debugging improvement

 - one fix for nls Kconfig (don't need to expose NLS_UCS2_UTILS option)

 - one minor spnego registry update

* tag '6.6-rc-smb3-client-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6:
  spnego: add missing OID to oid registry
  smb3: fix minor typo in SMB2_GLOBAL_CAP_LARGE_MTU
  cifs: update internal module version number for cifs.ko
  smb3: allow controlling maximum number of cached directories
  smb3: add trace point for queryfs (statfs)
  nls: Hide new NLS_UCS2_UTILS
  smb3: allow controlling length of time directory entries are cached with dir leases
  smb: propagate error code of extract_sharename()
2023-09-09 19:56:23 -07:00
David Howells
a3c57ab79a iov_iter: Kunit tests for page extraction
Add some kunit tests for page extraction for ITER_BVEC, ITER_KVEC and
ITER_XARRAY type iterators.  ITER_UBUF and ITER_IOVEC aren't dealt with
as they require userspace VM interaction.  ITER_DISCARD isn't dealt with
either as that can't be extracted.

Signed-off-by: David Howells <dhowells@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: David Hildenbrand <david@redhat.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-09-09 15:11:49 -07:00
David Howells
2d71340ff1 iov_iter: Kunit tests for copying to/from an iterator
Add some kunit tests for page extraction for ITER_BVEC, ITER_KVEC and
ITER_XARRAY type iterators.  ITER_UBUF and ITER_IOVEC aren't dealt with
as they require userspace VM interaction.  ITER_DISCARD isn't dealt with
either as that does nothing.

Signed-off-by: David Howells <dhowells@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: David Hildenbrand <david@redhat.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-09-09 15:11:49 -07:00
David Howells
f741bd7178 iov_iter: Fix iov_iter_extract_pages() with zero-sized entries
iov_iter_extract_pages() doesn't correctly handle skipping over initial
zero-length entries in ITER_KVEC and ITER_BVEC-type iterators.

The problem is that it accidentally reduces maxsize to 0 when it
skipping and thus runs to the end of the array and returns 0.

Fix this by sticking the calculated size-to-copy in a new variable
rather than back in maxsize.

Fixes: 7d58fe7310 ("iov_iter: Add a function to extract a page list from an iterator")
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: David Hildenbrand <david@redhat.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-09-09 15:11:49 -07:00
Linus Torvalds
6b8bb5b8d9 Merge tag 'sh-for-v6.6-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/glaubitz/sh-linux
Pull sh updates from Adrian Glaubitz:

 - Fix a use-after-free bug in the push-switch driver (Duoming Zhou)

 - Fix calls to dma_declare_coherent_memory() that incorrectly passed
   the buffer end address instead of the buffer size as the size
   parameter

* tag 'sh-for-v6.6-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/glaubitz/sh-linux:
  sh: push-switch: Reorder cleanup operations to avoid use-after-free bug
  sh: boards: Fix CEU buffer size passed to dma_declare_coherent_memory()
2023-09-09 14:46:57 -07:00
Linus Torvalds
1b37a0a2d4 Merge tag 'riscv-for-linus-6.6-mw2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull more RISC-V updates from Palmer Dabbelt:

 - The kernel now dynamically probes for misaligned access speed, as
   opposed to relying on a table of known implementations.

 - Support for non-coherent devices on systems using the Andes AX45MP
   core, including the RZ/Five SoCs.

 - Support for the V extension in ptrace(), again.

 - Support for KASLR.

 - Support for the BPF prog pack allocator in RISC-V.

 - A handful of bug fixes and cleanups.

* tag 'riscv-for-linus-6.6-mw2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (25 commits)
  soc: renesas: Kconfig: For ARCH_R9A07G043 select the required configs if dependencies are met
  riscv: Kconfig.errata: Add dependency for RISCV_SBI in ERRATA_ANDES config
  riscv: Kconfig.errata: Drop dependency for MMU in ERRATA_ANDES_CMO config
  riscv: Kconfig: Select DMA_DIRECT_REMAP only if MMU is enabled
  bpf, riscv: use prog pack allocator in the BPF JIT
  riscv: implement a memset like function for text
  riscv: extend patch_text_nosync() for multiple pages
  bpf: make bpf_prog_pack allocator portable
  riscv: libstub: Implement KASLR by using generic functions
  libstub: Fix compilation warning for rv32
  arm64: libstub: Move KASLR handling functions to kaslr.c
  riscv: Dump out kernel offset information on panic
  riscv: Introduce virtual kernel mapping KASLR
  RISC-V: Add ptrace support for vectors
  soc: renesas: Kconfig: Select the required configs for RZ/Five SoC
  cache: Add L2 cache management for Andes AX45MP RISC-V core
  dt-bindings: cache: andestech,ax45mp-cache: Add DT binding documentation for L2 cache controller
  riscv: mm: dma-noncoherent: nonstandard cache operations support
  riscv: errata: Add Andes alternative ports
  riscv: asm: vendorid_list: Add Andes Technology to the vendors list
  ...
2023-09-09 14:25:11 -07:00
Duoming Zhou
246f80a0b1 sh: push-switch: Reorder cleanup operations to avoid use-after-free bug
The original code puts flush_work() before timer_shutdown_sync()
in switch_drv_remove(). Although we use flush_work() to stop
the worker, it could be rescheduled in switch_timer(). As a result,
a use-after-free bug can occur. The details are shown below:

      (cpu 0)                    |      (cpu 1)
switch_drv_remove()              |
 flush_work()                    |
  ...                            |  switch_timer // timer
                                 |   schedule_work(&psw->work)
 timer_shutdown_sync()           |
 ...                             |  switch_work_handler // worker
 kfree(psw) // free              |
                                 |   psw->state = 0 // use

This patch puts timer_shutdown_sync() before flush_work() to
mitigate the bugs. As a result, the worker and timer will be
stopped safely before the deallocate operations.

Fixes: 9f5e8eee5c ("sh: generic push-switch framework.")
Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Link: https://lore.kernel.org/r/20230802033737.9738-1-duoming@zju.edu.cn
Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
2023-09-09 21:54:20 +02:00
Petr Tesarik
fb60211f37 sh: boards: Fix CEU buffer size passed to dma_declare_coherent_memory()
In all these cases, the last argument to dma_declare_coherent_memory() is
the buffer end address, but the expected value should be the size of the
reserved region.

Fixes: 39fb993038 ("media: arch: sh: ap325rxa: Use new renesas-ceu camera driver")
Fixes: c2f9b05fd5 ("media: arch: sh: ecovec: Use new renesas-ceu camera driver")
Fixes: f3590dc329 ("media: arch: sh: kfr2r09: Use new renesas-ceu camera driver")
Fixes: 186c446f4b ("media: arch: sh: migor: Use new renesas-ceu camera driver")
Fixes: 1a3c230b41 ("media: arch: sh: ms7724se: Use new renesas-ceu camera driver")
Signed-off-by: Petr Tesarik <petr.tesarik.ext@huawei.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com>
Reviewed-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Reviewed-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Link: https://lore.kernel.org/r/20230724120742.2187-1-petrtesarik@huaweicloud.com
Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
2023-09-09 21:54:20 +02:00
Linus Torvalds
2a5a4326e5 Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull more SCSI updates from James Bottomley:
 "Mostly small stragglers that missed the initial merge.

  Driver updates are qla2xxx and smartpqi (mp3sas has a high diffstat
  due to the volatile qualifier removal, fnic due to unused function
  removal and sd.c has a lot of code shuffling to remove forward
  declarations)"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (38 commits)
  scsi: ufs: core: No need to update UPIU.header.flags and lun in advanced RPMB handler
  scsi: ufs: core: Add advanced RPMB support where UFSHCI 4.0 does not support EHS length in UTRD
  scsi: mpt3sas: Remove volatile qualifier
  scsi: mpt3sas: Perform additional retries if doorbell read returns 0
  scsi: libsas: Simplify sas_queue_reset() and remove unused code
  scsi: ufs: Fix the build for the old ARM OABI
  scsi: qla2xxx: Fix unused variable warning in qla2xxx_process_purls_pkt()
  scsi: fnic: Remove unused functions fnic_scsi_host_start/end_tag()
  scsi: qla2xxx: Fix spelling mistake "tranport" -> "transport"
  scsi: fnic: Replace sgreset tag with max_tag_id
  scsi: qla2xxx: Remove unused variables in qla24xx_build_scsi_type_6_iocbs()
  scsi: qla2xxx: Fix nvme_fc_rcv_ls_req() undefined error
  scsi: smartpqi: Change driver version to 2.1.24-046
  scsi: smartpqi: Enhance error messages
  scsi: smartpqi: Enhance controller offline notification
  scsi: smartpqi: Enhance shutdown notification
  scsi: smartpqi: Simplify lun_number assignment
  scsi: smartpqi: Rename pciinfo to pci_info
  scsi: smartpqi: Rename MACRO to clarify purpose
  scsi: smartpqi: Add abort handler
  ...
2023-09-09 12:01:33 -07:00
Linus Torvalds
6b41fb277e Merge tag 'driver-core-6.6-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver symbol lookup fix from Greg KH:
 "Here is one last fixup for your tree for 6.6-rc1. It resolves a
  problem with the way that symbol_get was changed in the module tree
  merge in your tree to fix up the DVB drivers which rely on this old
  api to attach new devices.

  As the changelog comment says:

    In commit 9011e49d54 ("modules: only allow symbol_get of
    EXPORT_SYMBOL_GPL modules") the use of symbol_get is properly
    restricted to GPL-only marked symbols. This interacts oddly with the
    DVB logic which only uses dvb_attach() to load the dvb driver which
    then uses symbol_get().

    Fix this up by properly marking all of the dvb_attach attach symbols
    as EXPORT_SYMBOL_GPL().

  This has been acked by Hans from the V4L driver side, Luis from the
  module side, Mauro on the media side, and Christoph said it was the
  correct solution, and was tested by the original reporter of the
  issue.

  It has passed 0-day testing, but has not been in linux-next due to it
  only being sent yesterday"

* tag 'driver-core-6.6-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
  media: dvb: symbol fixup for dvb_attach()
2023-09-09 11:49:05 -07:00
Linus Torvalds
474197a4f7 Merge tag 'dma-mapping-6.6-2023-09-09' of git://git.infradead.org/users/hch/dma-mapping
Pull dma-mapping fixes from Christoph Hellwig:

 - move a dma-debug call that prints a message out from a lock that's
   causing problems with the lock order in serial drivers (Sergey
   Senozhatsky)

 - fix the CONFIG_DMA_NUMA_CMA Kconfig entry to have the right
   dependency and not default to y (Christoph Hellwig)

 - move an ifdef a bit to remove a __maybe_unused that seems to trip up
   some sensitivities (Christoph Hellwig)

 - revert a bogus check in the CMA allocator (Zhenhua Huang)

* tag 'dma-mapping-6.6-2023-09-09' of git://git.infradead.org/users/hch/dma-mapping:
  Revert "dma-contiguous: check for memory region overlap"
  dma-pool: remove a __maybe_unused label in atomic_pool_expand
  dma-contiguous: fix the Kconfig entry for CONFIG_DMA_NUMA_CMA
  dma-debug: don't call __dma_entry_alloc_check_leak() under free_entries_lock
2023-09-09 11:41:22 -07:00
Linus Torvalds
060249b5d3 Merge tag 'pci-v6.6-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci
Pull PCI fixes from Bjorn Helgaas:

 - Add PCI_DYNAMIC_OF_NODES dependency on OF_IRQ to fix sparc64 build
   error (Lizhi Hou)

 - After coalescing host bridge resources, free any released resources
   to avoid a leak (Ross Lagerwall)

 - Revert a quirk that prevented NVIDIA T4 GPUs from using Secondary Bus
   Reset. The quirk worked around an issue that we now think is related
   to the Root Port, not the GPU (Bjorn Helgaas)

* tag 'pci-v6.6-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci:
  Revert "PCI: Mark NVIDIA T4 GPUs to avoid bus reset"
  PCI: Free released resource after coalescing
  PCI: Fix CONFIG_PCI_DYNAMIC_OF_NODES kconfig dependencies
2023-09-09 11:35:28 -07:00
Linus Torvalds
fa9d4bf5b7 Merge tag 'ntb-6.6' of https://github.com/jonmason/ntb
Pull NTB updates from Jon Mason:
 "Link toggling fixes and debugfs error path fixes"

[ And for everybody like me who always have to remind themselves what
  the TLA of the day is, and what NTB stands for - it's a PCIe
  "Non-Transparent Bridge" thing    - Linus ]

* tag 'ntb-6.6' of https://github.com/jonmason/ntb:
  ntb: Check tx descriptors outstanding instead of head/tail for tx queue
  ntb: Fix calculation ntb_transport_tx_free_entry()
  ntb: Drop packets when qp link is down
  ntb: Clean up tx tail index on link down
  ntb: amd: Drop unnecessary error check for debugfs_create_dir
  NTB: ntb_tool: Switch to memdup_user_nul() helper
  dtivers: ntb: fix parameter check in perf_setup_dbgfs()
  ntb: Remove error checking for debugfs_create_dir()
2023-09-09 11:30:16 -07:00
Jeff Layton
fdd2630a73 nfsd: fix change_info in NFSv4 RENAME replies
nfsd sends the transposed directory change info in the RENAME reply. The
source directory is in save_fh and the target is in current_fh.

Reported-by: Zhi Li <yieli@redhat.com>
Reported-by: Benjamin Coddington <bcodding@redhat.com>
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2218844
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-09-09 13:24:52 -04:00
Steve French
5d153cd128 spnego: add missing OID to oid registry
Add missing OID to the registry. Some servers and clients (including
Windows) now request "NEGOEX - SPNEGEO Extended Negotiation Security")

See https://datatracker.ietf.org/doc/html/draft-zhu-negoex-02

Reviewed-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2023-09-09 08:18:16 -05:00
Greg Kroah-Hartman
86495af117 media: dvb: symbol fixup for dvb_attach()
In commit 9011e49d54 ("modules: only allow symbol_get of
EXPORT_SYMBOL_GPL modules") the use of symbol_get is properly restricted
to GPL-only marked symbols.  This interacts oddly with the DVB logic
which only uses dvb_attach() to load the dvb driver which then uses
symbol_get().

Fix this up by properly marking all of the dvb_attach attach symbols as
EXPORT_SYMBOL_GPL().

Fixes: 9011e49d54 ("modules: only allow symbol_get of EXPORT_SYMBOL_GPL modules")
Cc: stable <stable@kernel.org>
Reported-by: Stefan Lippers-Hollmann <s.l-h@gmx.de>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: linux-media@vger.kernel.org
Cc: linux-modules@vger.kernel.org
Acked-by: Luis Chamberlain <mcgrof@kernel.org>
Acked-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Link: https://lore.kernel.org/r/20230908092035.3815268-2-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-09 08:15:11 +01:00
Linus Torvalds
6099776f9f Merge tag '6.6-rc-ksmbd' of git://git.samba.org/ksmbd
Pull smb server update from Steve French:
 "After two years, many fixes and much testing, ksmbd is no longer
  experimental"

* tag '6.6-rc-ksmbd' of git://git.samba.org/ksmbd:
  ksmbd: remove experimental warning
2023-09-08 22:01:55 -07:00
Linus Torvalds
3095dd99dd Merge tag 'xarray-6.6' of git://git.infradead.org/users/willy/xarray
Pull xarray fixes from Matthew Wilcox:

 - Fix a bug encountered by people using bittorrent where they'd get
   NULL pointer dereferences on page cache lookups when using XFS

 - Two documentation fixes

* tag 'xarray-6.6' of git://git.infradead.org/users/willy/xarray:
  idr: fix param name in idr_alloc_cyclic() doc
  xarray: Document necessary flag in alloc functions
  XArray: Do not return sibling entries from xa_load()
2023-09-08 21:46:26 -07:00
Linus Torvalds
7402e635ed Merge tag 'block-6.6-2023-09-08' of git://git.kernel.dk/linux
Pull block fixes from Jens Axboe:

 - Fix null_blk polled IO timeout handling (Chengming)

 - Regression fix for swapped arguments in drbd bvec_set_page()
   (Christoph)

 - String length handling fix for s390 dasd (Heiko)

 - Fixes for blk-throttle accounting (Yu)

 - Fix page pinning issue for same page segments (Christoph)

 - Remove redundant file_remove_privs() call (Christoph)

 - Fix a regression in partition handling for devices not supporting
   partitions (Li)

* tag 'block-6.6-2023-09-08' of git://git.kernel.dk/linux:
  drbd: swap bvec_set_page len and offset
  block: fix pin count management when merging same-page segments
  null_blk: fix poll request timeout handling
  s390/dasd: fix string length handling
  block: don't add or resize partition on the disk with GENHD_FL_NO_PART
  block: remove the call to file_remove_privs in blkdev_write_iter
  blk-throttle: consider 'carryover_ios/bytes' in throtl_trim_slice()
  blk-throttle: use calculate_io/bytes_allowed() for throtl_trim_slice()
  blk-throttle: fix wrong comparation while 'carryover_ios/bytes' is negative
  blk-throttle: print signed value 'carryover_bytes/ios' for user
2023-09-08 21:39:54 -07:00
Linus Torvalds
7ccc3ebf0c Merge tag 'io_uring-6.6-2023-09-08' of git://git.kernel.dk/linux
Pull io_uring fixes from Jens Axboe:
 "A few fixes that should go into the 6.6-rc merge window:

   - Fix for a regression this merge window caused by the SQPOLL
     affinity patch, where we can race with SQPOLL thread shutdown and
     cause an oops when trying to set affinity (Gabriel)

   - Fix for a regression this merge window where fdinfo reading with
     for a ring setup with IORING_SETUP_NO_SQARRAY will attempt to
     deference the non-existing SQ ring array (me)

   - Add the patch that allows more finegrained control over who can use
     io_uring (Matteo)

   - Locking fix for a regression added this merge window for IOPOLL
     overflow (Pavel)

   - IOPOLL fix for stable, breaking our loop if helper threads are
     exiting (Pavel)

  Also had a fix for unreaped iopoll requests from io-wq from Ming, but
  we found an issue with that and hence it got reverted. Will get this
  sorted for a future rc"

* tag 'io_uring-6.6-2023-09-08' of git://git.kernel.dk/linux:
  Revert "io_uring: fix IO hang in io_wq_put_and_exit from do_exit()"
  io_uring: fix unprotected iopoll overflow
  io_uring: break out of iowq iopoll on teardown
  io_uring: add a sysctl to disable io_uring system-wide
  io_uring/fdinfo: only print ->sq_array[] if it's there
  io_uring: fix IO hang in io_wq_put_and_exit from do_exit()
  io_uring: Don't set affinity on a dying sqpoll thread
2023-09-08 21:32:28 -07:00
Naveen N Rao
145036f88d selftests/ftrace: Fix dependencies for some of the synthetic event tests
Commit b81a3a100c ("tracing/histogram: Add simple tests for
stacktrace usage of synthetic events") changed the output text in
tracefs README, but missed updating some of the dependencies specified
in selftests. This causes some of the tests to exit as unsupported.

Fix this by changing the grep pattern. Since we want these tests to work
on older kernels, match only against the common last part of the
pattern.

Link: https://lore.kernel.org/linux-trace-kernel/20230614091046.2178539-1-naveen@kernel.org

Cc: <linux-kselftest@vger.kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Fixes: b81a3a100c ("tracing/histogram: Add simple tests for stacktrace usage of synthetic events")
Signed-off-by: Naveen N Rao <naveen@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-09-08 23:13:03 -04:00
Steven Rostedt (Google)
6fdac58c56 tracing: Remove unused trace_event_file dir field
Now that eventfs structure is used to create the events directory via the
eventfs dynamically allocate code, the "dir" field of the trace_event_file
structure is no longer used. Remove it.

Link: https://lkml.kernel.org/r/20230908022001.580400115@goodmis.org

Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ajay Kaher <akaher@vmware.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-09-08 23:13:02 -04:00
Steven Rostedt (Google)
1ef26d8b2c tracing: Use the new eventfs descriptor for print trigger
The check to create the print event "trigger" was using the obsolete "dir"
value of the trace_event_file to determine if it should create the trigger
or not. But that value will now be NULL because it uses the event file
descriptor.

Change it to test the "ef" field of the trace_event_file structure so that
the trace_marker "trigger" file appears again.

Link: https://lkml.kernel.org/r/20230908022001.371815239@goodmis.org

Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ajay Kaher <akaher@vmware.com>
Fixes: 27152bceea ("eventfs: Move tracing/events to eventfs")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-09-08 23:13:01 -04:00
Steven Rostedt (Google)
95a404bd60 ring-buffer: Do not attempt to read past "commit"
When iterating over the ring buffer while the ring buffer is active, the
writer can corrupt the reader. There's barriers to help detect this and
handle it, but that code missed the case where the last event was at the
very end of the page and has only 4 bytes left.

The checks to detect the corruption by the writer to reads needs to see the
length of the event. If the length in the first 4 bytes is zero then the
length is stored in the second 4 bytes. But if the writer is in the process
of updating that code, there's a small window where the length in the first
4 bytes could be zero even though the length is only 4 bytes. That will
cause rb_event_length() to read the next 4 bytes which could happen to be off the
allocated page.

To protect against this, fail immediately if the next event pointer is
less than 8 bytes from the end of the commit (last byte of data), as all
events must be a minimum of 8 bytes anyway.

Link: https://lore.kernel.org/all/20230905141245.26470-1-Tze-nan.Wu@mediatek.com/
Link: https://lore.kernel.org/linux-trace-kernel/20230907122820.0899019c@gandalf.local.home

Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Reported-by: Tze-nan Wu <Tze-nan.Wu@mediatek.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-09-08 23:12:19 -04:00
Steven Rostedt (Google)
d508ee2dd5 tracefs/eventfs: Free top level files on removal
When an instance is removed, the top level files of the eventfs directory
are not cleaned up. Call the eventfs_remove() on each of the entries to
free them.

This was found via kmemleak:

unreferenced object 0xffff8881047c1280 (size 96):
  comm "mkdir", pid 924, jiffies 4294906489 (age 2013.077s)
  hex dump (first 32 bytes):
    18 31 ed 03 81 88 ff ff 00 31 09 24 81 88 ff ff  .1.......1.$....
    00 00 00 00 00 00 00 00 98 19 7c 04 81 88 ff ff  ..........|.....
  backtrace:
    [<000000000fa46b4d>] kmalloc_trace+0x2a/0xa0
    [<00000000e729cd0c>] eventfs_prepare_ef.constprop.0+0x3a/0x160
    [<000000009032e6a8>] eventfs_add_events_file+0xa0/0x160
    [<00000000fe968442>] create_event_toplevel_files+0x6f/0x130
    [<00000000e364d173>] event_trace_add_tracer+0x14/0x140
    [<00000000411840fa>] trace_array_create_dir+0x52/0xf0
    [<00000000967804fa>] trace_array_create+0x208/0x370
    [<00000000da505565>] instance_mkdir+0x6b/0xb0
    [<00000000dc1215af>] tracefs_syscall_mkdir+0x5b/0x90
    [<00000000a8aca289>] vfs_mkdir+0x272/0x380
    [<000000007709b242>] do_mkdirat+0xfc/0x1d0
    [<00000000c0b6d219>] __x64_sys_mkdir+0x78/0xa0
    [<0000000097b5dd4b>] do_syscall_64+0x3f/0x90
    [<00000000a3f00cfa>] entry_SYSCALL_64_after_hwframe+0x6e/0xd8
unreferenced object 0xffff888103ed3118 (size 8):
  comm "mkdir", pid 924, jiffies 4294906489 (age 2013.077s)
  hex dump (first 8 bytes):
    65 6e 61 62 6c 65 00 00                          enable..
  backtrace:
    [<0000000010f75127>] __kmalloc_node_track_caller+0x51/0x160
    [<000000004b3eca91>] kstrdup+0x34/0x60
    [<0000000050074d7a>] eventfs_prepare_ef.constprop.0+0x53/0x160
    [<000000009032e6a8>] eventfs_add_events_file+0xa0/0x160
    [<00000000fe968442>] create_event_toplevel_files+0x6f/0x130
    [<00000000e364d173>] event_trace_add_tracer+0x14/0x140
    [<00000000411840fa>] trace_array_create_dir+0x52/0xf0
    [<00000000967804fa>] trace_array_create+0x208/0x370
    [<00000000da505565>] instance_mkdir+0x6b/0xb0
    [<00000000dc1215af>] tracefs_syscall_mkdir+0x5b/0x90
    [<00000000a8aca289>] vfs_mkdir+0x272/0x380
    [<000000007709b242>] do_mkdirat+0xfc/0x1d0
    [<00000000c0b6d219>] __x64_sys_mkdir+0x78/0xa0
    [<0000000097b5dd4b>] do_syscall_64+0x3f/0x90
    [<00000000a3f00cfa>] entry_SYSCALL_64_after_hwframe+0x6e/0xd8

Link: https://lore.kernel.org/linux-trace-kernel/20230907175859.6fedbaa2@gandalf.local.home

Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Ajay Kaher <akaher@vmware.com>
Cc: Zheng Yejian <zhengyejian1@huawei.com>
Cc: Naresh Kamboju <naresh.kamboju@linaro.org>
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Fixes: 5bdcd5f533 eventfs: ("Implement removal of meta data from eventfs")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-09-08 23:12:06 -04:00
Steve French
702c390bc8 smb3: fix minor typo in SMB2_GLOBAL_CAP_LARGE_MTU
There was a minor typo in the define for SMB2_GLOBAL_CAP_LARGE_MTU
      0X00000004 instead of 0x00000004
make it consistent

Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2023-09-08 19:01:16 -05:00
Linus Torvalds
32bf43e4ef Merge tag 'thermal-6.6-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull more thermal control updates from Rafael Wysocki:
 "Eliminate an obsolete thermal zone registration function"

* tag 'thermal-6.6-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  thermal: core: Drop thermal_zone_device_register()
  thermal: Use thermal_tripless_zone_device_register()
  thermal: core: Add function for registering tripless thermal zones
  thermal: core: Clean up headers of thermal zone registration functions
2023-09-08 13:24:00 -07:00
Yu Kuai
99892147f0 md: fix warning for holder mismatch from export_rdev()
Commit a1d7671910 ("md: use mddev->external to select holder in
export_rdev()") fix the problem that 'claim_rdev' is used for
blkdev_get_by_dev() while 'rdev' is used for blkdev_put().

However, if mddev->external is changed from 0 to 1, then 'rdev' is used
for blkdev_get_by_dev() while 'claim_rdev' is used for blkdev_put(). And
this problem can be reporduced reliably by following:

New file: mdadm/tests/23rdev-lifetime

devname=${dev0##*/}
devt=`cat /sys/block/$devname/dev`
pid=""
runtime=2

clean_up_test() {
        pill -9 $pid
        echo clear > /sys/block/md0/md/array_state
}

trap 'clean_up_test' EXIT

add_by_sysfs() {
        while true; do
                echo $devt > /sys/block/md0/md/new_dev
        done
}

remove_by_sysfs(){
        while true; do
                echo remove > /sys/block/md0/md/dev-${devname}/state
        done
}

echo md0 > /sys/module/md_mod/parameters/new_array || die "create md0 failed"

add_by_sysfs &
pid="$pid $!"

remove_by_sysfs &
pid="$pid $!"

sleep $runtime
exit 0

Test cmd:

./test --save-logs --logdir=/tmp/ --keep-going --dev=loop --tests=23rdev-lifetime

Test result:

------------[ cut here ]------------
WARNING: CPU: 0 PID: 960 at block/bdev.c:618 blkdev_put+0x27c/0x330
Modules linked in: multipath md_mod loop
CPU: 0 PID: 960 Comm: test Not tainted 6.5.0-rc2-00121-g01e55c376936-dirty #50
RIP: 0010:blkdev_put+0x27c/0x330
Call Trace:
 <TASK>
 export_rdev.isra.23+0x50/0xa0 [md_mod]
 mddev_unlock+0x19d/0x300 [md_mod]
 rdev_attr_store+0xec/0x190 [md_mod]
 sysfs_kf_write+0x52/0x70
 kernfs_fop_write_iter+0x19a/0x2a0
 vfs_write+0x3b5/0x770
 ksys_write+0x74/0x150
 __x64_sys_write+0x22/0x30
 do_syscall_64+0x40/0x90
 entry_SYSCALL_64_after_hwframe+0x63/0xcd

Fix the problem by recording if 'rdev' is used as holder.

Fixes: a1d7671910 ("md: use mddev->external to select holder in export_rdev()")
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20230825025532.1523008-3-yukuai1@huaweicloud.com
2023-09-08 13:16:40 -07:00
Yu Kuai
7deac114be md: don't dereference mddev after export_rdev()
Except for initial reference, mddev->kobject is referenced by
rdev->kobject, and if the last rdev is freed, there is no guarantee that
mddev is still valid. Hence mddev should not be used anymore after
export_rdev().

This problem can be triggered by following test for mdadm at very
low rate:

New file: mdadm/tests/23rdev-lifetime

devname=${dev0##*/}
devt=`cat /sys/block/$devname/dev`
pid=""
runtime=2

clean_up_test() {
        pill -9 $pid
        echo clear > /sys/block/md0/md/array_state
}

trap 'clean_up_test' EXIT

add_by_sysfs() {
        while true; do
                echo $devt > /sys/block/md0/md/new_dev
        done
}

remove_by_sysfs(){
        while true; do
                echo remove > /sys/block/md0/md/dev-${devname}/state
        done
}

echo md0 > /sys/module/md_mod/parameters/new_array || die "create md0 failed"

add_by_sysfs &
pid="$pid $!"

remove_by_sysfs &
pid="$pid $!"

sleep $runtime
exit 0

Test cmd:

./test --save-logs --logdir=/tmp/ --keep-going --dev=loop --tests=23rdev-lifetime

Test result:

general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b6bcb: 0000 [#4] PREEMPT SMP
CPU: 0 PID: 1292 Comm: test Tainted: G      D W          6.5.0-rc2-00121-g01e55c376936 #562
RIP: 0010:md_wakeup_thread+0x9e/0x320 [md_mod]
Call Trace:
 <TASK>
 mddev_unlock+0x1b6/0x310 [md_mod]
 rdev_attr_store+0xec/0x190 [md_mod]
 sysfs_kf_write+0x52/0x70
 kernfs_fop_write_iter+0x19a/0x2a0
 vfs_write+0x3b5/0x770
 ksys_write+0x74/0x150
 __x64_sys_write+0x22/0x30
 do_syscall_64+0x40/0x90
 entry_SYSCALL_64_after_hwframe+0x63/0xcd

Fix this problem by don't dereference mddev after export_rdev().

Fixes: 3ce94ce5d0 ("md: fix duplicate filename for rdev")
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20230825025532.1523008-2-yukuai1@huaweicloud.com
2023-09-08 13:16:10 -07:00
Linus Torvalds
fd88c59e79 Merge tag 'pm-6.6-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fix from Rafael Wysocki:
 "Fix an Intel RAPL power capping driver regression introduced during
  the 6.5 development cycle (Srinivas Pandruvada)"

* tag 'pm-6.6-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  powercap: intel_rapl: Fix invalid setting of Power Limit 4
2023-09-08 13:16:09 -07:00
Linus Torvalds
d30c0d326b Merge tag 'gpio-fixes-for-v6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull gpio fix from Bartosz Golaszewski:

 - fix a regression in irqchip setup in gpio-zynq

* tag 'gpio-fixes-for-v6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
  gpio: zynq: restore zynq_gpio_irq_reqres/zynq_gpio_irq_relres callbacks
2023-09-08 13:12:59 -07:00
Bjorn Helgaas
5260bd6d36 Revert "PCI: Mark NVIDIA T4 GPUs to avoid bus reset"
This reverts commit d5af729dc2.

d5af729dc2 ("PCI: Mark NVIDIA T4 GPUs to avoid bus reset") avoided
Secondary Bus Reset on the T4 because the reset seemed to not work when the
T4 was directly attached to a Root Port.

But NVIDIA thinks the issue is probably related to some issue with the Root
Port, not with the T4.  The T4 provides neither PM nor FLR reset, so
masking bus reset compromises this device for assignment scenarios.

Revert d5af729dc2 as requested by Wu Zongyong.  This will leave SBR
broken in the specific configuration Wu tested, as it was in v6.5, so Wu
will debug that further.

Link: https://lore.kernel.org/r/ZPqMCDWvITlOLHgJ@wuzongyong-alibaba
Link: https://lore.kernel.org/r/20230908201104.GA305023@bhelgaas
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2023-09-08 15:11:45 -05:00
Linus Torvalds
a3d231e44a Merge tag 'sound-fix-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
 "A collection of fixes for 6.6-rc1. All small and easy ones.

   - The corrections of the previous PCM iov_iter transitions

   - Regression fixes in MIDI 2.0 / USB changes

   - Various ASoC codec fixes for Cirrus, Realtek, WCD

   - ASoC AMD quirks and ASoC Intel AVS driver workaround"

* tag 'sound-fix-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (21 commits)
  ALSA: hda/realtek - ALC287 I2S speaker platform support
  ASoC: amd: yc: Fix a non-functional mic on Lenovo 82TL
  ASoC: Intel: avs: Provide support for fallback topology
  ALSA: seq: Fix snd_seq_expand_var_event() call to user-space
  ALSA: usb-audio: Fix potential memory leaks at error path for UMP open
  ALSA: hda/cirrus: Fix broken audio on hardware with two CS42L42 codecs.
  ASoC: rt5645: NULL pointer access when removing jack
  ASoC: amd: yc: Add DMI entries to support Victus by HP Gaming Laptop 15-fb0xxx (8A3E)
  MAINTAINERS: Update the MAINTAINERS enties for TEXAS INSTRUMENTS ASoC DRIVERS
  ALSA: sb: Fix wrong argument in commented code
  ALSA: pcm: Fix error checks of default read/write copy ops
  ASoC: Name iov_iter argument as iterator instead of buffer
  ASoC: dmaengine: Drop unused iov_iter for process callback
  ALSA: hda/tas2781: Use standard clamp() macro
  ASoC: cs35l56: Waiting for firmware to boot must be tolerant of I/O errors
  ASoC: dt-bindings: fsl_easrc: Add support for imx8mp-easrc
  ASoC: cs42l43: Fix missing error code in cs42l43_codec_probe()
  ASoC: cs35l45: Rename DACPCM1 Source control
  ASoC: cs35l45: Fix "Dead assigment" warning
  ASoC: cs35l45: Add support for Chip ID 0x35A460
  ...
2023-09-08 13:07:50 -07:00