Commit Graph

1185157 Commits

Author SHA1 Message Date
Stephen Boyd
1a86e99fa0 Merge branches 'clk-starfive', 'clk-fractional' and 'clk-devmof' into clk-next
- Shrink size of clk_fractional_divider a little
 - Convert various clk drivers to devm_of_clk_add_hw_provider()

* clk-starfive:
  clk: starfive: Delete the redundant dev_set_drvdata() in JH7110 clock drivers
  clk: starfive: Avoid casting iomem pointers
  MAINTAINERS: generalise StarFive clk/reset entries
  reset: starfive: Add StarFive JH7110 reset driver
  clk: starfive: Add StarFive JH7110 always-on clock driver
  clk: starfive: Add StarFive JH7110 system clock driver
  reset: starfive: jh71x0: Use 32bit I/O on 32bit registers
  reset: starfive: Rename "jh7100" to "jh71x0" for the common code
  reset: starfive: Extract the common JH71X0 reset code
  reset: starfive: Factor out common JH71X0 reset code
  reset: Create subdirectory for StarFive drivers
  reset: starfive: Replace SOC_STARFIVE with ARCH_STARFIVE
  clk: starfive: Rename "jh7100" to "jh71x0" for the common code
  clk: starfive: Rename clk-starfive-jh7100.h to clk-starfive-jh71x0.h
  clk: starfive: Factor out common JH7100 and JH7110 code
  clk: starfive: Replace SOC_STARFIVE with ARCH_STARFIVE
  dt-bindings: clock: Add StarFive JH7110 always-on clock and reset generator
  dt-bindings: clock: Add StarFive JH7110 system clock and reset generator

* clk-fractional:
  clk: Remove mmask and nmask fields in struct clk_fractional_divider
  clk: rockchip: Remove values for mmask and nmask in struct clk_fractional_divider
  clk: imx: Remove values for mmask and nmask in struct clk_fractional_divider
  clk: Compute masks for fractional_divider clk when needed.

* clk-devmof:
  clk: uniphier: Use managed `of_clk_add_hw_provider()`
  clk: si5351: Use managed `of_clk_add_hw_provider()`
  clk: si570: Use managed `of_clk_add_hw_provider()`
  clk: si514: Use managed `of_clk_add_hw_provider()`
  clk: lmk04832: Use managed `of_clk_add_hw_provider()`
  clk: hsdk-pll: Use managed `of_clk_add_hw_provider()`
  clk: cdce706: Use managed `of_clk_add_hw_provider()`
  clk: axs10x: Use managed `of_clk_add_hw_provider()`
  clk: axm5516: Use managed `of_clk_add_hw_provider()`
  clk: axi-clkgen: Use managed `of_clk_add_hw_provider()`
2023-04-25 11:50:49 -07:00
Stephen Boyd
caca6ad367 Merge branches 'clk-xilinx', 'clk-broadcom' and 'clk-platform' into clk-next
- BCM63268 timer clock and reset controller
 - Convert platform clk drivers to remove_new

* clk-xilinx:
  clocking-wizard: Support higher frequency accuracy
  clk: zynqmp: pll: Remove the limit

* clk-broadcom:
  clk: bcm: Add BCM63268 timer clock and reset driver
  dt-bindings: clock: Add BCM63268 timer binding
  dt-bindings: reset: add BCM63268 timer reset definitions
  dt-bindings: clk: add BCM63268 timer clock definitions

* clk-platform: (25 commits)
  clk: xilinx: Convert to platform remove callback returning void
  clk: x86: Convert to platform remove callback returning void
  clk: uniphier: Convert to platform remove callback returning void
  clk: ti: Convert to platform remove callback returning void
  clk: tegra: Convert to platform remove callback returning void
  clk: stm32: Convert to platform remove callback returning void
  clk: mvebu: Convert to platform remove callback returning void
  clk: mmp: Convert to platform remove callback returning void
  clk: keystone: Convert to platform remove callback returning void
  clk: hisilicon: Convert to platform remove callback returning void
  clk: stm32mp1: Convert to platform remove callback returning void
  clk: scpi: Convert to platform remove callback returning void
  clk: s2mps11: Convert to platform remove callback returning void
  clk: pwm: Convert to platform remove callback returning void
  clk: palmas: Convert to platform remove callback returning void
  clk: hsdk-pll: Convert to platform remove callback returning void
  clk: fixed-rate: Convert to platform remove callback returning void
  clk: fixed-mmio: Convert to platform remove callback returning void
  clk: fixed-factor: Convert to platform remove callback returning void
  clk: axm5516: Convert to platform remove callback returning void
  ...
2023-04-25 11:50:36 -07:00
Stephen Boyd
6f7478e3bb Merge branches 'clk-mediatek', 'clk-sunplus', 'clk-loongson' and 'clk-socfpga' into clk-next
- Frequency Hopping (FHCTL) on MediaTek MT6795, MT8173, MT8192 and
   MT8195 SoCs
 - Converted most Mediatek clock drivers to struct platform_driver
 - MediaTek clock drivers can be built as modules
 - Mediatek MT8188 SoC clk drivers
 - Clock driver for Sunplus SP7021 SoC
 - Reimplement Loongson-1 clk driver with DT support
 - Clk driver support for Loongson-2 SoCs
 - Migrate socfpga clk driver to of_clk_add_hw_provider()

* clk-mediatek: (84 commits)
  clk: mediatek: fhctl: Mark local variables static
  clk: mediatek: Use right match table, include mod_devicetable
  clk: mediatek: Add MT8188 adsp clock support
  clk: mediatek: Add MT8188 imp i2c wrapper clock support
  clk: mediatek: Add MT8188 wpesys clock support
  clk: mediatek: Add MT8188 vppsys1 clock support
  clk: mediatek: Add MT8188 vppsys0 clock support
  clk: mediatek: Add MT8188 vencsys clock support
  clk: mediatek: Add MT8188 vdosys1 clock support
  clk: mediatek: Add MT8188 vdosys0 clock support
  clk: mediatek: Add MT8188 vdecsys clock support
  clk: mediatek: Add MT8188 mfgcfg clock support
  clk: mediatek: Add MT8188 ipesys clock support
  clk: mediatek: Add MT8188 imgsys clock support
  clk: mediatek: Add MT8188 ccusys clock support
  clk: mediatek: Add MT8188 camsys clock support
  clk: mediatek: Add MT8188 infrastructure clock support
  clk: mediatek: Add MT8188 peripheral clock support
  clk: mediatek: Add MT8188 topckgen clock support
  clk: mediatek: Add MT8188 apmixedsys clock support
  ...

* clk-sunplus:
  clk: Add Sunplus SP7021 clock driver

* clk-loongson:
  clk: clk-loongson2: add clock controller driver support
  dt-bindings: clock: add loongson-2 boot clock index
  MAINTAINERS: remove obsolete file entry in MIPS/LOONGSON1 ARCHITECTURE
  MIPS: loongson32: Update the clock initialization
  clk: loongson1: Re-implement the clock driver
  clk: loongson1: Remove the outdated driver
  dt-bindings: clock: Add Loongson-1 clock

* clk-socfpga:
  clk: socfpga: arria10: use of_clk_add_hw_provider and improve error handling
  clk: socfpga: use of_clk_add_hw_provider and improve error handling
  clk: socfpga: arria10: use of_clk_add_hw_provider and improve error handling
  clk: socfpga: use of_clk_add_hw_provider and improve error handling
  clk: socfpga: arria10: use of_clk_add_hw_provider and improve error handling
  clk: socfpga: use of_clk_add_hw_provider and improve error handling
2023-04-25 11:50:08 -07:00
Stephen Boyd
4ec6a2f957 Merge branches 'clk-cleanup', 'clk-aspeed', 'clk-dt', 'clk-renesas' and 'clk-skyworks' into clk-next
- Support for i3c clks on Aspeed ast2600 SoCs
 - Clock driver for Skyworks Si521xx I2C PCIe clock generators

* clk-cleanup:
  clk: microchip: fix potential UAF in auxdev release callback
  clk: sifive: make SiFive clk drivers depend on ARCH_ symbols
  clk: stm32h7: Remove an unused field in struct stm32_fractional_divider
  clk: tegra20: fix gcc-7 constant overflow warning
  clock: milbeaut: use devm_platform_get_and_ioremap_resource()
  clk: Print an info line before disabling unused clocks
  clk: ti: Use of_address_to_resource()
  clk: remove unnecessary (void*) conversions
  clk: at91: clk-sam9x60-pll: fix return value check
  clk: visconti: remove unused visconti_pll_provider::regmap

* clk-aspeed:
  dt-bindings: clock: ast2600: Expand comment on reset definitions
  clk: ast2600: Add comment about combined clock + reset handling
  dt-bindings: clock: ast2600: remove IC36 & I3C7 clock definitions
  clk: ast2600: Add full configs for I3C clocks
  dt-bindings: clock: ast2600: Add top-level I3C clock
  clk: ast2600: allow empty entries in aspeed_g6_gates

* clk-dt:
  clk: mediatek: clk-pllfh: fix missing of_node_put() in fhctl_parse_dt()
  clk: Use of_property_present() for testing DT property presence

* clk-renesas:
  clk: renesas: r8a77980: Add I2C5 clock
  clk: rs9: Add support for 9FGV0441
  clk: rs9: Support device specific dif bit calculation
  dt-bindings: clk: rs9: Add 9FGV0441
  clk: rs9: Check for vendor/device ID
  clk: renesas: Convert to platform remove callback returning void
  clk: renesas: r9a06g032: Improve clock tables
  clk: renesas: r9a06g032: Document structs
  clk: renesas: r9a06g032: Drop unused fields
  clk: renesas: r9a06g032: Improve readability
  clk: renesas: r8a77980: Add Z2 clock
  clk: renesas: r8a77970: Add Z2 clock
  clk: renesas: r8a77995: Fix VIN parent clock
  clk: renesas: r8a77980: Add VIN clocks
  clk: renesas: r8a779g0: Add VIN clocks
  clk: renesas: r8a779g0: Add ISPCS clocks
  clk: renesas: r8a779g0: Add CSI-2 clocks
  clk: renesas: r8a779g0: Add thermal clock
  clk: renesas: r8a779g0: Add Audio clocks
  clk: renesas: cpg-mssr: Update MSSR register range for R-Car V4H

* clk-skyworks:
  clk: si521xx: Clock driver for Skyworks Si521xx I2C PCIe clock generators
  dt-bindings: clk: si521xx: Add Skyworks Si521xx I2C PCIe clock generators
2023-04-25 11:49:50 -07:00
Linus Torvalds
de10553fce Merge tag 'x86-apic-2023-04-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 APIC updates from Thomas Gleixner:

 - Fix the incorrect handling of atomic offset updates in
   reserve_eilvt_offset()

   The check for the return value of atomic_cmpxchg() is not compared
   against the old value, it is compared against the new value, which
   makes it two round on success.

   Convert it to atomic_try_cmpxchg() which does the right thing.

 - Handle IO/APIC less systems correctly

   When IO/APIC is not advertised by ACPI then the computation of the
   lower bound for dynamically allocated interrupts like MSI goes wrong.

   This lower bound is used to exclude the IO/APIC legacy GSI space as
   that must stay reserved for the legacy interrupts.

   In case that the system, e.g. VM, does not advertise an IO/APIC the
   lower bound stays at 0.

   0 is an invalid interrupt number except for the legacy timer
   interrupt on x86. The return value is unchecked in the core code, so
   it ends up to allocate interrupt number 0 which is subsequently
   considered to be invalid by the caller, e.g. the MSI allocation code.

   A similar problem was already cured for device tree based systems
   years ago, but that missed - or did not envision - the zero IO/APIC
   case.

   Consolidate the zero check and return the provided "from" argument to
   the core code call site, which is guaranteed to be greater than 0.

 - Simplify the X2APIC cluster CPU mask logic for CPU hotplug

   Per cluster CPU masks are required for X2APIC in cluster mode to
   determine the correct cluster for a target CPU when calculating the
   destination for IPIs

   These masks are established when CPUs are borught up. The first CPU
   in a cluster must allocate a new cluster CPU mask. As this happens
   during the early startup of a CPU, where memory allocations cannot be
   done, the mask has to be allocated by the control CPU.

   The current implementation allocates a clustermask just in case and
   if the to be brought up CPU is the first in a cluster the CPU takes
   over this allocation from a global pointer.

   This works nicely in the fully serialized CPU bringup scenario which
   is used today, but would fail completely for parallel bringup of
   CPUs.

   The cluster association of a CPU can be computed from the APIC ID
   which is enumerated by ACPI/MADT.

   So the cluster CPU masks can be preallocated and associated upfront
   and the upcoming CPUs just need to set their corresponding bit.

   Aside of preparing for parallel bringup this is a valuable
   simplification on its own.

 - Remove global variables which control the early startup of secondary
   CPUs on 64-bit

   The only information which is needed by a starting CPU is the Linux
   CPU number. The CPU number allows it to retrieve the rest of the
   required data from already existing per CPU storage.

   So instead of initial_stack, early_gdt_desciptor and initial_gs
   provide a new variable smpboot_control which contains the Linux CPU
   number for now. The starting CPU can retrieve and compute all
   required information for startup from there.

   Aside of being a cleanup, this is also preparing for parallel CPU
   bringup, where starting CPUs will look up their Linux CPU number via
   the APIC ID, when smpboot_control has the corresponding control bit
   set.

 - Make cc_vendor globally accesible

   Subsequent parallel bringup changes require access to cc_vendor
   because confidental computing platforms need special treatment in the
   early startup phase vs. CPUID and APCI ID readouts.

   The change makes cc_vendor global and provides stub accessors in case
   that CONFIG_ARCH_HAS_CC_PLATFORM is not set.

   This was merged from the x86/cc branch in anticipation of further
   parallel bringup commits which require access to cc_vendor. Due to
   late discoveries of fundamental issue with those patches these
   commits never happened.

   The merge commit is unfortunately in the middle of the APIC commits
   so unraveling it would have required a rebase or revert. As the
   parallel bringup seems to be well on its way for 6.5 this would be
   just pointless churn. As the commit does not contain any functional
   change it's not a risk to keep it.

* tag 'x86-apic-2023-04-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/ioapic: Don't return 0 from arch_dynirq_lower_bound()
  x86/apic: Fix atomic update of offset in reserve_eilvt_offset()
  x86/coco: Export cc_vendor
  x86/smpboot: Reference count on smpboot_setup_warm_reset_vector()
  x86/smpboot: Remove initial_gs
  x86/smpboot: Remove early_gdt_descr on 64-bit
  x86/smpboot: Remove initial_stack on 64-bit
  x86/apic/x2apic: Allow CPU cluster_mask to be populated in parallel
2023-04-25 11:39:45 -07:00
Linus Torvalds
e7989789c6 Merge tag 'timers-core-2023-04-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timers and timekeeping updates from Thomas Gleixner:

 - Improve the VDSO build time checks to cover all dynamic relocations

   VDSO does not allow dynamic relocations, but the build time check is
   incomplete and fragile.

   It's based on architectures specifying the relocation types to search
   for and does not handle R_*_NONE relocation entries correctly.
   R_*_NONE relocations are injected by some GNU ld variants if they
   fail to determine the exact .rel[a]/dyn_size to cover trailing zeros.
   R_*_NONE relocations must be ignored by dynamic loaders, so they
   should be ignored in the build time check too.

   Remove the architecture specific relocation types to check for and
   validate strictly that no other relocations than R_*_NONE end up in
   the VSDO .so file.

 - Prefer signal delivery to the current thread for
   CLOCK_PROCESS_CPUTIME_ID based posix-timers

   Such timers prefer to deliver the signal to the main thread of a
   process even if the context in which the timer expires is the current
   task. This has the downside that it might wake up an idle thread.

   As there is no requirement or guarantee that the signal has to be
   delivered to the main thread, avoid this by preferring the current
   task if it is part of the thread group which shares sighand.

   This not only avoids waking idle threads, it also distributes the
   signal delivery in case of multiple timers firing in the context of
   different threads close to each other better.

 - Align the tick period properly (again)

   For a long time the tick was starting at CLOCK_MONOTONIC zero, which
   allowed users space applications to either align with the tick or to
   place a periodic computation so that it does not interfere with the
   tick. The alignement of the tick period was more by chance than by
   intention as the tick is set up before a high resolution clocksource
   is installed, i.e. timekeeping is still tick based and the tick
   period advances from there.

   The early enablement of sched_clock() broke this alignement as the
   time accumulated by sched_clock() is taken into account when
   timekeeping is initialized. So the base value now(CLOCK_MONOTONIC) is
   not longer a multiple of tick periods, which breaks applications
   which relied on that behaviour.

   Cure this by aligning the tick starting point to the next multiple of
   tick periods, i.e 1000ms/CONFIG_HZ.

 - A set of NOHZ fixes and enhancements:

     * Cure the concurrent writer race for idle and IO sleeptime
       statistics

       The statitic values which are exposed via /proc/stat are updated
       from the CPU local idle exit and remotely by cpufreq, but that
       happens without any form of serialization. As a consequence
       sleeptimes can be accounted twice or worse.

       Prevent this by restricting the accumulation writeback to the CPU
       local idle exit and let the remote access compute the accumulated
       value.

     * Protect idle/iowait sleep time with a sequence count

       Reading idle/iowait sleep time, e.g. from /proc/stat, can race
       with idle exit updates. As a consequence the readout may result
       in random and potentially going backwards values.

       Protect this by a sequence count, which fixes the idle time
       statistics issue, but cannot fix the iowait time problem because
       iowait time accounting races with remote wake ups decrementing
       the remote runqueues nr_iowait counter. The latter is impossible
       to fix, so the only way to deal with that is to document it
       properly and to remove the assertion in the selftest which
       triggers occasionally due to that.

     * Restructure struct tick_sched for better cache layout

     * Some small cleanups and a better cache layout for struct
       tick_sched

 - Implement the missing timer_wait_running() callback for POSIX CPU
   timers

   For unknown reason the introduction of the timer_wait_running()
   callback missed to fixup posix CPU timers, which went unnoticed for
   almost four years.

   While initially only targeted to prevent livelocks between a timer
   deletion and the timer expiry function on PREEMPT_RT enabled kernels,
   it turned out that fixing this for mainline is not as trivial as just
   implementing a stub similar to the hrtimer/timer callbacks.

   The reason is that for CONFIG_POSIX_CPU_TIMERS_TASK_WORK enabled
   systems there is a livelock issue independent of RT.

   CONFIG_POSIX_CPU_TIMERS_TASK_WORK=y moves the expiry of POSIX CPU
   timers out from hard interrupt context to task work, which is handled
   before returning to user space or to a VM. The expiry mechanism moves
   the expired timers to a stack local list head with sighand lock held.
   Once sighand is dropped the task can be preempted and a task which
   wants to delete a timer will spin-wait until the expiry task is
   scheduled back in. In the worst case this will end up in a livelock
   when the preempting task and the expiry task are pinned on the same
   CPU.

   The timer wheel has a timer_wait_running() mechanism for RT, which
   uses a per CPU timer-base expiry lock which is held by the expiry
   code and the task waiting for the timer function to complete blocks
   on that lock.

   This does not work in the same way for posix CPU timers as there is
   no timer base and expiry for process wide timers can run on any task
   belonging to that process, but the concept of waiting on an expiry
   lock can be used too in a slightly different way.

   Add a per task mutex to struct posix_cputimers_work, let the expiry
   task hold it accross the expiry function and let the deleting task
   which waits for the expiry to complete block on the mutex.

   In the non-contended case this results in an extra
   mutex_lock()/unlock() pair on both sides.

   This avoids spin-waiting on a task which is scheduled out, prevents
   the livelock and cures the problem for RT and !RT systems

* tag 'timers-core-2023-04-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  posix-cpu-timers: Implement the missing timer_wait_running callback
  selftests/proc: Assert clock_gettime(CLOCK_BOOTTIME) VS /proc/uptime monotonicity
  selftests/proc: Remove idle time monotonicity assertions
  MAINTAINERS: Remove stale email address
  timers/nohz: Remove middle-function __tick_nohz_idle_stop_tick()
  timers/nohz: Add a comment about broken iowait counter update race
  timers/nohz: Protect idle/iowait sleep time under seqcount
  timers/nohz: Only ever update sleeptime from idle exit
  timers/nohz: Restructure and reshuffle struct tick_sched
  tick/common: Align tick period with the HZ tick.
  selftests/timers/posix_timers: Test delivery of signals across threads
  posix-timers: Prefer delivery of signals to the current thread
  vdso: Improve cmd_vdso_check to check all dynamic relocations
2023-04-25 11:22:46 -07:00
Linus Torvalds
3f614ab563 Merge tag 'irq-core-2023-04-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull interrupt updates from Thomas Gleixner:
 "Core:

   - Add tracepoints for tasklet callbacks which makes it possible to
     analyze individual tasklet functions instead of guess working from
     the overall duration of tasklet processing

   - Ensure that secondary interrupt threads have their affinity
     adjusted correctly

  Drivers:

   - A large rework of the RISC-V IPI management to prepare for a new
     RISC-V interrupt architecture

   - Small fixes and enhancements all over the place

   - Removal of support for various obsolete hardware platforms and the
     related code"

* tag 'irq-core-2023-04-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits)
  irqchip/st: Remove stih415/stih416 and stid127 platforms support
  irqchip/gic-v3: Add Rockchip 3588001 erratum workaround
  genirq: Update affinity of secondary threads
  softirq: Add trace points for tasklet entry/exit
  irqchip/loongson-pch-pic: Fix pch_pic_acpi_init calling
  irqchip/loongson-pch-pic: Fix registration of syscore_ops
  irqchip/loongson-eiointc: Fix registration of syscore_ops
  irqchip/loongson-eiointc: Fix incorrect use of acpi_get_vec_parent
  irqchip/loongson-eiointc: Fix returned value on parsing MADT
  irqchip/riscv-intc: Add empty irq_eoi() for chained irq handlers
  RISC-V: Use IPIs for remote icache flush when possible
  RISC-V: Use IPIs for remote TLB flush when possible
  RISC-V: Allow marking IPIs as suitable for remote FENCEs
  RISC-V: Treat IPIs as normal Linux IRQs
  irqchip/riscv-intc: Allow drivers to directly discover INTC hwnode
  RISC-V: Clear SIP bit only when using SBI IPI operations
  irqchip/irq-sifive-plic: Add syscore callbacks for hibernation
  irqchip: Use of_property_read_bool() for boolean properties
  irqchip/bcm-6345-l1: Request memory region
  irqchip/gicv3: Workaround for NVIDIA erratum T241-FABRIC-4
  ...
2023-04-25 11:16:08 -07:00
Linus Torvalds
15bbeec0fe Merge tag 'core-entry-2023-04-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull core entry/ptrace update from Thomas Gleixner:
 "Provide a ptrace set/get interface for syscall user dispatch. The main
  purpose is to enable checkpoint/restore (CRIU) to handle processes
  which utilize syscall user dispatch correctly"

* tag 'core-entry-2023-04-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  selftest, ptrace: Add selftest for syscall user dispatch config api
  ptrace: Provide set/get interface for syscall user dispatch
  syscall_user_dispatch: Untag selector address before access_ok()
  syscall_user_dispatch: Split up set_syscall_user_dispatch()
2023-04-25 11:05:04 -07:00
Linus Torvalds
29e95a4b26 Merge tag 'core-debugobjects-2023-04-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull core debugobjects update from Thomas Gleixner:
 "A single update to debugobjects:

  Prevent a race vs statically initialized objects. Such objects are
  usually not initialized via an init() function. They are special cased
  and detected on first use under the assumption that they are already
  correctly initialized via the static initializer.

  This works correctly unless there are two concurrent debug object
  operations on such an object.

  The first one detects that the object is not yet tracked and tries to
  establish a tracking object after dropping the debug objects hash
  bucket lock. The concurrent operation does the same. The one which
  wins the race ends up modifying the state of the object which makes
  the other one fail resulting in a bogus debug objects warning.

  Prevent this by making the detection of a static object and the
  allocation of a tracking object atomic under the hash bucket lock. So
  the first one to acquire the hash bucket lock will succeed and the
  second one will observe the correct tracking state.

  This race existed forever but was only exposed when the timer wheel
  code added a debug_object_assert_init() call outside of the timer base
  locked region. This replaced the previous warning about
  timer::function being NULL which had to be removed when the
  timer_shutdown() mechanics were added"

* tag 'core-debugobjects-2023-04-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  debugobject: Prevent init race with static objects
2023-04-25 11:00:45 -07:00
Linus Torvalds
bc1bb2a49b Merge tag 'x86_sev_for_v6.4_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 SEV updates from Borislav Petkov:

 - Add the necessary glue so that the kernel can run as a confidential
   SEV-SNP vTOM guest on Hyper-V. A vTOM guest basically splits the
   address space in two parts: encrypted and unencrypted. The use case
   being running unmodified guests on the Hyper-V confidential computing
   hypervisor

 - Double-buffer messages between the guest and the hardware PSP device
   so that no partial buffers are copied back'n'forth and thus potential
   message integrity and leak attacks are possible

 - Name the return value the sev-guest driver returns when the hw PSP
   device hasn't been called, explicitly

 - Cleanups

* tag 'x86_sev_for_v6.4_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/hyperv: Change vTOM handling to use standard coco mechanisms
  init: Call mem_encrypt_init() after Hyper-V hypercall init is done
  x86/mm: Handle decryption/re-encryption of bss_decrypted consistently
  Drivers: hv: Explicitly request decrypted in vmap_pfn() calls
  x86/hyperv: Reorder code to facilitate future work
  x86/ioremap: Add hypervisor callback for private MMIO mapping in coco VM
  x86/sev: Change snp_guest_issue_request()'s fw_err argument
  virt/coco/sev-guest: Double-buffer messages
  crypto: ccp: Get rid of __sev_platform_init_locked()'s local function pointer
  crypto: ccp - Name -1 return value as SEV_RET_NO_FW_CALL
2023-04-25 10:48:08 -07:00
Linus Torvalds
c42b59bfaa Merge tag 'x86_paravirt_for_v6.4_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 paravirt updates from Borislav Petkov:

 - Convert a couple of paravirt callbacks to asm to prevent
   '-fzero-call-used-regs' builds from zeroing live registers because
   paravirt hides the CALLs from the compiler so latter doesn't know
   there's a CALL in the first place

 - Merge two paravirt callbacks into one, as their functionality is
   identical

* tag 'x86_paravirt_for_v6.4_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/paravirt: Convert simple paravirt functions to asm
  x86/paravirt: Merge activate_mm() and dup_mmap() callbacks
2023-04-25 10:32:51 -07:00
Linus Torvalds
4a4a28fca6 Merge tag 'x86_misc_for_v6.4_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull misc x86 updates from Borislav Petkov:

 - Add a x86 hw vulnerabilities section to MAINTAINERS so that the folks
   involved in it can get CCed on patches

 - Add some more CPUID leafs to the kcpuid tool and extend its
   functionality to be more useful when grepping for CPUID bits

* tag 'x86_misc_for_v6.4_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  MAINTAINERS: Add x86 hardware vulnerabilities section
  tools/x86/kcpuid: Dump the CPUID function in detailed view
  tools/x86/kcpuid: Update AMD leaf Fn80000001
  tools/x86/kcpuid: Fix avx512bw and avx512lvl fields in Fn00000007
2023-04-25 10:27:02 -07:00
Linus Torvalds
e3420f98f8 Merge tag 'x86_cpu_for_v6.4_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 cpu model updates from Borislav Petkov:

 - Add Emerald Rapids to the list of Intel models supporting PPIN

 - Finally use a CPUID bit for split lock detection instead of
   enumerating every model

 - Make sure automatic IBRS is set on AMD, even though the AP bringup
   code does that now by replicating the MSR which contains the switch

* tag 'x86_cpu_for_v6.4_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/cpu: Add Xeon Emerald Rapids to list of CPUs that support PPIN
  x86/split_lock: Enumerate architectural split lock disable bit
  x86/CPU/AMD: Make sure EFER[AIBRSE] is set
2023-04-25 10:20:52 -07:00
Linus Torvalds
1699dbebf3 Merge tag 'x86_acpi_for_v6.4_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 ACPI update from Borislav Petkov:

 - Improve code generation in ACPI's global lock's acquisition function

* tag 'x86_acpi_for_v6.4_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/ACPI/boot: Improve __acpi_acquire_global_lock
2023-04-25 10:05:00 -07:00
Linus Torvalds
d3464152e5 Merge tag 'ras_core_for_v6.4_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RAS updates from Borislav Petkov:

 - Just cleanups and fixes this time around: make threshold_ktype const,
   an objtool fix and use proper size for a bitmap

* tag 'ras_core_for_v6.4_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/MCE/AMD: Use an u64 for bank_map
  x86/mce: Always inline old MCA stubs
  x86/MCE/AMD: Make kobj_type structure constant
2023-04-25 09:56:33 -07:00
Linus Torvalds
e94ee641f9 Merge tag 'edac_updates_for_v6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras
Pull EDAC updates from Borislav Petkov:

 - skx_edac: Fix overflow when decoding 32G DIMM ranks

 - i10nm_edac: Add Sierra Forest support

 - amd64_edac: Split driver code between legacy and SMCA systems. The
   final goal is adding support for more hw, like GPUs

 - The usual minor cleanups and fixes

* tag 'edac_updates_for_v6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras: (25 commits)
  EDAC/i10nm: Add Intel Sierra Forest server support
  EDAC/amd64: Fix indentation in umc_determine_edac_cap()
  EDAC/altera: Remove MODULE_LICENSE in non-module
  EDAC: Sanitize MODULE_AUTHOR strings
  EDAC/amd81[13]1: Remove trailing newline from MODULE_AUTHOR
  EDAC/amd64: Add get_err_info() to pvt->ops
  EDAC/amd64: Split dump_misc_regs() into dct/umc functions
  EDAC/amd64: Split init_csrows() into dct/umc functions
  EDAC/amd64: Split determine_edac_cap() into dct/umc functions
  EDAC/amd64: Rename f17h_determine_edac_ctl_cap()
  EDAC/amd64: Split setup_mci_misc_attrs() into dct/umc functions
  EDAC/amd64: Split ecc_enabled() into dct/umc functions
  EDAC/amd64: Split read_mc_regs() into dct/umc functions
  EDAC/amd64: Split determine_memory_type() into dct/umc functions
  EDAC/amd64: Split read_base_mask() into dct/umc functions
  EDAC/amd64: Split prep_chip_selects() into dct/umc functions
  EDAC/amd64: Rework hw_info_{get,put}
  EDAC/amd64: Merge struct amd64_family_type into struct amd64_pvt
  EDAC/amd64: Do not discover ECC symbol size for Family 17h and later
  EDAC/amd64: Drop dbam_to_cs() for Family 17h and later
  ...
2023-04-25 09:44:07 -07:00
Linus Torvalds
f7301270a2 Merge tag 'm68k-for-v6.4-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k
Pull m68k updates from Geert Uytterhoeven:

 - defconfig updates

 - miscellaneous fixes and improvements

* tag 'm68k-for-v6.4-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
  m68k: kexec: Include <linux/reboot.h>
  m68k: defconfig: Update defconfigs for v6.3-rc1
  m68k: Remove obsolete config NO_KERNEL_MSG
  nubus: Drop noop match function
2023-04-25 09:37:43 -07:00
Dhruva Gole
cc5f6fa4f6 spi: bcm63xx: use macro DEFINE_SIMPLE_DEV_PM_OPS
Using this macro makes the code more readable.
It also inits the members of dev_pm_ops in the following manner
without us explicitly needing to:

.suspend = bcm63xx_spi_suspend, \
.resume = bcm63xx_spi_resume, \
.freeze = bcm63xx_spi_suspend, \
.thaw = bcm63xx_spi_resume, \
.poweroff = bcm63xx_spi_suspend, \
.restore = bcm63xx_spi_resume

Signed-off-by: Dhruva Gole <d-gole@ti.com>
Link: https://lore.kernel.org/r/20230424102546.1604484-1-d-gole@ti.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2023-04-25 16:55:45 +01:00
Chaitanya Kulkarni
3f89ac587b block/drivers: remove dead clear of random flag
QUEUE_FLAG_ADD_RANDOM is not set before we clear it for "null_blk",
"brd", "nbd", "zram", and "bcache" since by default we don't set
"QUEUE_FLAG_ADD_RANDOM" to MQ ops.

Remove dead clear of QUEUE_FLAG_ADD_RANDOM in above listed drivers.

Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org> #zram
Link: https://lore.kernel.org/r/20230424234628.45544-2-kch@nvidia.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-04-25 08:02:11 -06:00
Ming Lei
38c8e3dfb2 block: sync part's ->bd_has_submit_bio with disk's
submit_bio() always uses bio->bi_bdev->bd_has_submit_bio to decide if
disk's ->submit_bio() is called, and bio->bi_bdev could point to one
partition device.

So we have to sync part bdev's ->bd_has_submit_bio with disk's.

Reported-by: Changhui Zhong <czhong@redhat.com>
Link: https://lore.kernel.org/linux-block/ZEdItaPqif8fp85H@ovpn-8-24.pek2.redhat.com/T/#t
Fixes: 9f4107b07b ("block: store bdev->bd_disk->fops->submit_bio state in bdev")
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20230425034154.110099-1-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-04-25 07:36:02 -06:00
Juergen Gross
cbfac7707b xen/blkback: move blkif_get_x86_*_req() into blkback.c
There is no need to have the functions blkif_get_x86_32_req() and
blkif_get_x86_64_req() in a header file, as they are used in one place
only.

So move them into the using source file and drop the inline qualifier.

While at it fix some style issues, and simplify the code by variable
reusing and using min() instead of open coding it.

Instead of using barrier() use READ_ONCE() for avoiding multiple reads
of nr_segments.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2023-04-25 11:09:30 +02:00
Juergen Gross
e7b4c07d4b xen/blkback: simplify free_persistent_gnts() interface
The interface of free_persistent_gnts() can be simplified, as there is
only a single caller of free_persistent_gnts() and the 2nd and 3rd
parameters are easily obtainable via the ring pointer, which is passed
as the first parameter anyway.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2023-04-25 11:09:27 +02:00
Juergen Gross
656f3c1d79 xen/blkback: remove stale prototype
There is no function xen_blkif_purge_persistent(), so remove its
prototype from common.h.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2023-04-25 11:09:22 +02:00
Juergen Gross
6935321ecc xen/blkback: fix white space code style issues
Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2023-04-25 11:09:19 +02:00
Bob Peterson
644f6bf762 gfs2: gfs2_ail_empty_gl no log flush on error
Before this patch, function gfs2_ail_empty_gl called gfs2_log_flush even
in cases where it encountered an error. It should probably skip the log
flush and leave the file system in an inconsistent state, letting a
subsequent withdraw force the journal to be replayed to reestablish
metadata consistency.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-04-25 11:07:16 +02:00
Bob Peterson
b97e583caa gfs2: Issue message when revokes cannot be written
Before this patch, function gfs2_ail_empty_gl would silently return an
error to the caller. This would get silently set into sd_log_error which
would cause a withdraw, but there was no indication why the file system
was withdrawn. This patch adds a fs_err to log the appropriate error
message.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-04-25 11:06:30 +02:00
Bob Peterson
68ca088dc1 gfs2: Perform second log flush in gfs2_make_fs_ro
Before this patch, function gfs2_make_fs_ro called gfs2_log_flush once to
finalize the log. However, if there's dirty metadata, log flushes tend
to sync the metadata and formulate revokes. Before this patch, those
revokes may not be written out to the journal immediately, which meant
unresolved glocks could still have revokes in their ail lists. When the
glock worker runs, it tries to transition the glock, but the unresolved
revokes in the ail still need to be written, so it tries to start a
transaction. It's impossible to start a transaction because at that
point, the SDF_JOURNAL_LIVE flag has been cleared by gfs2_make_fs_ro.
That causes the glock worker to fail, unable to write the revokes. The
calling sequence looked something like this:

gfs2_make_fs_ro
   gfs2_log_flush - with GFS2_LOG_HEAD_FLUSH_SHUTDOWN flag set
	if (flags & GFS2_LOG_HEAD_FLUSH_SHUTDOWN)
		clear_bit(SDF_JOURNAL_LIVE, &sdp->sd_flags);
...meanwhile...
glock_work_func
   do_xmote
      rgrp_go_sync (or possibly inode_go_sync)
         ...
         gfs2_ail_empty_gl
            __gfs2_trans_begin
               if (unlikely(!test_bit(SDF_JOURNAL_LIVE, &sdp->sd_flags))) {
               ...
                  return -EROFS;

The previous patch in the series ("gfs2: return errors from
gfs2_ail_empty_gl") now causes the transaction error to no longer be
ignored, so it causes a warning from MOST of the xfstests:

WARNING: CPU: 11 PID: X at fs/gfs2/super.c:603 gfs2_put_super [gfs2]

which corresponds to:

WARN_ON(gfs2_withdrawing(sdp));

The withdraw was triggered silently from do_xmote by:

	if (unlikely(sdp->sd_log_error && !gfs2_withdrawn(sdp)))
		gfs2_withdraw_delayed(sdp);

This patch adds a second log_flush to gfs2_make_fs_ro: one to sync the
data and one to sync any outstanding revokes and finalize the journal.
Note that both of these log flushes need to be "special," in other
words, not GFS2_LOG_HEAD_FLUSH_NORMAL.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-04-25 11:01:41 +02:00
Bob Peterson
24ab158298 gfs2: return errors from gfs2_ail_empty_gl
Before this patch, function gfs2_ail_empty_gl did not return errors it
encountered from __gfs2_trans_begin. Those errors usually came from the
fact that the file system was made read-only, often due to unmount
(but theoretically could be due to -o remount,ro), which prevented
the transaction from starting.

The inability to start a transaction prevented its revokes from being
properly written to the journal for glocks during unmount (and
transition to ro).

That meant glocks could be unlocked without the metadata properly
revoked in the journal. So other nodes could grab the glock thinking
that their lvb values were correct, but in fact corresponded to the
glock without its revokes properly synced. That presented as lvb
mismatch errors.

This patch allows gfs2_ail_empty_gl to return the error properly to
the caller.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-04-25 11:00:21 +02:00
Basavaraj Natikar
3738666988 HID: amd_sfh: Fix max supported HID devices
commit 4bd763568d ("HID: amd_sfh: Support for additional light sensor")
adds additional sensor devices, but forgets to add the number of HID
devices to match. Thus, the number of HID devices does not match the
actual number of sensors.

In order to prevent corruption and system hangs when more than the
allowed number of HID devices are accessed, the number of HID devices is
increased accordingly.

Fixes: 4bd763568d ("HID: amd_sfh: Support for additional light sensor")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=217354
Signed-off-by: Basavaraj Natikar <Basavaraj.Natikar@amd.com>
Link: https://lore.kernel.org/r/20230424160406.2579888-1-Basavaraj.Natikar@amd.com
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
2023-04-25 10:58:28 +02:00
wuych
28b17f6270 net: phy: marvell-88x2222: remove unnecessary (void*) conversions
Pointer variables of void * type do not require type cast.

Signed-off-by: wuych <yunchuan@nfschina.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-04-25 09:43:50 +01:00
Kuniyuki Iwashima
50749f2dd6 tcp/udp: Fix memleaks of sk and zerocopy skbs with TX timestamp.
syzkaller reported [0] memory leaks of an UDP socket and ZEROCOPY
skbs.  We can reproduce the problem with these sequences:

  sk = socket(AF_INET, SOCK_DGRAM, 0)
  sk.setsockopt(SOL_SOCKET, SO_TIMESTAMPING, SOF_TIMESTAMPING_TX_SOFTWARE)
  sk.setsockopt(SOL_SOCKET, SO_ZEROCOPY, 1)
  sk.sendto(b'', MSG_ZEROCOPY, ('127.0.0.1', 53))
  sk.close()

sendmsg() calls msg_zerocopy_alloc(), which allocates a skb, sets
skb->cb->ubuf.refcnt to 1, and calls sock_hold().  Here, struct
ubuf_info_msgzc indirectly holds a refcnt of the socket.  When the
skb is sent, __skb_tstamp_tx() clones it and puts the clone into
the socket's error queue with the TX timestamp.

When the original skb is received locally, skb_copy_ubufs() calls
skb_unclone(), and pskb_expand_head() increments skb->cb->ubuf.refcnt.
This additional count is decremented while freeing the skb, but struct
ubuf_info_msgzc still has a refcnt, so __msg_zerocopy_callback() is
not called.

The last refcnt is not released unless we retrieve the TX timestamped
skb by recvmsg().  Since we clear the error queue in inet_sock_destruct()
after the socket's refcnt reaches 0, there is a circular dependency.
If we close() the socket holding such skbs, we never call sock_put()
and leak the count, sk, and skb.

TCP has the same problem, and commit e0c8bccd40 ("net: stream:
purge sk_error_queue in sk_stream_kill_queues()") tried to fix it
by calling skb_queue_purge() during close().  However, there is a
small chance that skb queued in a qdisc or device could be put
into the error queue after the skb_queue_purge() call.

In __skb_tstamp_tx(), the cloned skb should not have a reference
to the ubuf to remove the circular dependency, but skb_clone() does
not call skb_copy_ubufs() for zerocopy skb.  So, we need to call
skb_orphan_frags_rx() for the cloned skb to call skb_copy_ubufs().

[0]:
BUG: memory leak
unreferenced object 0xffff88800c6d2d00 (size 1152):
  comm "syz-executor392", pid 264, jiffies 4294785440 (age 13.044s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 cd af e8 81 00 00 00 00  ................
    02 00 07 40 00 00 00 00 00 00 00 00 00 00 00 00  ...@............
  backtrace:
    [<0000000055636812>] sk_prot_alloc+0x64/0x2a0 net/core/sock.c:2024
    [<0000000054d77b7a>] sk_alloc+0x3b/0x800 net/core/sock.c:2083
    [<0000000066f3c7e0>] inet_create net/ipv4/af_inet.c:319 [inline]
    [<0000000066f3c7e0>] inet_create+0x31e/0xe40 net/ipv4/af_inet.c:245
    [<000000009b83af97>] __sock_create+0x2ab/0x550 net/socket.c:1515
    [<00000000b9b11231>] sock_create net/socket.c:1566 [inline]
    [<00000000b9b11231>] __sys_socket_create net/socket.c:1603 [inline]
    [<00000000b9b11231>] __sys_socket_create net/socket.c:1588 [inline]
    [<00000000b9b11231>] __sys_socket+0x138/0x250 net/socket.c:1636
    [<000000004fb45142>] __do_sys_socket net/socket.c:1649 [inline]
    [<000000004fb45142>] __se_sys_socket net/socket.c:1647 [inline]
    [<000000004fb45142>] __x64_sys_socket+0x73/0xb0 net/socket.c:1647
    [<0000000066999e0e>] do_syscall_x64 arch/x86/entry/common.c:50 [inline]
    [<0000000066999e0e>] do_syscall_64+0x38/0x90 arch/x86/entry/common.c:80
    [<0000000017f238c1>] entry_SYSCALL_64_after_hwframe+0x63/0xcd

BUG: memory leak
unreferenced object 0xffff888017633a00 (size 240):
  comm "syz-executor392", pid 264, jiffies 4294785440 (age 13.044s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 00 2d 6d 0c 80 88 ff ff  .........-m.....
  backtrace:
    [<000000002b1c4368>] __alloc_skb+0x229/0x320 net/core/skbuff.c:497
    [<00000000143579a6>] alloc_skb include/linux/skbuff.h:1265 [inline]
    [<00000000143579a6>] sock_omalloc+0xaa/0x190 net/core/sock.c:2596
    [<00000000be626478>] msg_zerocopy_alloc net/core/skbuff.c:1294 [inline]
    [<00000000be626478>] msg_zerocopy_realloc+0x1ce/0x7f0 net/core/skbuff.c:1370
    [<00000000cbfc9870>] __ip_append_data+0x2adf/0x3b30 net/ipv4/ip_output.c:1037
    [<0000000089869146>] ip_make_skb+0x26c/0x2e0 net/ipv4/ip_output.c:1652
    [<00000000098015c2>] udp_sendmsg+0x1bac/0x2390 net/ipv4/udp.c:1253
    [<0000000045e0e95e>] inet_sendmsg+0x10a/0x150 net/ipv4/af_inet.c:819
    [<000000008d31bfde>] sock_sendmsg_nosec net/socket.c:714 [inline]
    [<000000008d31bfde>] sock_sendmsg+0x141/0x190 net/socket.c:734
    [<0000000021e21aa4>] __sys_sendto+0x243/0x360 net/socket.c:2117
    [<00000000ac0af00c>] __do_sys_sendto net/socket.c:2129 [inline]
    [<00000000ac0af00c>] __se_sys_sendto net/socket.c:2125 [inline]
    [<00000000ac0af00c>] __x64_sys_sendto+0xe1/0x1c0 net/socket.c:2125
    [<0000000066999e0e>] do_syscall_x64 arch/x86/entry/common.c:50 [inline]
    [<0000000066999e0e>] do_syscall_64+0x38/0x90 arch/x86/entry/common.c:80
    [<0000000017f238c1>] entry_SYSCALL_64_after_hwframe+0x63/0xcd

Fixes: f214f915e7 ("tcp: enable MSG_ZEROCOPY")
Fixes: b5947e5d1e ("udp: msg_zerocopy")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-04-25 09:42:35 +01:00
Gencen Gan
d325c34d9e net: amd: Fix link leak when verifying config failed
After failing to verify configuration, it returns directly without
releasing link, which may cause memory leak.

Paolo Abeni thinks that the whole code of this driver is quite
"suboptimal" and looks unmainatained since at least ~15y, so he
suggests that we could simply remove the whole driver, please
take it into consideration.

Simon Horman suggests that the fix label should be set to
"Linux-2.6.12-rc2" considering that the problem has existed
since the driver was introduced and the commit above doesn't
seem to exist in net/net-next.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Gan Gecen <gangecen@hust.edu.cn>
Reviewed-by: Dongliang Mu <dzm91@hust.edu.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-04-25 09:41:18 +01:00
Sakari Ailus
73b41dc51f media: ov5670: Fix probe on ACPI
devm_clk_get() will return either an error or NULL, which the driver
handles, continuing to use the clock of reading the value of the
clock-frequency property.

However, the value of ov5670->xvclk is left as-is and the other clock
framework functions aren't capable of handling error values.

Use devm_clk_get_optional() to obtain NULL instead of -ENOENT.

Fixes: 8004c91e20 ("media: i2c: ov5670: Use common clock framework")
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Reviewed-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
2023-04-25 08:30:18 +01:00
Geert Uytterhoeven
e5c23bec0f sh: Replace <uapi/asm/types.h> by <asm-generic/int-ll64.h>
As arch/sh/include/uapi/asm/types.h doesn't exist, sh doesn't provide
any sh-specific uapi definitions, and it can just include
<asm-generic/int-ll64.h>, like most other architectures.

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Link: https://lore.kernel.org/r/26932016c83c2ad350db59f5daf96117a38bbbd8.1679566927.git.geert@linux-m68k.org
Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
2023-04-25 09:16:51 +02:00
Geert Uytterhoeven
8bc6666f13 sh: Use generic GCC library routines
The C implementations of __ashldi3(), __ashrdi3__(), and __lshrdi3() in
arch/sh/lib/ are identical to the generic C implementations in lib/.
Reduce duplication by switching SH to the generic versions.

Update the include path in arch/sh/boot/compressed accordingly.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Reviewed-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Link: https://lore.kernel.org/r/74dbe68dc8e2ffb6180092f73723fe21ab692c7a.1679566500.git.geert+renesas@glider.be
Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
2023-04-25 09:16:47 +02:00
Zheng Wang
c5749639f2 scsi: qedi: Fix use after free bug in qedi_remove()
In qedi_probe() we call __qedi_probe() which initializes
&qedi->recovery_work with qedi_recovery_handler() and
&qedi->board_disable_work with qedi_board_disable_work().

When qedi_schedule_recovery_handler() is called, schedule_delayed_work()
will finally start the work.

In qedi_remove(), which is called to remove the driver, the following
sequence may be observed:

Fix this by finishing the work before cleanup in qedi_remove().

CPU0                  CPU1

                     |qedi_recovery_handler
qedi_remove          |
  __qedi_remove      |
iscsi_host_free      |
scsi_host_put        |
//free shost         |
                     |iscsi_host_for_each_session
                     |//use qedi->shost

Cancel recovery_work and board_disable_work in __qedi_remove().

Fixes: 4b1068f5d7 ("scsi: qedi: Add MFW error recovery process")
Signed-off-by: Zheng Wang <zyytlz.wz@163.com>
Link: https://lore.kernel.org/r/20230413033422.28003-1-zyytlz.wz@163.com
Acked-by: Manish Rangankar <mrangankar@marvell.com>
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-04-24 23:32:13 -04:00
Alice Chao
948afc6961 scsi: ufs: core: mcq: Fix &hwq->cq_lock deadlock issue
When ufshcd_err_handler() is executed, CQ event interrupt can enter waiting
for the same lock. This can happen in ufshcd_handle_mcq_cq_events() and
also in ufs_mtk_mcq_intr(). The following warning message will be generated
when &hwq->cq_lock is used in IRQ context with IRQ enabled. Use
ufshcd_mcq_poll_cqe_lock() with spin_lock_irqsave instead of spin_lock to
resolve the deadlock issue.

[name:lockdep&]WARNING: inconsistent lock state
[name:lockdep&]--------------------------------
[name:lockdep&]inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage.
[name:lockdep&]kworker/u16:4/260 [HC0[0]:SC0[0]:HE1:SE1] takes:
  ffffff8028444600 (&hwq->cq_lock){?.-.}-{2:2}, at:
ufshcd_mcq_poll_cqe_lock+0x30/0xe0
[name:lockdep&]{IN-HARDIRQ-W} state was registered at:
  lock_acquire+0x17c/0x33c
  _raw_spin_lock+0x5c/0x7c
  ufshcd_mcq_poll_cqe_lock+0x30/0xe0
  ufs_mtk_mcq_intr+0x60/0x1bc [ufs_mediatek_mod]
  __handle_irq_event_percpu+0x140/0x3ec
  handle_irq_event+0x50/0xd8
  handle_fasteoi_irq+0x148/0x2b0
  generic_handle_domain_irq+0x4c/0x6c
  gic_handle_irq+0x58/0x134
  call_on_irq_stack+0x40/0x74
  do_interrupt_handler+0x84/0xe4
  el1_interrupt+0x3c/0x78
<snip>

Possible unsafe locking scenario:
       CPU0
       ----
  lock(&hwq->cq_lock);
  <Interrupt>
    lock(&hwq->cq_lock);
  *** DEADLOCK ***
2 locks held by kworker/u16:4/260:

[name:lockdep&]
 stack backtrace:
CPU: 7 PID: 260 Comm: kworker/u16:4 Tainted: G S      W  OE
6.1.17-mainline-android14-2-g277223301adb #1
Workqueue: ufs_eh_wq_0 ufshcd_err_handler

 Call trace:
  dump_backtrace+0x10c/0x160
  show_stack+0x20/0x30
  dump_stack_lvl+0x98/0xd8
  dump_stack+0x20/0x60
  print_usage_bug+0x584/0x76c
  mark_lock_irq+0x488/0x510
  mark_lock+0x1ec/0x25c
  __lock_acquire+0x4d8/0xffc
  lock_acquire+0x17c/0x33c
  _raw_spin_lock+0x5c/0x7c
  ufshcd_mcq_poll_cqe_lock+0x30/0xe0
  ufshcd_poll+0x68/0x1b0
  ufshcd_transfer_req_compl+0x9c/0xc8
  ufshcd_err_handler+0x3bc/0xea0
  process_one_work+0x2f4/0x7e8
  worker_thread+0x234/0x450
  kthread+0x110/0x134
  ret_from_fork+0x10/0x20

Fixes: ed975065c3 ("scsi: ufs: core: mcq: Add completion support in poll")
Reviewed-by: Can Guo <quic_cang@quicinc.com>
Reviewed-by: Stanley Chu <stanley.chu@mediatek.com>
Signed-off-by: Alice Chao <alice.chao@mediatek.com>
Link: https://lore.kernel.org/r/20230424080400.8955-1-alice.chao@mediatek.com
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-04-24 23:18:41 -04:00
Tom Rix
392e4daa8a scsi: ipr: Remove several unused variables
gcc with W=1 reports
drivers/scsi/ipr.c: In function ‘ipr_init_res_entry’:
drivers/scsi/ipr.c:1104:22: error: variable ‘proto’
  set but not used [-Werror=unused-but-set-variable]
 1104 |         unsigned int proto;
      |                      ^~~~~
drivers/scsi/ipr.c: In function ‘ipr_update_res_entry’:
drivers/scsi/ipr.c:1261:22: error: variable ‘proto’
  set but not used [-Werror=unused-but-set-variable]
 1261 |         unsigned int proto;
      |                      ^~~~~
drivers/scsi/ipr.c: In function ‘ipr_change_queue_depth’:
drivers/scsi/ipr.c:4417:36: error: variable ‘res’
  set but not used [-Werror=unused-but-set-variable]
 4417 |         struct ipr_resource_entry *res;
      |                                    ^~~

These variables are not used, so remove them. The lock around res is not
needed so remove that. This makes ioa_cfg and lock_flags unneeded so remove
them as well.

Fixes: 65a15d6560 ("scsi: ipr: Remove SATA support")
Signed-off-by: Tom Rix <trix@redhat.com>
Link: https://lore.kernel.org/r/20230420125035.3888188-1-trix@redhat.com
Acked-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-04-24 23:11:47 -04:00
Akshat Jain
81221ab764 scsi: pm80xx: Log device registration
Log combination of phy_id and device_id in device registration response.

Signed-off-by: Akshat Jain <akshatzen@google.com>
Signed-off-by: Pranav Prasad <pranavpp@google.com>
Link: https://lore.kernel.org/r/20230411230650.1760757-1-pranavpp@google.com
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-04-24 23:10:51 -04:00
Linus Torvalds
173ea743bf Merge tag 'pull-nios2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull trivial nios2 cleanup from Al Viro.

* tag 'pull-nios2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  nios2: _TIF_ALLWORK_MASK is unused
2023-04-24 19:43:32 -07:00
Linus Torvalds
181b69dd6e Merge tag 'pull-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull misc vfs pile from Al Viro.

Random minor cleanups.

* tag 'pull-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  fs: Fix description of vfs_tmpfile()
  sysv: switch to put_and_unmap_page()
  fs/sysv: Don't round down address for kunmap_flush_on_unmap()
2023-04-24 19:38:34 -07:00
Linus Torvalds
11b32219cb Merge tag 'pull-old-dio' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull legacy dio cleanup from Al Viro.

* tag 'pull-old-dio' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  __blockdev_direct_IO(): get rid of submit_io callback
2023-04-24 19:28:49 -07:00
Linus Torvalds
0e497ad525 Merge tag 'pull-write-one-page' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs write_one_page removal from Al Viro:
 "write_one_page series"

* tag 'pull-write-one-page' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  mm,jfs: move write_one_page/folio_write_one to jfs
  ocfs2: don't use write_one_page in ocfs2_duplicate_clusters_by_page
  ufs: don't flush page immediately for DIRSYNC directories
2023-04-24 19:20:27 -07:00
Linus Torvalds
ef36b9afc2 Merge tag 'pull-fd' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs fget updates from Al Viro:
 "fget() to fdget() conversions"

* tag 'pull-fd' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  fuse_dev_ioctl(): switch to fdget()
  cgroup_get_from_fd(): switch to fdget_raw()
  bpf: switch to fdget_raw()
  build_mount_idmapped(): switch to fdget()
  kill the last remaining user of proc_ns_fget()
  SVM-SEV: convert the rest of fget() uses to fdget() in there
  convert sgx_set_attribute() to fdget()/fdput()
  convert setns(2) to fdget()/fdput()
2023-04-24 19:14:20 -07:00
Christian Marangi
4774ad841b net: phy: marvell: Fix inconsistent indenting in led_blink_set
Fix inconsistent indeinting in m88e1318_led_blink_set reported by kernel
test robot, probably done by the presence of an if condition dropped in
later revision of the same code.

Reported-by: kernel test robot <lkp@intel.com>
Link: https://lore.kernel.org/oe-kbuild-all/202304240007.0VEX8QYG-lkp@intel.com/
Fixes: ea9e86485d ("net: phy: marvell: Implement led_blink_set()")
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20230423172800.3470-1-ansuelsmth@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-24 19:01:47 -07:00
Horatiu Vultur
700f11eb2c lan966x: Don't use xdp_frame when action is XDP_TX
When the action of an xdp program was XDP_TX, lan966x was creating
a xdp_frame and use this one to send the frame back. But it is also
possible to send back the frame without needing a xdp_frame, because
it is possible to send it back using the page.
And then once the frame is transmitted is possible to use directly
page_pool_recycle_direct as lan966x is using page pools.
This would save some CPU usage on this path, which results in higher
number of transmitted frames. Bellow are the statistics:
Frame size:    Improvement:
64                ~8%
256              ~11%
512               ~8%
1000              ~0%
1500              ~0%

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Link: https://lore.kernel.org/r/20230422142344.3630602-1-horatiu.vultur@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-24 18:58:04 -07:00
Jakub Kicinski
ee3392ed16 Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says:

====================
pull-request: bpf-next 2023-04-24

We've added 5 non-merge commits during the last 3 day(s) which contain
a total of 7 files changed, 87 insertions(+), 44 deletions(-).

The main changes are:

1) Workaround for bpf iter selftest due to lack of subprog support
   in precision tracking, from Andrii.

2) Disable bpf_refcount_acquire kfunc until races are fixed, from Dave.

3) One more test_verifier test converted from asm macro to asm in C,
   from Eduard.

4) Fix build with NETFILTER=y INET=n config, from Florian.

5) Add __rcu_read_{lock,unlock} into deny list, from Yafang.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next:
  selftests/bpf: avoid mark_all_scalars_precise() trigger in one of iter tests
  bpf: Add __rcu_read_{lock,unlock} into btf id deny list
  bpf: Disable bpf_refcount_acquire kfunc calls until race conditions are fixed
  selftests/bpf: verifier/prevent_map_lookup converted to inline assembly
  bpf: fix link failure with NETFILTER=y INET=n
====================

Link: https://lore.kernel.org/r/20230425005648.86714-1-alexei.starovoitov@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-24 18:45:12 -07:00
Jakub Kicinski
9610a8dc0a Merge branch 'tsnep-xdp-socket-zero-copy-support'
Gerhard Engleder says:

====================
tsnep: XDP socket zero-copy support

Implement XDP socket zero-copy support for tsnep driver. I tried to
follow existing drivers like igc as far as possible. But one main
difference is that tsnep does not need any reconfiguration for XDP BPF
program setup. So I decided to keep this behavior no matter if a XSK
pool is used or not. As a result, tsnep starts using the XSK pool even
if no XDP BPF program is available.

Another difference is that I tried to prevent potentially failing
allocations during XSK pool setup. E.g. both memory models for page pool
and XSK pool are registered all the time. Thus, XSK pool setup cannot
end up with not working queues.

Some prework is done to reduce the last two XSK commits to actual XSK
changes.
====================

Link: https://lore.kernel.org/r/20230421194656.48063-1-gerhard@engleder-embedded.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-24 18:22:40 -07:00
Gerhard Engleder
cd275c236b tsnep: Add XDP socket zero-copy TX support
Send and complete XSK pool frames within TX NAPI context. NAPI context
is triggered by ndo_xsk_wakeup.

Test results with A53 1.2GHz:

xdpsock txonly copy mode, 64 byte frames:
                   pps            pkts           1.00
tx                 284,409        11,398,144
Two CPUs with 100% and 10% utilization.

xdpsock txonly zero-copy mode, 64 byte frames:
                   pps            pkts           1.00
tx                 511,929        5,890,368
Two CPUs with 100% and 1% utilization.

xdpsock l2fwd copy mode, 64 byte frames:
                   pps            pkts           1.00
rx                 248,985        7,315,885
tx                 248,921        7,315,885
Two CPUs with 100% and 10% utilization.

xdpsock l2fwd zero-copy mode, 64 byte frames:
                   pps            pkts           1.00
rx                 254,735        3,039,456
tx                 254,735        3,039,456
Two CPUs with 100% and 4% utilization.

Packet rate increases and CPU utilization is reduced in both cases.

Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-24 18:22:38 -07:00
Gerhard Engleder
3fc2333933 tsnep: Add XDP socket zero-copy RX support
Add support for XSK zero-copy to RX path. The setup of the XSK pool can
be done at runtime. If the netdev is running, then the queue must be
disabled and enabled during reconfiguration. This can be done easily
with functions introduced in previous commits.

A more important property is that, if the netdev is running, then the
setup of the XSK pool shall not stop the netdev in case of errors. A
broken netdev after a failed XSK pool setup is bad behavior. Therefore,
the allocation and setup of resources during XSK pool setup is done only
before any queue is disabled. Additionally, freeing and later allocation
of resources is eliminated in some cases. Page pool entries are kept for
later use. Two memory models are registered in parallel. As a result,
the XSK pool setup cannot fail during queue reconfiguration.

In contrast to other drivers, XSK pool setup and XDP BPF program setup
are separate actions. XSK pool setup can be done without any XDP BPF
program. The XDP BPF program can be added, removed or changed without
any reconfiguration of the XSK pool.

Test results with A53 1.2GHz:

xdpsock rxdrop copy mode, 64 byte frames:
                   pps            pkts           1.00
rx                 856,054        10,625,775
Two CPUs with both 100% utilization.

xdpsock rxdrop zero-copy mode, 64 byte frames:
                   pps            pkts           1.00
rx                 889,388        4,615,284
Two CPUs with 100% and 20% utilization.

Packet rate increases and CPU utilization is reduced.

100% CPU load seems to the base load. This load is consumed by ksoftirqd
just for dropping the generated packets without xdpsock running.

Using batch API reduced CPU utilization slightly, but measurements are
not stable enough to provide meaningful numbers.

Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-24 18:22:38 -07:00