Commit Graph

76217 Commits

Author SHA1 Message Date
Linus Torvalds
f5a376edde Merge tag 'x86_entry_for_v5.11_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fix from Borislav Petkov:
 "A single fix for objtool to generate proper unwind info for newer
  toolchains which do not generate section symbols anymore. And a
  cleanup ontop.

  This was originally going to go during the next merge window but
  people can already trigger a build error with binutils-2.36 which
  doesn't emit section symbols - something which objtool relies on - so
  let's expedite it"

* tag 'x86_entry_for_v5.11_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/entry: Remove put_ret_addr_in_rdi THUNK macro argument
  x86/entry: Emit a symbol for register restoring thunk
2021-01-31 11:48:12 -08:00
Linus Torvalds
c178fae3a9 Merge tag 'nfs-for-5.11-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client fixes from Trond Myklebust:

 - SUNRPC: Handle 0 length opaque XDR object data properly

 - Fix a layout segment leak in pnfs_layout_process()

 - pNFS/NFSv4: Update the layout barrier when we schedule a layoutreturn

 - pNFS/NFSv4: Improve rejection of out-of-order layouts

 - pNFS/NFSv4: Try to return invalid layout in pnfs_layout_process()

* tag 'nfs-for-5.11-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  SUNRPC: Handle 0 length opaque XDR object data properly
  SUNRPC: Move simple_get_bytes and simple_get_netobj into private header
  pNFS/NFSv4: Improve rejection of out-of-order layouts
  pNFS/NFSv4: Update the layout barrier when we schedule a layoutreturn
  pNFS/NFSv4: Try to return invalid layout in pnfs_layout_process()
  pNFS/NFSv4: Fix a layout segment leak in pnfs_layout_process()
2021-01-31 11:19:12 -08:00
Linus Torvalds
b0dfa64dcd Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull rdma fixes from Jason Gunthorpe:
 "Several recent regressions and some bug fixes:

   - Typo corrupting the max_recv_sge for cxgb4

   - Regression from re-using kernel enums as a HW AbI in vmw_pvrdma

   - Sleeping inside a spinlock in hns

   - Revert the attempt to fix devlink deadlocks as the fix is more buggy

   - Typo in sysfs_emit_at conversions

   - Revert the removal of VLAN support in rxe"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  Revert "RDMA/rxe: Remove VLAN code leftovers from RXE"
  RDMA/usnic: Fix misuse of sysfs_emit_at
  Revert "RDMA/mlx5: Fix devlink deadlock on net namespace deletion"
  RDMA/hns: Use mutex instead of spinlock for ida allocation
  RDMA/vmw_pvrdma: Fix network_hdr_type reported in WC
  RDMA/cxgb4: Fix the reported max_recv_sge value
2021-01-28 09:27:26 -08:00
Linus Torvalds
5bec2487ff Merge tag 'regulator-fix-v5.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
Pull regulator fixes from Mark Brown:
 "The main thing here is a change to make sure that we don't try to
  double resolve the supply of a regulator if we have two probes going
  on simultaneously, plus an incremental fix on top of that to resolve a
  lockdep issue it introduced.

  There's also a patch from Dmitry Osipenko adding stubs for some
  functions to avoid build issues in consumers in some configurations"

* tag 'regulator-fix-v5.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
  regulator: Fix lockdep warning resolving supplies
  regulator: consumer: Add missing stubs to regulator/consumer.h
  regulator: core: avoid regulator_resolve_supply() race condition
2021-01-26 10:59:01 -08:00
Dave Wysochanski
ba6dfce47c SUNRPC: Move simple_get_bytes and simple_get_netobj into private header
Remove duplicated helper functions to parse opaque XDR objects
and place inside new file net/sunrpc/auth_gss/auth_gss_internal.h.
In the new file carry the license and copyright from the source file
net/sunrpc/auth_gss/auth_gss.c.  Finally, update the comment inside
include/linux/sunrpc/xdr.h since lockd is not the only user of
struct xdr_netobj.

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-01-25 15:59:12 -05:00
Sami Tolvanen
9f12e37cae Commit 9bb48c82ac ("tty: implement write_iter") converted the tty
layer to use write_iter. Fix the redirected_tty_write declaration
also in n_tty and change the comparisons to use write_iter instead of
write.

[ Also moved the declaration of redirected_tty_write() to the proper
  location in a header file. The reason for the bug was the bogus extern
  declaration in n_tty.c silently not matching the changed definition in
  tty_io.c, and because it wasn't in a shared header file, there was no
  cross-checking of the declaration.

  Sami noticed because Clang's Control Flow Integrity checking ended up
  incidentally noticing the inconsistent declaration.    - Linus ]

Fixes: 9bb48c82ac ("tty: implement write_iter")
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-01-25 12:08:07 -08:00
Linus Torvalds
a692a610d7 Merge tag 'block-5.11-2021-01-24' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:

 - NVMe pull request from Christoph:
      - fix a status code in nvmet (Chaitanya Kulkarni)
      - avoid double completions in nvme-rdma/nvme-tcp (Chao Leng)
      - fix the CMB support to cope with NVMe 1.4 controllers (Klaus Jensen)
      - fix PRINFO handling in the passthrough ioctl (Revanth Rajashekar)
      - fix a double DMA unmap in nvme-pci

 - lightnvm error path leak fix (Pan)

 - MD pull request from Song:
      - Flush request fix (Xiao)

* tag 'block-5.11-2021-01-24' of git://git.kernel.dk/linux-block:
  lightnvm: fix memory leak when submit fails
  nvme-pci: fix error unwind in nvme_map_data
  nvme-pci: refactor nvme_unmap_data
  md: Set prev_flush_start and flush_bio in an atomic way
  nvmet: set right status on error in id-ns handler
  nvme-pci: allow use of cmb on v1.4 controllers
  nvme-tcp: avoid request double completion for concurrent nvme_tcp_timeout
  nvme-rdma: avoid request double completion for concurrent nvme_rdma_timeout
  nvme: check the PRINFO bit before deciding the host buffer length
2021-01-24 12:24:35 -08:00
Linus Torvalds
443d11297b Merge tag 'driver-core-5.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core fixes from Greg KH:
 "Here are some small driver core fixes for 5.11-rc5 that resolve some
  reported problems:

   - revert of a -rc1 patch that was causing problems with some machines

   - device link device name collision problem fix (busses only have to
     name devices unique to their bus, not unique to all busses)

   - kernfs splice bugfixes to resolve firmware loading problems for
     Qualcomm systems.

   - other tiny driver core fixes for minor issues reported.

  All of these have been in linux-next with no reported problems"

* tag 'driver-core-5.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
  driver core: Fix device link device name collision
  driver core: Extend device_is_dependent()
  kernfs: wire up ->splice_read and ->splice_write
  kernfs: implement ->write_iter
  kernfs: implement ->read_iter
  Revert "driver core: Reorder devices on successful probe"
  Driver core: platform: Add extra error check in devm_platform_get_irqs_affinity()
  drivers core: Free dma_range_map when driver probe failed
2021-01-24 11:05:48 -08:00
Linus Torvalds
24c56ee06c Merge tag 'sched_urgent_for_v5.11_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Borislav Petkov:

 - Correct the marking of kthreads which are supposed to run on a
   specific, single CPU vs such which are affine to only one CPU, mark
   per-cpu workqueue threads as such and make sure that marking
   "survives" CPU hotplug. Fix CPU hotplug issues with such kthreads.

 - A fix to not push away tasks on CPUs coming online.

 - Have workqueue CPU hotplug code use cpu_possible_mask when breaking
   affinity on CPU offlining so that pending workers can finish on newly
   arrived onlined CPUs too.

 - Dump tasks which haven't vacated a CPU which is currently being
   unplugged.

 - Register a special scale invariance callback which gets called on
   resume from RAM to read out APERF/MPERF after resume and thus make
   the schedutil scaling governor more precise.

* tag 'sched_urgent_for_v5.11_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched: Relax the set_cpus_allowed_ptr() semantics
  sched: Fix CPU hotplug / tighten is_per_cpu_kthread()
  sched: Prepare to use balance_push in ttwu()
  workqueue: Restrict affinity change to rescuer
  workqueue: Tag bound workers with KTHREAD_IS_PER_CPU
  kthread: Extract KTHREAD_IS_PER_CPU
  sched: Don't run cpu-online with balance_push() enabled
  workqueue: Use cpu_possible_mask instead of cpu_active_mask to break affinity
  sched/core: Print out straggler tasks in sched_cpu_dying()
  x86: PM: Register syscore_ops for scale invariance
2021-01-24 10:09:20 -08:00
Linus Torvalds
025929f468 Merge tag 'timers_urgent_for_v5.11_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fixes from Borislav Petkov:

 - Fix an integer overflow in the NTP RTC synchronization which led to
   the latter happening every 2 seconds instead of the intended every 11
   minutes.

 - Get rid of now unused get_seconds().

* tag 'timers_urgent_for_v5.11_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  ntp: Fix RTC synchronization on 32-bit platforms
  timekeeping: Remove unused get_seconds()
2021-01-24 09:58:38 -08:00
Peter Zijlstra
ac687e6e8c kthread: Extract KTHREAD_IS_PER_CPU
There is a need to distinguish geniune per-cpu kthreads from kthreads
that happen to have a single CPU affinity.

Geniune per-cpu kthreads are kthreads that are CPU affine for
correctness, these will obviously have PF_KTHREAD set, but must also
have PF_NO_SETAFFINITY set, lest userspace modify their affinity and
ruins things.

However, these two things are not sufficient, PF_NO_SETAFFINITY is
also set on other tasks that have their affinities controlled through
other means, like for instance workqueues.

Therefore another bit is needed; it turns out kthread_create_per_cpu()
already has such a bit: KTHREAD_IS_PER_CPU, which is used to make
kthread_park()/kthread_unpark() work correctly.

Expose this flag and remove the implicit setting of it from
kthread_create_on_cpu(); the io_uring usage of it seems dubious at
best.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>
Tested-by: Valentin Schneider <valentin.schneider@arm.com>
Link: https://lkml.kernel.org/r/20210121103506.557620262@infradead.org
2021-01-22 15:09:42 +01:00
Saravana Kannan
e020ff611b driver core: Fix device link device name collision
The device link device's name was of the form:
<supplier-dev-name>--<consumer-dev-name>

This can cause name collision as reported here [1] as device names are
not globally unique. Since device names have to be unique within the
bus/class, add the bus/class name as a prefix to the device names used to
construct the device link device name.

So the devuce link device's name will be of the form:
<supplier-bus-name>:<supplier-dev-name>--<consumer-bus-name>:<consumer-dev-name>

[1] - https://lore.kernel.org/lkml/20201229033440.32142-1-michael@walle.cc/

Fixes: 287905e68d ("driver core: Expose device link details in sysfs")
Cc: stable@vger.kernel.org
Reported-by: Michael Walle <michael@walle.cc>
Tested-by: Michael Walle <michael@walle.cc>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20210110175408.1465657-1-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-21 20:12:40 +01:00
Dmitry Osipenko
51dfb6ca37 regulator: consumer: Add missing stubs to regulator/consumer.h
Add missing stubs to regulator/consumer.h in order to fix COMPILE_TEST
of the kernel. In particular this should fix compile-testing of OPP core
because of a missing stub for regulator_sync_voltage().

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Link: https://lore.kernel.org/r/20210120205844.12658-1-digetx@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-01-21 12:35:42 +00:00
Linus Torvalds
75439bc439 Merge tag 'net-5.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
 "Networking fixes for 5.11-rc5, including fixes from bpf, wireless, and
  can trees.

  Current release - regressions:

   - nfc: nci: fix the wrong NCI_CORE_INIT parameters

  Current release - new code bugs:

   - bpf: allow empty module BTFs

  Previous releases - regressions:

   - bpf: fix signed_{sub,add32}_overflows type handling

   - tcp: do not mess with cloned skbs in tcp_add_backlog()

   - bpf: prevent double bpf_prog_put call from bpf_tracing_prog_attach

   - bpf: don't leak memory in bpf getsockopt when optlen == 0

   - tcp: fix potential use-after-free due to double kfree()

   - mac80211: fix encryption issues with WEP

   - devlink: use right genl user_ptr when handling port param get/set

   - ipv6: set multicast flag on the multicast route

   - tcp: fix TCP_USER_TIMEOUT with zero window

  Previous releases - always broken:

   - bpf: local storage helpers should check nullness of owner ptr passed

   - mac80211: fix incorrect strlen of .write in debugfs

   - cls_flower: call nla_ok() before nla_next()

   - skbuff: back tiny skbs with kmalloc() in __netdev_alloc_skb() too"

* tag 'net-5.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (52 commits)
  net: systemport: free dev before on error path
  net: usb: cdc_ncm: don't spew notifications
  net: mscc: ocelot: Fix multicast to the CPU port
  tcp: Fix potential use-after-free due to double kfree()
  bpf: Fix signed_{sub,add32}_overflows type handling
  can: peak_usb: fix use after free bugs
  can: vxcan: vxcan_xmit: fix use after free bug
  can: dev: can_restart: fix use after free bug
  tcp: fix TCP socket rehash stats mis-accounting
  net: dsa: b53: fix an off by one in checking "vlan->vid"
  tcp: do not mess with cloned skbs in tcp_add_backlog()
  selftests: net: fib_tests: remove duplicate log test
  net: nfc: nci: fix the wrong NCI_CORE_INIT parameters
  sh_eth: Fix power down vs. is_opened flag ordering
  net: Disable NETIF_F_HW_TLS_RX when RXCSUM is disabled
  netfilter: rpfilter: mask ecn bits before fib lookup
  udp: mask TOS bits in udp_v4_early_demux()
  xsk: Clear pool even for inactive queues
  bpf: Fix helper bpf_map_peek_elem_proto pointing to wrong callback
  sh_eth: Make PHY access aware of Runtime PM to fix reboot crash
  ...
2021-01-20 11:52:21 -08:00
Grant Grundler
de658a195e net: usb: cdc_ncm: don't spew notifications
RTL8156 sends notifications about every 32ms.
Only display/log notifications when something changes.

This issue has been reported by others:
	https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1832472
	https://lkml.org/lkml/2020/8/27/1083

...
[785962.779840] usb 1-1: new high-speed USB device number 5 using xhci_hcd
[785962.929944] usb 1-1: New USB device found, idVendor=0bda, idProduct=8156, bcdDevice=30.00
[785962.929949] usb 1-1: New USB device strings: Mfr=1, Product=2, SerialNumber=6
[785962.929952] usb 1-1: Product: USB 10/100/1G/2.5G LAN
[785962.929954] usb 1-1: Manufacturer: Realtek
[785962.929956] usb 1-1: SerialNumber: 000000001
[785962.991755] usbcore: registered new interface driver cdc_ether
[785963.017068] cdc_ncm 1-1:2.0: MAC-Address: 00:24:27:88:08:15
[785963.017072] cdc_ncm 1-1:2.0: setting rx_max = 16384
[785963.017169] cdc_ncm 1-1:2.0: setting tx_max = 16384
[785963.017682] cdc_ncm 1-1:2.0 usb0: register 'cdc_ncm' at usb-0000:00:14.0-1, CDC NCM, 00:24:27:88:08:15
[785963.019211] usbcore: registered new interface driver cdc_ncm
[785963.023856] usbcore: registered new interface driver cdc_wdm
[785963.025461] usbcore: registered new interface driver cdc_mbim
[785963.038824] cdc_ncm 1-1:2.0 enx002427880815: renamed from usb0
[785963.089586] cdc_ncm 1-1:2.0 enx002427880815: network connection: disconnected
[785963.121673] cdc_ncm 1-1:2.0 enx002427880815: network connection: disconnected
[785963.153682] cdc_ncm 1-1:2.0 enx002427880815: network connection: disconnected
...

This is about 2KB per second and will overwrite all contents of a 1MB
dmesg buffer in under 10 minutes rendering them useless for debugging
many kernel problems.

This is also an extra 180 MB/day in /var/logs (or 1GB per week) rendering
the majority of those logs useless too.

When the link is up (expected state), spew amount is >2x higher:
...
[786139.600992] cdc_ncm 2-1:2.0 enx002427880815: network connection: connected
[786139.632997] cdc_ncm 2-1:2.0 enx002427880815: 2500 mbit/s downlink 2500 mbit/s uplink
[786139.665097] cdc_ncm 2-1:2.0 enx002427880815: network connection: connected
[786139.697100] cdc_ncm 2-1:2.0 enx002427880815: 2500 mbit/s downlink 2500 mbit/s uplink
[786139.729094] cdc_ncm 2-1:2.0 enx002427880815: network connection: connected
[786139.761108] cdc_ncm 2-1:2.0 enx002427880815: 2500 mbit/s downlink 2500 mbit/s uplink
...

Chrome OS cannot support RTL8156 until this is fixed.

Signed-off-by: Grant Grundler <grundler@chromium.org>
Reviewed-by: Hayes Wang <hayeswang@realtek.com>
Link: https://lore.kernel.org/r/20210120011208.3768105-1-grundler@chromium.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-20 09:01:55 -08:00
Parav Pandit
de641d74fb Revert "RDMA/mlx5: Fix devlink deadlock on net namespace deletion"
This reverts commit fbdd0049d9.

Due to commit in fixes tag, netdevice events were received only in one net
namespace of mlx5_core_dev. Due to this when netdevice events arrive in
net namespace other than net namespace of mlx5_core_dev, they are missed.

This results in empty GID table due to RDMA device being detached from its
net device.

Hence, revert back to receive netdevice events in all net namespaces to
restore back RDMA functionality in non init_net net namespace. The
deadlock will have to be addressed in another patch.

Fixes: fbdd0049d9 ("RDMA/mlx5: Fix devlink deadlock on net namespace deletion")
Link: https://lore.kernel.org/r/20210117092633.10690-1-leon@kernel.org
Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-01-19 20:22:59 -04:00
Geert Uytterhoeven
8eed01b5ca mdio-bitbang: Export mdiobb_{read,write}()
Export mdiobb_read() and mdiobb_write(), so Ethernet controller drivers
can call them from their MDIO read/write wrappers.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Tested-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-19 12:02:20 -08:00
Klaus Jensen
20d3bb92e8 nvme-pci: allow use of cmb on v1.4 controllers
Since NVMe v1.4 the Controller Memory Buffer must be explicitly enabled
by the host.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
[hch: avoid a local variable and add a comment]
Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-01-18 18:58:18 +01:00
Linus Torvalds
1d94330a43 Merge tag 'for-5.11/dm-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device mapper fixes from Mike Snitzer:

 - Fix DM-raid's raid1 discard limits so discards work.

 - Select missing Kconfig dependencies for DM integrity and zoned
   targets.

 - Four fixes for DM crypt target's support to optionally bypass kcryptd
   workqueues.

 - Fix DM snapshot merge supports missing data flushes before committing
   metadata.

 - Fix DM integrity data device flushing when external metadata is used.

 - Fix DM integrity's maximum number of supported constructor arguments
   that user can request when creating an integrity device.

 - Eliminate DM core ioctl logging noise when an ioctl is issued without
   required CAP_SYS_RAWIO permission.

* tag 'for-5.11/dm-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dm crypt: defer decryption to a tasklet if interrupts disabled
  dm integrity: fix the maximum number of arguments
  dm crypt: do not call bio_endio() from the dm-crypt tasklet
  dm integrity: fix flush with external metadata device
  dm: eliminate potential source of excessive kernel log noise
  dm snapshot: flush merged data before committing metadata
  dm crypt: use GFP_ATOMIC when allocating crypto requests from softirq
  dm crypt: do not wait for backlogged crypto request completion in softirq
  dm zoned: select CONFIG_CRC32
  dm integrity: select CRYPTO_SKCIPHER
  dm raid: fix discard limits for raid1
2021-01-15 18:01:17 -08:00
Linus Torvalds
b45e2da6e4 Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
 "10 patches.

  Subsystems affected by this patch series: MAINTAINERS and mm (slub,
  pagealloc, memcg, kasan, vmalloc, migration, hugetlb, memory-failure,
  and process_vm_access)"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  mm/process_vm_access.c: include compat.h
  mm,hwpoison: fix printing of page flags
  MAINTAINERS: add Vlastimil as slab allocators maintainer
  mm/hugetlb: fix potential missing huge page size info
  mm: migrate: initialize err in do_migrate_pages
  mm/vmalloc.c: fix potential memory leak
  arm/kasan: fix the array size of kasan_early_shadow_pte[]
  mm/memcontrol: fix warning in mem_cgroup_page_lruvec()
  mm/page_alloc: add a missing mm_page_alloc_zone_locked() tracepoint
  mm, slub: consider rest of partial list if acquire_slab() fails
2021-01-15 15:25:45 -08:00
Linus Torvalds
82821be8a2 Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:

 - Set the minimum GCC version to 5.1 for arm64 due to earlier compiler
   bugs.

 - Make atomic helpers __always_inline to avoid a section mismatch when
   compiling with clang.

 - Fix the CMA and crashkernel reservations to use ZONE_DMA (remove the
   arm64_dma32_phys_limit variable, no longer needed with a dynamic
   ZONE_DMA sizing in 5.11).

 - Remove redundant IRQ flag tracing that was leaving lockdep
   inconsistent with the hardware state.

 - Revert perf events based hard lockup detector that was causing
   smp_processor_id() to be called in preemptible context.

 - Some trivial cleanups - spelling fix, renaming S_FRAME_SIZE to
   PT_REGS_SIZE, function prototypes added.

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: selftests: Fix spelling of 'Mismatch'
  arm64: syscall: include prototype for EL0 SVC functions
  compiler.h: Raise minimum version of GCC to 5.1 for arm64
  arm64: make atomic helpers __always_inline
  arm64: rename S_FRAME_SIZE to PT_REGS_SIZE
  Revert "arm64: Enable perf events based hard lockup detector"
  arm64: entry: remove redundant IRQ flag tracing
  arm64: Remove arm64_dma32_phys_limit and its uses
2021-01-15 13:11:51 -08:00
Will Deacon
dca5244d2f compiler.h: Raise minimum version of GCC to 5.1 for arm64
GCC versions >= 4.9 and < 5.1 have been shown to emit memory references
beyond the stack pointer, resulting in memory corruption if an interrupt
is taken after the stack pointer has been adjusted but before the
reference has been executed. This leads to subtle, infrequent data
corruption such as the EXT4 problems reported by Russell King at the
link below.

Life is too short for buggy compilers, so raise the minimum GCC version
required by arm64 to 5.1.

Reported-by: Russell King <linux@armlinux.org.uk>
Suggested-by: Arnd Bergmann <arnd@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
Tested-by: Nathan Chancellor <natechancellor@gmail.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: <stable@vger.kernel.org>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Link: https://lore.kernel.org/r/20210105154726.GD1551@shell.armlinux.org.uk
Link: https://lore.kernel.org/r/20210112224832.10980-1-will@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-01-15 10:04:49 +00:00
Linus Torvalds
e8c13a6bc8 Merge tag 'net-5.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
 "We have a few fixes for long standing issues, in particular Eric's fix
  to not underestimate the skb sizes, and my fix for brokenness of
  register_netdevice() error path. They may uncover other bugs so we
  will keep an eye on them. Also included are Willem's fixes for
  kmap(_atomic).

  Looking at the "current release" fixes, it seems we are about one rc
  behind a normal cycle. We've previously seen an uptick of "people had
  run their test suites" / "humans actually tried to use new features"
  fixes between rc2 and rc3.

  Summary:

  Current release - regressions:

   - fix feature enforcement to allow NETIF_F_HW_TLS_TX if IP_CSUM &&
     IPV6_CSUM

   - dcb: accept RTM_GETDCB messages carrying set-like DCB commands if
     user is admin for backward-compatibility

   - selftests/tls: fix selftests build after adding ChaCha20-Poly1305

  Current release - always broken:

   - ppp: fix refcount underflow on channel unbridge

   - bnxt_en: clear DEFRAG flag in firmware message when retry flashing

   - smc: fix out of bound access in the new netlink interface

  Previous releases - regressions:

   - fix use-after-free with UDP GRO by frags

   - mptcp: better msk-level shutdown

   - rndis_host: set proper input size for OID_GEN_PHYSICAL_MEDIUM
     request

   - i40e: xsk: fix potential NULL pointer dereferencing

  Previous releases - always broken:

   - skb frag: kmap_atomic fixes

   - avoid 32 x truesize under-estimation for tiny skbs

   - fix issues around register_netdevice() failures

   - udp: prevent reuseport_select_sock from reading uninitialized socks

   - dsa: unbind all switches from tree when DSA master unbinds

   - dsa: clear devlink port type before unregistering slave netdevs

   - can: isotp: isotp_getname(): fix kernel information leak

   - mlxsw: core: Thermal control fixes

   - ipv6: validate GSO SKB against MTU before finish IPv6 processing

   - stmmac: use __napi_schedule() for PREEMPT_RT

   - net: mvpp2: remove Pause and Asym_Pause support

  Misc:

   - remove from MAINTAINERS folks who had been inactive for >5yrs"

* tag 'net-5.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (58 commits)
  mptcp: fix locking in mptcp_disconnect()
  net: Allow NETIF_F_HW_TLS_TX if IP_CSUM && IPV6_CSUM
  MAINTAINERS: dccp: move Gerrit Renker to CREDITS
  MAINTAINERS: ipvs: move Wensong Zhang to CREDITS
  MAINTAINERS: tls: move Aviad to CREDITS
  MAINTAINERS: ena: remove Zorik Machulsky from reviewers
  MAINTAINERS: vrf: move Shrijeet to CREDITS
  MAINTAINERS: net: move Alexey Kuznetsov to CREDITS
  MAINTAINERS: altx: move Jay Cliburn to CREDITS
  net: avoid 32 x truesize under-estimation for tiny skbs
  nt: usb: USB_RTL8153_ECM should not default to y
  net: stmmac: fix taprio configuration when base_time is in the past
  net: stmmac: fix taprio schedule configuration
  net: tip: fix a couple kernel-doc markups
  net: sit: unregister_netdevice on newlink's error path
  net: stmmac: Fixed mtu channged by cache aligned
  cxgb4/chtls: Fix tid stuck due to wrong update of qid
  i40e: fix potential NULL pointer dereferencing
  net: stmmac: use __napi_schedule() for PREEMPT_RT
  can: mcp251xfd: mcp251xfd_handle_rxif_one(): fix wrong NULL pointer check
  ...
2021-01-14 13:31:07 -08:00
Nick Desaulniers
5e6dca82bc x86/entry: Emit a symbol for register restoring thunk
Arnd found a randconfig that produces the warning:

  arch/x86/entry/thunk_64.o: warning: objtool: missing symbol for insn at
  offset 0x3e

when building with LLVM_IAS=1 (Clang's integrated assembler). Josh
notes:

  With the LLVM assembler not generating section symbols, objtool has no
  way to reference this code when it generates ORC unwinder entries,
  because this code is outside of any ELF function.

  The limitation now being imposed by objtool is that all code must be
  contained in an ELF symbol.  And .L symbols don't create such symbols.

  So basically, you can use an .L symbol *inside* a function or a code
  segment, you just can't use the .L symbol to contain the code using a
  SYM_*_START/END annotation pair.

Fangrui notes that this optimization is helpful for reducing image size
when compiling with -ffunction-sections and -fdata-sections. I have
observed on the order of tens of thousands of symbols for the kernel
images built with those flags.

A patch has been authored against GNU binutils to match this behavior
of not generating unused section symbols ([1]), so this will
also become a problem for users of GNU binutils once they upgrade to 2.36.

Omit the .L prefix on a label so that the assembler will emit an entry
into the symbol table for the label, with STB_LOCAL binding. This
enables objtool to generate proper unwind info here with LLVM_IAS=1 or
GNU binutils 2.36+.

 [ bp: Massage commit message. ]

Reported-by: Arnd Bergmann <arnd@arndb.de>
Suggested-by: Josh Poimboeuf <jpoimboe@redhat.com>
Suggested-by: Borislav Petkov <bp@alien8.de>
Suggested-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lkml.kernel.org/r/20210112194625.4181814-1-ndesaulniers@google.com
Link: https://github.com/ClangBuiltLinux/linux/issues/1209
Link: https://reviews.llvm.org/D93783
Link: https://sourceware.org/binutils/docs/as/Symbol-Names.html
Link: https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=d1bcae833b32f1408485ce69f844dcd7ded093a8 [1]
2021-01-14 17:18:25 +01:00
Will Deacon
b90d72a6bf Revert "arm64: Enable perf events based hard lockup detector"
This reverts commit 367c820ef0.

lockup_detector_init() makes heavy use of per-cpu variables and must be
called with preemption disabled. Usually, it's handled early during boot
in kernel_init_freeable(), before SMP has been initialised.

Since we do not know whether or not our PMU interrupt can be signalled
as an NMI until considerably later in the boot process, the Arm PMU
driver attempts to re-initialise the lockup detector off the back of a
device_initcall(). Unfortunately, this is called from preemptible
context and results in the following splat:

  | BUG: using smp_processor_id() in preemptible [00000000] code: swapper/0/1
  | caller is debug_smp_processor_id+0x20/0x2c
  | CPU: 2 PID: 1 Comm: swapper/0 Not tainted 5.10.0+ #276
  | Hardware name: linux,dummy-virt (DT)
  | Call trace:
  |   dump_backtrace+0x0/0x3c0
  |   show_stack+0x20/0x6c
  |   dump_stack+0x2f0/0x42c
  |   check_preemption_disabled+0x1cc/0x1dc
  |   debug_smp_processor_id+0x20/0x2c
  |   hardlockup_detector_event_create+0x34/0x18c
  |   hardlockup_detector_perf_init+0x2c/0x134
  |   watchdog_nmi_probe+0x18/0x24
  |   lockup_detector_init+0x44/0xa8
  |   armv8_pmu_driver_init+0x54/0x78
  |   do_one_initcall+0x184/0x43c
  |   kernel_init_freeable+0x368/0x380
  |   kernel_init+0x1c/0x1cc
  |   ret_from_fork+0x10/0x30

Rather than bodge this with raw_smp_processor_id() or randomly disabling
preemption, simply revert the culprit for now until we figure out how to
do this properly.

Reported-by: Lecopzer Chen <lecopzer.chen@mediatek.com>
Signed-off-by: Will Deacon <will@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Cc: Sumit Garg <sumit.garg@linaro.org>
Cc: Alexandru Elisei <alexandru.elisei@arm.com>
Link: https://lore.kernel.org/r/20201221162249.3119-1-lecopzer.chen@mediatek.com
Link: https://lore.kernel.org/r/20210112221855.10666-1-will@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-01-13 15:08:41 +00:00
Hailong Liu
29970dc24f arm/kasan: fix the array size of kasan_early_shadow_pte[]
The size of kasan_early_shadow_pte[] now is PTRS_PER_PTE which defined
to 512 for arm.  This means that it only covers the prev Linux pte
entries, but not the HWTABLE pte entries for arm.

The reason it currently works is that the symbol kasan_early_shadow_page
immediately following kasan_early_shadow_pte in memory is page aligned,
which makes kasan_early_shadow_pte look like a 4KB size array.  But we
can't ensure the order is always right with different compiler/linker,
or if more bss symbols are introduced.

We had a test with QEMU + vexpress:put a 512KB-size symbol with
attribute __section(".bss..page_aligned") after kasan_early_shadow_pte,
and poisoned it after kasan_early_init().  Then enabled CONFIG_KASAN, it
failed to boot up.

Link: https://lkml.kernel.org/r/20210109044622.8312-1-hailongliiu@yeah.net
Signed-off-by: Hailong Liu <liu.hailong6@zte.com.cn>
Signed-off-by: Ziliang Guo <guo.ziliang@zte.com.cn>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-01-12 18:12:54 -08:00
Hugh Dickins
7ea510b92c mm/memcontrol: fix warning in mem_cgroup_page_lruvec()
Boot a CONFIG_MEMCG=y kernel with "cgroup_disabled=memory" and you are
met by a series of warnings from the VM_WARN_ON_ONCE_PAGE(!memcg, page)
recently added to the inline mem_cgroup_page_lruvec().

An earlier attempt to place that warning, in mem_cgroup_lruvec(), had
been careful to do so after weeding out the mem_cgroup_disabled() case;
but was itself invalid because of the mem_cgroup_lruvec(NULL, pgdat) in
clear_pgdat_congested() and age_active_anon().

Warning in mem_cgroup_page_lruvec() was once useful in detecting a KSM
charge bug, so may be worth keeping: but skip if mem_cgroup_disabled().

Link: https://lkml.kernel.org/r/alpine.LSU.2.11.2101032056260.1093@eggly.anvils
Fixes: 9a1ac2288c ("mm/memcontrol:rewrite mem_cgroup_page_lruvec()")
Signed-off-by: Hugh Dickins <hughd@google.com>
Reviewed-by: Alex Shi <alex.shi@linux.alibaba.com>
Acked-by: Roman Gushchin <guro@fb.com>
Acked-by: Chris Down <chris@chrisdown.name>
Reviewed-by: Baoquan He <bhe@redhat.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Hui Su <sh_def@163.com>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Shakeel Butt <shakeelb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-01-12 18:12:54 -08:00
Chunguang Xu
aba428a0c6 timekeeping: Remove unused get_seconds()
The get_seconds() cleanup seems to have been completed, now it is
time to delete the legacy interface to avoid misuse later.

Signed-off-by: Chunguang Xu <brookxu@tencent.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/1606816351-26900-1-git-send-email-brookxu@tencent.com
2021-01-12 21:13:01 +01:00
Willem de Bruijn
97550f6fa5 net: compound page support in skb_seq_read
skb_seq_read iterates over an skb, returning pointer and length of
the next data range with each call.

It relies on kmap_atomic to access highmem pages when needed.

An skb frag may be backed by a compound page, but kmap_atomic maps
only a single page. There are not enough kmap slots to always map all
pages concurrently.

Instead, if kmap_atomic is needed, iterate over each page.

As this increases the number of calls, avoid this unless needed.
The necessary condition is captured in skb_frag_must_loop.

I tried to make the change as obvious as possible. It should be easy
to verify that nothing changes if skb_frag_must_loop returns false.

Tested:
  On an x86 platform with
    CONFIG_HIGHMEM=y
    CONFIG_DEBUG_KMAP_LOCAL_FORCE_MAP=y
    CONFIG_NETFILTER_XT_MATCH_STRING=y

  Run
    ip link set dev lo mtu 1500
    iptables -A OUTPUT -m string --string 'badstring' -algo bm -j ACCEPT
    dd if=/dev/urandom of=in bs=1M count=20
    nc -l -p 8000 > /dev/null &
    nc -w 1 -q 0 localhost 8000 < in

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-11 18:20:09 -08:00
Willem de Bruijn
29766bcffa net: support kmap_local forced debugging in skb_frag_foreach
Skb frags may be backed by highmem and/or compound pages. Highmem
pages need kmap_atomic mappings to access. But kmap_atomic maps a
single page, not the entire compound page.

skb_foreach_page iterates over an skb frag, in one step in the common
case, page by page only if kmap_atomic must be called for each page.
The decision logic is captured in skb_frag_must_loop.

CONFIG_DEBUG_KMAP_LOCAL_FORCE_MAP extends kmap from highmem to all
pages, to increase code coverage.

Extend skb_frag_must_loop to this new condition.

Link: https://lore.kernel.org/linux-mm/20210106180132.41dc249d@gandalf.local.home/
Fixes: 0e91a0c698 ("mm/highmem: Provide CONFIG_DEBUG_KMAP_LOCAL_FORCE_MAP")
Reported-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Tested-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-11 18:20:09 -08:00
Linus Torvalds
28318f5350 Merge tag 'usb-5.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
 "Here are a number of small USB driver fixes for 5.11-rc3.

  Include in here are:

   - USB gadget driver fixes for reported issues

   - new usb-serial driver ids

   - dma from stack bugfixes

   - typec bugfixes

   - dwc3 bugfixes

   - xhci driver bugfixes

   - other small misc usb driver bugfixes

  All of these have been in linux-next with no reported issues"

* tag 'usb-5.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (35 commits)
  usb: dwc3: gadget: Clear wait flag on dequeue
  usb: typec: Send uevent for num_altmodes update
  usb: typec: Fix copy paste error for NVIDIA alt-mode description
  usb: gadget: enable super speed plus
  kcov, usb: hide in_serving_softirq checks in __usb_hcd_giveback_urb
  usb: uas: Add PNY USB Portable SSD to unusual_uas
  usb: gadget: configfs: Preserve function ordering after bind failure
  usb: gadget: select CONFIG_CRC32
  usb: gadget: core: change the comment for usb_gadget_connect
  usb: gadget: configfs: Fix use-after-free issue with udc_name
  usb: dwc3: gadget: Restart DWC3 gadget when enabling pullup
  usb: usbip: vhci_hcd: protect shift size
  USB: usblp: fix DMA to stack
  USB: serial: iuu_phoenix: fix DMA from stack
  USB: serial: option: add LongSung M5710 module support
  USB: serial: option: add Quectel EM160R-GL
  USB: Gadget: dummy-hcd: Fix shift-out-of-bounds bug
  usb: gadget: f_uac2: reset wMaxPacketSize
  usb: dwc3: ulpi: Fix USB2.0 HS/FS/LS PHY suspend regression
  usb: dwc3: ulpi: Replace CPU-based busyloop with Protocol-based one
  ...
2021-01-10 12:33:19 -08:00
Linus Torvalds
a440e4d761 Merge tag 'x86_urgent_for_v5.11_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:
 "As expected, fixes started trickling in after the holidays so here is
  the accumulated pile of x86 fixes for 5.11:

   - A fix for fanotify_mark() missing the conversion of x86_32 native
     syscalls which take 64-bit arguments to the compat handlers due to
     former having a general compat handler. (Brian Gerst)

   - Add a forgotten pmd page destructor call to pud_free_pmd_page()
     where a pmd page is freed. (Dan Williams)

   - Make IN/OUT insns with an u8 immediate port operand handling for
     SEV-ES guests more precise by using only the single port byte and
     not the whole s32 value of the insn decoder. (Peter Gonda)

   - Correct a straddling end range check before returning the proper
     MTRR type, when the end address is the same as top of memory.
     (Ying-Tsun Huang)

   - Change PQR_ASSOC MSR update scheme when moving a task to a resctrl
     resource group to avoid significant performance overhead with some
     resctrl workloads. (Fenghua Yu)

   - Avoid the actual task move overhead when the task is already in the
     resource group. (Fenghua Yu)"

* tag 'x86_urgent_for_v5.11_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/resctrl: Don't move a task to the same resource group
  x86/resctrl: Use an IPI instead of task_work_add() to update PQR_ASSOC MSR
  x86/mtrr: Correct the range check before performing MTRR type lookups
  x86/sev-es: Fix SEV-ES OUT/IN immediate opcode vc handling
  x86/mm: Fix leak of pmd ptlock
  fanotify: Fix sys_fanotify_mark() on native x86-32
2021-01-10 11:31:17 -08:00
Linus Torvalds
fb9ca0be63 Merge tag 'acpi-5.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI fixes from Rafael Wysocki:
 "These address two build issues and drop confusing text from a couple
  of Kconfig entries.

  Specifics:

   - Drop two local variables that are never read and the code updating
     their values from the x86 suspend-to-idle code (Rafael Wysocki)

   - Add empty stub of an ACPI helper function to avoid build issues
     when CONFIG_ACPI is not set (Shawn Guo)

   - Remove confusing text regarding modules from Kconfig entries that
     correspond to non-modular code (Peter Robinson)"

* tag 'acpi-5.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI: Update Kconfig help text for items that are no longer modular
  ACPI: scan: add stub acpi_create_platform_device() for !CONFIG_ACPI
  ACPI: PM: s2idle: Drop unused local variables and related code
2021-01-08 15:42:56 -08:00
Linus Torvalds
3e2a590acb Merge tag 'iommu-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull iommu fixes from Will Deacon:
 "This is mainly all Intel VT-D stuff, but there are some fixes for AMD
  and ARM as well.

  We've also got the revert I promised during the merge window, which
  removes a temporary hack to accomodate i915 while we transitioned the
  Intel IOMMU driver over to the common DMA-IOMMU API.

  Finally, there are still a couple of other VT-D fixes floating around,
  so I expect to send you another batch of fixes next week.

  Summary:

   - Fix VT-D TLB invalidation for subdevices

   - Fix VT-D use-after-free on subdevice detach

   - Fix VT-D locking so that IRQs are disabled during SVA bind/unbind

   - Fix VT-D address alignment when flushing IOTLB

   - Fix memory leak in VT-D IRQ remapping failure path

   - Revert temporary i915 sglist hack now that it is no longer required

   - Fix sporadic boot failure with Arm SMMU on Qualcomm SM8150

   - Fix NULL dereference in AMD IRQ remapping code with remapping disabled

   - Fix accidental enabling of irqs on AMD resume-from-suspend path

   - Fix some typos in comments"

* tag 'iommu-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  iommu/vt-d: Fix ineffective devTLB invalidation for subdevices
  iommu/vt-d: Fix general protection fault in aux_detach_device()
  iommu/vt-d: Move intel_iommu info from struct intel_svm to struct intel_svm_dev
  iommu/arm-smmu-qcom: Initialize SCTLR of the bypass context
  iommu/vt-d: Fix lockdep splat in sva bind()/unbind()
  Revert "iommu: Add quirk for Intel graphic devices in map_sg"
  iommu/vt-d: Fix misuse of ALIGN in qi_flush_piotlb()
  iommu/amd: Stop irq_remapping_select() matching when remapping is disabled
  iommu/amd: Set iommu->int_enabled consistently when interrupts are set up
  iommu/intel: Fix memleak in intel_irq_remapping_alloc
  iommu/iova: fix 'domain' typos
2021-01-08 14:55:41 -08:00
Mikulas Patocka
9b5948267a dm integrity: fix flush with external metadata device
With external metadata device, flush requests are not passed down to the
data device.

Fix this by submitting the flush request in dm_integrity_flush_buffers. In
order to not degrade performance, we overlap the data device flush with
the metadata device flush.

Reported-by: Lukas Straub <lukasstraub2@web.de>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2021-01-08 15:57:29 -05:00
Linus Torvalds
6279d812ea Merge tag 'net-5.11-rc3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull more networking fixes from Jakub Kicinski:
 "Slightly lighter pull request to get back into the Thursday cadence.

  Current release - always broken:

   - can: mcp251xfd: fix Tx/Rx ring buffer driver race conditions

   - dsa: hellcreek: fix led_classdev build errors

  Previous releases - regressions:

   - ipv6: fib: flush exceptions when purging route to avoid netdev
     reference leak

   - ip_tunnels: fix pmtu check in nopmtudisc mode

   - ip: always refragment ip defragmented packets to avoid MTU issues
     when forwarding through tunnels, correct "packet too big" message
     is prohibitively tricky to generate

   - s390/qeth: fix locking for discipline setup / removal and during
     recovery to prevent both deadlocks and races

   - mlx5: Use port_num 1 instead of 0 when delete a RoCE address

  Previous releases - always broken:

   - cdc_ncm: correct overhead calculation in delayed_ndp_size to
     prevent out of bound accesses with Huawei 909s-120 LTE module

   - fix stmmac dwmac-sun8i suspend/resume:
           - PHY being left powered off
           - MAC syscon configuration being reset
           - reference to the reset controller being improperly dropped

   - qrtr: fix null-ptr-deref in qrtr_ns_remove

   - can: tcan4x5x: fix bittiming const, use common bittiming from m_can
     driver

   - mlx5e: CT: Use per flow counter when CT flow accounting is enabled

   - mlx5e: Fix SWP offsets when vlan inserted by driver

  Misc:

   - bpf: Fix a task_iter bug caused by a bpf -> net merge conflict
     resolution

  And the usual many fixes to various error paths"

* tag 'net-5.11-rc3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (69 commits)
  net: dsa: lantiq_gswip: Exclude RMII from modes that report 1 GbE
  s390/qeth: fix L2 header access in qeth_l3_osa_features_check()
  s390/qeth: fix locking for discipline setup / removal
  s390/qeth: fix deadlock during recovery
  selftests: fib_nexthops: Fix wrong mausezahn invocation
  nexthop: Bounce NHA_GATEWAY in FDB nexthop groups
  nexthop: Unlink nexthop group entry in error path
  nexthop: Fix off-by-one error in error path
  octeontx2-af: fix memory leak of lmac and lmac->name
  chtls: Fix chtls resources release sequence
  chtls: Added a check to avoid NULL pointer dereference
  chtls: Replace skb_dequeue with skb_peek
  chtls: Avoid unnecessary freeing of oreq pointer
  chtls: Fix panic when route to peer not configured
  chtls: Remove invalid set_tcb call
  chtls: Fix hardware tid leak
  net: ip: always refragment ip defragmented packets
  net: fix pmtu check in nopmtudisc mode
  selftests: netfilter: add selftest for ipip pmtu discovery with enabled connection tracking
  docs: octeontx2: tune rst markup
  ...
2021-01-08 12:12:30 -08:00
Petr Mladek
a91bd6223e Revert "init/console: Use ttynull as a fallback when there is no console"
This reverts commit 757055ae8d.

The commit caused that ttynull was used as the default console
on several systems[1][2][3]. As a result, the console was
blank even when a better alternative existed.

It happened when there was no console configured
on the command line and ttynull_init() was the first initcall
calling register_console().

Or it happened when /dev/ did not exist when console_on_rootfs()
was called. It was not able to open /dev/console even though
a console driver was registered. It tried to add ttynull console
but it obviously did not help. But ttynull became the preferred
console and was used by /dev/console when it was available later.

The commit tried to fix a historical problem that have been there
for ages. The primary motivation was the commit 3cffa06aee
("printk/console: Allow to disable console output by using console=""
 or console=null"). It provided a clean solution for a workaround
 that was widely used and worked only by chance.

This revert causes that the console="" or console=null command line
options will again work only by chance. These options will cause that
a particular console will be preferred and the default (tty) ones
will not get enabled. There will be no console registered at
all. As a result there won't be stdin, stdout, and stderr for
the init process. But it worked exactly this way even before.

The proper solution has to fulfill many conditions:

  + Register ttynull only when explicitly required or as
    the ultimate fallback.

  + ttynull should get associated with /dev/console but it must
    not become preferred console when used as a fallback.
    Especially, it must still be possible to replace it
    by a better console later.

Such a change requires clean up of the register_console() code.
Otherwise, it would be even harder to follow. Especially, the use
of has_preferred_console and CON_CONSDEV flag is tricky. The clean
up is risky. The ordering of consoles is not well defined. And
any changes tend to break existing user settings.

Do the revert at the least risky solution for now.

[1] https://lore.kernel.org/linux-kselftest/20201221144302.GR4077@smile.fi.intel.com/
[2] https://lore.kernel.org/lkml/d2a3b3c0-e548-7dd1-730f-59bc5c04e191@synopsys.com/
[3] https://patchwork.ozlabs.org/project/linux-um/patch/20210105120128.10854-1-thomas@m3y3r.de/

Reported-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reported-by: Vineet Gupta <vgupta@synopsys.com>
Reported-by: Thomas Meyer <thomas@m3y3r.de>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-01-08 11:02:18 -08:00
Aya Levin
9c9be85f6b net/mlx5e: Add missing capability check for uplink follow
Expose firmware indication that it supports setting eswitch uplink state
to follow (follow the physical link). Condition setting the eswitch
uplink admin-state with this capability bit. Older FW may not support
the uplink state setting.

Fixes: 7d0314b11c ("net/mlx5e: Modify uplink state on interface up/down")
Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-01-07 12:22:48 -08:00
Shawn Guo
ee61cfd955 ACPI: scan: add stub acpi_create_platform_device() for !CONFIG_ACPI
It adds a stub acpi_create_platform_device() for !CONFIG_ACPI build, so
that caller doesn't have to deal with !CONFIG_ACPI build issue.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-01-07 17:44:00 +01:00
Liu Yi L
18abda7a2d iommu/vt-d: Fix general protection fault in aux_detach_device()
The aux-domain attach/detach are not tracked, some data structures might
be used after free. This causes general protection faults when multiple
subdevices are created and assigned to a same guest machine:

  | general protection fault, probably for non-canonical address 0xdead000000000100: 0000 [#1] SMP NOPTI
  | RIP: 0010:intel_iommu_aux_detach_device+0x12a/0x1f0
  | [...]
  | Call Trace:
  |  iommu_aux_detach_device+0x24/0x70
  |  vfio_mdev_detach_domain+0x3b/0x60
  |  ? vfio_mdev_set_domain+0x50/0x50
  |  iommu_group_for_each_dev+0x4f/0x80
  |  vfio_iommu_detach_group.isra.0+0x22/0x30
  |  vfio_iommu_type1_detach_group.cold+0x71/0x211
  |  ? find_exported_symbol_in_section+0x4a/0xd0
  |  ? each_symbol_section+0x28/0x50
  |  __vfio_group_unset_container+0x4d/0x150
  |  vfio_group_try_dissolve_container+0x25/0x30
  |  vfio_group_put_external_user+0x13/0x20
  |  kvm_vfio_group_put_external_user+0x27/0x40 [kvm]
  |  kvm_vfio_destroy+0x45/0xb0 [kvm]
  |  kvm_put_kvm+0x1bb/0x2e0 [kvm]
  |  kvm_vm_release+0x22/0x30 [kvm]
  |  __fput+0xcc/0x260
  |  ____fput+0xe/0x10
  |  task_work_run+0x8f/0xb0
  |  do_exit+0x358/0xaf0
  |  ? wake_up_state+0x10/0x20
  |  ? signal_wake_up_state+0x1a/0x30
  |  do_group_exit+0x47/0xb0
  |  __x64_sys_exit_group+0x18/0x20
  |  do_syscall_64+0x57/0x1d0
  |  entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fix the crash by tracking the subdevices when attaching and detaching
aux-domains.

Fixes: 67b8e02b5e ("iommu/vt-d: Aux-domain specific domain attach/detach")
Co-developed-by: Xin Zeng <xin.zeng@intel.com>
Signed-off-by: Xin Zeng <xin.zeng@intel.com>
Signed-off-by: Liu Yi L <yi.l.liu@intel.com>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/1609949037-25291-3-git-send-email-yi.l.liu@intel.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-01-07 14:35:14 +00:00
Liu Yi L
9ad9f45b3b iommu/vt-d: Move intel_iommu info from struct intel_svm to struct intel_svm_dev
'struct intel_svm' is shared by all devices bound to a give process,
but records only a single pointer to a 'struct intel_iommu'. Consequently,
cache invalidations may only be applied to a single DMAR unit, and are
erroneously skipped for the other devices.

In preparation for fixing this, rework the structures so that the iommu
pointer resides in 'struct intel_svm_dev', allowing 'struct intel_svm'
to track them in its device list.

Fixes: 1c4f88b7f1 ("iommu/vt-d: Shared virtual address in scalable mode")
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Raj Ashok <ashok.raj@intel.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Reported-by: Guo Kaijie <Kaijie.Guo@intel.com>
Reported-by: Xin Zeng <xin.zeng@intel.com>
Signed-off-by: Guo Kaijie <Kaijie.Guo@intel.com>
Signed-off-by: Xin Zeng <xin.zeng@intel.com>
Signed-off-by: Liu Yi L <yi.l.liu@intel.com>
Tested-by: Guo Kaijie <Kaijie.Guo@intel.com>
Cc: stable@vger.kernel.org # v5.0+
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/1609949037-25291-2-git-send-email-yi.l.liu@intel.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-01-07 14:34:36 +00:00
Andrey Konovalov
e89eed02a5 kcov, usb: hide in_serving_softirq checks in __usb_hcd_giveback_urb
Done opencode in_serving_softirq() checks in in_serving_softirq() to avoid
cluttering the code, hide them in kcov helpers instead.

Fixes: aee9ddb1d3 ("kcov, usb: only collect coverage from __usb_hcd_giveback_urb in softirq")
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Link: https://lore.kernel.org/r/aeb430c5bb90b0ccdf1ec302c70831c1a47b9c45.1609876340.git.andreyknvl@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-07 14:17:29 +01:00
Linus Torvalds
36bbbd0e23 Merge branch 'rcu/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu
Pull RCU fix from Paul McKenney:
 "This is a fix for a regression in the v5.10 merge window, but it was
  reported quite late in the v5.10 process, plus generating and testing
  the fix took some time.

  The regression is due to commit 36dadef23f ("kprobes: Init kprobes
  in early_initcall") which on powerpc can use RCU Tasks before
  initialization, resulting in boot failures.

  The fix is straightforward, simply moving initialization of RCU Tasks
  before the early_initcall()s. The fix has been exposed to -next and
  kbuild test robot testing, and has been tested by the PowerPC guys"

* 'rcu/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu:
  rcu-tasks: Move RCU-tasks initialization to before early_initcall()
2021-01-04 10:55:19 -08:00
Linus Torvalds
f4f6a2e329 Merge tag 'compiler-attributes-for-linus-v5.11' of git://github.com/ojeda/linux
Pull ENABLE_MUST_CHECK removal from Miguel Ojeda:
 "Remove CONFIG_ENABLE_MUST_CHECK (Masahiro Yamada)"

Note that this removes the config option by making the must-check
unconditional, not by removing must check itself.

* tag 'compiler-attributes-for-linus-v5.11' of git://github.com/ojeda/linux:
  Compiler Attributes: remove CONFIG_ENABLE_MUST_CHECK
2021-01-04 10:47:38 -08:00
Linus Torvalds
eda809aef5 Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
 "This is a load of driver fixes (12 ufs, 1 mpt3sas, 1 cxgbi).

  The big core two fixes are for power management ("block: Do not accept
  any requests while suspended" and "block: Fix a race in the runtime
  power management code") which finally sorts out the resume problems
  we've occasionally been having.

  To make the resume fix, there are seven necessary precursors which
  effectively renames REQ_PREEMPT to REQ_PM, so every "special" request
  in block is automatically a power management exempt one.

  All of the non-PM preempt cases are removed except for the one in the
  SCSI Parallel Interface (spi) domain validation which is a genuine
  case where we have to run requests at high priority to validate the
  bus so this becomes an autopm get/put protected request"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (22 commits)
  scsi: cxgb4i: Fix TLS dependency
  scsi: ufs: Un-inline ufshcd_vops_device_reset function
  scsi: ufs: Re-enable WriteBooster after device reset
  scsi: ufs-mediatek: Use correct path to fix compile error
  scsi: mpt3sas: Signedness bug in _base_get_diag_triggers()
  scsi: block: Do not accept any requests while suspended
  scsi: block: Remove RQF_PREEMPT and BLK_MQ_REQ_PREEMPT
  scsi: core: Only process PM requests if rpm_status != RPM_ACTIVE
  scsi: scsi_transport_spi: Set RQF_PM for domain validation commands
  scsi: ide: Mark power management requests with RQF_PM instead of RQF_PREEMPT
  scsi: ide: Do not set the RQF_PREEMPT flag for sense requests
  scsi: block: Introduce BLK_MQ_REQ_PM
  scsi: block: Fix a race in the runtime power management code
  scsi: ufs-pci: Enable UFSHCD_CAP_RPM_AUTOSUSPEND for Intel controllers
  scsi: ufs-pci: Fix recovery from hibernate exit errors for Intel controllers
  scsi: ufs-pci: Ensure UFS device is in PowerDown mode for suspend-to-disk ->poweroff()
  scsi: ufs-pci: Fix restore from S4 for Intel controllers
  scsi: ufs-mediatek: Keep VCC always-on for specific devices
  scsi: ufs: Allow regulators being always-on
  scsi: ufs: Clear UAC for RPMB after ufshcd resets
  ...
2021-01-01 12:58:07 -08:00
Linus Torvalds
f6e1ea1964 Merge tag 'ceph-for-5.11-rc2' of git://github.com/ceph/ceph-client
Pull ceph fixes from Ilya Dryomov:
 "A fix for an edge case in MClientRequest encoding and a couple of
  trivial fixups for the new msgr2 support"

* tag 'ceph-for-5.11-rc2' of git://github.com/ceph/ceph-client:
  libceph: add __maybe_unused to DEFINE_MSGR2_FEATURE
  libceph: align session_key and con_secret to 16 bytes
  libceph: fix auth_signature buffer allocation in secure mode
  ceph: reencode gid_list when reconnecting
2020-12-30 12:02:12 -08:00
Josh Poimboeuf
aa8c7db494 kdev_t: always inline major/minor helper functions
Silly GCC doesn't always inline these trivial functions.

Fixes the following warning:

  arch/x86/kernel/sys_ia32.o: warning: objtool: cp_stat64()+0xd8: call to new_encode_dev() with UACCESS enabled

Link: https://lkml.kernel.org/r/984353b44a4484d86ba9f73884b7306232e25e30.1608737428.git.jpoimboe@redhat.com
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Randy Dunlap <rdunlap@infradead.org>	[build-tested]
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-29 15:36:49 -08:00
Huang Shijie
8b0fac44bd sizes.h: add SZ_8G/SZ_16G/SZ_32G macros
Add these macros, since we can use them in drivers.

Link: https://lkml.kernel.org/r/20201229072819.11183-1-sjhuang@iluvatar.ai
Signed-off-by: Huang Shijie <sjhuang@iluvatar.ai>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-29 15:36:49 -08:00
Baoquan He
dc2da7b45f mm: memmap defer init doesn't work as expected
VMware observed a performance regression during memmap init on their
platform, and bisected to commit 73a6e474cb ("mm: memmap_init:
iterate over memblock regions rather that check each PFN") causing it.

Before the commit:

  [0.033176] Normal zone: 1445888 pages used for memmap
  [0.033176] Normal zone: 89391104 pages, LIFO batch:63
  [0.035851] ACPI: PM-Timer IO Port: 0x448

With commit

  [0.026874] Normal zone: 1445888 pages used for memmap
  [0.026875] Normal zone: 89391104 pages, LIFO batch:63
  [2.028450] ACPI: PM-Timer IO Port: 0x448

The root cause is the current memmap defer init doesn't work as expected.

Before, memmap_init_zone() was used to do memmap init of one whole zone,
to initialize all low zones of one numa node, but defer memmap init of
the last zone in that numa node.  However, since commit 73a6e474cb,
function memmap_init() is adapted to iterater over memblock regions
inside one zone, then call memmap_init_zone() to do memmap init for each
region.

E.g, on VMware's system, the memory layout is as below, there are two
memory regions in node 2.  The current code will mistakenly initialize the
whole 1st region [mem 0xab00000000-0xfcffffffff], then do memmap defer to
iniatialize only one memmory section on the 2nd region [mem
0x10000000000-0x1033fffffff].  In fact, we only expect to see that there's
only one memory section's memmap initialized.  That's why more time is
costed at the time.

[    0.008842] ACPI: SRAT: Node 0 PXM 0 [mem 0x00000000-0x0009ffff]
[    0.008842] ACPI: SRAT: Node 0 PXM 0 [mem 0x00100000-0xbfffffff]
[    0.008843] ACPI: SRAT: Node 0 PXM 0 [mem 0x100000000-0x55ffffffff]
[    0.008844] ACPI: SRAT: Node 1 PXM 1 [mem 0x5600000000-0xaaffffffff]
[    0.008844] ACPI: SRAT: Node 2 PXM 2 [mem 0xab00000000-0xfcffffffff]
[    0.008845] ACPI: SRAT: Node 2 PXM 2 [mem 0x10000000000-0x1033fffffff]

Now, let's add a parameter 'zone_end_pfn' to memmap_init_zone() to pass
down the real zone end pfn so that defer_init() can use it to judge
whether defer need be taken in zone wide.

Link: https://lkml.kernel.org/r/20201223080811.16211-1-bhe@redhat.com
Link: https://lkml.kernel.org/r/20201223080811.16211-2-bhe@redhat.com
Fixes: commit 73a6e474cb ("mm: memmap_init: iterate over memblock regions rather that check each PFN")
Signed-off-by: Baoquan He <bhe@redhat.com>
Reported-by: Rahul Gopakumar <gopakumarr@vmware.com>
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-29 15:36:49 -08:00
Souptick Joarder
6d87d0ece5 mm: add prototype for __add_to_page_cache_locked()
Otherwise it causes a gcc warning:

  mm/filemap.c:830:14: warning: no previous prototype for `__add_to_page_cache_locked' [-Wmissing-prototypes]

A previous attempt to make this function static led to compilation
errors when CONFIG_DEBUG_INFO_BTF is enabled because
__add_to_page_cache_locked() is referred to by BPF code.

Adding a prototype will silence the warning.

Link: https://lkml.kernel.org/r/1608693702-4665-1-git-send-email-jrdr.linux@gmail.com
Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
Cc: Alex Shi <alex.shi@linux.alibaba.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-29 15:36:49 -08:00