At least sparc64 requires I/O-specific access to framebuffers. This
patch updates the fbdev console accordingly.
For drivers with direct access to the framebuffer memory, the callback
functions in struct fb_ops test for the type of memory and call the rsp
fb_sys_ of fb_cfb_ functions. Read and write operations are implemented
internally by DRM's fbdev helper.
For drivers that employ a shadow buffer, fbdev's blit function retrieves
the framebuffer address as struct dma_buf_map, and uses dma_buf_map
interfaces to access the buffer.
The bochs driver on sparc64 uses a workaround to flag the framebuffer as
I/O memory and avoid a HW exception. With the introduction of struct
dma_buf_map, this is not required any longer. The patch removes the rsp
code from both, bochs and fbdev.
v7:
* use min_t(size_t,) (kernel test robot)
* return the number of bytes read/written, if any (fbdev testcase)
v5:
* implement fb_read/fb_write internally (Daniel, Sam)
v4:
* move dma_buf_map changes into separate patch (Daniel)
* TODO list: comment on fbdev updates (Daniel)
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Sam Ravnborg <sam@ravnborg.org>
Tested-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20201103093015.1063-11-tzimmermann@suse.de
Kernel DRM clients now store their framebuffer address in an instance
of struct dma_buf_map. Depending on the buffer's location, the address
refers to system or I/O memory.
Callers of drm_client_buffer_vmap() receive a copy of the value in
the call's supplied arguments. It can be accessed and modified with
dma_buf_map interfaces.
v6:
* don't call page_to_phys() on framebuffers in I/O memory;
warn instead (Daniel)
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Tested-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20201103093015.1063-9-tzimmermann@suse.de
This patch replaces the vmap/vunmap's use of raw pointers in GEM object
functions with instances of struct dma_buf_map. GEM backends are
converted as well. For most of them, this simply changes the returned type.
TTM-based drivers now return information about the location of the memory,
either system or I/O memory. GEM VRAM helpers and qxl now use ttm_bo_vmap()
et al. Amdgpu, nouveau and radeon use drm_gem_ttm_vmap() et al instead of
implementing their own vmap callbacks.
v7:
* init QXL cursor to mapped BO buffer (kernel test robot)
v5:
* update vkms after switch to shmem
v4:
* use ttm_bo_vmap(), drm_gem_ttm_vmap(), et al. (Daniel, Christian)
* fix a trailing { in drm_gem_vmap()
* remove several empty functions instead of converting them (Daniel)
* comment uses of raw pointers with a TODO (Daniel)
* TODO list: convert more helpers to use struct dma_buf_map
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Acked-by: Christian König <christian.koenig@amd.com>
Tested-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20201103093015.1063-7-tzimmermann@suse.de
The new functions ttm_bo_{vmap,vunmap}() map and unmap a TTM BO in kernel
address space. The mapping's address is returned as struct dma_buf_map.
Each function is a simplified version of TTM's existing kmap code. Both
functions respect the memory's location ani/or writecombine flags.
On top TTM's functions, GEM TTM helpers got drm_gem_ttm_{vmap,vunmap}(),
two helpers that convert a GEM object into the TTM BO and forward the call
to TTM's vmap/vunmap. These helpers can be dropped into the rsp GEM object
callbacks.
v5:
* use size_t for storing mapping size (Christian)
* ignore premapped memory areas correctly in ttm_bo_vunmap()
* rebase onto latest TTM interfaces (Christian)
* remove BUG() from ttm_bo_vmap() (Christian)
v4:
* drop ttm_kmap_obj_to_dma_buf() in favor of vmap helpers (Daniel,
Christian)
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Christian König <christian.koenig@amd.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Tested-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20201103093015.1063-6-tzimmermann@suse.de
Add the new vma_set_file() function to allow changing
vma->vm_file with the necessary refcount dance.
v2: add more users of this.
v3: add missing EXPORT_SYMBOL, rebase on mmap cleanup,
add comments why we drop the reference on two occasions.
v4: make it clear that changing an anonymous vma is illegal.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> (v2)
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://patchwork.freedesktop.org/patch/394773/
Since commit 9a40401cfa ("lib/scatterlist: Do not limit max_segment to
PAGE_ALIGNED values") the max_segment input to sg_alloc_table_from_pages()
does not have to be any special value. The new algorithm will always
create something less than what the user provides. Thus eliminate this
confusing constant.
- vmwgfx should use the HW capability, not mix in the OS page size for
calling dma_set_max_seg_size()
- i915 uses i915_sg_segment_size() both for sg_alloc_table_from_pages
and for some open coded sgl construction. This doesn't change the value
since rounddown(size, UINT_MAX) == SCATTERLIST_MAX_SEGMENT
- drm_prime_pages_to_sg uses it as a default if max_segment is zero,
UINT_MAX is fine to use directly.
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Cc: Qian Cai <cai@lca.pw>
Cc: "Ursulin, Tvrtko" <tvrtko.ursulin@intel.com>
Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/0-v1-44733fccd781+13d-rm_scatterlist_max_jgg@nvidia.com
Pull timer fixes from Thomas Gleixner:
"A few fixes for timers/timekeeping:
- Prevent undefined behaviour in the timespec64_to_ns() conversion
which is used for converting user supplied time input to
nanoseconds. It lacked overflow protection.
- Mark sched_clock_read_begin/retry() to prevent recursion in the
tracer
- Remove unused debug functions in the hrtimer and timerlist code"
* tag 'timers-urgent-2020-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
time: Prevent undefined behaviour in timespec64_to_ns()
timers: Remove unused inline funtion debug_timer_free()
hrtimer: Remove unused inline function debug_hrtimer_free()
time/sched_clock: Mark sched_clock_read_begin/retry() as notrace
Pull char/misc fixes/removals from Greg KH:
"Here's some small fixes for 5.10-rc2 and a big driver removal.
The fixes are for some reported issues in the interconnect and
coresight drivers, nothing major.
The "big" driver removal is the MIC drivers have been asked to be
removed as the hardware never shipped and Intel no longer wants to
maintain something that no one can use. This is welcomed by many as
the DMA usage of these drivers was "interesting" and the security
people were starting to question some issues that were starting to be
found in the codebase.
Note, one of the subsystems for this driver, the "VOP" code, will
probably come back in future kernel versions as it was looking to
potentially solve some PCIe virtualization issues that a number of
other vendors were wanting to solve. But as-is, this codebase didn't
work for anyone else so no actual functionality is being removed.
All of these have been in linux-next with no reported issues"
* tag 'char-misc-5.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
coresight: cti: Initialize dynamic sysfs attributes
coresight: Fix uninitialised pointer bug in etm_setup_aux()
coresight: add module license
misc: mic: remove the MIC drivers
interconnect: qcom: use icc_sync state for sm8[12]50
interconnect: qcom: Ensure that the floor bandwidth value is enforced
interconnect: qcom: sc7180: Init BCMs before creating the nodes
interconnect: qcom: sdm845: Init BCMs before creating the nodes
interconnect: Aggregate before setting initial bandwidth
interconnect: qcom: sdm845: Enable keepalive for the MM1 BCM
Pull driver core and documentation fixes from Greg KH:
"Here is one tiny debugfs change to fix up an API where the last user
was successfully fixed up in 5.10-rc1 (so it couldn't be merged
earlier), and a much larger Documentation/ABI/ update to the files so
they can be automatically parsed by our tools.
The Documentation/ABI/ updates are just formatting issues, small ones
to bring the files into parsable format, and have been acked by
numerous subsystem maintainers and the documentation maintainer. I
figured it was good to get this into 5.10-rc2 to help wih the merge
issues that would arise if these were to stick in linux-next until
5.11-rc1.
The debugfs change has been in linux-next for a long time, and the
Documentation updates only for the last linux-next release"
* tag 'driver-core-5.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (40 commits)
scripts: get_abi.pl: assume ReST format by default
docs: ABI: sysfs-class-led-trigger-pattern: remove hw_pattern duplication
docs: ABI: sysfs-class-backlight: unify ABI documentation
docs: ABI: sysfs-c2port: remove a duplicated entry
docs: ABI: sysfs-class-power: unify duplicated properties
docs: ABI: unify /sys/class/leds/<led>/brightness documentation
docs: ABI: stable: remove a duplicated documentation
docs: ABI: change read/write attributes
docs: ABI: cleanup several ABI documents
docs: ABI: sysfs-bus-nvdimm: use the right format for ABI
docs: ABI: vdso: use the right format for ABI
docs: ABI: fix syntax to be parsed using ReST notation
docs: ABI: convert testing/configfs-acpi to ReST
docs: Kconfig/Makefile: add a check for broken ABI files
docs: abi-testing.rst: enable --rst-sources when building docs
docs: ABI: don't escape ReST-incompatible chars from obsolete and removed
docs: ABI: create a 2-depth index for ABI
docs: ABI: make it parse ABI/stable as ReST-compatible files
docs: ABI: sysfs-uevent: make it compatible with ReST output
docs: ABI: testing: make the files compatible with ReST output
...
Pull USB driver fixes from Greg KH:
"Here are a number of small bugfixes for reported issues in some USB
drivers. They include:
- typec bugfixes
- xhci bugfixes and lockdep warning fixes
- cdc-acm driver regression fix
- kernel doc fixes
- cdns3 driver bugfixes for a bunch of reported issues
- other tiny USB driver fixes
All have been in linux-next with no reported issues"
* tag 'usb-5.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
usb: cdns3: gadget: own the lock wrongly at the suspend routine
usb: cdns3: Fix on-chip memory overflow issue
usb: cdns3: gadget: suspicious implicit sign extension
xhci: Don't create stream debugfs files with spinlock held.
usb: xhci: Workaround for S3 issue on AMD SNPS 3.0 xHC
xhci: Fix sizeof() mismatch
usb: typec: stusb160x: fix signedness comparison issue with enum variables
usb: typec: add missing MODULE_DEVICE_TABLE() to stusb160x
USB: apple-mfi-fastcharge: don't probe unhandled devices
usbcore: Check both id_table and match() when both available
usb: host: ehci-tegra: Fix error handling in tegra_ehci_probe()
usb: typec: stusb160x: fix an IS_ERR() vs NULL check in probe
usb: typec: tcpm: reset hard_reset_count for any disconnect
usb: cdc-acm: fix cooldown mechanism
usb: host: fsl-mph-dr-of: check return of dma_set_mask()
usb: fix kernel-doc markups
usb: typec: stusb160x: fix some signedness bugs
usb: cdns3: Variable 'length' set but not used
Pull vhost fixes from Michael Tsirkin:
"Fixes all over the place.
A new UAPI is borderline: can also be considered a new feature but
also seems to be the only way we could come up with to fix addressing
for userspace - and it seems important to switch to it now before
userspace making assumptions about addressing ability of devices is
set in stone"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
vdpasim: allow to assign a MAC address
vdpasim: fix MAC address configuration
vdpa: handle irq bypass register failure case
vdpa_sim: Fix DMA mask
Revert "vhost-vdpa: fix page pinning leakage in error path"
vdpa/mlx5: Fix error return in map_direct_mr()
vhost_vdpa: Return -EFAULT if copy_from_user() fails
vdpa_sim: implement get_iova_range()
vhost: vdpa: report iova range
vdpa: introduce config op to get valid iova range
Pull more flexible-array member conversions from Gustavo A. R. Silva:
"Replace zero-length arrays with flexible-array members"
* tag 'flexible-array-conversions-5.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux:
printk: ringbuffer: Replace zero-length array with flexible-array member
net/smc: Replace zero-length array with flexible-array member
net/mlx5: Replace zero-length array with flexible-array member
mei: hw: Replace zero-length array with flexible-array member
gve: Replace zero-length array with flexible-array member
Bluetooth: btintel: Replace zero-length array with flexible-array member
scsi: target: tcmu: Replace zero-length array with flexible-array member
ima: Replace zero-length array with flexible-array member
enetc: Replace zero-length array with flexible-array member
fs: Replace zero-length array with flexible-array member
Bluetooth: Replace zero-length array with flexible-array member
params: Replace zero-length array with flexible-array member
tracepoint: Replace zero-length array with flexible-array member
platform/chrome: cros_ec_proto: Replace zero-length array with flexible-array member
platform/chrome: cros_ec_commands: Replace zero-length array with flexible-array member
mailbox: zynqmp-ipi-message: Replace zero-length array with flexible-array member
dmaengine: ti-cppi5: Replace zero-length array with flexible-array member
Pull arm64 fixes from Will Deacon:
"The diffstat is a bit spread out thanks to an invasive CPU erratum
workaround which missed the merge window and also a bunch of fixes to
the recently added MTE selftests.
- Fixes to MTE kselftests
- Fix return code from KVM Spectre-v2 hypercall
- Build fixes for ld.lld and Clang's infamous integrated assembler
- Ensure RCU is up and running before we use printk()
- Workaround for Cortex-A77 erratum 1508412
- Fix linker warnings from unexpected ELF sections
- Ensure PE/COFF sections are 64k aligned"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: Change .weak to SYM_FUNC_START_WEAK_PI for arch/arm64/lib/mem*.S
arm64/smp: Move rcu_cpu_starting() earlier
arm64: Add workaround for Arm Cortex-A77 erratum 1508412
arm64: Add part number for Arm Cortex-A77
arm64: mte: Document that user PSTATE.TCO is ignored by kernel uaccess
module: use hidden visibility for weak symbol references
arm64: efi: increase EFI PE/COFF header padding to 64 KB
arm64: vmlinux.lds: account for spurious empty .igot.plt sections
kselftest/arm64: Fix check_user_mem test
kselftest/arm64: Fix check_ksm_options test
kselftest/arm64: Fix check_mmap_options test
kselftest/arm64: Fix check_child_memory test
kselftest/arm64: Fix check_tags_inclusion test
kselftest/arm64: Fix check_buffer_fill test
arm64: avoid -Woverride-init warning
KVM: arm64: ARM_SMCCC_ARCH_WORKAROUND_1 doesn't return SMCCC_RET_NOT_REQUIRED
arm64: vdso32: Allow ld.lld to properly link the VDSO
Pull asm-generic fix from Arnd Bergmann:
"One small bugfix, fixing a build regression for RISC-V"
* tag 'asm-generic-fixes-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
asm-generic: mark __{get,put}_user_fn as __always_inline
Pull power management fixes from Rafael Wysocki:
"These fix a few issues related to running intel_pstate in the passive
mode with HWP enabled, correct the handling of the max_cstate module
parameter in intel_idle and make a few janitorial changes.
Specifics:
- Modify Kconfig to prevent configuring either the "conservative" or
the "ondemand" governor as the default cpufreq governor if
intel_pstate is selected, in which case "schedutil" is the default
choice for the default governor setting (Rafael Wysocki).
- Modify the cpufreq core, intel_pstate and the schedutil governor to
avoid missing updates of the HWP max limit when intel_pstate
operates in the passive mode with HWP enabled (Rafael Wysocki).
- Fix max_cstate module parameter handling in intel_idle for
processor models with C-state tables coming from ACPI (Chen Yu).
- Clean up assorted pieces of power management code (Jackie Zamow,
Tom Rix, Zhang Qilong)"
* tag 'pm-5.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpufreq: schedutil: Always call driver if CPUFREQ_NEED_UPDATE_LIMITS is set
cpufreq: Introduce cpufreq_driver_test_flags()
cpufreq: speedstep: remove unneeded semicolon
PM: sleep: fix typo in kernel/power/process.c
intel_idle: Fix max_cstate for processor models without C-state tables
cpufreq: intel_pstate: Avoid missing HWP max updates in passive mode
cpufreq: Introduce CPUFREQ_NEED_UPDATE_LIMITS driver flag
cpufreq: Avoid configuring old governors as default with intel_pstate
cpufreq: e_powersaver: remove unreachable break
Pull drm fixes from Dave Airlie:
"A busier rc2 than normal, have larger sets of fixes for amdgpu +
nouveau, along with some i915, docs, core, panel, sun4i, v3d, vc4
fixes.
Nothing spooky though or pumpkin related.
docs:
- kernel doc fixes
core:
- fix shmem helpers dma-buf mmap bug
amdgpu:
- Add new navi1x PCI ID
- GPUVM reserved area fixes
- Misc display fixes
- Fix bad interactions between display code and CONFIG_KGDB
- Fixes for SMU manual fan control and i2c
nouveau:
- endian regression fix for old gpus
- buffer object refcount fix
- uapi start/end alignment fix
- display notifier fix
- display clock checking fixes
i915:
- Fix max memory region size calculation
- Restore ILK-M RPS support, restoring performance
- Reject 90/270 degreerotated initial fbs
panel:
- mantix reset fixes
sun4i:
- scalar fix
vc4:
- hdmi audio fixes
v3d:
- fix double free"
* tag 'drm-fixes-2020-10-30-1' of git://anongit.freedesktop.org/drm/drm: (42 commits)
drm/nouveau/kms/nv50-: Fix clock checking algorithm in nv50_dp_mode_valid()
drm/nouveau/kms/nv50-: Get rid of bogus nouveau_conn_mode_valid()
drm/nouveau/device: fix changing endianess code to work on older GPUs
drm/nouveau/gem: fix "refcount_t: underflow; use-after-free"
drm/nouveau/kms/nv50-: Program notifier offset before requesting disp caps
drm/nouveau/nouveau: fix the start/end range for migration
drm/i915: Reject 90/270 degree rotated initial fbs
drm/i915: Restore ILK-M RPS support
drm/i915/region: fix max size calculation
drm/vc4: Rework the structure conversion functions
drm/vc4: hdmi: Add a name to the codec DAI component
drm/shme-helpers: Fix dma_buf_mmap forwarding bug
drm/vc4: hdmi: Avoid sleeping in atomic context
drm/amdgpu/pm: fix the fan speed in fan1_input in manual mode for navi1x
drm/amd/pm: fix the wrong fan speed in fan1_input
drm/amdgpu/swsmu: drop smu i2c bus on navi1x
drm/vc4: drv: Add error handding for bind
drm: drm_print.h: fix kernel-doc markups
drm: kernel-doc: drm_dp_helper.h: fix a typo
drm: kernel-doc: add description for a new function parameter
...
Pull fallthrough fix from Gustavo A. R. Silva:
"This fixes a ton of fall-through warnings when building with Clang
12.0.0 and -Wimplicit-fallthrough"
* tag 'fallthrough-fixes-clang-5.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux:
include: jhash/signal: Fix fall-through warnings for Clang
Pull rdma fixes from Jason Gunthorpe:
"The good news is people are testing rc1 in the RDMA world - the bad
news is testing of the for-next area is not as good as I had hoped, as
we really should have caught at least the rdma_connect_locked() issue
before now.
Notable merge window regressions that didn't get caught/fixed in time
for rc1:
- Fix in kernel users of rxe, they were broken by the rapid fix to
undo the uABI breakage in rxe from another patch
- EFA userspace needs to read the GID table but was broken with the
new GID table logic
- Fix user triggerable deadlock in mlx5 using devlink reload
- Fix deadlock in several ULPs using rdma_connect from the CM handler
callbacks
- Memory leak in qedr"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
RDMA/qedr: Fix memory leak in iWARP CM
RDMA: Add rdma_connect_locked()
RDMA/uverbs: Fix false error in query gid IOCTL
RDMA/mlx5: Fix devlink deadlock on net namespace deletion
RDMA/rxe: Fix small problem in network_type patch
In preparation to enable -Wimplicit-fallthrough for Clang, explicitly
add break statements instead of letting the code fall through to the
next case.
This patch adds four break statements that, together, fix almost 40,000
warnings when building Linux 5.10-rc1 with Clang 12.0.0 and this[1] change
reverted. Notice that in order to enable -Wimplicit-fallthrough for Clang,
such change[1] is meant to be reverted at some point. So, this patch helps
to move in that direction.
Something important to mention is that there is currently a discrepancy
between GCC and Clang when dealing with switch fall-through to empty case
statements or to cases that only contain a break/continue/return
statement[2][3][4].
Now that the -Wimplicit-fallthrough option has been globally enabled[5],
any compiler should really warn on missing either a fallthrough annotation
or any of the other case-terminating statements (break/continue/return/
goto) when falling through to the next case statement. Making exceptions
to this introduces variation in case handling which may continue to lead
to bugs, misunderstandings, and a general lack of robustness. The point
of enabling options like -Wimplicit-fallthrough is to prevent human error
and aid developers in spotting bugs before their code is even built/
submitted/committed, therefore eliminating classes of bugs. So, in order
to really accomplish this, we should, and can, move in the direction of
addressing any error-prone scenarios and get rid of the unintentional
fallthrough bug-class in the kernel, entirely, even if there is some minor
redundancy. Better to have explicit case-ending statements than continue to
have exceptions where one must guess as to the right result. The compiler
will eliminate any actual redundancy.
[1] commit e2079e93f5 ("kbuild: Do not enable -Wimplicit-fallthrough for clang for now")
[2] https://github.com/ClangBuiltLinux/linux/issues/636
[3] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91432
[4] https://godbolt.org/z/xgkvIh
[5] commit a035d552a9 ("Makefile: Globally enable fall-through warning")
Co-developed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Pull AFS fixes from David Howells:
- Fix copy_file_range() to an afs file now returning EINVAL if the
splice_write file op isn't supplied.
- Fix a deref-before-check in afs_unuse_cell().
- Fix a use-after-free in afs_xattr_get_acl().
- Fix afs to not try to clear PG_writeback when laundering a page.
- Fix afs to take a ref on a page that it sets PG_private on and to
drop that ref when clearing PG_private. This is done through recently
added helpers.
- Fix a page leak if write_begin() fails.
- Fix afs_write_begin() to not alter the dirty region info stored in
page->private, but rather do this in afs_write_end() instead when we
know what we actually changed.
- Fix afs_invalidatepage() to alter the dirty region info on a page
when partial page invalidation occurs so that we don't inadvertantly
include a span of zeros that will get written back if a page gets
laundered due to a remote 3rd-party induced invalidation.
We mustn't, however, reduce the dirty region if the page has been
seen to be mapped (ie. we got called through the page_mkwrite vector)
as the page might still be mapped and we might lose data if the file
is extended again.
- Fix the dirty region info to have a lower resolution if the size of
the page is too large for this to be encoded (e.g. powerpc32 with 64K
pages).
Note that this might not be the ideal way to handle this, since it
may allow some leakage of undirtied zero bytes to the server's copy
in the case of a 3rd-party conflict.
To aid the last two fixes, two additional changes:
- Wrap the manipulations of the dirty region info stored in
page->private into helper functions.
- Alter the encoding of the dirty region so that the region bounds can
be stored with one fewer bit, making a bit available for the
indication of mappedness.
* tag 'afs-fixes-20201029' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
afs: Fix dirty-region encoding on ppc32 with 64K pages
afs: Fix afs_invalidatepage to adjust the dirty region
afs: Alter dirty range encoding in page->private
afs: Wrap page->private manipulations in inline functions
afs: Fix where page->private is set during write
afs: Fix page leak on afs_write_begin() failure
afs: Fix to take ref on page when PG_private is set
afs: Fix afs_launder_page to not clear PG_writeback
afs: Fix a use after free in afs_xattr_get_acl()
afs: Fix tracing deref-before-check
afs: Fix copy_file_range()
Pull ext4 fixes from Ted Ts'o:
"Bug fixes for the new ext4 fast commit feature, plus a fix for the
'data=journal' bug fix.
Also use the generic casefolding support which has now landed in
fs/libfs.c for 5.10"
* tag 'ext4_for_linus_fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: indicate that fast_commit is available via /sys/fs/ext4/feature/...
ext4: use generic casefolding support
ext4: do not use extent after put_bh
ext4: use IS_ERR() for error checking of path
ext4: fix mmap write protection for data=journal mode
jbd2: fix a kernel-doc markup
ext4: use s_mount_flags instead of s_mount_state for fast commit state
ext4: make num of fast commit blocks configurable
ext4: properly check for dirty state in ext4_inode_datasync_dirty()
ext4: fix double locking in ext4_fc_commit_dentry_updates()
This replaces the spaghetti code in the two existing page pools.
First of all depending on the allocation size it is between 3 (1GiB) and
5 (1MiB) times faster than the old implementation.
It makes better use of buddy pages to allow for larger physical contiguous
allocations which should result in better TLB utilization at least for
amdgpu.
Instead of a completely braindead approach of filling the pool with one
CPU while another one is trying to shrink it we only give back freed
pages.
This also results in much less locking contention and a trylock free MM
shrinker callback, so we can guarantee that pages are given back to the
system when needed.
Downside of this is that it takes longer for many small allocations until
the pool is filled up. We could address this, but I couldn't find an use
case where this actually matters. We also don't bother freeing large
chunks of pages any more since the CPU overhead in that path isn't really
that important.
The sysfs files are replaced with a single module parameter, allowing
users to override how many pages should be globally pooled in TTM. This
unfortunately breaks the UAPI slightly, but as far as we know nobody ever
depended on this.
Zeroing memory coming from the pool was handled inconsistently. The
alloc_pages() based pool was zeroing it, the dma_alloc_attr() based one
wasn't. For now the new implementation isn't zeroing pages from the pool
either and only sets the __GFP_ZERO flag when necessary.
The implementation has only 768 lines of code compared to the over 2600
of the old one, and also allows for saving quite a bunch of code in the
drivers since we don't need specialized handling there any more based on
kernel config.
Additional to all of that there was a neat bug with IOMMU, coherent DMA
mappings and huge pages which is now fixed in the new code as well.
v2: make ttm_pool_apply_caching static as reported by the kernel bot, add
some more checks
v3: fix some more checkpatch.pl warnings
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Madhav Chauhan <madhav.chauhan@amd.com>
Tested-by: Huang Rui <ray.huang@amd.com>
Link: https://patchwork.freedesktop.org/patch/397080/?series=83051&rev=1
Fix afs_invalidatepage() to adjust the dirty region recorded in
page->private when truncating a page. If the dirty region is entirely
removed, then the private data is cleared and the page dirty state is
cleared.
Without this, if the page is truncated and then expanded again by truncate,
zeros from the expanded, but no-longer dirty region may get written back to
the server if the page gets laundered due to a conflicting 3rd-party write.
It mustn't, however, shorten the dirty region of the page if that page is
still mmapped and has been marked dirty by afs_page_mkwrite(), so a flag is
stored in page->private to record this.
Fixes: 4343d00872 ("afs: Get rid of the afs_writeback record")
Signed-off-by: David Howells <dhowells@redhat.com>
The afs filesystem uses page->private to store the dirty range within a
page such that in the event of a conflicting 3rd-party write to the server,
we write back just the bits that got changed locally.
However, there are a couple of problems with this:
(1) I need a bit to note if the page might be mapped so that partial
invalidation doesn't shrink the range.
(2) There aren't necessarily sufficient bits to store the entire range of
data altered (say it's a 32-bit system with 64KiB pages or transparent
huge pages are in use).
So wrap the accesses in inline functions so that future commits can change
how this works.
Also move them out of the tracing header into the in-directory header.
There's not really any need for them to be in the tracing header.
Signed-off-by: David Howells <dhowells@redhat.com>
Add a helper function to test the flags of the cpufreq driver in use
againt a given flags mask.
In particular, this will be needed to test the
CPUFREQ_NEED_UPDATE_LIMITS cpufreq driver flag in the schedutil
governor.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>