42 Commits

Author SHA1 Message Date
Yongpeng Yang
67d85b062d Documentation: admin-guide: blockdev: replace zone_capacity with zone_capacity_mb when creating devices
The "zone_capacity=%umb" option is no longer used. The effective option
is now "zone_capacity_mb=%u", so update the documentation accordingly.

Signed-off-by: Yongpeng Yang <yangyongpeng@xiaomi.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-12-15 09:56:06 -07:00
Damien Le Moal
ade260ca85 Documentation: admin-guide: blockdev: update zloop parameters
In Documentation/admin-guide/blockdev/zoned_loop.rst, add the
description of the zone_append and ordered_zone_append configuration
arguments of zloop "add" command (device creation).

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-17 09:40:09 -07:00
Linus Torvalds
ee2fe81cdc Merge tag 'docs-6.18' of git://git.lwn.net/linux
Pull documentation updates from Jonathan Corbet:
 "It has been a relatively busy cycle in docsland, with changes all
  over:

   - Bring the kernel memory-model docs into the Sphinx build in the
     "literal include" mode.

   - Lots of build-infrastructure work, further cleaning up long-term
     kernel-doc technical debt. The sphinx-pre-install tool has been
     converted to Python and updated for current systems.

   - A new tool to detect when documents have been moved and generate
     HTML redirects; this can be used on kernel.org (or any other site
     hosting the rendered docs) to avoid breaking links.

   - Automated processing of the YAML files describing the netlink
     protocol.

   - A significant update of the maintainer's PGP guide.

  ... and a seemingly endless series of typo fixes, build-problem fixes,
  etc"

* tag 'docs-6.18' of git://git.lwn.net/linux: (193 commits)
  Documentation/features: Update feature lists for 6.17-rc7
  docs: remove cdomain.py
  Documentation/process: submitting-patches: fix typo in "were do"
  docs: dev-tools/lkmm: Fix typo of missing file extension
  Documentation: trace: histogram: Convert ftrace docs cross-reference
  Documentation: trace: histogram-design: Wrap introductory note in note:: directive
  Documentation: trace: historgram-design: Separate sched_waking histogram section heading and the following diagram
  Documentation: trace: histogram-design: Trim trailing vertices in diagram explanation text
  Documentation: trace: histogram: Fix histogram trigger subsection number order
  docs: driver-api: fix spelling of "buses".
  Documentation: fbcon: Use admonition directives
  Documentation: fbcon: Reindent 8th step of attach/detach/unload
  Documentation: fbcon: Add boot options and attach/detach/unload section headings
  docs: filesystems: sysfs: add remaining top level sysfs directory descriptions
  docs: filesystems: sysfs: clarify symlink destinations in dev and bus/devices descriptions
  docs: filesystems: sysfs: remove top level sysfs net directory
  docs: maintainer: Fix ambiguous subheading formatting
  docs: kdoc: a few more dump_typedef() tweaks
  docs: kdoc: remove redundant comment stripping in dump_typedef()
  docs: kdoc: remove some dead code in dump_typedef()
  ...
2025-10-03 17:16:13 -07:00
Bjorn Helgaas
c349216707 Documentation: Fix admin-guide typos
Fix typos.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/20250813200526.290420-4-helgaas@kernel.org
2025-08-18 10:31:19 -06:00
Erick Karanja
f7a2e1c087 Docs: admin-guide: Correct spelling mistake
Fix spelling mistake directoy to directory

Reported-by: codespell

Signed-off-by: Erick Karanja <karanja99erick@gmail.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20250813071837.668613-1-karanja99erick@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-08-13 05:33:09 -06:00
Linus Torvalds
00c010e130 Merge tag 'mm-stable-2025-05-31-14-50' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:

 - "Add folio_mk_pte()" from Matthew Wilcox simplifies the act of
   creating a pte which addresses the first page in a folio and reduces
   the amount of plumbing which architecture must implement to provide
   this.

 - "Misc folio patches for 6.16" from Matthew Wilcox is a shower of
   largely unrelated folio infrastructure changes which clean things up
   and better prepare us for future work.

 - "memory,x86,acpi: hotplug memory alignment advisement" from Gregory
   Price adds early-init code to prevent x86 from leaving physical
   memory unused when physical address regions are not aligned to memory
   block size.

 - "mm/compaction: allow more aggressive proactive compaction" from
   Michal Clapinski provides some tuning of the (sadly, hard-coded (more
   sadly, not auto-tuned)) thresholds for our invokation of proactive
   compaction. In a simple test case, the reduction of a guest VM's
   memory consumption was dramatic.

 - "Minor cleanups and improvements to swap freeing code" from Kemeng
   Shi provides some code cleaups and a small efficiency improvement to
   this part of our swap handling code.

 - "ptrace: introduce PTRACE_SET_SYSCALL_INFO API" from Dmitry Levin
   adds the ability for a ptracer to modify syscalls arguments. At this
   time we can alter only "system call information that are used by
   strace system call tampering, namely, syscall number, syscall
   arguments, and syscall return value.

   This series should have been incorporated into mm.git's "non-MM"
   branch, but I goofed.

 - "fs/proc: extend the PAGEMAP_SCAN ioctl to report guard regions" from
   Andrei Vagin extends the info returned by the PAGEMAP_SCAN ioctl
   against /proc/pid/pagemap. This permits CRIU to more efficiently get
   at the info about guard regions.

 - "Fix parameter passed to page_mapcount_is_type()" from Gavin Shan
   implements that fix. No runtime effect is expected because
   validate_page_before_insert() happens to fix up this error.

 - "kernel/events/uprobes: uprobe_write_opcode() rewrite" from David
   Hildenbrand basically brings uprobe text poking into the current
   decade. Remove a bunch of hand-rolled implementation in favor of
   using more current facilities.

 - "mm/ptdump: Drop assumption that pxd_val() is u64" from Anshuman
   Khandual provides enhancements and generalizations to the pte dumping
   code. This might be needed when 128-bit Page Table Descriptors are
   enabled for ARM.

 - "Always call constructor for kernel page tables" from Kevin Brodsky
   ensures that the ctor/dtor is always called for kernel pgtables, as
   it already is for user pgtables.

   This permits the addition of more functionality such as "insert hooks
   to protect page tables". This change does result in various
   architectures performing unnecesary work, but this is fixed up where
   it is anticipated to occur.

 - "Rust support for mm_struct, vm_area_struct, and mmap" from Alice
   Ryhl adds plumbing to permit Rust access to core MM structures.

 - "fix incorrectly disallowed anonymous VMA merges" from Lorenzo
   Stoakes takes advantage of some VMA merging opportunities which we've
   been missing for 15 years.

 - "mm/madvise: batch tlb flushes for MADV_DONTNEED and MADV_FREE" from
   SeongJae Park optimizes process_madvise()'s TLB flushing.

   Instead of flushing each address range in the provided iovec, we
   batch the flushing across all the iovec entries. The syscall's cost
   was approximately halved with a microbenchmark which was designed to
   load this particular operation.

 - "Track node vacancy to reduce worst case allocation counts" from
   Sidhartha Kumar makes the maple tree smarter about its node
   preallocation.

   stress-ng mmap performance increased by single-digit percentages and
   the amount of unnecessarily preallocated memory was dramaticelly
   reduced.

 - "mm/gup: Minor fix, cleanup and improvements" from Baoquan He removes
   a few unnecessary things which Baoquan noted when reading the code.

 - ""Enhance sysfs handling for memory hotplug in weighted interleave"
   from Rakie Kim "enhances the weighted interleave policy in the memory
   management subsystem by improving sysfs handling, fixing memory
   leaks, and introducing dynamic sysfs updates for memory hotplug
   support". Fixes things on error paths which we are unlikely to hit.

 - "mm/damon: auto-tune DAMOS for NUMA setups including tiered memory"
   from SeongJae Park introduces new DAMOS quota goal metrics which
   eliminate the manual tuning which is required when utilizing DAMON
   for memory tiering.

 - "mm/vmalloc.c: code cleanup and improvements" from Baoquan He
   provides cleanups and small efficiency improvements which Baoquan
   found via code inspection.

 - "vmscan: enforce mems_effective during demotion" from Gregory Price
   changes reclaim to respect cpuset.mems_effective during demotion when
   possible. because presently, reclaim explicitly ignores
   cpuset.mems_effective when demoting, which may cause the cpuset
   settings to violated.

   This is useful for isolating workloads on a multi-tenant system from
   certain classes of memory more consistently.

 - "Clean up split_huge_pmd_locked() and remove unnecessary folio
   pointers" from Gavin Guo provides minor cleanups and efficiency gains
   in in the huge page splitting and migrating code.

 - "Use kmem_cache for memcg alloc" from Huan Yang creates a slab cache
   for `struct mem_cgroup', yielding improved memory utilization.

 - "add max arg to swappiness in memory.reclaim and lru_gen" from
   Zhongkun He adds a new "max" argument to the "swappiness=" argument
   for memory.reclaim MGLRU's lru_gen.

   This directs proactive reclaim to reclaim from only anon folios
   rather than file-backed folios.

 - "kexec: introduce Kexec HandOver (KHO)" from Mike Rapoport is the
   first step on the path to permitting the kernel to maintain existing
   VMs while replacing the host kernel via file-based kexec. At this
   time only memblock's reserve_mem is preserved.

 - "mm: Introduce for_each_valid_pfn()" from David Woodhouse provides
   and uses a smarter way of looping over a pfn range. By skipping
   ranges of invalid pfns.

 - "sched/numa: Skip VMA scanning on memory pinned to one NUMA node via
   cpuset.mems" from Libo Chen removes a lot of pointless VMA scanning
   when a task is pinned a single NUMA mode.

   Dramatic performance benefits were seen in some real world cases.

 - "JFS: Implement migrate_folio for jfs_metapage_aops" from Shivank
   Garg addresses a warning which occurs during memory compaction when
   using JFS.

 - "move all VMA allocation, freeing and duplication logic to mm" from
   Lorenzo Stoakes moves some VMA code from kernel/fork.c into the more
   appropriate mm/vma.c.

 - "mm, swap: clean up swap cache mapping helper" from Kairui Song
   provides code consolidation and cleanups related to the folio_index()
   function.

 - "mm/gup: Cleanup memfd_pin_folios()" from Vishal Moola does that.

 - "memcg: Fix test_memcg_min/low test failures" from Waiman Long
   addresses some bogus failures which are being reported by the
   test_memcontrol selftest.

 - "eliminate mmap() retry merge, add .mmap_prepare hook" from Lorenzo
   Stoakes commences the deprecation of file_operations.mmap() in favor
   of the new file_operations.mmap_prepare().

   The latter is more restrictive and prevents drivers from messing with
   things in ways which, amongst other problems, may defeat VMA merging.

 - "memcg: decouple memcg and objcg stocks"" from Shakeel Butt decouples
   the per-cpu memcg charge cache from the objcg's one.

   This is a step along the way to making memcg and objcg charging
   NMI-safe, which is a BPF requirement.

 - "mm/damon: minor fixups and improvements for code, tests, and
   documents" from SeongJae Park is yet another batch of miscellaneous
   DAMON changes. Fix and improve minor problems in code, tests and
   documents.

 - "memcg: make memcg stats irq safe" from Shakeel Butt converts memcg
   stats to be irq safe. Another step along the way to making memcg
   charging and stats updates NMI-safe, a BPF requirement.

 - "Let unmap_hugepage_range() and several related functions take folio
   instead of page" from Fan Ni provides folio conversions in the
   hugetlb code.

* tag 'mm-stable-2025-05-31-14-50' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (285 commits)
  mm: pcp: increase pcp->free_count threshold to trigger free_high
  mm/hugetlb: convert use of struct page to folio in __unmap_hugepage_range()
  mm/hugetlb: refactor __unmap_hugepage_range() to take folio instead of page
  mm/hugetlb: refactor unmap_hugepage_range() to take folio instead of page
  mm/hugetlb: pass folio instead of page to unmap_ref_private()
  memcg: objcg stock trylock without irq disabling
  memcg: no stock lock for cpu hot-unplug
  memcg: make __mod_memcg_lruvec_state re-entrant safe against irqs
  memcg: make count_memcg_events re-entrant safe against irqs
  memcg: make mod_memcg_state re-entrant safe against irqs
  memcg: move preempt disable to callers of memcg_rstat_updated
  memcg: memcg_rstat_updated re-entrant safe against irqs
  mm: khugepaged: decouple SHMEM and file folios' collapse
  selftests/eventfd: correct test name and improve messages
  alloc_tag: check mem_profiling_support in alloc_tag_init
  Docs/damon: update titles and brief introductions to explain DAMOS
  selftests/damon/_damon_sysfs: read tried regions directories in order
  mm/damon/tests/core-kunit: add a test for damos_set_filters_default_reject()
  mm/damon/paddr: remove unused variable, folio_list, in damon_pa_stat()
  mm/damon/sysfs-schemes: fix wrong comment on damons_sysfs_quota_goal_metric_strs
  ...
2025-05-31 15:44:16 -07:00
Sergey Senozhatsky
b05f8d7e07 Documentation: zram: update IDLE pages tracking documentation
Move IDLE pages tracking into a separate chapter because there are
multiple features that use (or depend on) it either in built-in variant
("mark all") or in extended variant (ac-time tracking).

In addition, recompression doesn't require memory tracking to be enabled
in order to be able to perform idle recompression.

Link: https://lkml.kernel.org/r/20250416042833.3858827-1-senozhatsky@chromium.org
Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Reported-by: Shin Kawamura <kawasin@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-11 17:48:33 -07:00
Sergey Senozhatsky
cf42d4cccf zram: modernize writeback interface
The writeback interface supports a page_index=N parameter which performs
writeback of the given page.  Since we rarely need to writeback just one
single page, the typical use case involves a number of writeback calls,
each performing writeback of one page:

  echo page_index=100 > zram0/writeback
  ...
  echo page_index=200 > zram0/writeback
  echo page_index=500 > zram0/writeback
  ...
  echo page_index=700 > zram0/writeback

One obvious downside of this is that it increases the number of syscalls. 
Less obvious, but a significantly more important downside, is that when
given only one page to post-process zram cannot perform an optimal target
selection.  This becomes a critical limitation when writeback_limit is
enabled, because under writeback_limit we want to guarantee the highest
memory savings hence we first need to writeback pages that release the
highest amount of zsmalloc pool memory.

This patch adds page_indexes=LOW-HIGH parameter to the writeback
interface:

  echo page_indexes=100-200 page_indexes=500-700 > zram0/writeback

This gives zram a chance to apply an optimal target selection strategy on
each iteration of the writeback loop.

We also now permit multiple page_index parameters per call (previously
zram would recognize only one page_index) and a mix or single pages and
page ranges:

  echo page_index=42 page_index=99 page_indexes=100-200 \
       page_indexes=500-700 > zram0/writeback

Apart from that the patch also unifies parameters passing and resembles
other "modern" zram device attributes (e.g.  recompression), while the old
interface used a mixed scheme: values-less parameters for mode and a
key=value format for page_index.  We still support the "old" value-less
format for compatibility reasons.

[senozhatsky@chromium.org: simplify parse_page_index() range checks, per Brian]
  nk: https://lkml.kernel.org/r/20250404015327.2427684-1-senozhatsky@chromium.org
[sozhatsky@chromium.org: fix uninitialized variable in zram_writeback_slots(), per Dan]
  nk: https://lkml.kernel.org/r/20250409112611.1154282-1-senozhatsky@chromium.org
Link: https://lkml.kernel.org/r/20250327015818.4148660-1-senozhatsky@chromium.org
Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Reviewed-by: Brian Geffon <bgeffon@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Richard Chang <richardycc@google.com>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-11 17:48:09 -07:00
Damien Le Moal
9e4f11c122 Documentation: Document the new zoned loop block device driver
Introduce the zoned_loop.rst documentation file under
admin-guide/blockdev to document the zoned loop block device driver.
An overview of the driver is provided and its usage to create and delete
zoned devices described.

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Link: https://lore.kernel.org/r/20250407075222.170336-3-dlemoal@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-05-01 17:03:56 -06:00
Sergey Senozhatsky
4127e13c93 zram: remove max_comp_streams device attr
max_comp_streams device attribute has been defunct since May 2016 when
zram switched to per-CPU compression streams, remove it.

Link: https://lkml.kernel.org/r/20250303022425.285971-5-senozhatsky@chromium.org
Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Hillf Danton <hdanton@sina.com>
Cc: Kairui Song <ryncsn@gmail.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Yosry Ahmed <yosry.ahmed@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-03-16 22:06:33 -07:00
Saru2003
5c14b68596 Documentation: zram: fix dictionary spelling
Fixes a typo in the ZRAM documentation where 'dictioary' was
misspelled. Corrected it to 'dictionary' in the example usage
of 'algorithm_params'.

Signed-off-by: Sarveshwaar SS <sarvesh20123@gmail.com>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/20241125165122.17521-1-sarvesh20123@gmail.com
2024-12-13 08:52:22 -07:00
Sergey Senozhatsky
58652f2b6d zram: permit only one post-processing operation at a time
Both recompress and writeback soon will unlock slots during processing,
which makes things too complex wrt possible race-conditions.  We still
want to clear PP_SLOT in slot_free, because this is how we figure out that
slot that was selected for post-processing has been released under us and
when we start post-processing we check if slot still has PP_SLOT set.  At
the same time, theoretically, we can have something like this:

CPU0			    CPU1

recompress
scan slots
set PP_SLOT
unlock slot
			slot_free
			clear PP_SLOT

			allocate PP_SLOT
			writeback
			scan slots
			set PP_SLOT
			unlock slot
select PP-slot
test PP_SLOT

So recompress will not detect that slot has been re-used and re-selected
for concurrent writeback post-processing.

Make sure that we only permit on post-processing operation at a time.  So
now recompress and writeback post-processing don't race against each
other, we only need to handle slot re-use (slot_free and write), which is
handled individually by each pp operation.

Having recompress and writeback competing for the same slots is not
exactly good anyway (can't imagine anyone doing that).

Link: https://lkml.kernel.org/r/20240917021020.883356-3-senozhatsky@chromium.org
Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-05 16:56:22 -08:00
Sergey Senozhatsky
e899007a5e zram: support priority parameter in recompression
recompress device attribute supports alg=NAME parameter so that we can
specify only one particular algorithm we want to perform recompression
with.  However, with algo params we now can have several exactly same
secondary algorithms but each with its own params tuning (e.g.  priority 1
configured to use more aggressive level, and priority 2 configured to use
a pre-trained dictionary).  Support priority=NUM parameter so that we can
correctly determine which secondary algorithm we want to use.

Link: https://lkml.kernel.org/r/20240902105656.1383858-25-senozhatsky@chromium.org
Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Nick Terrell <terrelln@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-09 16:39:12 -07:00
Sergey Senozhatsky
97ee4842f2 Documentation/zram: add documentation for algorithm parameters
Document brief description of compression algorithms' parameters:
compression level and pre-trained dictionary.

[senozhatsky@chromium.org: trivial fixup]
  Link: https://lkml.kernel.org/r/20240903063722.1603592-1-senozhatsky@chromium.org
Link: https://lkml.kernel.org/r/20240902105656.1383858-24-senozhatsky@chromium.org
Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Nick Terrell <terrelln@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-09 16:39:11 -07:00
Sergey Senozhatsky
4eac932103 zram: introduce algorithm_params device attribute
This attribute is used to setup compression algorithms' parameters, so we
can tweak algorithms' characteristics.  At this point only 'level' is
supported (to be extended in the future).

Each call sets up parameters for one particular algorithm, which should be
specified either by the algorithm's priority or algo name.  This is
expected to be called after corresponding algorithm is selected via
comp_algorithm or recomp_algorithm.

 echo "priority=0 level=1" > /sys/block/zram0/algorithm_params
or
 echo "algo=zstd level=1" > /sys/block/zram0/algorithm_params

Link: https://lkml.kernel.org/r/20240902105656.1383858-16-senozhatsky@chromium.org
Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Nick Terrell <terrelln@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-09 16:39:09 -07:00
Sergey Senozhatsky
917a59e81c zram: introduce custom comp backends API
Moving to custom backends implementation gives us ability to have our own
minimalistic and extendable API, and algorithms tunings becomes possible.

The list of compression backends is empty at this point, we will add
backends in the followup patches.

Link: https://lkml.kernel.org/r/20240902105656.1383858-5-senozhatsky@chromium.org
Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Nick Terrell <terrelln@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-09 16:39:07 -07:00
Sergey Senozhatsky
34efe1c3b6 zram: add max_pages param to recompression
Introduce "max_pages" param to recompress device attribute which sets an
upper limit on the number of entries (pages) zram attempts to recompress
(in this particular recompression call).  S/W recompression can be quite
expensive so limiting the number of pages recompress touches can be quite
helpful.

Link: https://lkml.kernel.org/r/20240329094050.2815699-1-senozhatsky@chromium.org
Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Acked-by: Brian Geffon <bgeffon@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-25 20:56:30 -07:00
Sergey Senozhatsky
a7a0350583 zram: split memory-tracking and ac-time tracking
ZRAM_MEMORY_TRACKING enables two features:
- per-entry ac-time tracking
- debugfs interface

The latter one is the reason why memory-tracking depends on DEBUG_FS,
while the former one is used far beyond debugging these days.  Namely
ac-time is used for fine grained writeback of idle entries (pages).

Move ac-time tracking under its own config option so that it can be
enabled (along with writeback) on systems without DEBUG_FS.

[senozhatsky@chromium.org: ifdef fixup, per Dmytro]
  Link: https://lkml.kernel.org/r/20231117013543.540280-1-senozhatsky@chromium.org
Link: https://lkml.kernel.org/r/20231115024223.4133148-1-senozhatsky@chromium.org
Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Dmytro Maluka <dmaluka@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-10 16:51:40 -08:00
Eric Blake
952aa344bf docs nbd: userspace NBD now favors github over sourceforge
While the sourceforge site for userspace NBD still exists, the code
repository moved to github several years ago.  Then with a recent
patch[1], the github landing page contains just as much information as
the sourceforge page, so we might as well point to a single location
that also provides the code.

[1] https://lists.debian.org/nbd/2023/03/msg00051.html

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Link: https://lore.kernel.org/r/20230410180611.1051618-5-eblake@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-04-27 19:15:11 -06:00
Ondrej Zary
7750d8b510 drivers/block: Remove PARIDE core and high-level protocols
Remove PARIDE core and high level protocols, taking care not to break
low-level drivers (used by pata_parport). Also update documentation.

Signed-off-by: Ondrej Zary <linux@zary.sk>
Acked-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
2023-01-31 10:41:32 +09:00
Ondrej Zary
246a1c4c6b ata: pata_parport: add driver (PARIDE replacement)
The pata_parport is a libata-based replacement of the old PARIDE
subsystem - driver for parallel port IDE devices.
It uses the original paride low-level protocol drivers but does not
need the high-level drivers (pd, pcd, pf, pt, pg). The IDE devices
behind parallel port adapters are handled by the ATA layer.

This will allow paride and its high-level drivers to be removed.

Unfortunately, libata drivers cannot sleep so pata_parport claims
parport before activating the ata host and keeps it claimed (and
protocol connected) until the ata host is removed. This means that
no devices can be chained (neither other pata_parport devices nor
a printer).

paride and pata_parport are mutually exclusive because the compiled
protocol drivers are incompatible.

Tested with:
 - Imation SuperDisk LS-120 and HP C4381A (EPAT)
 - Freecom Parallel CD (FRPW)
 - Toshiba Mobile CD-RW 2793008 w/Freecom Parallel Cable rev.903 (FRIQ)
 - Backpack CD-RW 222011 and CD-RW 19350 (BPCK6)

The following bugs in low-level protocol drivers were found and will
be fixed later:

Note: EPP-32 mode is buggy in EPAT - and also in all other protocol
drivers - they don't handle non-multiple-of-4 block transfers
correctly. This causes problems with LS-120 drive.
There is also another bug in EPAT: EPP modes don't work unless a 4-bit
or 8-bit mode is used first (probably some initialization missing?).
Once the device is initialized, EPP works until power cycle.

So after device power on, you have to:
echo "parport0 epat 0" >/sys/bus/pata_parport/new_device
echo pata_parport.0 >/sys/bus/pata_parport/delete_device
echo "parport0 epat 4" >/sys/bus/pata_parport/new_device
(autoprobe will initialize correctly as it tries the slowest modes
first but you'll get the broken EPP-32 mode)

Note: EPP modes are buggy in FRPW, only modes 0 and 1 work.

Signed-off-by: Ondrej Zary <linux@zary.sk>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Acked-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
2023-01-31 09:34:41 +09:00
Sergey Senozhatsky
77db7bb56b zram: add incompressible flag to read_block_state()
Add a new flag to zram block state that shows if the page is
incompressible: that none of the algorithm (including secondary ones)
could compress it.

Link: https://lkml.kernel.org/r/20221109115047.2921851-14-senozhatsky@chromium.org
Suggested-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Acked-by: Minchan Kim <minchan@kernel.org>
Cc: Alexey Romanov <avromanov@sberdevices.ru>
Cc: Nhat Pham <nphamcs@gmail.com>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Suleiman Souhlal <suleiman@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-11-30 15:58:53 -08:00
Sergey Senozhatsky
b46f9ea3cb zram: add incompressible writeback
Add support for incompressible pages writeback:

  echo incompressible > /sys/block/zramX/writeback

Link: https://lkml.kernel.org/r/20221109115047.2921851-13-senozhatsky@chromium.org
Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Acked-by: Minchan Kim <minchan@kernel.org>
Cc: Alexey Romanov <avromanov@sberdevices.ru>
Cc: Nhat Pham <nphamcs@gmail.com>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Suleiman Souhlal <suleiman@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-11-30 15:58:53 -08:00
Sergey Senozhatsky
443dd79806 documentation: add zram recompression documentation
Document user-space visible device attributes that are enabled by
ZRAM_MULTI_COMP.

Link: https://lkml.kernel.org/r/20221109115047.2921851-12-senozhatsky@chromium.org
Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Acked-by: Minchan Kim <minchan@kernel.org>
Cc: Alexey Romanov <avromanov@sberdevices.ru>
Cc: Nhat Pham <nphamcs@gmail.com>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Suleiman Souhlal <suleiman@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-11-30 15:58:52 -08:00
Sergey Senozhatsky
60e9b39ebe zram: add recompress flag to read_block_state()
Add a new flag to zram block state that shows if the page was recompressed
(using alternative compression algorithm).

Link: https://lkml.kernel.org/r/20221109115047.2921851-6-senozhatsky@chromium.org
Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Acked-by: Minchan Kim <minchan@kernel.org>
Cc: Alexey Romanov <avromanov@sberdevices.ru>
Cc: Nhat Pham <nphamcs@gmail.com>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Suleiman Souhlal <suleiman@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-11-30 15:58:51 -08:00
Linus Torvalds
50fd82b3a9 Merge tag 'docs-5.19-2' of git://git.lwn.net/linux
Pull documentation fixes from Jonathan Corbet:
 "A handful of late-arriving documentation fixes and the addition of an
  SVG tux logo which, I'm assured, we're going to want"

* tag 'docs-5.19-2' of git://git.lwn.net/linux:
  documentation: Format button_dev as a pointer.
  docs: add SVG version of the Linux logo
  docs: move Linux logo into a new `images` folder
  docs: blockdev: change title to match section content
  docs/conf.py: Cope with removal of language=None in Sphinx 5.0.0
2022-06-02 15:36:06 -07:00
Joel Colledge
938285a130 docs: blockdev: change title to match section content
This index.rst was added in commit
39443104c7 docs: blockdev: convert to ReST

It appears that the title from the RapidIO index page was copied. This
title does not match the content of this directory. Change it to match.

Signed-off-by: Joel Colledge <joel.colledge@linbit.com>
Link: https://lore.kernel.org/r/20220530142849.717-1-joel.colledge@linbit.com
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2022-06-01 09:30:53 -06:00
Brian Geffon
30226b69f8 zram: add a huge_idle writeback mode
Today it's only possible to write back as a page, idle, or huge.  A user
might want to writeback pages which are huge and idle first as these idle
pages do not require decompression and make a good first pass for
writeback.

Idle writeback specifically has the advantage that a refault is unlikely
given that the page has been swapped for some amount of time without being
refaulted.

Huge writeback has the advantage that you're guaranteed to get the maximum
benefit from a single page writeback, that is, you're reclaiming one full
page of memory.  Pages which are compressed in zram being written back
result in some benefit which is always less than a page size because of
the fact that it was compressed.

The primary use of this is for minimizing refaults in situations where the
device has to be sensitive to storage endurance.  On ChromeOS we have
devices with slow eMMC and repeated writes and refaults can negatively
affect performance and endurance.

Link: https://lkml.kernel.org/r/20220322215821.1196994-1-bgeffon@google.com
Signed-off-by: Brian Geffon <bgeffon@google.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-04-29 14:36:59 -07:00
Ethan Dye
4fbe7b19a9 docs: Fix wording in optional zram feature docs
This fixes some simple grammar errors in the documentation for zram,
specifically errors in the optional feature section of the zram
documentation.

Signed-off-by: Ethan Dye <mrtops03@gmail.com>
Link: https://lore.kernel.org/r/20220207235442.95090-1-mrtops03@gmail.com
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2022-02-15 16:32:27 -07:00
Akira Yokosawa
5c81691bb6 docs: admin-guide/blockdev: Remove digraph of node-states
While node-states-8.dot has two digraphs, the dot(1) command can
not properly handle multiple graphs in a DOT file and the
kernel-doc page at

    https://www.kernel.org/doc/html/latest/admin-guide/blockdev/drbd/figures.html

fails to render the graphs.

It turned out that the digraph of node_states can be removed.

Quote from Joel's reflection:

    On reflection, the digraph node_states can be removed entirely.
    It is too basic to contain any useful information. In addition
    it references "ioctl_set_state". The ioctl configuration
    interface for DRBD has long been removed. In fact, it was never
    in the upstream version of DRBD.

Remove node_states and rename the DOT file peer_states-8.dot.

Suggested-by: Joel Colledge <joel.colledge@linbit.com>
Acked-by: Joel Colledge <joel.colledge@linbit.com>
Signed-off-by: Akira Yokosawa <akiyks@gmail.com>
Cc: Philipp Reisner <philipp.reisner@linbit.com>
Cc: Lars Ellenberg <lars.ellenberg@linbit.com>
Link: https://lore.kernel.org/r/7df04f45-8746-e666-1a9d-a998f1ab1f91@gmail.com
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2021-11-29 14:39:43 -07:00
Brian Geffon
755804d169 zram: introduce an aged idle interface
This change introduces an aged idle interface to the existing idle sysfs
file for zram.

When CONFIG_ZRAM_MEMORY_TRACKING is enabled the idle file now also
accepts an integer argument.  This integer is the age (in seconds) of
pages to mark as idle.  The idle file still supports 'all' as it always
has.  This new approach allows for much more control over which pages
get marked as idle.

[bgeffon@google.com: use IS_ENABLED and cleanup comment]
  Link: https://lkml.kernel.org/r/20210924161128.1508015-1-bgeffon@google.com
[bgeffon@google.com: Sergey's cleanup suggestions]
  Link: https://lkml.kernel.org/r/20210929143056.13067-1-bgeffon@google.com

Link: https://lkml.kernel.org/r/20210923130115.1344361-1-bgeffon@google.com
Signed-off-by: Brian Geffon <bgeffon@google.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Suleiman Souhlal <suleiman@google.com>
Cc: Jesse Barnes <jsbarnes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06 13:30:43 -07:00
Linus Torvalds
ac73e3dc8a Merge branch 'akpm' (patches from Andrew)
Merge misc updates from Andrew Morton:

 - a few random little subsystems

 - almost all of the MM patches which are staged ahead of linux-next
   material. I'll trickle to post-linux-next work in as the dependents
   get merged up.

Subsystems affected by this patch series: kthread, kbuild, ide, ntfs,
ocfs2, arch, and mm (slab-generic, slab, slub, dax, debug, pagecache,
gup, swap, shmem, memcg, pagemap, mremap, hmm, vmalloc, documentation,
kasan, pagealloc, memory-failure, hugetlb, vmscan, z3fold, compaction,
oom-kill, migration, cma, page-poison, userfaultfd, zswap, zsmalloc,
uaccess, zram, and cleanups).

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (200 commits)
  mm: cleanup kstrto*() usage
  mm: fix fall-through warnings for Clang
  mm: slub: convert sysfs sprintf family to sysfs_emit/sysfs_emit_at
  mm: shmem: convert shmem_enabled_show to use sysfs_emit_at
  mm:backing-dev: use sysfs_emit in macro defining functions
  mm: huge_memory: convert remaining use of sprintf to sysfs_emit and neatening
  mm: use sysfs_emit for struct kobject * uses
  mm: fix kernel-doc markups
  zram: break the strict dependency from lzo
  zram: add stat to gather incompressible pages since zram set up
  zram: support page writeback
  mm/process_vm_access: remove redundant initialization of iov_r
  mm/zsmalloc.c: rework the list_add code in insert_zspage()
  mm/zswap: move to use crypto_acomp API for hardware acceleration
  mm/zswap: fix passing zero to 'PTR_ERR' warning
  mm/zswap: make struct kernel_param_ops definitions const
  userfaultfd/selftests: hint the test runner on required privilege
  userfaultfd/selftests: fix retval check for userfaultfd_open()
  userfaultfd/selftests: always dump something in modes
  userfaultfd: selftests: make __{s,u}64 format specifiers portable
  ...
2020-12-15 12:53:37 -08:00
Minchan Kim
194e28da1a zram: add stat to gather incompressible pages since zram set up
Currently, zram supports the stat via /sys/block/zram/mm_stat to represent
how many of incompressible pages are stored at the moment but it couldn't
show how many times incompressible pages were wrote down since zram set
up.  It's also good indication to see how zram is effective in the system.

Link: https://lkml.kernel.org/r/20201130201907.1284910-1-minchan@kernel.org
Signed-off-by: Minchan Kim <minchan@kernel.org>
Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-15 12:13:47 -08:00
Minchan Kim
0d8359620d zram: support page writeback
There is demand to writeback specific process pages to backing store
instead of all idles pages in the system due to storage wear out concerns
and to launching latency of apps which are most of the time idle but are
critical for resume latency.

This patch extends the writeback knob to support a specific page
writeback.

Link: https://lkml.kernel.org/r/20201020190506.3758660-1-minchan@kernel.org
Signed-off-by: Minchan Kim <minchan@kernel.org>
Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-15 12:13:47 -08:00
Andrew Klychkov
b2105aa2c6 Documentation: fix typos found in admin-guide subdirectory
Fixed twelve typos in cppc_sysfs.rst, binderfs.rst, paride.rst,
zram.rst, bug-hunting.rst, introduction.rst, usage.rst, dm-crypt.rst

Signed-off-by: Andrew Klychkov <andrew.a.klychkov@gmail.com>
Reviewed-by: Jonathan Corbet <corbet@lwn.net>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Link: https://lore.kernel.org/r/20201204070235.GA48631@spblnx124.lan
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2020-12-08 10:25:42 -07:00
Randy Dunlap
6b99e6e6aa Documentation/admin-guide: blockdev/ramdisk: remove use of "rdev"
Remove use of "rdev" from blockdev/ramdisk.rst and update
admin-guide/kernel-parameters.txt.

"rdev" is considered antiquated, ancient, archaic, obsolete, deprecated
{choose any or all}.

"rdev" was removed from util-linux in 2010:
  https://git.kernel.org/pub/scm/utils/util-linux/util-linux.git/commit/?id=a3e40c14651fccf18e7954f081e601389baefe3f

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Karel Zak <kzak@redhat.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: linux-block@vger.kernel.org
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Martin Mares <mj@ucw.cz>
Cc: linux-video@atrey.karlin.mff.cuni.cz
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: linux-doc@vger.kernel.org
Link: https://lore.kernel.org/r/20200918015640.8439-3-rdunlap@infradead.org
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2020-09-24 10:50:31 -06:00
Alexander A. Klimov
c0ad0befed Replace HTTP links with HTTPS ones: DRBD driver
Rationale:
Reduces attack surface on kernel devs opening the links for MITM
as HTTPS traffic is much harder to manipulate.

Deterministic algorithm:
For each file:
  If not .svg:
    For each line:
      If doesn't contain `\bxmlns\b`:
        For each link, `\bhttp://[^# \t\r\n]*(?:\w|/)`:
          If both the HTTP and HTTPS versions
          return 200 OK and serve the same content:
            Replace HTTP with HTTPS.

Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de>
Link: https://lore.kernel.org/r/20200627103111.71771-1-grandmaster@al2klimov.de
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2020-07-05 14:16:44 -06:00
Yue Hu
56e6b3b0b3 Documentation: zram: fix the description about orig_data_size of mm_stat
orig_data_size counted the same_pages by commit 51f9f82c85 ("zram:
count same page write as page_stored"), so let's fix it.

Signed-off-by: Yue Hu <zbestahu@163.com>
Link: https://lore.kernel.org/r/20200206111031.9524-1-zbestahu@gmail.com
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2020-02-13 11:33:32 -07:00
Yue Hu
5871023c3a zram: correct documentation about sysfs node of huge page writeback
sysfs node for huge page writeback is writeback rather than write.

Signed-off-by: Yue Hu <huyue2@yulong.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Link: https://lore.kernel.org/r/20200120102949.12132-1-zbestahu@gmail.com
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2020-01-24 09:52:05 -07:00
Randy Dunlap
a3e1c56a0b Documentation: zram: various fixes in zram.rst
Fix various items in zram.rst:
- typos/spellos
- punctuation
- grammar
- shell syntax
- indentation
- sysfs file names

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Minchan Kim <minchan@kernel.org>
Link: https://lore.kernel.org/r/77000e12-677a-62f6-9f78-343be5bd6630@infradead.org
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2020-01-24 09:50:29 -07:00
Mauro Carvalho Chehab
7e042736fa docs: add SPDX tags to new index files
All those new files I added are under GPL v2.0 license.

Add the corresponding SPDX headers to them.

Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2019-07-15 11:03:03 -03:00
Mauro Carvalho Chehab
e7751617dd docs: blockdev: add it to the admin-guide
The blockdev book basically contains user-faced documentation.

Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2019-07-15 11:03:01 -03:00