Commit Graph

6576 Commits

Author SHA1 Message Date
Linus Torvalds
8883957b3c Merge tag 'fsnotify_hsm_for_v6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull fsnotify pre-content notification support from Jan Kara:
 "This introduces a new fsnotify event (FS_PRE_ACCESS) that gets
  generated before a file contents is accessed.

  The event is synchronous so if there is listener for this event, the
  kernel waits for reply. On success the execution continues as usual,
  on failure we propagate the error to userspace. This allows userspace
  to fill in file content on demand from slow storage. The context in
  which the events are generated has been picked so that we don't hold
  any locks and thus there's no risk of a deadlock for the userspace
  handler.

  The new pre-content event is available only for users with global
  CAP_SYS_ADMIN capability (similarly to other parts of fanotify
  functionality) and it is an administrator responsibility to make sure
  the userspace event handler doesn't do stupid stuff that can DoS the
  system.

  Based on your feedback from the last submission, fsnotify code has
  been improved and now file->f_mode encodes whether pre-content event
  needs to be generated for the file so the fast path when nobody wants
  pre-content event for the file just grows the additional file->f_mode
  check. As a bonus this also removes the checks whether the old
  FS_ACCESS event needs to be generated from the fast path. Also the
  place where the event is generated during page fault has been moved so
  now filemap_fault() generates the event if and only if there is no
  uptodate folio in the page cache.

  Also we have dropped FS_PRE_MODIFY event as current real-world users
  of the pre-content functionality don't really use it so let's start
  with the minimal useful feature set"

* tag 'fsnotify_hsm_for_v6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: (21 commits)
  fanotify: Fix crash in fanotify_init(2)
  fs: don't block write during exec on pre-content watched files
  fs: enable pre-content events on supported file systems
  ext4: add pre-content fsnotify hook for DAX faults
  btrfs: disable defrag on pre-content watched files
  xfs: add pre-content fsnotify hook for DAX faults
  fsnotify: generate pre-content permission event on page fault
  mm: don't allow huge faults for files with pre content watches
  fanotify: disable readahead if we have pre-content watches
  fanotify: allow to set errno in FAN_DENY permission response
  fanotify: report file range info with pre-content events
  fanotify: introduce FAN_PRE_ACCESS permission event
  fsnotify: generate pre-content permission event on truncate
  fsnotify: pass optional file access range in pre-content event
  fsnotify: introduce pre-content permission events
  fanotify: reserve event bit of deprecated FAN_DIR_MODIFY
  fanotify: rename a misnamed constant
  fanotify: don't skip extra event info if no info_mode is set
  fsnotify: check if file is actually being watched for pre-content events on open
  fsnotify: opt-in for permission events at file open time
  ...
2025-01-23 13:36:06 -08:00
Linus Torvalds
d0d106a2bd Merge tag 'bpf-next-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Pull bpf updates from Alexei Starovoitov:
 "A smaller than usual release cycle.

  The main changes are:

   - Prepare selftest to run with GCC-BPF backend (Ihor Solodrai)

     In addition to LLVM-BPF runs the BPF CI now runs GCC-BPF in compile
     only mode. Half of the tests are failing, since support for
     btf_decl_tag is still WIP, but this is a great milestone.

   - Convert various samples/bpf to selftests/bpf/test_progs format
     (Alexis Lothoré and Bastien Curutchet)

   - Teach verifier to recognize that array lookup with constant
     in-range index will always succeed (Daniel Xu)

   - Cleanup migrate disable scope in BPF maps (Hou Tao)

   - Fix bpf_timer destroy path in PREEMPT_RT (Hou Tao)

   - Always use bpf_mem_alloc in bpf_local_storage in PREEMPT_RT (Martin
     KaFai Lau)

   - Refactor verifier lock support (Kumar Kartikeya Dwivedi)

     This is a prerequisite for upcoming resilient spin lock.

   - Remove excessive 'may_goto +0' instructions in the verifier that
     LLVM leaves when unrolls the loops (Yonghong Song)

   - Remove unhelpful bpf_probe_write_user() warning message (Marco
     Elver)

   - Add fd_array_cnt attribute for prog_load command (Anton Protopopov)

     This is a prerequisite for upcoming support for static_branch"

* tag 'bpf-next-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (125 commits)
  selftests/bpf: Add some tests related to 'may_goto 0' insns
  bpf: Remove 'may_goto 0' instruction in opt_remove_nops()
  bpf: Allow 'may_goto 0' instruction in verifier
  selftests/bpf: Add test case for the freeing of bpf_timer
  bpf: Cancel the running bpf_timer through kworker for PREEMPT_RT
  bpf: Free element after unlock in __htab_map_lookup_and_delete_elem()
  bpf: Bail out early in __htab_map_lookup_and_delete_elem()
  bpf: Free special fields after unlock in htab_lru_map_delete_node()
  tools: Sync if_xdp.h uapi tooling header
  libbpf: Work around kernel inconsistently stripping '.llvm.' suffix
  bpf: selftests: verifier: Add nullness elision tests
  bpf: verifier: Support eliding map lookup nullness
  bpf: verifier: Refactor helper access type tracking
  bpf: tcp: Mark bpf_load_hdr_opt() arg2 as read-write
  bpf: verifier: Add missing newline on verbose() call
  selftests/bpf: Add distilled BTF test about marking BTF_IS_EMBEDDED
  libbpf: Fix incorrect traversal end type ID when marking BTF_IS_EMBEDDED
  libbpf: Fix return zero when elf_begin failed
  selftests/bpf: Fix btf leak on new btf alloc failure in btf_distill test
  veristat: Load struct_ops programs only once
  ...
2025-01-23 08:04:07 -08:00
Linus Torvalds
754916d4a2 Merge tag 'caps-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sergeh/linux
Pull capabilities updates from Serge Hallyn:

 - remove the cap_mmap_file() hook, as it simply returned the default
   return value and so doesn't need to exist (Paul Moore)

 - add a trace event for cap_capable() (Jordan Rome)

* tag 'caps-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sergeh/linux:
  security: add trace event for cap_capable
  capabilities: remove cap_mmap_file()
2025-01-23 08:00:16 -08:00
Linus Torvalds
21266b8df5 Merge tag 'AT_EXECVE_CHECK-v6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull AT_EXECVE_CHECK from Kees Cook:

 - Implement AT_EXECVE_CHECK flag to execveat(2) (Mickaël Salaün)

 - Implement EXEC_RESTRICT_FILE and EXEC_DENY_INTERACTIVE securebits
   (Mickaël Salaün)

 - Add selftests and samples for AT_EXECVE_CHECK (Mickaël Salaün)

* tag 'AT_EXECVE_CHECK-v6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  ima: instantiate the bprm_creds_for_exec() hook
  samples/check-exec: Add an enlighten "inc" interpreter and 28 tests
  selftests: ktap_helpers: Fix uninitialized variable
  samples/check-exec: Add set-exec
  selftests/landlock: Add tests for execveat + AT_EXECVE_CHECK
  selftests/exec: Add 32 tests for AT_EXECVE_CHECK and exec securebits
  security: Add EXEC_RESTRICT_FILE and EXEC_DENY_INTERACTIVE securebits
  exec: Add a new AT_EXECVE_CHECK flag to execveat(2)
2025-01-22 20:34:42 -08:00
Linus Torvalds
5ab889facc Merge tag 'hardening-v6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull hardening updates from Kees Cook:

 - stackleak: Use str_enabled_disabled() helper (Thorsten Blum)

 - Document GCC INIT_STACK_ALL_PATTERN behavior (Geert Uytterhoeven)

 - Add task_prctl_unknown tracepoint (Marco Elver)

* tag 'hardening-v6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  hardening: Document INIT_STACK_ALL_PATTERN behavior with GCC
  stackleak: Use str_enabled_disabled() helper in stack_erasing_sysctl()
  tracing: Remove pid in task_rename tracing output
  tracing: Add task_prctl_unknown tracepoint
2025-01-22 20:29:53 -08:00
Linus Torvalds
ad2aec7c96 Merge tag 'tomoyo-pr-20250123' of git://git.code.sf.net/p/tomoyo/tomoyo
Pull tomoyo updates from Tetsuo Handa:
 "Small changes to improve usability"

* tag 'tomoyo-pr-20250123' of git://git.code.sf.net/p/tomoyo/tomoyo:
  tomoyo: automatically use patterns for several situations in learning mode
  tomoyo: use realpath if symlink's pathname refers to procfs
  tomoyo: don't emit warning in tomoyo_write_control()
2025-01-22 20:25:00 -08:00
Linus Torvalds
de5817bbfb Merge tag 'landlock-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux
Pull landlock updates from Mickaël Salaün:
 "This mostly factors out some Landlock code and prepares for upcoming
  audit support.

  Because files with invalid modes might be visible after filesystem
  corruption, Landlock now handles those weird files too.

  A few sample and test issues are also fixed"

* tag 'landlock-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux:
  selftests/landlock: Add layout1.umount_sandboxer tests
  selftests/landlock: Add wrappers.h
  selftests/landlock: Fix error message
  landlock: Optimize file path walks and prepare for audit support
  selftests/landlock: Add test to check partial access in a mount tree
  landlock: Align partial refer access checks with final ones
  landlock: Simplify initially denied access rights
  landlock: Move access types
  landlock: Factor out check_access_path()
  selftests/landlock: Fix build with non-default pthread linking
  landlock: Use scoped guards for ruleset in landlock_add_rule()
  landlock: Use scoped guards for ruleset
  landlock: Constify get_mode_access()
  landlock: Handle weird files
  samples/landlock: Fix possible NULL dereference in parse_path()
  selftests/landlock: Remove unused macros in ptrace_test.c
2025-01-22 20:20:55 -08:00
Linus Torvalds
9cb2bf599b Merge tag 'keys-next-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd
Pull keys updates from Jarkko Sakkinen.

Avoid using stack addresses for sg lists. And a cleanup.

* tag 'keys-next-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd:
  KEYS: trusted: dcp: fix improper sg use with CONFIG_VMAP_STACK=y
  keys: drop shadowing dead prototype
2025-01-22 19:47:17 -08:00
Linus Torvalds
690ffcd817 Merge tag 'selinux-pr-20250121' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux
Pull selinux updates from Paul Moore:

 - Extended permissions supported in conditional policy

   The SELinux extended permissions, aka "xperms", allow security admins
   to target individuals ioctls, and recently netlink messages, with
   their SELinux policy. Adding support for conditional policies allows
   admins to toggle the granular xperms using SELinux booleans, helping
   pave the way for greater use of xperms in general purpose SELinux
   policies. This change bumps the maximum SELinux policy version to 34.

 - Fix a SCTP/SELinux error return code inconsistency

   Depending on the loaded SELinux policy, specifically it's
   EXTSOCKCLASS support, the bind(2) LSM/SELinux hook could return
   different error codes due to the SELinux code checking the socket's
   SELinux object class (which can vary depending on EXTSOCKCLASS) and
   not the socket's sk_protocol field. We fix this by doing the obvious,
   and looking at the sock->sk_protocol field instead of the object
   class.

 - Makefile fixes to properly cleanup av_permissions.h

   Add av_permissions.h to "targets" so that it is properly cleaned up
   using the kbuild infrastructure.

 - A number of smaller improvements by Christian Göttsche

   A variety of straightforward changes to reduce code duplication,
   reduce pointer lookups, migrate void pointers to defined types,
   simplify code, constify function parameters, and correct iterator
   types.

* tag 'selinux-pr-20250121' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
  selinux: make more use of str_read() when loading the policy
  selinux: avoid unnecessary indirection in struct level_datum
  selinux: use known type instead of void pointer
  selinux: rename comparison functions for clarity
  selinux: rework match_ipv6_addrmask()
  selinux: constify and reconcile function parameter names
  selinux: avoid using types indicating user space interaction
  selinux: supply missing field initializers
  selinux: add netlink nlmsg_type audit message
  selinux: add support for xperms in conditional policies
  selinux: Fix SCTP error inconsistency in selinux_socket_bind()
  selinux: use native iterator types
  selinux: add generated av_permissions.h to targets
2025-01-21 20:09:14 -08:00
Linus Torvalds
f96a974170 Merge tag 'lsm-pr-20250121' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm
Pull lsm updates from Paul Moore:

 - Improved handling of LSM "secctx" strings through lsm_context struct

   The LSM secctx string interface is from an older time when only one
   LSM was supported, migrate over to the lsm_context struct to better
   support the different LSMs we now have and make it easier to support
   new LSMs in the future.

   These changes explain the Rust, VFS, and networking changes in the
   diffstat.

 - Only build lsm_audit.c if CONFIG_SECURITY and CONFIG_AUDIT are
   enabled

   Small tweak to be a bit smarter about when we build the LSM's common
   audit helpers.

 - Check for absurdly large policies from userspace in SafeSetID

   SafeSetID policies rules are fairly small, basically just "UID:UID",
   it easy to impose a limit of KMALLOC_MAX_SIZE on policy writes which
   helps quiet a number of syzbot related issues. While work is being
   done to address the syzbot issues through other mechanisms, this is a
   trivial and relatively safe fix that we can do now.

 - Various minor improvements and cleanups

   A collection of improvements to the kernel selftests, constification
   of some function parameters, removing redundant assignments, and
   local variable renames to improve readability.

* tag 'lsm-pr-20250121' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm:
  lockdown: initialize local array before use to quiet static analysis
  safesetid: check size of policy writes
  net: corrections for security_secid_to_secctx returns
  lsm: rename variable to avoid shadowing
  lsm: constify function parameters
  security: remove redundant assignment to return variable
  lsm: Only build lsm_audit.c if CONFIG_SECURITY and CONFIG_AUDIT are set
  selftests: refactor the lsm `flags_overset_lsm_set_self_attr` test
  binder: initialize lsm_context structure
  rust: replace lsm context+len with lsm_context
  lsm: secctx provider check on release
  lsm: lsm_context in security_dentry_init_security
  lsm: use lsm_context in security_inode_getsecctx
  lsm: replace context+len with lsm_context
  lsm: ensure the correct LSM context releaser
2025-01-21 20:03:04 -08:00
Linus Torvalds
678ca9f78e Merge tag 'Smack-for-6.14' of https://github.com/cschaufler/smack-next
Pull smack update from Casey Schaufler:
 "One minor code improvement for v6.14"

* tag 'Smack-for-6.14' of https://github.com/cschaufler/smack-next:
  smack: deduplicate access to string conversion
2025-01-21 19:59:46 -08:00
Linus Torvalds
0ca0cf9f8c Merge tag 'integrity-v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity
Pull integrity updates from Mimi Zohar:
 "There's just a couple of changes: two kernel messages addressed, a
  measurement policy collision addressed, and one policy cleanup.

  Please note that the contents of the IMA measurement list is
  potentially affected. The builtin tmpfs IMA policy rule change might
  introduce additional measurements, while detecting a reboot might
  eliminate some measurements"

* tag 'integrity-v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity:
  ima: ignore suffixed policy rule comments
  ima: limit the builtin 'tcb' dont_measure tmpfs policy rule
  ima: kexec: silence RCU list traversal warning
  ima: Suspend PCR extends and log appends when rebooting
2025-01-21 19:54:32 -08:00
David Gstir
e8d9fab39d KEYS: trusted: dcp: fix improper sg use with CONFIG_VMAP_STACK=y
With vmalloc stack addresses enabled (CONFIG_VMAP_STACK=y) DCP trusted
keys can crash during en- and decryption of the blob encryption key via
the DCP crypto driver. This is caused by improperly using sg_init_one()
with vmalloc'd stack buffers (plain_key_blob).

Fix this by always using kmalloc() for buffers we give to the DCP crypto
driver.

Cc: stable@vger.kernel.org # v6.10+
Fixes: 0e28bf61a5 ("KEYS: trusted: dcp: fix leak of blob encryption key")
Signed-off-by: David Gstir <david@sigma-star.at>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2025-01-21 11:25:23 +02:00
Linus Torvalds
4b84a4c8d4 Merge tag 'vfs-6.14-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull misc vfs updates from Christian Brauner:
 "Features:

   - Support caching symlink lengths in inodes

     The size is stored in a new union utilizing the same space as
     i_devices, thus avoiding growing the struct or taking up any more
     space

     When utilized it dodges strlen() in vfs_readlink(), giving about
     1.5% speed up when issuing readlink on /initrd.img on ext4

   - Add RWF_DONTCACHE iocb and FOP_DONTCACHE file_operations flag

     If a file system supports uncached buffered IO, it may set
     FOP_DONTCACHE and enable support for RWF_DONTCACHE.

     If RWF_DONTCACHE is attempted without the file system supporting
     it, it'll get errored with -EOPNOTSUPP

   - Enable VBOXGUEST and VBOXSF_FS on ARM64

     Now that VirtualBox is able to run as a host on arm64 (e.g. the
     Apple M3 processors) we can enable VBOXSF_FS (and in turn
     VBOXGUEST) for this architecture.

     Tested with various runs of bonnie++ and dbench on an Apple MacBook
     Pro with the latest Virtualbox 7.1.4 r165100 installed

  Cleanups:

   - Delay sysctl_nr_open check in expand_files()

   - Use kernel-doc includes in fiemap docbook

   - Use page->private instead of page->index in watch_queue

   - Use a consume fence in mnt_idmap() as it's heavily used in
     link_path_walk()

   - Replace magic number 7 with ARRAY_SIZE() in fc_log

   - Sort out a stale comment about races between fd alloc and dup2()

   - Fix return type of do_mount() from long to int

   - Various cosmetic cleanups for the lockref code

  Fixes:

   - Annotate spinning as unlikely() in __read_seqcount_begin

     The annotation already used to be there, but got lost in commit
     52ac39e5db ("seqlock: seqcount_t: Implement all read APIs as
     statement expressions")

   - Fix proc_handler for sysctl_nr_open

   - Flush delayed work in delayed fput()

   - Fix grammar and spelling in propagate_umount()

   - Fix ESP not readable during coredump

     In /proc/PID/stat, there is the kstkesp field which is the stack
     pointer of a thread. While the thread is active, this field reads
     zero. But during a coredump, it should have a valid value

     However, at the moment, kstkesp is zero even during coredump

   - Don't wake up the writer if the pipe is still full

   - Fix unbalanced user_access_end() in select code"

* tag 'vfs-6.14-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (28 commits)
  gfs2: use lockref_init for qd_lockref
  erofs: use lockref_init for pcl->lockref
  dcache: use lockref_init for d_lockref
  lockref: add a lockref_init helper
  lockref: drop superfluous externs
  lockref: use bool for false/true returns
  lockref: improve the lockref_get_not_zero description
  lockref: remove lockref_put_not_zero
  fs: Fix return type of do_mount() from long to int
  select: Fix unbalanced user_access_end()
  vbox: Enable VBOXGUEST and VBOXSF_FS on ARM64
  pipe_read: don't wake up the writer if the pipe is still full
  selftests: coredump: Add stackdump test
  fs/proc: do_task_stat: Fix ESP not readable during coredump
  fs: add RWF_DONTCACHE iocb and FOP_DONTCACHE file_operations flag
  fs: sort out a stale comment about races between fd alloc and dup2
  fs: Fix grammar and spelling in propagate_umount()
  fs: fc_log replace magic number 7 with ARRAY_SIZE()
  fs: use a consume fence in mnt_idmap()
  file: flush delayed work in delayed fput()
  ...
2025-01-20 09:40:49 -08:00
John Johansen
e6b0876769 apparmor: fix dbus permission queries to v9 ABI
dbus permission queries need to be synced with fine grained unix
mediation to avoid potential policy regressions. To ensure that
dbus queries don't result in a case where fine grained unix mediation
is not being applied but dbus mediation is check the loaded policy
support ABI and abort the query if policy doesn't support the
v9 ABI.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-01-18 06:47:13 -08:00
John Johansen
dcd7a55941 apparmor: gate make fine grained unix mediation behind v9 abi
Fine grained unix mediation in Ubuntu used ABI v7, and policy using
this has propogated onto systems where fine grained unix mediation was
not supported. The userspace policy compiler supports downgrading
policy so the policy could be shared without changes.

Unfortunately this had the side effect that policy was not updated for
the none Ubuntu systems and enabling fine grained unix mediation on
those systems means that a new kernel can break a system with existing
policy that worked with the previous kernel. With fine grained af_unix
mediation this regression can easily break the system causing boot to
fail, as it affect unix socket files, non-file based unix sockets, and
dbus communication.

To aoid this regression move fine grained af_unix mediation behind
a new abi. This means that the system's userspace and policy must
be updated to support the new policy before it takes affect and
dropping a new kernel on existing system will not result in a
regression.

The abi bump is done in such a way as existing policy can be activated
on the system by changing the policy abi declaration and existing unix
policy rules will apply. Policy then only needs to be incrementally
updated, can even be backported to existing Ubuntu policy.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-01-18 06:47:13 -08:00
John Johansen
c05e705812 apparmor: add fine grained af_unix mediation
Extend af_unix mediation to support fine grained controls based on
the type (abstract, anonymous, fs), the address, and the labeling
on the socket.

This allows for using socket addresses to label and the socket and
control which subjects can communicate.

The unix rule format follows standard apparmor rules except that fs
based unix sockets can be mediated by existing file rules. None fs
unix sockets can be mediated by a unix socket rule. Where The address
of an abstract unix domain socket begins with the @ character, similar
to how they are reported (as paths) by netstat -x.  The address then
follows and may contain pattern matching and any characters including
the null character. In apparmor null characters must be specified by
using an escape sequence \000 or \x00. The pattern matching is the
same as is used by file path matching so * will not match / even
though it has no special meaning with in an abstract socket name. Eg.

     allow unix addr=@*,

Autobound unix domain sockets have a unix sun_path assigned to them by
the kernel, as such specifying a policy based address is not possible.
The autobinding of sockets can be controlled by specifying the special
auto keyword. Eg.

     allow unix addr=auto,

To indicate that the rule only applies to auto binding of unix domain
sockets.  It is important to note this only applies to the bind
permission as once the socket is bound to an address it is
indistinguishable from a socket that have an addr bound with a
specified name. When the auto keyword is used with other permissions
or as part of a peer addr it will be replaced with a pattern that can
match an autobound socket. Eg. For some kernels

    allow unix rw addr=auto,

It is important to note, this pattern may match abstract sockets that
were not autobound but have an addr that fits what is generated by the
kernel when autobinding a socket.

Anonymous unix domain sockets have no sun_path associated with the
socket address, however it can be specified with the special none
keyword to indicate the rule only applies to anonymous unix domain
sockets. Eg.

    allow unix addr=none,

If the address component of a rule is not specified then the rule
applies to autobind, abstract and anonymous sockets.

The label on the socket can be compared using the standard label=
rule conditional. Eg.

    allow unix addr=@foo peer=(label=bar),

see man apparmor.d for full syntax description.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-01-18 06:47:12 -08:00
John Johansen
b4940d913c apparmor: in preparation for finer networking rules rework match_prot
Rework match_prot into a common fn that can be shared by all the
networking rules. This will provide compatibility with current socket
mediation, via the early bailout permission encoding.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-01-18 06:47:12 -08:00
John Johansen
6cc6a0523d apparmor: lift kernel socket check out of critical section
There is no need for the kern check to be in the critical section,
it only complicates the code and slows down the case where the
socket is being created by the kernel.

Lifting it out will also allow socket_create to share common template
code, with other socket_permission checks.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-01-18 06:47:12 -08:00
John Johansen
9045aa25d1 apparmor: remove af_select macro
The af_select macro just adds a layer of unnecessary abstraction that
makes following what the code is doing harder.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-01-18 06:47:12 -08:00
John Johansen
ce9e3b3fa2 apparmor: add ability to mediate caps with policy state machine
Currently the caps encoding is very limited and can't be used with
conditionals. Allow capabilities to be mediated by the state
machine. This will allow us to add conditionals to capabilities that
aren't possible with the current encoding.

This patch only adds support for using the state machine and retains
the old encoding lookup as part of the runtime mediation code to
support older policy abis. A follow on patch will move backwards
compatibility to a mapping function done at policy load time.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-01-18 06:47:12 -08:00
John Johansen
a9eb185be8 apparmor: fix x_table_lookup when stacking is not the first entry
x_table_lookup currently does stacking during label_parse() if the
target specifies a stack but its only caller ensures that it will
never be used with stacking.

Refactor to slightly simplify the code in x_to_label(), this
also fixes a long standing problem where x_to_labels check on stacking
is only on the first element to the table option list, instead of
the element that is found and used.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-01-18 06:47:12 -08:00
John Johansen
84c455decf apparmor: add support for profiles to define the kill signal
Previously apparmor has only sent SIGKILL but there are cases where
it can be useful to send a different signal. Allow the profile
to optionally specify a different value.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-01-18 06:47:12 -08:00
John Johansen
2e12c5f060 apparmor: add additional flags to extended permission.
This is a step towards merging the file and policy state machines.

With the switch to extended permissions the state machine's ACCEPT2
table became unused freeing it up to store state specific flags. The
first flags to be stored are FLAG_OWNER and FLAG other which paves the
way towards merging the file and policydb perms into a single
permission table.

Currently Lookups based on the objects ownership conditional will
still need separate fns, this will be address in a following patch.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-01-18 06:47:12 -08:00
John Johansen
de4754c801 apparmor: carry mediation check on label
In order to speed up the mediated check, precompute and store the
result as a bit per class type. This will not only allow us to
speed up the mediation check but is also a step to removing the
unconfined special cases as the unconfined check can be replaced
with the generic label_mediates() check.

Note: label check does not currently work for capabilities and resources
      which need to have their mediation updated first.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-01-18 06:47:12 -08:00
John Johansen
34d31f2338 apparmor: cleanup: refactor file_perm() to doc semantics of some checks
Provide semantics, via fn names, for some checks being done in
file_perm(). This is a preparatory patch for improvements to both
permission caching and delegation, where the check will become more
involved.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-01-18 06:47:12 -08:00
John Johansen
35fad5b462 apparmor: remove explicit restriction that unconfined cannot use change_hat
There does not need to be an explicit restriction that unconfined
can't use change_hat. Traditionally unconfined doesn't have hats
so change_hat could not be used. But newer unconfined profiles have
the potential of having hats, and even system unconfined will be
able to be replaced with a profile that allows for hats.

To remain backwards compitible with expected return codes, continue
to return -EPERM if the unconfined profile does not have any hats.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-01-18 06:47:12 -08:00
John Johansen
cd769b05cc apparmor: ensure labels with more than one entry have correct flags
labels containing more than one entry need to accumulate flag info
from profiles that the label is constructed from. This is done
correctly for labels created by a merge but is not being done for
labels created by an update or directly created via a parse.

This technically is a bug fix, however the effect in current code is
to cause early unconfined bail out to not happen (ie. without the fix
it is slower) on labels that were created via update or a parse.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-01-18 06:47:12 -08:00
John Johansen
0bc8c6862f apparmor: switch signal mediation to use RULE_MEDIATES
Currently signal mediation is using a hard coded form of the
RULE_MEDIATES check. This hides the intended semantics, and means this
specific check won't pickup any changes or improvements made in the
RULE_MEDIATES check. Switch to using RULE_MEDIATES().

Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-01-18 06:47:12 -08:00
John Johansen
46b9b994dd apparmor: remove redundant unconfined check.
profile_af_perm and profile_af_sk_perm are only ever called after
checking that the profile is not unconfined. So we can drop these
redundant checks.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-01-18 06:47:12 -08:00
John Johansen
280799f724 apparmor: cleanup: attachment perm lookup to use lookup_perms()
Remove another case of code duplications. Switch to using the generic
routine instead of the current custom checks.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-01-18 06:47:11 -08:00
John Johansen
71e6cff3e0 apparmor: Improve debug print infrastructure
Make it so apparmor debug output can be controlled by class flags
as well as the debug flag on labels. This provides much finer
control at what is being output so apparmor doesn't flood the
logs with information that is not needed, making it hard to find
what is important.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-01-18 06:47:11 -08:00
Thorsten Blum
c602537de3 apparmor: Use str_yes_no() helper function
Remove hard-coded strings by using the str_yes_no() helper function.

Fix a typo in a comment: s/unpritable/unprintable/

Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-01-18 06:47:11 -08:00
Mickaël Salaün
d617f0d72d landlock: Optimize file path walks and prepare for audit support
Always synchronize access_masked_parent* with access_request_parent*
according to allowed_parent*.  This is required for audit support to be
able to get back to the reason of denial.

In a rename/link action, instead of always checking a rule two times for
the same parent directory of the source and the destination files, only
check it when an action on a child was not already allowed.  This also
enables us to keep consistent allowed_parent* status, which is required
to get back to the reason of denial.

For internal mount points, only upgrade allowed_parent* to true but do
not wrongfully set both of them to false otherwise.  This is also
required to get back to the reason of denial.

This does not impact the current behavior but slightly optimize code and
prepare for audit support that needs to know the exact reason why an
access was denied.

Cc: Günther Noack <gnoack@google.com>
Link: https://lore.kernel.org/r/20250108154338.1129069-14-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-01-17 19:05:37 +01:00
Mickaël Salaün
058518c209 landlock: Align partial refer access checks with final ones
Fix a logical issue that could have been visible if the source or the
destination of a rename/link action was allowed for either the source or
the destination but not both.  However, this logical bug is unreachable
because either:
- the rename/link action is allowed by the access rights tied to the
  same mount point (without relying on access rights in a parent mount
  point) and the access request is allowed (i.e. allow_parent1 and
  allow_parent2 are true in current_check_refer_path),
- or a common rule in a parent mount point updates the access check for
  the source and the destination (cf. is_access_to_paths_allowed).

See the following layout1.refer_part_mount_tree_is_allowed test that
work with and without this fix.

This fix does not impact current code but it is required for the audit
support.

Cc: Günther Noack <gnoack@google.com>
Link: https://lore.kernel.org/r/20250108154338.1129069-12-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-01-17 19:05:35 +01:00
Mickaël Salaün
d6c7cf84a2 landlock: Simplify initially denied access rights
Upgrade domain's handled access masks when creating a domain from a
ruleset, instead of converting them at runtime.  This is more consistent
and helps with audit support.

Cc: Günther Noack <gnoack@google.com>
Link: https://lore.kernel.org/r/20250108154338.1129069-7-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-01-17 19:05:35 +01:00
Mickaël Salaün
622e2f5954 landlock: Move access types
Move LANDLOCK_ACCESS_FS_INITIALLY_DENIED, access_mask_t, struct
access_mask, and struct access_masks_all to a dedicated access.h file.

Rename LANDLOCK_ACCESS_FS_INITIALLY_DENIED to
_LANDLOCK_ACCESS_FS_INITIALLY_DENIED to make it clear that it's not part
of UAPI.  Add some newlines when appropriate.

This file will be extended with following commits, and it will help to
avoid dependency loops.

Cc: Günther Noack <gnoack@google.com>
Link: https://lore.kernel.org/r/20250108154338.1129069-6-mic@digikod.net
[mic: Fix rebase conflict because of the new cleanup headers]
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-01-17 19:05:34 +01:00
Mickaël Salaün
924f4403d8 landlock: Factor out check_access_path()
Merge check_access_path() into current_check_access_path() and make
hook_path_mknod() use it.

Cc: Günther Noack <gnoack@google.com>
Link: https://lore.kernel.org/r/20250108154338.1129069-4-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-01-17 19:05:33 +01:00
Mickaël Salaün
16a6f4d3b5 landlock: Use scoped guards for ruleset in landlock_add_rule()
Simplify error handling by replacing goto statements with automatic
calls to landlock_put_ruleset() when going out of scope.

This change depends on the TCP support.

Cc: Konstantin Meskhidze <konstantin.meskhidze@huawei.com>
Cc: Mikhail Ivanov <ivanov.mikhail1@huawei-partners.com>
Reviewed-by: Günther Noack <gnoack@google.com>
Link: https://lore.kernel.org/r/20250113161112.452505-3-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-01-14 11:57:45 +01:00
Mickaël Salaün
d32f79a59a landlock: Use scoped guards for ruleset
Simplify error handling by replacing goto statements with automatic
calls to landlock_put_ruleset() when going out of scope.

This change will be easy to backport to v6.6 if needed, only the
kernel.h include line conflicts.  As for any other similar changes, we
should be careful when backporting without goto statements.

Add missing include file.

Reviewed-by: Günther Noack <gnoack@google.com>
Link: https://lore.kernel.org/r/20250113161112.452505-2-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-01-14 11:57:45 +01:00
Mickaël Salaün
25ccc75f5d landlock: Constify get_mode_access()
Use __attribute_const__ for get_mode_access().

Reviewed-by: Günther Noack <gnoack3000@gmail.com>
Link: https://lore.kernel.org/r/20250110153918.241810-2-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-01-14 11:57:44 +01:00
Mickaël Salaün
49440290a0 landlock: Handle weird files
A corrupted filesystem (e.g. bcachefs) might return weird files.
Instead of throwing a warning and allowing access to such file, treat
them as regular files.

Cc: Dave Chinner <david@fromorbit.com>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Paul Moore <paul@paul-moore.com>
Reported-by: syzbot+34b68f850391452207df@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/r/000000000000a65b35061cffca61@google.com
Reported-by: syzbot+360866a59e3c80510a62@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/r/67379b3f.050a0220.85a0.0001.GAE@google.com
Reported-by: Ubisectech Sirius <bugreport@ubisectech.com>
Closes: https://lore.kernel.org/r/c426821d-8380-46c4-a494-7008bbd7dd13.bugreport@ubisectech.com
Fixes: cb2c7d1a17 ("landlock: Support filesystem access-control")
Reviewed-by: Günther Noack <gnoack3000@gmail.com>
Link: https://lore.kernel.org/r/20250110153918.241810-1-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-01-14 11:57:39 +01:00
Yafang Shao
7c58ed44bd security: remove get_task_comm() and print task comm directly
Since task->comm is guaranteed to be NUL-terminated, we can print it
directly without the need to copy it into a separate buffer.  This
simplifies the code and avoids unnecessary operations.

Link: https://lkml.kernel.org/r/20241219023452.69907-5-laoar.shao@gmail.com
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Reviewed-by: Paul Moore <paul@paul-moore.com>
Acked-by: Kees Cook <kees@kernel.org>
Cc: James Morris <jmorris@namei.org>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: "André Almeida" <andrealmeid@igalia.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Borislav Petkov (AMD) <bp@alien8.de>
Cc: Danilo Krummrich <dakr@redhat.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Airlie <airlied@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Jiri Slaby <jirislaby@kernel.org>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Kalle Valo <kvalo@kernel.org>
Cc: Karol Herbst <kherbst@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Lyude Paul <lyude@redhat.com>
Cc: Oded Gabbay <ogabbay@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Simona Vetter <simona@ffwll.ch>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tvrtko Ursulin <tursulin@ursulin.net>
Cc: Vineet Gupta <vgupta@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-12 20:21:16 -08:00
Geert Uytterhoeven
a9a5e0bdc5 hardening: Document INIT_STACK_ALL_PATTERN behavior with GCC
The help text for INIT_STACK_ALL_PATTERN documents the patterns used by
Clang, but lacks documentation for GCC.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/293d29d6a0d1823165be97285c1bc73e90ee9db8.1736239070.git.geert+renesas@glider.be
Signed-off-by: Kees Cook <kees@kernel.org>
2025-01-08 14:17:33 -08:00
Christian Göttsche
01c2253a0f selinux: make more use of str_read() when loading the policy
Simplify the call sites, and enable future string validation in a single
place.

Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
[PM: subject tweak]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-01-07 23:14:40 -05:00
Christian Göttsche
7491536366 selinux: avoid unnecessary indirection in struct level_datum
Store the owned member of type struct mls_level directly in the parent
struct instead of an extra heap allocation.

Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-01-07 23:14:40 -05:00
Christian Göttsche
f07586160f selinux: use known type instead of void pointer
Improve type safety and readability by using the known type.

Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-01-07 23:14:39 -05:00
Christian Göttsche
83e7e18eed selinux: rename comparison functions for clarity
The functions context_cmp(), mls_context_cmp() and ebitmap_cmp() are not
traditional C style compare functions returning -1, 0, and 1 for less
than, equal, and greater than; they only return whether their arguments
are equal.

Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-01-07 23:14:39 -05:00
Christian Göttsche
5e99b81f48 selinux: rework match_ipv6_addrmask()
Constify parameters, add size hints, and simplify control flow.

According to godbolt the same assembly is generated.

Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-01-07 23:14:38 -05:00
Christian Göttsche
9090308510 selinux: constify and reconcile function parameter names
Align the parameter names between declarations and definitions, and
constify read-only parameters.

Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
[PM: tweak the subject line]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-01-07 23:14:38 -05:00