| Age | Commit message (Collapse) | Author | Files | Lines |
|
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux
Pull selinux fixes from Paul Moore:
"Three SELinux patches for v6.18 to fix issues around accessing the
per-task decision cache that we introduced in v6.16 to help reduce
SELinux overhead on path walks. The problem was that despite the cache
being located in the SELinux "task_security_struct", the parent struct
wasn't actually tied to the task, it was tied to a cred.
Historically SELinux did locate the task_security_struct in the
task_struct's security blob, but it was later relocated to the cred
struct when the cred work happened, as it made the most sense at the
time.
Unfortunately we never did the task_security_struct to
cred_security_struct rename work (avoid code churn maybe? who knows)
because it didn't really matter at the time. However, it suddenly
became a problem when we added a per-task cache to a per-cred object
and didn't notice because of the old, no-longer-correct struct naming.
Thanks to KCSAN for flagging this, as the silly humans running things
forgot that the task_security_struct was a big lie.
This contains three patches, only one of which actually fixes the
problem described above and moves the SELinux decision cache from the
per-cred struct to a newly (re)created per-task struct.
The other two patches, which form the bulk of the diffstat, take care
of the associated renaming tasks so we can hopefully avoid making the
same stupid mistake in the future.
For the record, I did contemplate sending just a fix for the cache,
leaving the renaming patches for the upcoming merge window, but the
type/variable naming ended up being pretty awful and would have made
v6.18 an outlier stuck between the "old" names and the "new" names in
v6.19. The renaming patches are also fairly mechanical/trivial and
shouldn't pose much risk despite their size.
TLDR; naming things may be hard, but if you mess it up bad things
happen"
* tag 'selinux-pr-20251121' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
selinux: rename the cred_security_struct variables to "crsec"
selinux: move avdcache to per-task security struct
selinux: rename task_security_struct to cred_security_struct
|
|
Along with the renaming from task_security_struct to cred_security_struct,
rename the local variables to "crsec" from "tsec". This both fits with
existing conventions and helps distinguish between task and cred related
variables.
No functional changes.
Acked-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
|
|
The avdcache is meant to be per-task; move it to a new
task_security_struct that is duplicated per-task.
Cc: stable@vger.kernel.org
Fixes: 5d7ddc59b3d89b724a5aa8f30d0db94ff8d2d93f ("selinux: reduce path walk overhead")
Signed-off-by: Stephen Smalley <stephen.smalley.work@gmail.com>
[PM: line length fixes]
Signed-off-by: Paul Moore <paul@paul-moore.com>
|
|
Before Linux had cred structures, the SELinux task_security_struct was
per-task and although the structure was switched to being per-cred
long ago, the name was never updated. This change renames it to
cred_security_struct to avoid confusion and pave the way for the
introduction of an actual per-task security structure for SELinux. No
functional change.
Cc: stable@vger.kernel.org
Signed-off-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
|
|
At this point it is guaranteed this is not the last reference.
However, a recent addition of might_sleep() at top of iput() started
generating false-positives as it was executing for all values.
Remedy the problem by using the newly introduced iput_not_last().
Reported-by: syzbot+12479ae15958fc3f54ec@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/68d32659.a70a0220.4f78.0012.GAE@google.com/
Fixes: 2ef435a872ab ("fs: add might_sleep() annotation to iput() and more")
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Link: https://patch.msgid.link/20251105212025.807549-2-mjguzik@gmail.com
Reviewed-by: Mickaël Salaün <mic@digikod.net>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity
Pull integrity updates from Mimi Zohar:
"Just a couple of changes: crypto code cleanup and a IMA xattr bug fix"
* tag 'integrity-v6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity:
ima: don't clear IMA_DIGSIG flag when setting or removing non-IMA xattr
lib/digsig: Use SHA-1 library instead of crypto_shash
integrity: Select CRYPTO from INTEGRITY_ASYMMETRIC_KEYS
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd
Pull keys updates from Jarkko Sakkinen:
"A few minor updates/fixes for keys"
* tag 'keys-next-6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd:
security: keys: use menuconfig for KEYS symbol
KEYS: encrypted: Use SHA-256 library instead of crypto_shash
KEYS: trusted_tpm1: Move private functionality out of public header
KEYS: trusted_tpm1: Use SHA-1 library instead of crypto_shash
KEYS: trusted_tpm1: Compare HMAC values in constant time
|
|
Give the KEYS kconfig symbol and its associated symbols a separate menu
space under Security options by using "menuconfig" instead of "config".
This also makes it easier to find the security and LSM options.
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
|
|
Instead of the "sha256" crypto_shash, just use sha256(). Similarly,
instead of the "hmac(sha256)" crypto_shash, just use
hmac_sha256_usingrawkey(). This is simpler and faster.
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Reviewed-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull file->f_path constification from Al Viro:
"Only one thing was modifying ->f_path of an opened file - acct(2).
Massaging that away and constifying a bunch of struct path * arguments
in functions that might be given &file->f_path ends up with the
situation where we can turn ->f_path into an anon union of const
struct path f_path and struct path __f_path, the latter modified only
in a few places in fs/{file_table,open,namei}.c, all for struct file
instances that are yet to be opened"
* tag 'pull-f_path' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (23 commits)
Have cc(1) catch attempts to modify ->f_path
kernel/acct.c: saner struct file treatment
configfs:get_target() - release path as soon as we grab configfs_item reference
apparmor/af_unix: constify struct path * arguments
ovl_is_real_file: constify realpath argument
ovl_sync_file(): constify path argument
ovl_lower_dir(): constify path argument
ovl_get_verity_digest(): constify path argument
ovl_validate_verity(): constify {meta,data}path arguments
ovl_ensure_verity_loaded(): constify datapath argument
ksmbd_vfs_set_init_posix_acl(): constify path argument
ksmbd_vfs_inherit_posix_acl(): constify path argument
ksmbd_vfs_kern_path_unlock(): constify path argument
ksmbd_vfs_path_lookup_locked(): root_share_path can be const struct path *
check_export(): constify path argument
export_operations->open(): constify path argument
rqst_exp_get_by_name(): constify path argument
nfs: constify path argument of __vfs_getattr()
bpf...d_path(): constify path argument
done_path_create(): constify path argument
...
|
|
Pull d_name audit update from Al Viro:
"Simplifying ->d_name audits, easy part.
Turn dentry->d_name into an anon union of const struct qsrt (d_name
itself) and a writable alias (__d_name).
With constification of some struct qstr * arguments of functions that
get &dentry->d_name passed to them, that ends up with all
modifications provably done only in fs/dcache.c (and a fairly small
part of it).
Any new places doing modifications will be easy to find - grep for
__d_name will suffice"
* tag 'pull-qstr' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
make it easier to catch those who try to modify ->d_name
generic_ci_validate_strict_name(): constify name argument
afs_dir_search: constify qstr argument
afs_edit_dir_{add,remove}(): constify qstr argument
exfat_find(): constify qstr argument
security_dentry_init_security(): constify qstr argument
|
|
Currently when both IMA and EVM are in fix mode, the IMA signature will
be reset to IMA hash if a program first stores IMA signature in
security.ima and then writes/removes some other security xattr for the
file.
For example, on Fedora, after booting the kernel with "ima_appraise=fix
evm=fix ima_policy=appraise_tcb" and installing rpm-plugin-ima,
installing/reinstalling a package will not make good reference IMA
signature generated. Instead IMA hash is generated,
# getfattr -m - -d -e hex /usr/bin/bash
# file: usr/bin/bash
security.ima=0x0404...
This happens because when setting security.selinux, the IMA_DIGSIG flag
that had been set early was cleared. As a result, IMA hash is generated
when the file is closed.
Similarly, IMA signature can be cleared on file close after removing
security xattr like security.evm or setting/removing ACL.
Prevent replacing the IMA file signature with a file hash, by preventing
the IMA_DIGSIG flag from being reset.
Here's a minimal C reproducer which sets security.selinux as the last
step which can also replaced by removing security.evm or setting ACL,
#include <stdio.h>
#include <sys/xattr.h>
#include <fcntl.h>
#include <unistd.h>
#include <string.h>
#include <stdlib.h>
int main() {
const char* file_path = "/usr/sbin/test_binary";
const char* hex_string = "030204d33204490066306402304";
int length = strlen(hex_string);
char* ima_attr_value;
int fd;
fd = open(file_path, O_WRONLY|O_CREAT|O_EXCL, 0644);
if (fd == -1) {
perror("Error opening file");
return 1;
}
ima_attr_value = (char*)malloc(length / 2 );
for (int i = 0, j = 0; i < length; i += 2, j++) {
sscanf(hex_string + i, "%2hhx", &ima_attr_value[j]);
}
if (fsetxattr(fd, "security.ima", ima_attr_value, length/2, 0) == -1) {
perror("Error setting extended attribute");
close(fd);
return 1;
}
const char* selinux_value= "system_u:object_r:bin_t:s0";
if (fsetxattr(fd, "security.selinux", selinux_value, strlen(selinux_value), 0) == -1) {
perror("Error setting extended attribute");
close(fd);
return 1;
}
close(fd);
return 0;
}
Signed-off-by: Coiby Xu <coxu@redhat.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
|
|
Select CRYPTO from INTEGRITY_ASYMMETRIC_KEYS, since
INTEGRITY_ASYMMETRIC_KEYS selects several options that depend on CRYPTO.
This unblocks the removal of the CRYPTO selection from SIGNATURE.
SIGNATURE (lib/digsig.c) itself will no longer need CRYPTO, but
INTEGRITY_ASYMMETRIC_KEYS was depending on it indirectly via the chain
SIGNATURE => INTEGRITY_SIGNATURE => INTEGRITY_ASYMMETRIC_KEYS.
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
|
|
Pull bitmap updates from Yury Norov:
- FIELD_PREP_WM16() consolidation (Nicolas)
- bitmaps for Rust (Burak)
- __fls() fix for arc (Kees)
* tag 'bitmap-for-6.18' of https://github.com/norov/linux: (25 commits)
rust: add dynamic ID pool abstraction for bitmap
rust: add find_bit_benchmark_rust module.
rust: add bitmap API.
rust: add bindings for bitops.h
rust: add bindings for bitmap.h
phy: rockchip-pcie: switch to FIELD_PREP_WM16 macro
clk: sp7021: switch to FIELD_PREP_WM16 macro
PCI: dw-rockchip: Switch to FIELD_PREP_WM16 macro
PCI: rockchip: Switch to FIELD_PREP_WM16* macros
net: stmmac: dwmac-rk: switch to FIELD_PREP_WM16 macro
ASoC: rockchip: i2s-tdm: switch to FIELD_PREP_WM16_CONST macro
drm/rockchip: dw_hdmi: switch to FIELD_PREP_WM16* macros
phy: rockchip-usb: switch to FIELD_PREP_WM16 macro
drm/rockchip: inno-hdmi: switch to FIELD_PREP_WM16 macro
drm/rockchip: dw_hdmi_qp: switch to FIELD_PREP_WM16 macro
phy: rockchip-samsung-dcphy: switch to FIELD_PREP_WM16 macro
drm/rockchip: vop2: switch to FIELD_PREP_WM16 macro
drm/rockchip: dsi: switch to FIELD_PREP_WM16* macros
phy: rockchip-emmc: switch to FIELD_PREP_WM16 macro
drm/rockchip: lvds: switch to FIELD_PREP_WM16 macro
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm
Pull lsm updates from Paul Moore:
- Move the management of the LSM BPF security blobs into the framework
In order to enable multiple LSMs we need to allocate and free the
various security blobs in the LSM framework and not the individual
LSMs as they would end up stepping all over each other.
- Leverage the lsm_bdev_alloc() helper in lsm_bdev_alloc()
Make better use of our existing helper functions to reduce some code
duplication.
- Update the Rust cred code to use 'sync::aref'
Part of a larger effort to move the Rust code over to the 'sync'
module.
- Make CONFIG_LSM dependent on CONFIG_SECURITY
As the CONFIG_LSM Kconfig setting is an ordered list of the LSMs to
enable a boot, it obviously doesn't make much sense to enable this
when CONFIG_SECURITY is disabled.
- Update the LSM and CREDENTIALS sections in MAINTAINERS with Rusty
bits
Add the Rust helper files to the associated LSM and CREDENTIALS
entries int the MAINTAINERS file. We're trying to improve the
communication between the two groups and making sure we're all aware
of what is going on via cross-posting to the relevant lists is a good
way to start.
* tag 'lsm-pr-20250926' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm:
lsm: CONFIG_LSM can depend on CONFIG_SECURITY
MAINTAINERS: add the associated Rust helper to the CREDENTIALS section
MAINTAINERS: add the associated Rust helper to the LSM section
rust,cred: update AlwaysRefCounted import to sync::aref
security: use umax() to improve code
lsm,selinux: Add LSM blob support for BPF objects
lsm: use lsm_blob_alloc() in lsm_bdev_alloc()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux
Pull selinux updates from Paul Moore:
- Support per-file labeling for functionfs
Both genfscon and user defined labeling methods are supported. This
should help users who want to provide separation between the control
endpoint file, "ep0", and other endpoints.
- Remove our use of get_zeroed_page() in sel_read_bool()
Update sel_read_bool() to use a four byte stack buffer instead of a
memory page fetched via get_zeroed_page(), and fix a memory in the
process.
Needless to say we should have done this a long time ago, but it was
in a very old chunk of code that "just worked" and I don't think
anyone had taken a real look at it in many years.
- Better use of the netdev skb/sock helper functions
Convert a sk_to_full_sk(skb->sk) into a skb_to_full_sk(skb) call.
- Remove some old, dead, and/or redundant code
* tag 'selinux-pr-20250926' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
selinux: enable per-file labeling for functionfs
selinux: fix sel_read_bool() allocation and error handling
selinux: Remove redundant __GFP_NOWARN
selinux: use a consistent method to get full socket from skb
selinux: Remove unused function selinux_policycap_netif_wildcard()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit
Pull audit updates from Paul Moore:
- Proper audit support for multiple LSMs
As the audit subsystem predated the work to enable multiple LSMs,
some additional work was needed to support logging the different LSM
labels for the subjects/tasks and objects on the system. Casey's
patches add new auxillary records for subjects and objects that
convey the additional labels.
- Ensure fanotify audit events are always generated
Generally speaking security relevant subsystems always generate audit
events, unless explicitly ignored. However, up to this point fanotify
events had been ignored by default, but starting with this pull
request fanotify follows convention and generates audit events by
default.
- Replace an instance of strcpy() with strscpy()
- Minor indentation, style, and comment fixes
* tag 'audit-pr-20250926' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit:
audit: fix skb leak when audit rate limit is exceeded
audit: init ab->skb_list earlier in audit_buffer_alloc()
audit: add record for multiple object contexts
audit: add record for multiple task security contexts
lsm: security_lsmblob_to_secctx module selection
audit: create audit_stamp structure
audit: add a missing tab
audit: record fanotify event regardless of presence of rules
audit: fix typo in auditfilter.c comment
audit: Replace deprecated strcpy() with strscpy()
audit: fix indentation in audit_log_exit()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull copy_process updates from Christian Brauner:
"This contains the changes to enable support for clone3() on nios2
which apparently is still a thing.
The more exciting part of this is that it cleans up the inconsistency
in how the 64-bit flag argument is passed from copy_process() into the
various other copy_*() helpers"
[ Fixed up rv ltl_monitor 32-bit support as per Sasha Levin in the merge ]
* tag 'kernel-6.18-rc1.clone3' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
nios2: implement architecture-specific portion of sys_clone3
arch: copy_thread: pass clone_flags as u64
copy_process: pass clone_flags as u64 across calltree
copy_sighand: Handle architectures where sizeof(unsigned long) < sizeof(u64)
|
|
Move functionality used only by trusted_tpm1.c out of the public header
<keys/trusted_tpm.h>. Specifically, change the exported functions into
static functions, since they are not used outside trusted_tpm1.c, and
move various other definitions and inline functions to trusted_tpm1.c.
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
|
|
Use the SHA-1 and HMAC-SHA1 library functions instead of crypto_shash.
This is simpler and faster.
Replace the selection of CRYPTO, CRYPTO_HMAC, and CRYPTO_SHA1 with
CRYPTO_LIB_SHA1 and CRYPTO_LIB_UTILS. The latter is needed for
crypto_memneq() which was previously being pulled in via CRYPTO.
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
|
|
To prevent timing attacks, HMAC value comparison needs to be constant
time. Replace the memcmp() with the correct function, crypto_memneq().
[For the Fixes commit I used the commit that introduced the memcmp().
It predates the introduction of crypto_memneq(), but it was still a bug
at the time even though a helper function didn't exist yet.]
Fixes: d00a1c72f7f4 ("keys: add new trusted key-type")
Cc: stable@vger.kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
|
|
Provides an abstraction for C bitmap API and bitops operations.
This commit enables a Rust implementation of an Android Binder
data structure from commit 15d9da3f818c ("binder: use bitmap for faster
descriptor lookup"), which can be found in drivers/android/dbitmap.h.
It is a step towards upstreaming the Rust port of Android Binder driver.
We follow the C Bitmap API closely in naming and semantics, with
a few differences that take advantage of Rust language facilities
and idioms. The main types are `BitmapVec` for owned bitmaps and
`Bitmap` for references to C bitmaps.
* We leverage Rust type system guarantees as follows:
* all (non-atomic) mutating operations require a &mut reference which
amounts to exclusive access.
* the `BitmapVec` type implements Send. This enables transferring
ownership between threads and is needed for Binder.
* the `BitmapVec` type implements Sync, which enables passing shared
references &Bitmap between threads. Atomic operations can be
used to safely modify from multiple threads (interior
mutability), though without ordering guarantees.
* The Rust API uses `{set,clear}_bit` vs `{set,clear}_bit_atomic` as
names for clarity, which differs from the C naming convention
`set_bit` for atomic vs `__set_bit` for non-atomic.
* we include enough operations for the API to be useful. Not all
operations are exposed yet in order to avoid dead code. The missing
ones can be added later.
* We take a fine-grained approach to safety:
* Low-level bit-ops get a safe API with bounds checks. Calling with
an out-of-bounds arguments to {set,clear}_bit becomes a no-op and
get logged as errors.
* We also introduce a RUST_BITMAP_HARDENED config, which
causes invocations with out-of-bounds arguments to panic.
* methods correspond to find_* C methods tolerate out-of-bounds
since the C implementation does. Also here, out-of-bounds
arguments are logged as errors, or panic in RUST_BITMAP_HARDENED
mode.
* We add a way to "borrow" bitmaps from C in Rust, to make C bitmaps
that were allocated in C directly usable in Rust code (`Bitmap`).
* the Rust API is optimized to represent the bitmap inline if it would
fit into a pointer. This saves allocations which is
relevant in the Binder use case.
The underlying C bitmap is *not* exposed for raw access in Rust. Doing so
would permit bypassing the Rust API and lose static guarantees.
An alternative route of vendoring an existing Rust bitmap package was
considered but suboptimal overall. Reusing the C implementation is
preferable for a basic data structure like bitmaps. It enables Rust
code to be a lot more similar and predictable with respect to C code
that uses the same data structures and enables the use of code that
has been tried-and-tested in the kernel, with the same performance
characteristics whenever possible.
We use the `usize` type for sizes and indices into the bitmap,
because Rust generally always uses that type for indices and lengths
and it will be more convenient if the API accepts that type. This means
that we need to perform some casts to/from u32 and usize, since the C
headers use unsigned int instead of size_t/unsigned long for these
numbers in some places.
Adds new MAINTAINERS section BITMAP API [RUST].
Suggested-by: Alice Ryhl <aliceryhl@google.com>
Suggested-by: Yury Norov <yury.norov@gmail.com>
Signed-off-by: Burak Emir <bqe@google.com>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Yury Norov (NVIDIA) <yury.norov@gmail.com>
|
|
unix_sk(sock)->path should never be modified, least of all by LSM...
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Nothing outside of fs/dcache.c has any business modifying
dentry names; passing &dentry->d_name as an argument should
have that argument declared as a const pointer.
Acked-by: Casey Schaufler <casey@schaufler-ca.com> # smack part
Acked-by: Paul Moore <paul@paul-moore.com>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
When CONFIG_SECURITY is not set, CONFIG_LSM (builtin_lsm_order) does
not need to be visible and settable since builtin_lsm_order is defined in
security.o, which is only built when CONFIG_SECURITY=y.
So make CONFIG_LSM depend on CONFIG_SECURITY.
Fixes: 13e735c0e953 ("LSM: Introduce CONFIG_LSM")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
[PM: subj tweak]
Signed-off-by: Paul Moore <paul@paul-moore.com>
|
|
This patch adds support for genfscon per-file labeling of functionfs
files as well as support for userspace to apply labels after new
functionfs endpoints are created.
This allows for separate labels and therefore access control on a
per-endpoint basis. An example use case would be for the default
endpoint EP0 used as a restricted control endpoint, and additional
usb endpoints to be used by other more permissive domains.
It should be noted that if there are multiple functionfs mounts on a
system, genfs file labels will apply to all mounts, and therefore will not
likely be as useful as the userspace relabeling portion of this patch -
the addition to selinux_is_genfs_special_handling().
This patch introduces the functionfs_seclabel policycap to maintain
existing functionfs genfscon behavior unless explicitly enabled.
Signed-off-by: Neill Kapron <nkapron@google.com>
Acked-by: Stephen Smalley <stephen.smalley.work@gmail.com>
[PM: trim changelog, apply boolean logic fixup]
Signed-off-by: Paul Moore <paul@paul-moore.com>
|
|
Switch sel_read_bool() from using get_zeroed_page() and free_page()
to a stack-allocated buffer. This also fixes a memory leak in the
error path when security_get_bool_value() returns an error.
Reported-by: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
|
|
With the introduction of clone3 in commit 7f192e3cd316 ("fork: add
clone3") the effective bit width of clone_flags on all architectures was
increased from 32-bit to 64-bit, with a new type of u64 for the flags.
However, for most consumers of clone_flags the interface was not
changed from the previous type of unsigned long.
While this works fine as long as none of the new 64-bit flag bits
(CLONE_CLEAR_SIGHAND and CLONE_INTO_CGROUP) are evaluated, this is still
undesirable in terms of the principle of least surprise.
Thus, this commit fixes all relevant interfaces of callees to
sys_clone3/copy_process (excluding the architecture-specific
copy_thread) to consistently pass clone_flags as u64, so that
no truncation to 32-bit integers occurs on 32-bit architectures.
Signed-off-by: Simon Schuster <schuster.simon@siemens-energy.com>
Link: https://lore.kernel.org/20250901-nios2-implement-clone3-v2-2-53fcf5577d57@siemens-energy.com
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Instead of doing direct access to ->i_count, add a helper to handle
this. This will make it easier to convert i_count to a refcount later.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Link: https://lore.kernel.org/9bc62a84c6b9d6337781203f60837bd98fbc4a96.1756222464.git.josef@toxicpanda.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Create a new audit record AUDIT_MAC_OBJ_CONTEXTS.
An example of the MAC_OBJ_CONTEXTS record is:
type=MAC_OBJ_CONTEXTS
msg=audit(1601152467.009:1050):
obj_selinux=unconfined_u:object_r:user_home_t:s0
When an audit event includes a AUDIT_MAC_OBJ_CONTEXTS record
the "obj=" field in other records in the event will be "obj=?".
An AUDIT_MAC_OBJ_CONTEXTS record is supplied when the system has
multiple security modules that may make access decisions based
on an object security context.
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
[PM: subj tweak, audit example readability indents]
Signed-off-by: Paul Moore <paul@paul-moore.com>
|
|
Replace the single skb pointer in an audit_buffer with a list of
skb pointers. Add the audit_stamp information to the audit_buffer as
there's no guarantee that there will be an audit_context containing
the stamp associated with the event. At audit_log_end() time create
auxiliary records as have been added to the list. Functions are
created to manage the skb list in the audit_buffer.
Create a new audit record AUDIT_MAC_TASK_CONTEXTS.
An example of the MAC_TASK_CONTEXTS record is:
type=MAC_TASK_CONTEXTS
msg=audit(1600880931.832:113)
subj_apparmor=unconfined
subj_smack=_
When an audit event includes a AUDIT_MAC_TASK_CONTEXTS record the
"subj=" field in other records in the event will be "subj=?".
An AUDIT_MAC_TASK_CONTEXTS record is supplied when the system has
multiple security modules that may make access decisions based on a
subject security context.
Refactor audit_log_task_context(), creating a new audit_log_subj_ctx().
This is used in netlabel auditing to provide multiple subject security
contexts as necessary.
Suggested-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
[PM: subj tweak, audit example readability indents]
Signed-off-by: Paul Moore <paul@paul-moore.com>
|
|
Add a parameter lsmid to security_lsmblob_to_secctx() to identify which
of the security modules that may be active should provide the security
context. If the value of lsmid is LSM_ID_UNDEF the first LSM providing
a hook is used. security_secid_to_secctx() is unchanged, and will
always report the first LSM providing a hook.
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
[PM: subj tweak]
Signed-off-by: Paul Moore <paul@paul-moore.com>
|
|
Use umax() to reduce the code in update_mmap_min_addr() and improve its
readability.
Signed-off-by: Qianfeng Rong <rongqianfeng@vivo.com>
[PM: subj line tweak]
Signed-off-by: Paul Moore <paul@paul-moore.com>
|
|
Commit 16f5dfbc851b ("gfp: include __GFP_NOWARN in GFP_NOWAIT")
made GFP_NOWAIT implicitly include __GFP_NOWARN.
Therefore, explicit __GFP_NOWARN combined with GFP_NOWAIT
(e.g., `GFP_NOWAIT | __GFP_NOWARN`) is now redundant. Let's clean
up these redundant flags across subsystems.
No functional changes.
Signed-off-by: Qianfeng Rong <rongqianfeng@vivo.com>
Acked-by: Stephen Smalley <stephen.smalley.work@gmail.com>
[PM: fixed horizontal spacing / alignment, line wraps]
Signed-off-by: Paul Moore <paul@paul-moore.com>
|
|
This patch introduces LSM blob support for BPF maps, programs, and
tokens to enable LSM stacking and multiplexing of LSM modules that
govern BPF objects. Additionally, the existing BPF hooks used by
SELinux have been updated to utilize the new blob infrastructure,
removing the assumption of exclusive ownership of the security
pointer.
Signed-off-by: Blaise Boscaccy <bboscaccy@linux.microsoft.com>
[PM: dropped local variable init, style fixes]
Signed-off-by: Paul Moore <paul@paul-moore.com>
|
|
Convert the lsm_bdev_alloc() function to use the lsm_blob_alloc() helper
like all of the other LSM security blob allocators.
Reviewed-by: Casey Schaufler <casey@schaufler-ca.com>
Acked-by: Serge Hallyn <serge@hallyn.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
|
|
In order to maintain code consistency and readability,
skb_to_full_sk() is used to get full socket from skb.
Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Acked-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
|
|
This is unused since commit a3d3043ef24a ("selinux: get netif_wildcard
policycap from policy instead of cache").
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Acked-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor
Pull apparmor updates from John Johansen:
"This has one major feature, it pulls in a cleaned up version of
af_unix mediation that Ubuntu has been carrying for years. It is
placed behind a new abi to ensure that it does cause policy
regressions. With pulling in the af_unix mediation there have been
cleanups and some refactoring of network socket mediation. This
accounts for the majority of the changes in the diff.
In addition there are a few improvements providing minor code
optimizations. several code cleanups, and bug fixes.
Features:
- improve debug printing
- carry mediation check on label (optimization)
- improve ability for compiler to optimize
__begin_current_label_crit_section
- transition for a linked list of rulesets to a vector of rulesets
- don't hardcode profile signal, allow it to be set by policy
- ability to mediate caps via the state machine instead of lut
- Add Ubuntu af_unix mediation, put it behind new v9 abi
Cleanups:
- fix typos and spelling errors
- cleanup kernel doc and code inconsistencies
- remove redundant checks/code
- remove unused variables
- Use str_yes_no() helper function
- mark tables static where appropriate
- make all generated string array headers const char *const
- refactor to doc semantics of file_perm checks
- replace macro calls to network/socket fns with explicit calls
- refactor/cleanup socket mediation code preparing for finer grained
mediation of different network families
- several updates to kernel doc comments
Bug fixes:
- fix incorrect profile->signal range check
- idmap mount fixes
- policy unpack unaligned access fixes
- kfree_sensitive() where appropriate
- fix oops when freeing policy
- fix conflicting attachment resolution
- fix exec table look-ups when stacking isn't first
- fix exec auditing
- mitigate userspace generating overly large xtables"
* tag 'apparmor-pr-2025-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor: (60 commits)
apparmor: fix: oops when trying to free null ruleset
apparmor: fix Regression on linux-next (next-20250721)
apparmor: fix test error: WARNING in apparmor_unix_stream_connect
apparmor: Remove the unused variable rules
apparmor: fix: accept2 being specifie even when permission table is presnt
apparmor: transition from a list of rules to a vector of rules
apparmor: fix documentation mismatches in val_mask_to_str and socket functions
apparmor: remove redundant perms.allow MAY_EXEC bitflag set
apparmor: fix kernel doc warnings for kernel test robot
apparmor: Fix unaligned memory accesses in KUnit test
apparmor: Fix 8-byte alignment for initial dfa blob streams
apparmor: shift uid when mediating af_unix in userns
apparmor: shift ouid when mediating hard links in userns
apparmor: make sure unix socket labeling is correctly updated.
apparmor: fix regression in fs based unix sockets when using old abi
apparmor: fix AA_DEBUG_LABEL()
apparmor: fix af_unix auditing to include all address information
apparmor: Remove use of the double lock
apparmor: update kernel doc comments for xxx_label_crit_section
apparmor: make __begin_current_label_crit_section() indicate whether put is needed
...
|
|
profile allocation is wrongly setting the number of entries on the
rules vector before any ruleset is assigned. If profile allocation
fails between ruleset allocation and assigning the first ruleset,
free_ruleset() will be called with a null pointer resulting in an
oops.
[ 107.350226] kernel BUG at mm/slub.c:545!
[ 107.350912] Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
[ 107.351447] CPU: 1 UID: 0 PID: 27 Comm: ksoftirqd/1 Not tainted 6.14.6-hwe-rlee287-dev+ #5
[ 107.353279] Hardware name:[ 107.350218] -QE-----------[ cutMU here ]--------- Ub---
[ 107.3502untu26] kernel BUG a 24t mm/slub.c:545.!04 P
[ 107.350912]C ( Oops: invalid oi4pcode: 0000 [#1]40 PREEMPT SMP NOPFXTI
+ PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[ 107.356054] RIP: 0010:__slab_free+0x152/0x340
[ 107.356444] Code: 00 4c 89 ff e8 0f ac df 00 48 8b 14 24 48 8b 4c 24 20 48 89 44 24 08 48 8b 03 48 c1 e8 09 83 e0 01 88 44 24 13 e9 71 ff ff ff <0f> 0b 41 f7 44 24 08 87 04 00 00 75 b2 eb a8 41 f7 44 24 08 87 04
[ 107.357856] RSP: 0018:ffffad4a800fbbb0 EFLAGS: 00010246
[ 107.358937] RAX: ffff97ebc2a88e70 RBX: ffffd759400aa200 RCX: 0000000000800074
[ 107.359976] RDX: ffff97ebc2a88e60 RSI: ffffd759400aa200 RDI: ffffad4a800fbc20
[ 107.360600] RBP: ffffad4a800fbc50 R08: 0000000000000001 R09: ffffffff86f02cf2
[ 107.361254] R10: 0000000000000000 R11: 0000000000000000 R12: ffff97ecc0049400
[ 107.361934] R13: ffff97ebc2a88e60 R14: ffff97ecc0049400 R15: 0000000000000000
[ 107.362597] FS: 0000000000000000(0000) GS:ffff97ecfb200000(0000) knlGS:0000000000000000
[ 107.363332] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 107.363784] CR2: 000061c9545ac000 CR3: 0000000047aa6000 CR4: 0000000000750ef0
[ 107.364331] PKRU: 55555554
[ 107.364545] Call Trace:
[ 107.364761] <TASK>
[ 107.364931] ? local_clock+0x15/0x30
[ 107.365219] ? srso_alias_return_thunk+0x5/0xfbef5
[ 107.365593] ? kfree_sensitive+0x32/0x70
[ 107.365900] kfree+0x29d/0x3a0
[ 107.366144] ? srso_alias_return_thunk+0x5/0xfbef5
[ 107.366510] ? local_clock_noinstr+0xe/0xd0
[ 107.366841] ? srso_alias_return_thunk+0x5/0xfbef5
[ 107.367209] kfree_sensitive+0x32/0x70
[ 107.367502] aa_free_profile.part.0+0xa2/0x400
[ 107.367850] ? rcu_do_batch+0x1e6/0x5e0
[ 107.368148] aa_free_profile+0x23/0x60
[ 107.368438] label_free_switch+0x4c/0x80
[ 107.368751] label_free_rcu+0x1c/0x50
[ 107.369038] rcu_do_batch+0x1e8/0x5e0
[ 107.369324] ? rcu_do_batch+0x157/0x5e0
[ 107.369626] rcu_core+0x1b0/0x2f0
[ 107.369888] rcu_core_si+0xe/0x20
[ 107.370156] handle_softirqs+0x9b/0x3d0
[ 107.370460] ? smpboot_thread_fn+0x26/0x210
[ 107.370790] run_ksoftirqd+0x3a/0x70
[ 107.371070] smpboot_thread_fn+0xf9/0x210
[ 107.371383] ? __pfx_smpboot_thread_fn+0x10/0x10
[ 107.371746] kthread+0x10d/0x280
[ 107.372010] ? __pfx_kthread+0x10/0x10
[ 107.372310] ret_from_fork+0x44/0x70
[ 107.372655] ? __pfx_kthread+0x10/0x10
[ 107.372974] ret_from_fork_asm+0x1a/0x30
[ 107.373316] </TASK>
[ 107.373505] Modules linked in: af_packet_diag mptcp_diag tcp_diag udp_diag raw_diag inet_diag snd_seq_dummy snd_hrtimer snd_seq_midi snd_seq_midi_event snd_rawmidi snd_seq snd_seq_device snd_timer snd soundcore qrtr binfmt_misc intel_rapl_msr intel_rapl_common kvm_amd ccp kvm irqbypass polyval_clmulni polyval_generic ghash_clmulni_intel sha256_ssse3 sha1_ssse3 aesni_intel crypto_simd cryptd i2c_piix4 i2c_smbus input_leds joydev sch_fq_codel msr parport_pc ppdev lp parport efi_pstore nfnetlink vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock vmw_vmci dmi_sysfs qemu_fw_cfg ip_tables x_tables autofs4 hid_generic usbhid hid psmouse serio_raw floppy bochs pata_acpi
[ 107.379086] ---[ end trace 0000000000000000 ]---
Don't set the count until a ruleset is actually allocated and
guard against free_ruleset() being called with a null pointer.
Reported-by: Ryan Lee <ryan.lee@canonical.com>
Fixes: 217af7e2f4de ("apparmor: refactor profile rules and attachments")
Signed-off-by: John Johansen <john.johansen@canonical.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity
Pull integrity update from Mimi Zohar:
"A single commit to permit disabling IMA from the boot command line for
just the kdump kernel.
The exception itself sort of makes sense. My concern is that
exceptions do not remain as exceptions, but somehow morph to become
the norm"
* tag 'integrity-v6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity:
ima: add a knob ima= to allow disabling IMA in kdump kernel
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/sergeh/linux
Pull capabilities update from Serge Hallyn:
- Fix broken link in documentation in capability.h
- Correct the permission check for unsafe exec
During exec, different effective and real credentials were assumed to
mean changed credentials, making it impossible in the no-new-privs
case to keep different uid and euid
* tag 'caps-pr-20250729' of git://git.kernel.org/pub/scm/linux/kernel/git/sergeh/linux:
uapi: fix broken link in linux/capability.h
exec: Correct the permission check for unsafe exec
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/wufan/ipe
Pull ipe update from Fan Wu:
"A single commit from Eric Biggers to simplify the IPE (Integrity
Policy Enforcement) policy audit with the SHA-256 library API"
* tag 'ipe-pr-20250728' of git://git.kernel.org/pub/scm/linux/kernel/git/wufan/ipe:
ipe: use SHA-256 library API instead of crypto_shash API
|
|
sk lock initialization was incorrectly removed, from
apparmor_file_alloc_security() while testing changes to changes to
apparmor_sk_alloc_security()
resulting in the following regression.
[ 48.056654] INFO: trying to register non-static key.
[ 48.057480] The code is fine but needs lockdep annotation, or maybe
[ 48.058416] you didn't initialize this object before use?
[ 48.059209] turning off the locking correctness validator.
[ 48.060040] CPU: 0 UID: 0 PID: 648 Comm: chronyd Not tainted 6.16.0-rc7-test-next-20250721-11410-g1ee809985e11-dirty #577 NONE
[ 48.060049] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[ 48.060055] Call Trace:
[ 48.060059] <TASK>
[ 48.060063] dump_stack_lvl (lib/dump_stack.c:122)
[ 48.060075] register_lock_class (kernel/locking/lockdep.c:988 kernel/locking/lockdep.c:1302)
[ 48.060084] ? path_name (security/apparmor/file.c:159)
[ 48.060093] __lock_acquire (kernel/locking/lockdep.c:5116)
[ 48.060103] lock_acquire (kernel/locking/lockdep.c:473 (discriminator 4) kernel/locking/lockdep.c:5873 (discriminator 4) kernel/locking/lockdep.c:5828 (discriminator 4))
[ 48.060109] ? update_file_ctx (security/apparmor/file.c:464)
[ 48.060115] ? __pfx_profile_path_perm (security/apparmor/file.c:247)
[ 48.060121] _raw_spin_lock (include/linux/spinlock_api_smp.h:134 kernel/locking/spinlock.c:154)
[ 48.060130] ? update_file_ctx (security/apparmor/file.c:464)
[ 48.060134] update_file_ctx (security/apparmor/file.c:464)
[ 48.060140] aa_file_perm (security/apparmor/file.c:532 (discriminator 1) security/apparmor/file.c:642 (discriminator 1))
[ 48.060147] ? __pfx_aa_file_perm (security/apparmor/file.c:607)
[ 48.060152] ? do_mmap (mm/mmap.c:558)
[ 48.060160] ? __pfx_userfaultfd_unmap_complete (fs/userfaultfd.c:841)
[ 48.060170] ? __lock_acquire (kernel/locking/lockdep.c:4677 (discriminator 1) kernel/locking/lockdep.c:5194 (discriminator 1))
[ 48.060176] ? common_file_perm (security/apparmor/lsm.c:535 (discriminator 1))
[ 48.060185] security_mmap_file (security/security.c:3012 (discriminator 2))
[ 48.060192] vm_mmap_pgoff (mm/util.c:574 (discriminator 1))
[ 48.060200] ? find_held_lock (kernel/locking/lockdep.c:5353 (discriminator 1))
[ 48.060206] ? __pfx_vm_mmap_pgoff (mm/util.c:568)
[ 48.060212] ? lock_release (kernel/locking/lockdep.c:5539 kernel/locking/lockdep.c:5892 kernel/locking/lockdep.c:5878)
[ 48.060219] ? __fget_files (arch/x86/include/asm/preempt.h:85 (discriminator 13) include/linux/rcupdate.h:100 (discriminator 13) include/linux/rcupdate.h:873 (discriminator 13) fs/file.c:1072 (discriminator 13))
[ 48.060229] ksys_mmap_pgoff (mm/mmap.c:604)
[ 48.060239] do_syscall_64 (arch/x86/entry/syscall_64.c:63 (discriminator 1) arch/x86/entry/syscall_64.c:94 (discriminator 1))
[ 48.060248] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
[ 48.060254] RIP: 0033:0x7fb6920e30a2
[ 48.060265] Code: 08 00 04 00 00 eb e2 90 41 f7 c1 ff 0f 00 00 75 27 55 89 cd 53 48 89 fb 48 85 ff 74 33 41 89 ea 48 89 df b8 09 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 5e 5b 5d c3 0f 1f 00 c7 05 e6 41 01 00 16 00
All code
========
0: 08 00 or %al,(%rax)
2: 04 00 add $0x0,%al
4: 00 eb add %ch,%bl
6: e2 90 loop 0xffffffffffffff98
8: 41 f7 c1 ff 0f 00 00 test $0xfff,%r9d
f: 75 27 jne 0x38
11: 55 push %rbp
12: 89 cd mov %ecx,%ebp
14: 53 push %rbx
15: 48 89 fb mov %rdi,%rbx
18: 48 85 ff test %rdi,%rdi
1b: 74 33 je 0x50
1d: 41 89 ea mov %ebp,%r10d
20: 48 89 df mov %rbx,%rdi
23: b8 09 00 00 00 mov $0x9,%eax
28: 0f 05 syscall
2a:* 48 3d 00 f0 ff ff cmp $0xfffffffffffff000,%rax <-- trapping instruction
30: 77 5e ja 0x90
32: 5b pop %rbx
33: 5d pop %rbp
34: c3 ret
35: 0f 1f 00 nopl (%rax)
38: c7 .byte 0xc7
39: 05 e6 41 01 00 add $0x141e6,%eax
3e: 16 (bad)
...
Code starting with the faulting instruction
===========================================
0: 48 3d 00 f0 ff ff cmp $0xfffffffffffff000,%rax
6: 77 5e ja 0x66
8: 5b pop %rbx
9: 5d pop %rbp
a: c3 ret
b: 0f 1f 00 nopl (%rax)
e: c7 .byte 0xc7
f: 05 e6 41 01 00 add $0x141e6,%eax
14: 16 (bad)
...
[ 48.060270] RSP: 002b:00007ffd2c0d3528 EFLAGS: 00000206 ORIG_RAX: 0000000000000009
[ 48.060279] RAX: ffffffffffffffda RBX: 00007fb691fc8000 RCX: 00007fb6920e30a2
[ 48.060283] RDX: 0000000000000005 RSI: 000000000007d000 RDI: 00007fb691fc8000
[ 48.060287] RBP: 0000000000000812 R08: 0000000000000003 R09: 0000000000011000
[ 48.060290] R10: 0000000000000812 R11: 0000000000000206 R12: 00007ffd2c0d3578
[ 48.060293] R13: 00007fb6920b6160 R14: 00007ffd2c0d39f0 R15: 00000fffa581a6a8
Fixes: 88fec3526e84 ("apparmor: make sure unix socket labeling is correctly updated.")
Signed-off-by: John Johansen <john.johansen@canonical.com>
|
|
commit 88fec3526e84 ("apparmor: make sure unix socket labeling is correctly updated.")
added the use of security_sk_alloc() which ensures the sk label is
initialized.
This means that the AA_BUG in apparmor_unix_stream_connect() is no
longer correct, because while the sk is still not being initialized
by going through post_create, it is now initialize in sk_alloc().
Remove the now invalid check.
Reported-by: syzbot+cd38ee04bcb3866b0c6d@syzkaller.appspotmail.com
Fixes: 88fec3526e84 ("apparmor: make sure unix socket labeling is correctly updated.")
Signed-off-by: John Johansen <john.johansen@canonical.com>
|
|
Variable rules is not effectively used, so delete it.
security/apparmor/lsm.c:182:23: warning: variable ‘rules’ set but not used.
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=22942
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc updates from Madhavan Srinivasan:
- CONFIG_HZ changes to move the base_slice from 10ms to 1ms
- Patchset to move some of the mutex handling to lock guard
- Expose secvars relevant to the key management mode
- Misc cleanups and fixes
Thanks to Ankit Chauhan, Christophe Leroy, Donet Tom, Gautam Menghani,
Haren Myneni, Johan Korsnes, Madadi Vineeth Reddy, Paul Mackerras,
Shrikanth Hegde, Srish Srinivasan, Thomas Fourier, Thomas Huth, Thomas
Weißschuh, Souradeep, Amit Machhiwal, R Nageswara Sastry, Venkat Rao
Bagalkote, Andrew Donnellan, Greg Kroah-Hartman, Mimi Zohar, Mukesh
Kumar Chaurasiya, Nayna Jain, Ritesh Harjani (IBM), Sourabh Jain, Srikar
Dronamraju, Stefan Berger, Tyrel Datwyler, and Kowshik Jois.
* tag 'powerpc-6.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (23 commits)
arch/powerpc: Remove .interp section in vmlinux
powerpc: Drop GPL boilerplate text with obsolete FSF address
powerpc: Don't use %pK through printk
arch: powerpc: defconfig: Drop obsolete CONFIG_NET_CLS_TCINDEX
misc: ocxl: Replace scnprintf() with sysfs_emit() in sysfs show functions
integrity/platform_certs: Allow loading of keys in the static key management mode
powerpc/secvar: Expose secvars relevant to the key management mode
powerpc/pseries: Correct secvar format representation for static key management
(powerpc/512) Fix possible `dma_unmap_single()` on uninitialized pointer
powerpc: floppy: Add missing checks after DMA map
book3s64/radix : Optimize vmemmap start alignment
book3s64/radix : Handle error conditions properly in radix_vmemmap_populate
powerpc/pseries/dlpar: Search DRC index from ibm,drc-indexes for IO add
KVM: PPC: Book3S HV: Add H_VIRT mapping for tracing exits
powerpc: sysdev: use lock guard for mutex
powerpc: powernv: ocxl: use lock guard for mutex
powerpc: book3s: vas: use lock guard for mutex
powerpc: fadump: use lock guard for mutex
powerpc: rtas: use lock guard for mutex
powerpc: eeh: use lock guard for mutex
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux
Pull landlock update from Mickaël Salaün:
"Fix test issues, improve build compatibility, and add new tests"
* tag 'landlock-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux:
landlock: Fix cosmetic change
samples/landlock: Fix building on musl libc
landlock: Fix warning from KUnit tests
selftests/landlock: Add test to check rule tied to covered mount point
selftests/landlock: Fix build of audit_test
selftests/landlock: Fix readlink check
|
|
audit_policy() does not support any other algorithm, so the crypto_shash
abstraction provides no value. Just use the SHA-256 library API
instead, which is much simpler and easier to use.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Fan Wu <wufan@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux
Pull selinux updates from Paul Moore:
- Introduce the concept of a SELinux "neveraudit" type which prevents
all auditing of the given type/domain.
Taken by itself, the benefit of marking a SELinux domain with the
"neveraudit" tag is likely not very interesting, especially given the
significant overlap with the "dontaudit" tag.
However, given that the "neveraudit" tag applies to *all* auditing of
the tagged domain, we can do some fairly interesting optimizations
when a SELinux domain is marked as both "permissive" and "dontaudit"
(think of the unconfined_t domain).
While this pull request includes optimized inode permission and
getattr hooks, these optimizations require SELinux policy changes,
therefore the improvements may not be visible on standard downstream
Linux distos for a period of time.
- Continue the deprecation process of /sys/fs/selinux/user.
After removing the associated userspace code in 2020, we marked the
/sys/fs/selinux/user interface as deprecated in Linux v6.13 with
pr_warn() and the usual documention update.
This adds a five second sleep after the pr_warn(), following a
previous deprecation process pattern that has worked well for us in
the past in helping identify any existing users that we haven't yet
reached.
- Add a __GFP_NOWARN flag to our initial hash table allocation.
Fuzzers such a syzbot often attempt abnormally large SELinux policy
loads, which the SELinux code gracefully handles by checking for
allocation failures, but not before the allocator emits a warning
which causes the automated fuzzing to flag this as an error and
report it to the list. While we want to continue to support the work
done by the fuzzing teams, we want to focus on proper issues and not
an error case that is already handled safely. Add a NOWARN flag to
quiet the allocator and prevent syzbot from tripping on this again.
- Remove some unnecessary selinuxfs cleanup code, courtesy of Al.
- Update the SELinux in-kernel documentation with pointers to
additional information.
* tag 'selinux-pr-20250725' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
selinux: don't bother with selinuxfs_info_free() on failures
selinux: add __GFP_NOWARN to hashtab_init() allocations
selinux: optimize selinux_inode_getattr/permission() based on neveraudit|permissive
selinux: introduce neveraudit types
documentation: add links to SELinux resources
selinux: add a 5 second sleep to /sys/fs/selinux/user
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm
Pull lsm updates from Paul Moore:
- Add Nicolas Bouchinet and Xiu Jianfeng as Lockdown maintainers
The Lockdown LSM has been without a dedicated mantainer since its
original acceptance upstream, and it has suffered as a result.
Thankfully we have two new volunteers who together I believe have the
background and desire to help ensure Lockdown is properly supported.
- Remove the unused cap_mmap_file() declaration
* tag 'lsm-pr-20250725' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm:
MAINTAINERS: Add Xiu and myself as Lockdown maintainers
security: Remove unused declaration cap_mmap_file()
lsm: trivial comment fix
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull crypto library conversions from Eric Biggers:
"Convert fsverity and apparmor to use the SHA-2 library functions
instead of crypto_shash. This is simpler and also slightly faster"
* tag 'libcrypto-conversions-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux:
fsverity: Switch from crypto_shash to SHA-2 library
fsverity: Explicitly include <linux/export.h>
apparmor: use SHA-256 library API instead of crypto_shash API
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull hardening updates from Kees Cook:
- Introduce and start using TRAILING_OVERLAP() helper for fixing
embedded flex array instances (Gustavo A. R. Silva)
- mux: Convert mux_control_ops to a flex array member in mux_chip
(Thorsten Blum)
- string: Group str_has_prefix() and strstarts() (Andy Shevchenko)
- Remove KCOV instrumentation from __init and __head (Ritesh Harjani,
Kees Cook)
- Refactor and rename stackleak feature to support Clang
- Add KUnit test for seq_buf API
- Fix KUnit fortify test under LTO
* tag 'hardening-v6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (22 commits)
sched/task_stack: Add missing const qualifier to end_of_stack()
kstack_erase: Support Clang stack depth tracking
kstack_erase: Add -mgeneral-regs-only to silence Clang warnings
init.h: Disable sanitizer coverage for __init and __head
kstack_erase: Disable kstack_erase for all of arm compressed boot code
x86: Handle KCOV __init vs inline mismatches
arm64: Handle KCOV __init vs inline mismatches
s390: Handle KCOV __init vs inline mismatches
arm: Handle KCOV __init vs inline mismatches
mips: Handle KCOV __init vs inline mismatch
powerpc/mm/book3s64: Move kfence and debug_pagealloc related calls to __init section
configs/hardening: Enable CONFIG_INIT_ON_FREE_DEFAULT_ON
configs/hardening: Enable CONFIG_KSTACK_ERASE
stackleak: Split KSTACK_ERASE_CFLAGS from GCC_PLUGINS_CFLAGS
stackleak: Rename stackleak_track_stack to __sanitizer_cov_stack_depth
stackleak: Rename STACKLEAK to KSTACK_ERASE
seq_buf: Introduce KUnit tests
string: Group str_has_prefix() and strstarts()
kunit/fortify: Add back "volatile" for sizeof() constants
acpi: nfit: intel: avoid multiple -Wflex-array-member-not-at-end warnings
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull fileattr updates from Christian Brauner:
"This introduces the new file_getattr() and file_setattr() system calls
after lengthy discussions.
Both system calls serve as successors and extensible companions to
the FS_IOC_FSGETXATTR and FS_IOC_FSSETXATTR system calls which have
started to show their age in addition to being named in a way that
makes it easy to conflate them with extended attribute related
operations.
These syscalls allow userspace to set filesystem inode attributes on
special files. One of the usage examples is the XFS quota projects.
XFS has project quotas which could be attached to a directory. All new
inodes in these directories inherit project ID set on parent
directory.
The project is created from userspace by opening and calling
FS_IOC_FSSETXATTR on each inode. This is not possible for special
files such as FIFO, SOCK, BLK etc. Therefore, some inodes are left
with empty project ID. Those inodes then are not shown in the quota
accounting but still exist in the directory. This is not critical but
in the case when special files are created in the directory with
already existing project quota, these new inodes inherit extended
attributes. This creates a mix of special files with and without
attributes. Moreover, special files with attributes don't have a
possibility to become clear or change the attributes. This, in turn,
prevents userspace from re-creating quota project on these existing
files.
In addition, these new system calls allow the implementation of
additional attributes that we couldn't or didn't want to fit into the
legacy ioctls anymore"
* tag 'vfs-6.17-rc1.fileattr' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
fs: tighten a sanity check in file_attr_to_fileattr()
tree-wide: s/struct fileattr/struct file_kattr/g
fs: introduce file_getattr and file_setattr syscalls
fs: prepare for extending file_get/setattr()
fs: make vfs_fileattr_[get|set] return -EOPNOTSUPP
selinux: implement inode_file_[g|s]etattr hooks
lsm: introduce new hooks for setting/getting inode fsxattr
fs: split fileattr related helpers into separate file
|
|
Pull misc VFS updates from Al Viro:
"VFS-related cleanups in various places (mostly of the "that really
can't happen" or "there's a better way to do it" variety)"
* tag 'pull-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
gpib: use file_inode()
binder_ioctl_write_read(): simplify control flow a bit
secretmem: move setting O_LARGEFILE and bumping users' count to the place where we create the file
apparmor: file never has NULL f_path.mnt
landlock: opened file never has a negative dentry
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull securityfs updates from Al Viro:
"Securityfs cleanups and fixes:
- one extra reference is enough to pin a dentry down; no need for
two. Switch to regular scheme, similar to shmem, debugfs, etc. This
fixes a securityfs_recursive_remove() dentry leak, among other
things.
- we need to have the filesystem pinned to prevent the contents
disappearing; what we do not need is pinning it for each file.
Doing that only for files and directories in the root is enough.
- the previous two changes allow us to get rid of the racy kludges in
efi_secret_unlink(), where we can use simple_unlink() instead of
securityfs_remove(). Which does not require unlocking and relocking
the parent, with all deadlocks that invites.
- Make securityfs_remove() take the entire subtree out, turning
securityfs_recursive_remove() into its alias. Makes a lot more
sense for callers and fixes a mount leak, while we are at it.
- Making securityfs_remove() remove the entire subtree allows for
much simpler life in most of the users - efi_secret, ima_fs, evm,
ipe, tmp get cleaner. I hadn't touched apparmor use of securityfs,
but I suspect that it would be useful there as well"
* tag 'pull-securityfs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
tpm: don't bother with removal of files in directory we'll be removing
ipe: don't bother with removal of files in directory we'll be removing
evm_secfs: clear securityfs interactions
ima_fs: get rid of lookup-by-dentry stuff
ima_fs: don't bother with removal of files in directory we'll be removing
efi_secret: clean securityfs use up
make securityfs_remove() remove the entire subtree
fix locking in efi_secret_unlink()
securityfs: pin filesystem only for objects directly in root
securityfs: don't pin dentries twice, once is enough...
|
|
Wire up CONFIG_KSTACK_ERASE to Clang 21's new stack depth tracking
callback[1] option.
Link: https://clang.llvm.org/docs/SanitizerCoverage.html#tracing-stack-depth [1]
Acked-by: Nicolas Schier <n.schier@avm.de>
Link: https://lore.kernel.org/r/20250724055029.3623499-4-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
|
|
The Clang stack depth tracking implementation has a fixed name for
the stack depth tracking callback, "__sanitizer_cov_stack_depth", so
rename the GCC plugin function to match since the plugin has no external
dependencies on naming.
Link: https://lore.kernel.org/r/20250717232519.2984886-2-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
|
|
In preparation for adding Clang sanitizer coverage stack depth tracking
that can support stack depth callbacks:
- Add the new top-level CONFIG_KSTACK_ERASE option which will be
implemented either with the stackleak GCC plugin, or with the Clang
stack depth callback support.
- Rename CONFIG_GCC_PLUGIN_STACKLEAK as needed to CONFIG_KSTACK_ERASE,
but keep it for anything specific to the GCC plugin itself.
- Rename all exposed "STACKLEAK" names and files to "KSTACK_ERASE" (named
for what it does rather than what it protects against), but leave as
many of the internals alone as possible to avoid even more churn.
While here, also split "prev_lowest_stack" into CONFIG_KSTACK_ERASE_METRICS,
since that's the only place it is referenced from.
Suggested-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20250717232519.2984886-1-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
|
|
The transition to the perms32 permission table dropped the need for
the accept2 table as permissions. However accept2 can be used for
flags and may be present even when the perms32 table is present. So
instead of checking on version, check whether the table is present.
Fixes: 2e12c5f06017 ("apparmor: add additional flags to extended permission.")
Signed-off-by: John Johansen <john.johansen@canonical.com>
|
|
The set of rules on a profile is not dynamically extended, instead
if a new ruleset is needed a new version of the profile is created.
This allows us to use a vector of rules instead of a list, slightly
reducing memory usage and simplifying the code.
Signed-off-by: John Johansen <john.johansen@canonical.com>
|
|
This patch fixes kernel-doc warnings:
1. val_mask_to_str:
- Added missing descriptions for `size` and `table` parameters.
- Removed outdated str_size and chrs references.
2. Socket Functions:
- Makes non-null requirements clear for socket/address args.
- Standardizes return values per kernel conventions.
- Adds Unix domain socket protocol details.
These changes silence doc validation warnings and improve accuracy for
AppArmor LSM docs.
Signed-off-by: Peng Jiang <jiang.peng9@zte.com.cn>
Signed-off-by: John Johansen <john.johansen@canonical.com>
|
|
This section of profile_transition that occurs after x_to_label only
happens if perms.allow already has the MAY_EXEC bit set, so we don't need
to set it again.
Fixes: 16916b17b4f8 ("apparmor: force auditing of conflicting attachment execs from confined")
Signed-off-by: Ryan Lee <ryan.lee@canonical.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
|
|
Fix kernel doc warnings for the functions
- apparmor_socket_bind
- apparmor_unix_may_send
- apparmor_unix_stream_connect
- val_mask_to_str
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202506070127.B1bc3da4-lkp@intel.com/
Signed-off-by: John Johansen <john.johansen@canonical.com>
|
|
The testcase triggers some unnecessary unaligned memory accesses on the
parisc architecture:
Kernel: unaligned access to 0x12f28e27 in policy_unpack_test_init+0x180/0x374 (iir 0x0cdc1280)
Kernel: unaligned access to 0x12f28e67 in policy_unpack_test_init+0x270/0x374 (iir 0x64dc00ce)
Use the existing helper functions put_unaligned_le32() and
put_unaligned_le16() to avoid such warnings on architectures which
prefer aligned memory accesses.
Signed-off-by: Helge Deller <deller@gmx.de>
Fixes: 98c0cc48e27e ("apparmor: fix policy_unpack_test on big endian systems")
Signed-off-by: John Johansen <john.johansen@canonical.com>
|
|
The dfa blob stream for the aa_dfa_unpack() function is expected to be aligned
on a 8 byte boundary.
The static nulldfa_src[] and stacksplitdfa_src[] arrays store the initial
apparmor dfa blob streams, but since they are declared as an array-of-chars
the compiler and linker will only ensure a "char" (1-byte) alignment.
Add an __aligned(8) annotation to the arrays to tell the linker to always
align them on a 8-byte boundary. This avoids runtime warnings at startup on
alignment-sensitive platforms like parisc such as:
Kernel: unaligned access to 0x7f2a584a in aa_dfa_unpack+0x124/0x788 (iir 0xca0109f)
Kernel: unaligned access to 0x7f2a584e in aa_dfa_unpack+0x210/0x788 (iir 0xca8109c)
Kernel: unaligned access to 0x7f2a586a in aa_dfa_unpack+0x278/0x788 (iir 0xcb01090)
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org
Fixes: 98b824ff8984 ("apparmor: refcount the pdb")
Signed-off-by: John Johansen <john.johansen@canonical.com>
|
|
Avoid unshifted ouids for socket file operations as observed when using
AppArmor profiles in unprivileged containers with LXD or Incus.
For example, root inside container and uid 1000000 outside, with
`owner /root/sock rw,` profile entry for nc:
/root$ nc -lkU sock & nc -U sock
==> dmesg
apparmor="DENIED" operation="connect" class="file"
namespace="root//lxd-podia_<var-snap-lxd-common-lxd>" profile="sockit"
name="/root/sock" pid=3924 comm="nc" requested_mask="wr" denied_mask="wr"
fsuid=1000000 ouid=0 [<== should be 1000000]
Fix by performing uid mapping as per common_perm_cond() in lsm.c
Signed-off-by: Gabriel Totev <gabriel.totev@zetier.com>
Fixes: c05e705812d1 ("apparmor: add fine grained af_unix mediation")
Signed-off-by: John Johansen <john.johansen@canonical.com>
|
|
When using AppArmor profiles inside an unprivileged container,
the link operation observes an unshifted ouid.
(tested with LXD and Incus)
For example, root inside container and uid 1000000 outside, with
`owner /root/link l,` profile entry for ln:
/root$ touch chain && ln chain link
==> dmesg
apparmor="DENIED" operation="link" class="file"
namespace="root//lxd-feet_<var-snap-lxd-common-lxd>" profile="linkit"
name="/root/link" pid=1655 comm="ln" requested_mask="l" denied_mask="l"
fsuid=1000000 ouid=0 [<== should be 1000000] target="/root/chain"
Fix by mapping inode uid of old_dentry in aa_path_link() rather than
using it directly, similarly to how it's mapped in __file_path_perm()
later in the file.
Signed-off-by: Gabriel Totev <gabriel.totev@zetier.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
|
|
When a unix socket is passed into a different confinement domain make
sure its cached mediation labeling is updated to correctly reflect
which domains are using the socket.
Fixes: c05e705812d1 ("apparmor: add fine grained af_unix mediation")
Signed-off-by: John Johansen <john.johansen@canonical.com>
|
|
This line removal should not be there and it makes it more difficult to
backport the following patch.
Cc: Günther Noack <gnoack@google.com>
Cc: Konstantin Meskhidze <konstantin.meskhidze@huawei.com>
Fixes: 7a11275c3787 ("landlock: Refactor layer helpers")
Link: https://lore.kernel.org/r/20250719104204.545188-2-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
Policy loaded using abi 7 socket mediation was not being applied
correctly in all cases. In some cases with fs based unix sockets a
subset of permissions where allowed when they should have been denied.
This was happening because the check for if the socket was an fs based
unix socket came before the abi check. But the abi check is where the
correct path is selected, so having the fs unix socket check occur
early would cause the wrong code path to be used.
Fix this by pushing the fs unix to be done after the abi check.
Fixes: dcd7a559411e ("apparmor: gate make fine grained unix mediation behind v9 abi")
Signed-off-by: John Johansen <john.johansen@canonical.com>
|
|
AA_DEBUG_LABEL() was not specifying it vargs, which is needed so it can
output debug parameters.
Fixes: 71e6cff3e0dd ("apparmor: Improve debug print infrastructure")
Signed-off-by: John Johansen <john.johansen@canonical.com>
|
|
The auditing of addresses currently doesn't include the source address
and mixes source and foreign/peer under the same audit name. Fix this
so source is always addr, and the foreign/peer is peer_addr.
Fixes: c05e705812d1 ("apparmor: add fine grained af_unix mediation")
Signed-off-by: John Johansen <john.johansen@canonical.com>
|
|
The use of the double lock is not necessary and problematic. Instead
pull the bits that need locks into their own sections and grab the
needed references.
Fixes: c05e705812d1 ("apparmor: add fine grained af_unix mediation")
Signed-off-by: John Johansen <john.johansen@canonical.com>
|
|
Add a kernel doc header for __end_current_label_crit_section(), and
update the header for __begin_current_label_crit_section().
Fixes: b42ecc5f58ef ("apparmor: make __begin_current_label_crit_section() indicate whether put is needed")
Signed-off-by: John Johansen <john.johansen@canonical.com>
|
|
needed
Same as aa_get_newest_cred_label_condref().
This avoids a bunch of work overall and allows the compiler to note when no
clean up is necessary, allowing for tail calls.
This in particular happens in apparmor_file_permission(), which manages to
tail call aa_file_perm() 105 bytes in (vs a regular call 112 bytes in
followed by branches to figure out if clean up is needed).
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
|
|
This reverts commit e9ed1eb8f6217e53843d82ecf2d50f8d1a93e77c.
Eric has requested that this patch be taken through the libcrypto-next
tree, instead.
Signed-off-by: John Johansen <john.johansen@canonical.com>
|
|
Some versions of the parser are generating an xtable transition per
state in the state machine, even when the state machine isn't using
the transition table.
The parser bug is triggered by
commit 2e12c5f06017 ("apparmor: add additional flags to extended permission.")
In addition to fixing this in userspace, mitigate this in the kernel
as part of the policy verification checks by detecting this situation
and adjusting to what is actually used, or if not used at all freeing
it, so we are not wasting unneeded memory on policy.
Fixes: 2e12c5f06017 ("apparmor: add additional flags to extended permission.")
Signed-off-by: John Johansen <john.johansen@canonical.com>
|
|
This user of SHA-256 does not support any other algorithm, so the
crypto_shash abstraction provides no value. Just use the SHA-256
library API instead, which is much simpler and easier to use.
Acked-by: John Johansen <john.johansen@canonical.com>
Link: https://lore.kernel.org/r/20250630174805.59010-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
mode
On PLPKS enabled PowerVM LPAR, there is no provision to load signed
third-party kernel modules when the key management mode is static. This
is because keys from secure boot secvars are only loaded when the key
management mode is dynamic.
Allow loading of the trustedcadb and moduledb keys even in the static
key management mode, where the secvar format string takes the form
"ibm,plpks-sb-v0".
Signed-off-by: Srish Srinivasan <ssrish@linux.ibm.com>
Tested-by: R Nageswara Sastry <rnsastry@linux.ibm.com>
Reviewed-by: Mimi Zohar <zohar@linux.ibm.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Nayna Jain <nayna@linux.ibm.com>
Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250610211907.101384-4-ssrish@linux.ibm.com
|
|
Now that we expose struct file_attr as our uapi struct rename all the
internal struct to struct file_kattr to clearly communicate that it is a
kernel internal struct. This is similar to struct mount_{k}attr and
others.
Link: https://lore.kernel.org/20250703-restlaufzeit-baurecht-9ed44552b481@brauner
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
These hooks are called on inode extended attribute retrieval/change.
Cc: selinux@vger.kernel.org
Cc: Paul Moore <paul@paul-moore.com>
Acked-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Andrey Albershteyn <aalbersh@kernel.org>
Link: https://lore.kernel.org/20250630-xattrat-syscall-v6-3-c4e3bc35227b@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Introduce new hooks for setting and getting filesystem extended
attributes on inode (FS_IOC_FSGETXATTR).
Cc: selinux@vger.kernel.org
Cc: Paul Moore <paul@paul-moore.com>
Acked-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Andrey Albershteyn <aalbersh@kernel.org>
Link: https://lore.kernel.org/20250630-xattrat-syscall-v6-2-c4e3bc35227b@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
get_id_range() expects a positive value as first argument but
get_random_u8() can return 0. Fix this by clamping it.
Validated by running the test in a for loop for 1000 times.
Note that MAX() is wrong as it is only supposed to be used for
constants, but max() is good here.
[..] ok 9 test_range2_rand1
[..] ok 10 test_range2_rand2
[..] ok 11 test_range2_rand15
[..] ------------[ cut here ]------------
[..] WARNING: CPU: 6 PID: 104 at security/landlock/id.c:99 test_range2_rand16 (security/landlock/id.c:99 (discriminator 1) security/landlock/id.c:234 (discriminator 1))
[..] Modules linked in:
[..] CPU: 6 UID: 0 PID: 104 Comm: kunit_try_catch Tainted: G N 6.16.0-rc1-dev-00001-g314a2f98b65f #1 PREEMPT(undef)
[..] Tainted: [N]=TEST
[..] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[..] RIP: 0010:test_range2_rand16 (security/landlock/id.c:99 (discriminator 1) security/landlock/id.c:234 (discriminator 1))
[..] Code: 49 c7 c0 10 70 30 82 4c 89 ff 48 c7 c6 a0 63 1e 83 49 c7 45 a0 e0 63 1e 83 e8 3f 95 17 00 e9 1f ff ff ff 0f 0b e9 df fd ff ff <0f> 0b ba 01 00 00 00 e9 68 fe ff ff 49 89 45 a8 49 8d 4d a0 45 31
[..] RSP: 0000:ffff888104eb7c78 EFLAGS: 00010246
[..] RAX: 0000000000000000 RBX: 000000000870822c RCX: 0000000000000000
^^^^^^^^^^^^^^^^
[..]
[..] Call Trace:
[..]
[..] ---[ end trace 0000000000000000 ]---
[..] ok 12 test_range2_rand16
[..] # landlock_id: pass:12 fail:0 skip:0 total:12
[..] # Totals: pass:12 fail:0 skip:0 total:12
[..] ok 1 landlock_id
Fixes: d9d2a68ed44b ("landlock: Add unique ID generator")
Signed-off-by: Tingmao Wang <m@maowtm.org>
Link: https://lore.kernel.org/r/73e28efc5b8cc394608b99d5bc2596ca917d7c4a.1750003733.git.m@maowtm.org
[mic: Minor cosmetic improvements]
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
Failures in sel_fill_super() will be followed by sel_kill_sb(), which
will call selinuxfs_info_free() anyway.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Reviewed-by: Christian Brauner <brauner@kernel.org>
[PM: subj and description tweaks]
Signed-off-by: Paul Moore <paul@paul-moore.com>
|
|
Max Kellerman recently experienced a problem[1] when calling exec with
differing uid and euid's and he triggered the logic that is supposed
to only handle setuid executables.
When exec isn't changing anything in struct cred it doesn't make sense
to go into the code that is there to handle the case when the
credentials change.
When looking into the history of the code I discovered that this issue
was not present in Linux-2.4.0-test12 and was introduced in
Linux-2.4.0-prerelease when the logic for handling this case was moved
from prepare_binprm to compute_creds in fs/exec.c.
The bug introdused was to comparing euid in the new credentials with
uid instead of euid in the old credentials, when testing if setuid
had changed the euid.
Since triggering the keep ptrace limping along case for setuid
executables makes no sense when it was not a setuid exec revert back
to the logic present in Linux-2.4.0-test12.
This removes the confusingly named and subtlety incorrect helpers
is_setuid and is_setgid, that helped this bug to persist.
The varaiable is_setid is renamed to id_changed (it's Linux-2.4.0-test12)
as the old name describes what matters rather than it's cause.
The code removed in Linux-2.4.0-prerelease was:
- /* Set-uid? */
- if (mode & S_ISUID) {
- bprm->e_uid = inode->i_uid;
- if (bprm->e_uid != current->euid)
- id_change = 1;
- }
-
- /* Set-gid? */
- /*
- * If setgid is set but no group execute bit then this
- * is a candidate for mandatory locking, not a setgid
- * executable.
- */
- if ((mode & (S_ISGID | S_IXGRP)) == (S_ISGID | S_IXGRP)) {
- bprm->e_gid = inode->i_gid;
- if (!in_group_p(bprm->e_gid))
- id_change = 1;
Linux-2.4.0-prerelease added the current logic as:
+ if (bprm->e_uid != current->uid || bprm->e_gid != current->gid ||
+ !cap_issubset(new_permitted, current->cap_permitted)) {
+ current->dumpable = 0;
+
+ lock_kernel();
+ if (must_not_trace_exec(current)
+ || atomic_read(¤t->fs->count) > 1
+ || atomic_read(¤t->files->count) > 1
+ || atomic_read(¤t->sig->count) > 1) {
+ if(!capable(CAP_SETUID)) {
+ bprm->e_uid = current->uid;
+ bprm->e_gid = current->gid;
+ }
+ if(!capable(CAP_SETPCAP)) {
+ new_permitted = cap_intersect(new_permitted,
+ current->cap_permitted);
+ }
+ }
+ do_unlock = 1;
+ }
I have condenced the logic from Linux-2.4.0-test12 to just:
id_changed = !uid_eq(new->euid, old->euid) || !in_group_p(new->egid);
This change is userspace visible, but I don't expect anyone to care.
For the bug that is being fixed to trigger bprm->unsafe has to be set.
The variable bprm->unsafe is set when ptracing an executable, when
sharing a working directory, or when no_new_privs is set. Properly
testing for cases that are safe even in those conditions and doing
nothing special should not affect anyone. Especially if they were
previously ok with their credentials getting munged
To minimize behavioural changes the code continues to set secureexec
when euid != uid or when egid != gid.
[1] https://lkml.kernel.org/r/20250306082615.174777-1-max.kellermann@ionos.com
Reported-by: Max Kellermann <max.kellermann@ionos.com>
Fixes: 64444d3d0d7f ("Linux version 2.4.0-prerelease")
v1: https://lkml.kernel.org/r/878qmxsuy8.fsf@email.froward.int.ebiederm.org
Reviewed-by: Serge Hallyn <serge@hallyn.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Reviewed-by: Jann Horn <jannh@google.com>
Acked-by: Kees Cook <kees@kernel.org>
|
|
As reported by syzbot, hashtab_init() can be affected by abnormally
large policy loads which would cause the kernel's allocator to emit
a warning in some configurations. Since the SELinux hashtab_init()
code handles the case where the allocation fails, due to a large
request or some other reason, we can safely add the __GFP_NOWARN flag
to squelch these abnormally large allocation warnings.
Reported-by: syzbot+bc2c99c2929c3d219fb3@syzkaller.appspotmail.com
Tested-by: syzbot+bc2c99c2929c3d219fb3@syzkaller.appspotmail.com
Signed-off-by: Paul Moore <paul@paul-moore.com>
|
|
neveraudit|permissive
Extend the task avdcache to also cache whether the task SID is both
permissive and neveraudit, and return immediately if so in both
selinux_inode_getattr() and selinux_inode_permission().
The same approach could be applied to many of the hook functions
although the avdcache would need to be updated for more than directory
search checks in order for this optimization to be beneficial for checks
on objects other than directories.
To test, apply https://github.com/SELinuxProject/selinux/pull/473 to
your selinux userspace, build and install libsepol, and use the following
CIL policy module:
$ cat neverauditpermissive.cil
(typeneveraudit unconfined_t)
(typepermissive unconfined_t)
Without this module inserted, running the following commands:
perf record make -jN # on an already built allmodconfig tree
perf report --sort=symbol,dso
yields the following percentages (only showing __d_lookup_rcu for
reference and only showing relevant SELinux functions):
1.65% [k] __d_lookup_rcu
0.53% [k] selinux_inode_permission
0.40% [k] selinux_inode_getattr
0.15% [k] avc_lookup
0.05% [k] avc_has_perm
0.05% [k] avc_has_perm_noaudit
0.02% [k] avc_policy_seqno
0.02% [k] selinux_file_permission
0.01% [k] selinux_inode_alloc_security
0.01% [k] selinux_file_alloc_security
for a total of 1.24% for SELinux compared to 1.65% for
__d_lookup_rcu().
After running the following command to insert this module:
semodule -i neverauditpermissive.cil
and then re-running the same perf commands from above yields
the following non-zero percentages:
1.74% [k] __d_lookup_rcu
0.31% [k] selinux_inode_permission
0.03% [k] selinux_inode_getattr
0.03% [k] avc_policy_seqno
0.01% [k] avc_lookup
0.01% [k] selinux_file_permission
0.01% [k] selinux_file_open
for a total of 0.40% for SELinux compared to 1.74% for
__d_lookup_rcu().
Signed-off-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
|
|
Introduce neveraudit types i.e. types that should never trigger
audit messages. This allows the AVC to skip all audit-related
processing for such types. Note that neveraudit differs from
dontaudit not only wrt being applied for all checks with a given
source type but also in that it disables all auditing, not just
permission denials.
When a type is both a permissive type and a neveraudit type,
the security server can short-circuit the security_compute_av()
logic, allowing all permissions and not auditing any permissions.
This change just introduces the basic support but does not yet
further optimize the AVC or hook function logic when a type
is both a permissive type and a dontaudit type.
Suggested-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
|
|
If the end result of a security_compute_sid() computation matches the
ssid or tsid, return that SID rather than looking it up again. This
avoids the problem of multiple initial SIDs that map to the same
context.
Cc: stable@vger.kernel.org
Reported-by: Guido Trentalancia <guido@trentalancia.com>
Fixes: ae254858ce07 ("selinux: introduce an initial SID for early boot processes")
Signed-off-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Tested-by: Guido Trentalancia <guido@trentalancia.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
|
|
... and use securityfs_remove() instead of securityfs_recursive_remove()
Acked-by: Fan Wu <wufan@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
1) creation never returns NULL; error is reported as ERR_PTR()
2) no need to remove file before removing its parent
Acked-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
lookup_template_data_hash_algo() machinery is used to locate the
matching ima_algo_array[] element at read time; securityfs
allows to stash that into inode->i_private at object creation
time, so there's no need to bother
Acked-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
removal of parent takes all children out
Acked-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Reviewed-by: Christian Brauner <brauner@kernel.org>
Acked-by: Mickaël Salaün <mic@digikod.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
We should count the terminating NUL byte as part of the ctx_len.
Otherwise, UBSAN logs a warning:
UBSAN: array-index-out-of-bounds in security/selinux/xfrm.c:99:14
index 60 is out of range for type 'char [*]'
The allocation itself is correct so there is no actual out of bounds
indexing, just a warning.
Cc: stable@vger.kernel.org
Suggested-by: Christian Göttsche <cgzones@googlemail.com>
Link: https://lore.kernel.org/selinux/CAEjxPJ6tA5+LxsGfOJokzdPeRomBHjKLBVR6zbrg+_w3ZZbM3A@mail.gmail.com/
Signed-off-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
|
|
Commit d7b6918e22c7 ("selinux: Deprecate /sys/fs/selinux/user") started
the deprecation process for /sys/fs/selinux/user:
The selinuxfs "user" node allows userspace to request a list
of security contexts that can be reached for a given SELinux
user from a given starting context. This was used by libselinux
when various login-style programs requested contexts for
users, but libselinux stopped using it in 2020.
Kernel support will be removed no sooner than Dec 2025.
A pr_warn() message has been in place since Linux v6.13, this patch
adds a five second sleep to /sys/fs/selinux/user to help make the
deprecation and upcoming removal more noticeable.
Signed-off-by: Paul Moore <paul@paul-moore.com>
|
|
Fix a typo in the security_inode_mkdir() comment block.
Signed-off-by: Kalevi Kolttonen <kalevi@kolttonen.fi>
[PM: subject tweak, add description]
Signed-off-by: Paul Moore <paul@paul-moore.com>
|
|
Kdump kernel doesn't need IMA functionality, and enabling IMA will cost
extra memory. It would be very helpful to allow IMA to be disabled for
kdump kernel.
Hence add a knob ima=on|off here to allow turning IMA off in kdump
kernel if needed.
Note that this IMA disabling is limited to kdump kernel, please don't
abuse it in other kernel and thus serious consequences are caused.
Signed-off-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
|
|
... and fix the mount leak when anything's mounted there.
securityfs_recursive_remove becomes an alias for securityfs_remove -
we'll probably need to remove it in a cycle or two.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Nothing on securityfs ever changes parents, so we don't need
to pin the internal mount if it's already pinned for parent.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
incidentally, securityfs_recursive_remove() is broken without that -
it leaks dentries, since simple_recursive_removal() does not expect
anything of that sort. It could be worked around by dput() in
remove_one() callback, but it's easier to just drop that double-get
stuff.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Invert the FINAL_PUT bit so that test_bit_acquire and clear_bit_unlock
can be used instead of smp_mb.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic
Pull compiler version requirement update from Arnd Bergmann:
"Require gcc-8 and binutils-2.30
x86 already uses gcc-8 as the minimum version, this changes all other
architectures to the same version. gcc-8 is used is Debian 10 and Red
Hat Enterprise Linux 8, both of which are still supported, and
binutils 2.30 is the oldest corresponding version on those.
Ubuntu Pro 18.04 and SUSE Linux Enterprise Server 15 both use gcc-7 as
the system compiler but additionally include toolchains that remain
supported.
With the new minimum toolchain versions, a number of workarounds for
older versions can be dropped, in particular on x86_64 and arm64.
Importantly, the updated compiler version allows removing two of the
five remaining gcc plugins, as support for sancov and structeak
features is already included in modern compiler versions.
I tried collecting the known changes that are possible based on the
new toolchain version, but expect that more cleanups will be possible.
Since this touches multiple architectures, I merged the patches
through the asm-generic tree."
* tag 'gcc-minimum-version-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
Makefile.kcov: apply needed compiler option unconditionally in CFLAGS_KCOV
Documentation: update binutils-2.30 version reference
gcc-plugins: remove SANCOV gcc plugin
Kbuild: remove structleak gcc plugin
arm64: drop binutils version checks
raid6: skip avx512 checks
kbuild: require gcc-8 and binutils-2.30
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/wufan/ipe
Pull IPE update from Fan Wu:
"A single commit from Jasjiv Singh, that adds an errno field to IPE
policy load auditing to log failures with error details, not just
successes.
This improves the security audit trail and helps diagnose policy
deployment issues"
* tag 'ipe-pr-20250527' of git://git.kernel.org/pub/scm/linux/kernel/git/wufan/ipe:
ipe: add errno field to IPE policy load auditing
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Paolo Abeni:
"Core:
- Implement the Device Memory TCP transmit path, allowing zero-copy
data transmission on top of TCP from e.g. GPU memory to the wire.
- Move all the IPv6 routing tables management outside the RTNL scope,
under its own lock and RCU. The route control path is now 3x times
faster.
- Convert queue related netlink ops to instance lock, reducing again
the scope of the RTNL lock. This improves the control plane
scalability.
- Refactor the software crc32c implementation, removing unneeded
abstraction layers and improving significantly the related
micro-benchmarks.
- Optimize the GRO engine for UDP-tunneled traffic, for a 10%
performance improvement in related stream tests.
- Cover more per-CPU storage with local nested BH locking; this is a
prep work to remove the current per-CPU lock in local_bh_disable()
on PREMPT_RT.
- Introduce and use nlmsg_payload helper, combining buffer bounds
verification with accessing payload carried by netlink messages.
Netfilter:
- Rewrite the procfs conntrack table implementation, improving
considerably the dump performance. A lot of user-space tools still
use this interface.
- Implement support for wildcard netdevice in netdev basechain and
flowtables.
- Integrate conntrack information into nft trace infrastructure.
- Export set count and backend name to userspace, for better
introspection.
BPF:
- BPF qdisc support: BPF-qdisc can be implemented with BPF struct_ops
programs and can be controlled in similar way to traditional qdiscs
using the "tc qdisc" command.
- Refactor the UDP socket iterator, addressing long standing issues
WRT duplicate hits or missed sockets.
Protocols:
- Improve TCP receive buffer auto-tuning and increase the default
upper bound for the receive buffer; overall this improves the
single flow maximum thoughput on 200Gbs link by over 60%.
- Add AFS GSSAPI security class to AF_RXRPC; it provides transport
security for connections to the AFS fileserver and VL server.
- Improve TCP multipath routing, so that the sources address always
matches the nexthop device.
- Introduce SO_PASSRIGHTS for AF_UNIX, to allow disabling SCM_RIGHTS,
and thus preventing DoS caused by passing around problematic FDs.
- Retire DCCP socket. DCCP only receives updates for bugs, and major
distros disable it by default. Its removal allows for better
organisation of TCP fields to reduce the number of cache lines hit
in the fast path.
- Extend TCP drop-reason support to cover PAWS checks.
Driver API:
- Reorganize PTP ioctl flag support to require an explicit opt-in for
the drivers, avoiding the problem of drivers not rejecting new
unsupported flags.
- Converted several device drivers to timestamping APIs.
- Introduce per-PHY ethtool dump helpers, improving the support for
dump operations targeting PHYs.
Tests and tooling:
- Add support for classic netlink in user space C codegen, so that
ynl-c can now read, create and modify links, routes addresses and
qdisc layer configuration.
- Add ynl sub-types for binary attributes, allowing ynl-c to output
known struct instead of raw binary data, clarifying the classic
netlink output.
- Extend MPTCP selftests to improve the code-coverage.
- Add tests for XDP tail adjustment in AF_XDP.
New hardware / drivers:
- OpenVPN virtual driver: offload OpenVPN data channels processing to
the kernel-space, increasing the data transfer throughput WRT the
user-space implementation.
- Renesas glue driver for the gigabit ethernet RZ/V2H(P) SoC.
- Broadcom asp-v3.0 ethernet driver.
- AMD Renoir ethernet device.
- ReakTek MT9888 2.5G ethernet PHY driver.
- Aeonsemi 10G C45 PHYs driver.
Drivers:
- Ethernet high-speed NICs:
- nVidia/Mellanox (mlx5):
- refactor the steering table handling to significantly
reduce the amount of memory used
- add support for complex matches in H/W flow steering
- improve flow streeing error handling
- convert to netdev instance locking
- Intel (100G, ice, igb, ixgbe, idpf):
- ice: add switchdev support for LLDP traffic over VF
- ixgbe: add firmware manipulation and regions devlink support
- igb: introduce support for frame transmission premption
- igb: adds persistent NAPI configuration
- idpf: introduce RDMA support
- idpf: add initial PTP support
- Meta (fbnic):
- extend hardware stats coverage
- add devlink dev flash support
- Broadcom (bnxt):
- add support for RX-side device memory TCP
- Wangxun (txgbe):
- implement support for udp tunnel offload
- complete PTP and SRIOV support for AML 25G/10G devices
- Ethernet NICs embedded and virtual:
- Google (gve):
- add device memory TCP TX support
- Amazon (ena):
- support persistent per-NAPI config
- Airoha:
- add H/W support for L2 traffic offload
- add per flow stats for flow offloading
- RealTek (rtl8211): add support for WoL magic packet
- Synopsys (stmmac):
- dwmac-socfpga 1000BaseX support
- add Loongson-2K3000 support
- introduce support for hardware-accelerated VLAN stripping
- Broadcom (bcmgenet):
- expose more H/W stats
- Freescale (enetc, dpaa2-eth):
- enetc: add MAC filter, VLAN filter RSS and loopback support
- dpaa2-eth: convert to H/W timestamping APIs
- vxlan: convert FDB table to rhashtable, for better scalabilty
- veth: apply qdisc backpressure on full ring to reduce TX drops
- Ethernet switches:
- Microchip (kzZ88x3): add ETS scheduler support
- Ethernet PHYs:
- RealTek (rtl8211):
- add support for WoL magic packet
- add support for PHY LEDs
- CAN:
- Adds RZ/G3E CANFD support to the rcar_canfd driver.
- Preparatory work for CAN-XL support.
- Add self-tests framework with support for CAN physical interfaces.
- WiFi:
- mac80211:
- scan improvements with multi-link operation (MLO)
- Qualcomm (ath12k):
- enable AHB support for IPQ5332
- add monitor interface support to QCN9274
- add multi-link operation support to WCN7850
- add 802.11d scan offload support to WCN7850
- monitor mode for WCN7850, better 6 GHz regulatory
- Qualcomm (ath11k):
- restore hibernation support
- MediaTek (mt76):
- WiFi-7 improvements
- implement support for mt7990
- Intel (iwlwifi):
- enhanced multi-link single-radio (EMLSR) support on 5 GHz links
- rework device configuration
- RealTek (rtw88):
- improve throughput for RTL8814AU
- RealTek (rtw89):
- add multi-link operation support
- STA/P2P concurrency improvements
- support different SAR configs by antenna
- Bluetooth:
- introduce HCI Driver protocol
- btintel_pcie: do not generate coredump for diagnostic events
- btusb: add HCI Drv commands for configuring altsetting
- btusb: add RTL8851BE device 0x0bda:0xb850
- btusb: add new VID/PID 13d3/3584 for MT7922
- btusb: add new VID/PID 13d3/3630 and 13d3/3613 for MT7925
- btnxpuart: implement host-wakeup feature"
* tag 'net-next-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1611 commits)
selftests/bpf: Fix bpf selftest build warning
selftests: netfilter: Fix skip of wildcard interface test
net: phy: mscc: Stop clearing the the UDPv4 checksum for L2 frames
net: openvswitch: Fix the dead loop of MPLS parse
calipso: Don't call calipso functions for AF_INET sk.
selftests/tc-testing: Add a test for HFSC eltree double add with reentrant enqueue behaviour on netem
net_sched: hfsc: Address reentrant enqueue adding class to eltree twice
octeontx2-pf: QOS: Refactor TC_HTB_LEAF_DEL_LAST callback
octeontx2-pf: QOS: Perform cache sync on send queue teardown
net: mana: Add support for Multi Vports on Bare metal
net: devmem: ncdevmem: remove unused variable
net: devmem: ksft: upgrade rx test to send 1K data
net: devmem: ksft: add 5 tuple FS support
net: devmem: ksft: add exit_wait to make rx test pass
net: devmem: ksft: add ipv4 support
net: devmem: preserve sockc_err
page_pool: fix ugly page_pool formatting
net: devmem: move list_add to net_devmem_bind_dmabuf.
selftests: netfilter: nft_queue.sh: include file transfer duration in log message
net: phy: mscc: Fix memory leak when using one step timestamping
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux
Pull selinux updates from Paul Moore:
- Reduce the SELinux impact on path walks.
Add a small directory access cache to the per-task SELinux state.
This cache allows SELinux to cache the most recently used directory
access decisions in order to avoid repeatedly querying the AVC on
path walks where the majority of the directories have similar
security contexts/labels.
My performance measurements are crude, but prior to this patch the
time spent in SELinux code on a 'make allmodconfig' run was 103% that
of __d_lookup_rcu(), and with this patch the time spent in SELinux
code dropped to 63% of __d_lookup_rcu(), a ~40% improvement.
Additional improvments can be expected in the future, but those will
require additional SELinux policy/toolchain support.
- Add support for wildcards in genfscon policy statements.
This patch allows for wildcards in the genfscon patch matching logic
as opposed to the prefix matching that was used prior to this change.
Adding wilcard support allows for more expressive and efficient path
matching in the policy which is especially helpful for sysfs, and has
resulted in a ~15% boot time reduction in Android.
SELinux policies can opt into wilcard matching by using the
"genfs_seclabel_wildcard" policy capability.
- Unify the error/OOM handling of the SELinux network caches.
A failure to allocate memory for the SELinux network caches isn't
fatal as the object label can still be safely returned to the caller,
it simply means that we cannot add the new data to the cache, at
least temporarily. This patch corrects this behavior for the
InfiniBand cache and does some minor cleanup.
- Minor improvements around constification, 'likely' annotations, and
removal of bogus comments.
* tag 'selinux-pr-20250527' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
selinux: fix the kdoc header for task_avdcache_update
selinux: remove a duplicated include
selinux: reduce path walk overhead
selinux: support wildcard match in genfscon
selinux: drop copy-paste comment
selinux: unify OOM handling in network hashtables
selinux: add likely hints for fast paths
selinux: contify network namespace pointer
selinux: constify network address pointer
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm
Pull lsm update from Paul Moore:
"One minor LSM framework patch to move the selinux_netlink_send() hook
under the CONFIG_SECURITY_NETWORK Kconfig knob"
* tag 'lsm-pr-20250527' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm:
lsm: Move security_netlink_send to under CONFIG_SECURITY_NETWORK
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity
Pull integrity updates from Mimi Zohar:
"Carrying the IMA measurement list across kexec is not a new feature,
but is updated to address a couple of issues:
- Carrying the IMA measurement list across kexec required knowing
apriori all the file measurements between the "kexec load" and
"kexec execute" in order to measure them before the "kexec load".
Any delay between the "kexec load" and "kexec exec" exacerbated the
problem.
- Any file measurements post "kexec load" were not carried across
kexec, resulting in the measurement list being out of sync with the
TPM PCR.
With these changes, the buffer for the IMA measurement list is still
allocated at "kexec load", but copying the IMA measurement list is
deferred to after quiescing the TPM.
Two new kexec critical data records are defined"
* tag 'integrity-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity:
ima: do not copy measurement list to kdump kernel
ima: measure kexec load and exec events as critical data
ima: make the kexec extra memory configurable
ima: verify if the segment size has changed
ima: kexec: move IMA log copy from kexec load to execute
ima: kexec: define functions to copy IMA log at soft boot
ima: kexec: skip IMA segment validation after kexec soft reboot
kexec: define functions to map and unmap segments
ima: define and call ima_alloc_kexec_file_buf()
ima: rename variable the seq_file "file" to "ima_kexec_file"
|
|
Pull smack update from Casey Schaufler:
"One trivial kernel doc fix"
* tag 'Smack-for-6.16' of https://github.com/cschaufler/smack-next:
security/smack/smackfs: small kernel-doc fixes
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull hardening updates from Kees Cook:
- Update overflow helpers to ease refactoring of on-stack flex array
instances (Gustavo A. R. Silva, Kees Cook)
- lkdtm: Use SLAB_NO_MERGE instead of constructors (Harry Yoo)
- Simplify CONFIG_CC_HAS_COUNTED_BY (Jan Hendrik Farr)
- Disable u64 usercopy KUnit test on 32-bit SPARC (Thomas Weißschuh)
- Add missed designated initializers now exposed by fixed randstruct
(Nathan Chancellor, Kees Cook)
- Document compilers versions for __builtin_dynamic_object_size
- Remove ARM_SSP_PER_TASK GCC plugin
- Fix GCC plugin randstruct, add selftests, and restore COMPILE_TEST
builds
- Kbuild: induce full rebuilds when dependencies change with GCC
plugins, the Clang sanitizer .scl file, or the randstruct seed.
- Kbuild: Switch from -Wvla to -Wvla-larger-than=1
- Correct several __nonstring uses for -Wunterminated-string-initialization
* tag 'hardening-v6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (23 commits)
Revert "hardening: Disable GCC randstruct for COMPILE_TEST"
lib/tests: randstruct: Add deep function pointer layout test
lib/tests: Add randstruct KUnit test
randstruct: gcc-plugin: Remove bogus void member
net: qede: Initialize qede_ll_ops with designated initializer
scsi: qedf: Use designated initializer for struct qed_fcoe_cb_ops
md/bcache: Mark __nonstring look-up table
integer-wrap: Force full rebuild when .scl file changes
randstruct: Force full rebuild when seed changes
gcc-plugins: Force full rebuild when plugins change
kbuild: Switch from -Wvla to -Wvla-larger-than=1
hardening: simplify CONFIG_CC_HAS_COUNTED_BY
overflow: Fix direct struct member initialization in _DEFINE_FLEX()
kunit/overflow: Add tests for STACK_FLEX_ARRAY_SIZE() helper
overflow: Add STACK_FLEX_ARRAY_SIZE() helper
input/joystick: magellan: Mark __nonstring look-up table const
watchdog: exar: Shorten identity name to fit correctly
mod_devicetable: Enlarge the maximum platform_device_id name length
overflow: Clarify expectations for getting DEFINE_FLEX variable sizes
compiler_types: Identify compiler versions for __builtin_dynamic_object_size
...
|
|
Users of IPE require a way to identify when and why an operation fails,
allowing them to both respond to violations of policy and be notified
of potentially malicious actions on their systems with respect to IPE.
This patch introduces a new error field to the AUDIT_IPE_POLICY_LOAD event
to log policy loading failures. Currently, IPE only logs successful policy
loads, but not failures. Tracking failures is crucial to detect malicious
attempts and ensure a complete audit trail for security events.
The new error field will capture the following error codes:
* -ENOKEY: Key used to sign the IPE policy not found in the keyring
* -ESTALE: Attempting to update an IPE policy with an older version
* -EKEYREJECTED: IPE signature verification failed
* -ENOENT: Policy was deleted while updating
* -EEXIST: Same name policy already deployed
* -ERANGE: Policy version number overflow
* -EINVAL: Policy version parsing error
* -EPERM: Insufficient permission
* -ENOMEM: Out of memory (OOM)
* -EBADMSG: Policy is invalid
Here are some examples of the updated audit record types:
AUDIT_IPE_POLICY_LOAD(1422):
audit: AUDIT1422 policy_name="Test_Policy" policy_version=0.0.1
policy_digest=sha256:84EFBA8FA71E62AE0A537FAB962F8A2BD1053964C4299DCA
92BFFF4DB82E86D3 auid=1000 ses=3 lsm=ipe res=1 errno=0
The above record shows a new policy has been successfully loaded into
the kernel with the policy name, version, and hash with the errno=0.
AUDIT_IPE_POLICY_LOAD(1422) with error:
audit: AUDIT1422 policy_name=? policy_version=? policy_digest=?
auid=1000 ses=3 lsm=ipe res=0 errno=-74
The above record shows a policy load failure due to an invalid policy
(-EBADMSG).
By adding this error field, we ensure that all policy load attempts,
whether successful or failed, are logged, providing a comprehensive
audit trail for IPE policy management.
Signed-off-by: Jasjiv Singh <jasjivsingh@linux.microsoft.com>
Signed-off-by: Fan Wu <wufan@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs directory lookup updates from Christian Brauner:
"This contains cleanups for the lookup_one*() family of helpers.
We expose a set of functions with names containing "lookup_one_len"
and others without the "_len". This difference has nothing to do with
"len". It's rater a historical accident that can be confusing.
The functions without "_len" take a "mnt_idmap" pointer. This is found
in the "vfsmount" and that is an important question when choosing
which to use: do you have a vfsmount, or are you "inside" the
filesystem. A related question is "is permission checking relevant
here?".
nfsd and cachefiles *do* have a vfsmount but *don't* use the non-_len
functions. They pass nop_mnt_idmap and refuse to work on filesystems
which have any other idmap.
This work changes nfsd and cachefile to use the lookup_one family of
functions and to explictily pass &nop_mnt_idmap which is consistent
with all other vfs interfaces used where &nop_mnt_idmap is explicitly
passed.
The remaining uses of the "_one" functions do not require permission
checks so these are renamed to be "_noperm" and the permission
checking is removed.
This series also changes these lookup function to take a qstr instead
of separate name and len. In many cases this simplifies the call"
* tag 'vfs-6.16-rc1.async.dir' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
VFS: change lookup_one_common and lookup_noperm_common to take a qstr
Use try_lookup_noperm() instead of d_hash_and_lookup() outside of VFS
VFS: rename lookup_one_len family to lookup_noperm and remove permission check
cachefiles: Use lookup_one() rather than lookup_one_len()
nfsd: Use lookup_one() rather than lookup_one_len()
VFS: improve interface for lookup_one functions
|
|
The label struct is variable length. While its use in struct aa_profile
is fixed length at 2 entries the variable length member needs to be
the last member in the structure.
The code already does this but the comment has it in the wrong location.
Also add a comment to ensure it stays at the end of the structure.
While we are at it, update the documentation for other profile members
as well.
Signed-off-by: John Johansen <john.johansen@canonical.com>
|
|
The debug_values_table is only referenced from lib.c so it should
be static.
Signed-off-by: John Johansen <john.johansen@canonical.com>
|
|
Conflicting attachment paths are an error state that result in the
binary in question executing under an unexpected ix/ux fallback. As such,
it should be audited to record the occurrence of conflicting attachments.
Signed-off-by: Ryan Lee <ryan.lee@canonical.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
|
|
Instead of silently overwriting the conflicting profile attachment string,
include that information in the ix/ux fallback string that gets set as info
instead. Also add a warning print if some other info is set that would be
overwritten by the ix/ux fallback string or by the profile not found error.
Signed-off-by: Ryan Lee <ryan.lee@canonical.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
|
|
declaration
Instead of having a literal, making this a constant will allow for (hacky)
detection of conflicting profile attachments from inspection of the info
pointer. This is used in the next patch to augment the information provided
through domain.c:x_to_label for ix/ux fallback.
Signed-off-by: Ryan Lee <ryan.lee@canonical.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
|
|
find_attach may set info if something unusual happens during that process
(currently only used to signal conflicting attachments, but this could be
expanded in the future). This is information that should be propagated to
userspace via an audit message.
Signed-off-by: Ryan Lee <ryan.lee@canonical.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
|
|
address_family_names and sock_type_names were created as const char *a[],
which declares them as (non-const) pointers to const chars. Since the
pointers themselves would not be changed, they should be generated as
const char *const a[].
Signed-off-by: Ryan Lee <ryan.lee@canonical.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
|
|
Conflicting attachment resolution is based on the number of states
traversed to reach an accepting state in the attachment DFA, accounting
for DFA loops traversed during the matching process. However, the loop
counting logic had multiple bugs:
- The inc_wb_pos macro increments both position and length, but length
is supposed to saturate upon hitting buffer capacity, instead of
wrapping around.
- If no revisited state is found when traversing the history, is_loop
would still return true, as if there was a loop found the length of
the history buffer, instead of returning false and signalling that
no loop was found. As a result, the adjustment step of
aa_dfa_leftmatch would sometimes produce negative counts with loop-
free DFAs that traversed enough states.
- The iteration in the is_loop for loop is supposed to stop before
i = wb->len, so the conditional should be < instead of <=.
This patch fixes the above bugs as well as the following nits:
- The count and size fields in struct match_workbuf were not used,
so they can be removed.
- The history buffer in match_workbuf semantically stores aa_state_t
and not unsigned ints, even if aa_state_t is currently unsigned int.
- The local variables in is_loop are counters, and thus should be
unsigned ints instead of aa_state_t's.
Fixes: 21f606610502 ("apparmor: improve overlapping domain attachment resolution")
Signed-off-by: Ryan Lee <ryan.lee@canonical.com>
Co-developed-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
|
|
Cross-merge networking fixes after downstream PR (net-6.15-rc8).
Conflicts:
80f2ab46c2ee ("irdma: free iwdev->rf after removing MSI-X")
4bcc063939a5 ("ice, irdma: fix an off by one in error handling code")
c24a65b6a27c ("iidc/ice/irdma: Update IDC to support multiple consumers")
https://lore.kernel.org/20250513130630.280ee6c5@canb.auug.org.au
No extra adjacent changes.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add function short descriptions to the kernel-doc where missing.
Correct a verb and add ending periods to sentences.
smackfs.c:1080: warning: missing initial short description on line:
* smk_net4addr_insert
smackfs.c:1343: warning: missing initial short description on line:
* smk_net6addr_insert
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Cc: linux-security-module@vger.kernel.org
Cc: Paul Moore <paul@paul-moore.com>
Cc: James Morris <jmorris@namei.org>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
|
|
WB_HISTORY_SIZE was defined to be a value not a power of 2, despite a
comment in the declaration of struct match_workbuf stating it is and a
modular arithmetic usage in the inc_wb_pos macro assuming that it is. Bump
WB_HISTORY_SIZE's value up to 32 and add a BUILD_BUG_ON_NOT_POWER_OF_2
line to ensure that any future changes to the value of WB_HISTORY_SIZE
respect this requirement.
Fixes: 136db994852a ("apparmor: increase left match history buffer size")
Signed-off-by: Ryan Lee <ryan.lee@canonical.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
|
|
Fix kernel-doc warnings in apparmor header files as reported by
scripts/kernel-doc:
cred.h:128: warning: expecting prototype for end_label_crit_section(). Prototype was for end_current_label_crit_section() instead
file.h:108: warning: expecting prototype for aa_map_file_perms(). Prototype was for aa_map_file_to_perms() instead
lib.h:159: warning: Function parameter or struct member 'hname' not described in 'basename'
lib.h:159: warning: Excess function parameter 'name' description in 'basename'
match.h:21: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
* The format used for transition tables is based on the GNU flex table
perms.h:109: warning: Function parameter or struct member 'accum' not described in 'aa_perms_accum_raw'
perms.h:109: warning: Function parameter or struct member 'addend' not described in 'aa_perms_accum_raw'
perms.h:136: warning: Function parameter or struct member 'accum' not described in 'aa_perms_accum'
perms.h:136: warning: Function parameter or struct member 'addend' not described in 'aa_perms_accum'
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Ryan Lee <ryan.lee@canonical.com>
Cc: John Johansen <john.johansen@canonical.com>
Cc: John Johansen <john@apparmor.net>
Cc: apparmor@lists.ubuntu.com
Cc: linux-security-module@vger.kernel.org
Cc: Paul Moore <paul@paul-moore.com>
Cc: James Morris <jmorris@namei.org>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
|
|
The check on profile->signal is always false, the value can never be
less than 1 *and* greater than MAXMAPPED_SIG. Fix this by replacing
the logical operator && with ||.
Fixes: 84c455decf27 ("apparmor: add support for profiles to define the kill signal")
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
|
|
This user of SHA-256 does not support any other algorithm, so the
crypto_shash abstraction provides no value. Just use the SHA-256
library API instead, which is much simpler and easier to use.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
|
|
The unpack_secmark() function currently uses kfree() to release memory
allocated for secmark structures and their labels. However, if a failure
occurs after partially parsing secmark, sensitive data may remain in
memory, posing a security risk.
To mitigate this, replace kfree() with kfree_sensitive() for freeing
secmark structures and their labels, aligning with the approach used
in free_ruleset().
I am submitting this as an RFC to seek freedback on whether this change
is appropriate and aligns with the subsystem's expectations. If
confirmed to be helpful, I will send a formal patch.
Signed-off-by: Zilin Guan <zilin@seu.edu.cn>
Signed-off-by: John Johansen <john.johansen@canonical.com>
|
|
Kdump kernel doesn't need IMA to do integrity measurement.
Hence the measurement list in 1st kernel doesn't need to be copied to
kdump kernel.
Here skip allocating buffer for measurement list copying if loading
kdump kernel. Then there won't be the later handling related to
ima_kexec_buffer.
Signed-off-by: Steven Chen <chenste@linux.microsoft.com>
Tested-by: Baoquan He <bhe@redhat.com>
Acked-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
|
|
Use the BIT() and BIT_ULL() macros in the new audit code instead of
explicit shifts to improve readability. Use bitmask instead of modulo
operation to simplify code.
Add test_range1_rand15() and test_range2_rand15() KUnit tests to improve
get_id_range() coverage.
Signed-off-by: Günther Noack <gnoack3000@gmail.com>
Link: https://lore.kernel.org/r/20250512093732.1408485-1-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
This reverts commit f5c68a4e84f9feca3be578199ec648b676db2030.
It is again possible to build "allmodconfig" with the randstruct GCC
plugin, so enable it for COMPILE_TEST to catch future bugs.
Signed-off-by: Kees Cook <kees@kernel.org>
|
|
A KUnit test checking boundaries triggers a canary warning, which may be
disturbing. Let's remove this test for now. Hopefully, KUnit will soon
get support for suppressing warning backtraces [1].
Cc: Alessandro Carminati <acarmina@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Günther Noack <gnoack@google.com>
Reported-by: Tingmao Wang <m@maowtm.org>
Closes: https://lore.kernel.org/r/20250327213807.12964-1-m@maowtm.org
Link: https://lore.kernel.org/r/20250425193249.78b45d2589575c15f483c3d8@linux-foundation.org [1]
Link: https://lore.kernel.org/r/20250503065359.3625407-1-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
Cross-merge networking fixes after downstream PR (net-6.15-rc5).
No conflicts or adjacent changes.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
gcc-12 and higher support the -ftrivial-auto-var-init= flag, after
gcc-8 is the minimum version, this is half of the supported ones, and
the vast majority of the versions that users are actually likely to
have, so it seems like a good time to stop having the fallback
plugin implementation
Older toolchains are still able to build kernels normally without
this plugin, but won't be able to use variable initialization..
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
The amount of memory allocated at kexec load, even with the extra memory
allocated, might not be large enough for the entire measurement list. The
indeterminate interval between kexec 'load' and 'execute' could exacerbate
this problem.
Define two new IMA events, 'kexec_load' and 'kexec_execute', to be
measured as critical data at kexec 'load' and 'execute' respectively.
Report the allocated kexec segment size, IMA binary log size and the
runtime measurements count as part of those events.
These events, and the values reported through them, serve as markers in
the IMA log to verify the IMA events are captured during kexec soft
reboot. The presence of a 'kexec_load' event in between the last two
'boot_aggregate' events in the IMA log implies this is a kexec soft
reboot, and not a cold-boot. And the absence of 'kexec_execute' event
after kexec soft reboot implies missing events in that window which
results in inconsistency with TPM PCR quotes, necessitating a cold boot
for a successful remote attestation.
These critical data events are displayed as hex encoded ascii in the
ascii_runtime_measurement_list. Verifying the critical data hash requires
calculating the hash of the decoded ascii string.
For example, to verify the 'kexec_load' data hash:
sudo cat /sys/kernel/security/integrity/ima/ascii_runtime_measurements
| grep kexec_load | cut -d' ' -f 6 | xxd -r -p | sha256sum
To verify the 'kexec_execute' data hash:
sudo cat /sys/kernel/security/integrity/ima/ascii_runtime_measurements
| grep kexec_execute | cut -d' ' -f 6 | xxd -r -p | sha256sum
Co-developed-by: Tushar Sugandhi <tusharsu@linux.microsoft.com>
Signed-off-by: Tushar Sugandhi <tusharsu@linux.microsoft.com>
Signed-off-by: Steven Chen <chenste@linux.microsoft.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Acked-by: Baoquan He <bhe@redhat.com>
Tested-by: Stefan Berger <stefanb@linux.ibm.com> # ppc64/kvm
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
|
|
The extra memory allocated for carrying the IMA measurement list across
kexec is hard-coded as half a PAGE. Make it configurable.
Define a Kconfig option, IMA_KEXEC_EXTRA_MEMORY_KB, to configure the
extra memory (in kb) to be allocated for IMA measurements added during
kexec soft reboot. Ensure the default value of the option is set such
that extra half a page of memory for additional measurements is allocated
for the additional measurements.
Update ima_add_kexec_buffer() function to allocate memory based on the
Kconfig option value, rather than the currently hard-coded one.
Suggested-by: Stefan Berger <stefanb@linux.ibm.com>
Co-developed-by: Tushar Sugandhi <tusharsu@linux.microsoft.com>
Signed-off-by: Tushar Sugandhi <tusharsu@linux.microsoft.com>
Signed-off-by: Steven Chen <chenste@linux.microsoft.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Acked-by: Baoquan He <bhe@redhat.com>
Tested-by: Stefan Berger <stefanb@linux.ibm.com> # ppc64/kvm
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
|
|
kexec 'load' may be called multiple times. Free and realloc the buffer
only if the segment_size is changed from the previous kexec 'load' call.
Signed-off-by: Steven Chen <chenste@linux.microsoft.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Acked-by: Baoquan He <bhe@redhat.com>
Tested-by: Stefan Berger <stefanb@linux.ibm.com> # ppc64/kvm
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
|
|
The IMA log is currently copied to the new kernel during kexec 'load' using
ima_dump_measurement_list(). However, the IMA measurement list copied at
kexec 'load' may result in loss of IMA measurements records that only
occurred after the kexec 'load'. Move the IMA measurement list log copy
from kexec 'load' to 'execute'
Make the kexec_segment_size variable a local static variable within the
file, so it can be accessed during both kexec 'load' and 'execute'.
Define kexec_post_load() as a wrapper for calling ima_kexec_post_load() and
machine_kexec_post_load(). Replace the existing direct call to
machine_kexec_post_load() with kexec_post_load().
When there is insufficient memory to copy all the measurement logs, copy as
much of the measurement list as possible.
Co-developed-by: Tushar Sugandhi <tusharsu@linux.microsoft.com>
Signed-off-by: Tushar Sugandhi <tusharsu@linux.microsoft.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Dave Young <dyoung@redhat.com>
Signed-off-by: Steven Chen <chenste@linux.microsoft.com>
Tested-by: Stefan Berger <stefanb@linux.ibm.com> # ppc64/kvm
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
|
|
The IMA log is currently copied to the new kernel during kexec 'load'
using ima_dump_measurement_list(). However, the log copied at kexec
'load' may result in loss of IMA measurements that only occurred after
kexec "load'. Setup the needed infrastructure to move the IMA log copy
from kexec 'load' to 'execute'.
Define a new IMA hook ima_update_kexec_buffer() as a stub function.
It will be used to call ima_dump_measurement_list() during kexec 'execute'.
Implement ima_kexec_post_load() function to be invoked after the new
Kernel image has been loaded for kexec. ima_kexec_post_load() maps the
IMA buffer to a segment in the newly loaded Kernel. It also registers
the reboot notifier_block to trigger ima_update_kexec_buffer() at
kexec 'execute'.
Set the priority of register_reboot_notifier to INT_MIN to ensure that the
IMA log copy operation will happen at the end of the operation chain, so
that all the IMA measurement records extended into the TPM are copied
Co-developed-by: Tushar Sugandhi <tusharsu@linux.microsoft.com>
Signed-off-by: Tushar Sugandhi <tusharsu@linux.microsoft.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Dave Young <dyoung@redhat.com>
Signed-off-by: Steven Chen <chenste@linux.microsoft.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Acked-by: Baoquan He <bhe@redhat.com>
Tested-by: Stefan Berger <stefanb@linux.ibm.com> # ppc64/kvm
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
|
|
Currently, the function kexec_calculate_store_digests() calculates and
stores the digest of the segment during the kexec_file_load syscall,
where the IMA segment is also allocated.
Later, the IMA segment will be updated with the measurement log at the
kexec execute stage when a kexec reboot is initiated. Therefore, the
digests should be updated for the IMA segment in the normal case. The
problem is that the content of memory segments carried over to the new
kernel during the kexec systemcall can be changed at kexec 'execute'
stage, but the size and the location of the memory segments cannot be
changed at kexec 'execute' stage.
To address this, skip the calculation and storage of the digest for the
IMA segment in kexec_calculate_store_digests() so that it is not added
to the purgatory_sha_regions.
With this change, the IMA segment is not included in the digest
calculation, storage, and verification.
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Dave Young <dyoung@redhat.com>
Co-developed-by: Tushar Sugandhi <tusharsu@linux.microsoft.com>
Signed-off-by: Tushar Sugandhi <tusharsu@linux.microsoft.com>
Signed-off-by: Steven Chen <chenste@linux.microsoft.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Acked-by: Baoquan He <bhe@redhat.com>
Tested-by: Stefan Berger <stefanb@linux.ibm.com> # ppc64/kvm
[zohar@linux.ibm.com: Fixed Signed-off-by tag to match author's email ]
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
|
|
In the current implementation, the ima_dump_measurement_list() API is
called during the kexec "load" phase, where a buffer is allocated and
the measurement records are copied. Due to this, new events added after
kexec load but before kexec execute are not carried over to the new kernel
during kexec operation
Carrying the IMA measurement list across kexec requires allocating a
buffer and copying the measurement records. Separate allocating the
buffer and copying the measurement records into separate functions in
order to allocate the buffer at kexec 'load' and copy the measurements
at kexec 'execute'.
After moving the vfree() here at this stage in the patch set, the IMA
measurement list fails to verify when doing two consecutive "kexec -s -l"
with/without a "kexec -s -u" in between. Only after "ima: kexec: move
IMA log copy from kexec load to execute" the IMA measurement list verifies
properly with the vfree() here.
Co-developed-by: Tushar Sugandhi <tusharsu@linux.microsoft.com>
Signed-off-by: Tushar Sugandhi <tusharsu@linux.microsoft.com>
Signed-off-by: Steven Chen <chenste@linux.microsoft.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Acked-by: Baoquan He <bhe@redhat.com>
Tested-by: Stefan Berger <stefanb@linux.ibm.com> # ppc64/kvm
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
|
|
Before making the function local seq_file "file" variable file static
global, rename it to "ima_kexec_file".
Signed-off-by: Steven Chen <chenste@linux.microsoft.com>
Acked-by: Baoquan He <bhe@redhat.com>
Tested-by: Stefan Berger <stefanb@linux.ibm.com> # ppc64/kvm
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux
Pull landlock fixes from Mickaël Salaün:
"Fix some Landlock audit issues, add related tests, and updates
documentation"
* tag 'landlock-6.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux:
landlock: Update log documentation
landlock: Fix documentation for landlock_restrict_self(2)
landlock: Fix documentation for landlock_create_ruleset(2)
selftests/landlock: Add PID tests for audit records
selftests/landlock: Factor out audit fixture in audit_test
landlock: Log the TGID of the domain creator
landlock: Remove incorrect warning
|
|
Cross-merge networking fixes after downstream PR (net-6.15-rc4).
This pull includes wireless and a fix to vxlan which isn't
in Linus's tree just yet. The latter creates with a silent conflict
/ build breakage, so merging it now to avoid causing problems.
drivers/net/vxlan/vxlan_vnifilter.c
094adad91310 ("vxlan: Use a single lock to protect the FDB table")
087a9eb9e597 ("vxlan: vnifilter: Fix unlocked deletion of default FDB entry")
https://lore.kernel.org/20250423145131.513029-1-idosch@nvidia.com
No "normal" conflicts, or adjacent changes.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
security_netlink_send() is a networking hook, so it fits better under
CONFIG_SECURITY_NETWORK.
Signed-off-by: Song Liu <song@kernel.org>
Signed-off-by: Paul Moore <paul@paul-moore.com>
|
|
On IMA policy update, if a measure rule exists in the policy,
IMA_MEASURE is set for ima_policy_flags which makes the violation_check
variable always true. Coupled with a no-action on MAY_READ for a
FILE_CHECK call, we're always taking the inode_lock().
This becomes a performance problem for extremely heavy read-only workloads.
Therefore, prevent this only in the case there's no action to be taken.
Signed-off-by: Frederick Lawler <fred@cloudflare.com>
Acked-by: Roberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
|
|
Fix, deduplicate, and improve rendering of landlock_restrict_self(2)'s
flags documentation.
The flags are now rendered like the syscall's parameters and
description.
Cc: Günther Noack <gnoack@google.com>
Cc: Paul Moore <paul@paul-moore.com>
Link: https://lore.kernel.org/r/20250416154716.1799902-2-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
Move and fix the flags documentation, and improve formatting.
It makes more sense and it eases maintenance to document syscall flags
in landlock.h, where they are defined. This is already the case for
landlock_restrict_self(2)'s flags.
The flags are now rendered like the syscall's parameters and
description.
Cc: Günther Noack <gnoack@google.com>
Cc: Paul Moore <paul@paul-moore.com>
Link: https://lore.kernel.org/r/20250416154716.1799902-1-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
There is a GCC crash bug in the randstruct for latest GCC versions that
is being tickled by landlock[1]. Temporarily disable GCC randstruct for
COMPILE_TEST builds to unbreak CI systems for the coming -rc2. This can
be restored once the bug is fixed.
Suggested-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/all/20250407-kbuild-disable-gcc-plugins-v1-1-5d46ae583f5e@kernel.org/ [1]
Acked-by: Mark Brown <broonie@kernel.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20250409151154.work.872-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
|
|
The kdoc header incorrectly references an older parameter, update it
to reference what is currently used in the function.
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202504122308.Ch8PzJdD-lkp@intel.com/
Signed-off-by: Paul Moore <paul@paul-moore.com>
|
|
The "linux/parser.h" header was included twice, we only need it once.
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202504121945.Q0GDD0sG-lkp@intel.com/
Signed-off-by: Paul Moore <paul@paul-moore.com>
|
|
DCCP was orphaned in 2021 by commit 054c4610bd05 ("MAINTAINERS: dccp:
move Gerrit Renker to CREDITS"), which noted that the last maintainer
had been inactive for five years.
In recent years, it has become a playground for syzbot, and most changes
to DCCP have been odd bug fixes triggered by syzbot. Apart from that,
the only changes have been driven by treewide or networking API updates
or adjustments related to TCP.
Thus, in 2023, we announced we would remove DCCP in 2025 via commit
b144fcaf46d4 ("dccp: Print deprecation notice.").
Since then, only one individual has contacted the netdev mailing list. [0]
There is ongoing research for Multipath DCCP. The repository is hosted
on GitHub [1], and development is not taking place through the upstream
community. While the repository is published under the GPLv2 license,
the scheduling part remains proprietary, with a LICENSE file [2] stating:
"This is not Open Source software."
The researcher mentioned a plan to address the licensing issue, upstream
the patches, and step up as a maintainer, but there has been no further
communication since then.
Maintaining DCCP for a decade without any real users has become a burden.
Therefore, it's time to remove it.
Removing DCCP will also provide significant benefits to TCP. It allows
us to freely reorganize the layout of struct inet_connection_sock, which
is currently shared with DCCP, and optimize it to reduce the number of
cachelines accessed in the TCP fast path.
Note that we keep DCCP netfilter modules as requested. [3]
Link: https://lore.kernel.org/netdev/20230710182253.81446-1-kuniyu@amazon.com/T/#u #[0]
Link: https://github.com/telekom/mp-dccp #[1]
Link: https://github.com/telekom/mp-dccp/blob/mpdccp_v03_k5.10/net/dccp/non_gpl_scheduler/LICENSE #[2]
Link: https://lore.kernel.org/netdev/Z_VQ0KlCRkqYWXa-@calendula/ #[3]
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Acked-by: Paul Moore <paul@paul-moore.com> (LSM and SELinux)
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Link: https://patch.msgid.link/20250410023921.11307-3-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Reduce the SELinux performance overhead during path walks through the
use of a per-task directory access cache and some minor code
optimizations. The directory access cache is per-task because it allows
for a lockless cache while also fitting well with a common application
pattern of heavily accessing a relatively small number of SELinux
directory labels. The cache is inherited by child processes when the
child runs with the same SELinux domain as the parent, and invalidated
on changes to the task's SELinux domain or the loaded SELinux policy.
A cache of four entries was chosen based on testing with the Fedora
"targeted" policy, a SELinux Reference Policy variant, and
'make allmodconfig' on Linux v6.14.
Code optimizations include better use of inline functions to reduce
function calls in the common case, especially in the inode revalidation
code paths, and elimination of redundant checks between the LSM and
SELinux layers.
As mentioned briefly above, aside from general use and regression
testing with the selinux-testsuite, performance was measured using
'make allmodconfig' with Linux v6.14 as a base reference. As expected,
there were variations from one test run to another, but the measurements
below are a good representation of the test results seen on my test
system.
* Linux v6.14
REF
1.26% [k] __d_lookup_rcu
SELINUX (1.31%)
0.58% [k] selinux_inode_permission
0.29% [k] avc_lookup
0.25% [k] avc_has_perm_noaudit
0.19% [k] __inode_security_revalidate
* Linux v6.14 + patch
REF
1.41% [k] __d_lookup_rcu
SELINUX (0.89%)
0.65% [k] selinux_inode_permission
0.15% [k] avc_lookup
0.05% [k] avc_has_perm_noaudit
0.04% [k] avc_policy_seqno
X.XX% [k] __inode_security_revalidate (now inline)
In both cases the __d_lookup_rcu() function was used as a reference
point to establish a context for the SELinux related functions. On a
unpatched Linux v6.14 system we see the time spent in the combined
SELinux functions exceeded that of __d_lookup_rcu(), 1.31% compared to
1.26%. However, with this patch applied the time spent in the combined
SELinux functions dropped to roughly 65% of the time spent in
__d_lookup_rcu(), 0.89% compared to 1.41%. Aside from the significant
decrease in time spent in the SELinux AVC, it appears that any additional
time spent searching and updating the cache is offset by other code
improvements, e.g. time spent in selinux_inode_permission() +
__inode_security_revalidate() + avc_policy_seqno() is less on the
patched kernel than the unpatched kernel.
It is worth noting that in this patch the use of the per-task cache is
limited to the security_inode_permission() LSM callback,
selinux_inode_permission(), but future work could expand the cache into
inode_has_perm(), likely through consolidation of the two functions.
While this would likely have little to no impact on path walks, it
may benefit other operations.
Reviewed-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Tested-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
|
|
Currently, genfscon only supports string prefix match to label files.
Thus, labeling numerous dynamic sysfs entries requires many specific
path rules. For example, labeling device paths such as
`/sys/devices/pci0000:00/0000:00:03.1/<...>/0000:04:00.1/wakeup`
requires listing all specific PCI paths, which is challenging to
maintain. While user-space restorecon can handle these paths with
regular expression rules, relabeling thousands of paths under sysfs
after it is mounted is inefficient compared to using genfscon.
This commit adds wildcard matching to genfscon to make rules more
efficient and expressive. This new behavior is enabled by
genfs_seclabel_wildcard capability. With this capability, genfscon does
wildcard matching instead of prefix matching. When multiple wildcard
rules match against a path, then the longest rule (determined by the
length of the rule string) will be applied. If multiple rules of the
same length match, the first matching rule encountered in the given
genfscon policy will be applied. Users are encouraged to write longer,
more explicit path rules to avoid relying on this behavior.
This change resulted in nice real-world performance improvements. For
example, boot times on test Android devices were reduced by 15%. This
improvement is due to the elimination of the "restorecon -R /sys" step
during boot, which takes more than two seconds in the worst case.
Signed-off-by: Takaya Saeki <takayas@chromium.org>
Signed-off-by: Paul Moore <paul@paul-moore.com>
|
|
Port labeling is based on port number and protocol (TCP/UDP/...) but not
based on network family (IPv4/IPv6).
Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
|
|
For network objects, like interfaces, nodes, port and InfiniBands, the
object to SID lookup is cached in hashtables. OOM during such hashtable
additions of new objects is considered non-fatal and the computed SID is
simply returned without adding the compute result into the hash table.
Actually ignore OOM in the InfiniBand code, despite the comment already
suggesting to do so. This reverts commit c350f8bea271 ("selinux: Fix
error return code in sel_ib_pkey_sid_slow()").
Add comments in the other places.
Use kmalloc() instead of kzalloc(), since all members are initialized on
success and the data is only used in internbal hash tables, so no risk
of information leakage to userspace.
Fixes: c350f8bea271 ("selinux: Fix error return code in sel_ib_pkey_sid_slow()")
Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
|
|
In the network hashtable lookup code add likely() compiler hints in the
fast path, like already done in sel_netif_sid().
Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
|
|
The network namespace is not modified.
Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
|
|
The network address, either an IPv4 or IPv6 one, is not modified.
Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
|
|
As for other Audit's "pid" fields, Landlock should use the task's TGID
instead of its TID. Fix this issue by keeping a reference to the TGID
of the domain creator.
Existing tests already check for the PID but only with the thread group
leader, so always the TGID. A following patch adds dedicated tests for
non-leader thread.
Remove the current_real_cred() check which does not make sense because
we only reference a struct pid, whereas a previous version did reference
a struct cred instead.
Cc: Christian Brauner <brauner@kernel.org>
Cc: Paul Moore <paul@paul-moore.com>
Reviewed-by: Günther Noack <gnoack3000@gmail.com>
Link: https://lore.kernel.org/r/20250410171725.1265860-1-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
landlock_put_hierarchy() can be called when an error occurs in
landlock_merge_ruleset() due to insufficient memory. In this case, the
domain's audit details might not have been allocated yet, which would
cause landlock_free_hierarchy_details() to print a warning (but still
safely handle this case).
We could keep the WARN_ON_ONCE(!hierarchy) but it's not worth it for
this kind of function, so let's remove it entirely.
Cc: Paul Moore <paul@paul-moore.com>
Reported-by: syzbot+8bca99e91de7e060e4ea@syzkaller.appspotmail.com
Link: https://lore.kernel.org/r/20250331104709.897062-1-mic@digikod.net
Reviewed-by: Günther Noack <gnoack@google.com>
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
try_lookup_noperm() and d_hash_and_lookup() are nearly identical. The
former does some validation of the name where the latter doesn't.
Outside of the VFS that validation is likely valuable, and having only
one exported function for this task is certainly a good idea.
So make d_hash_and_lookup() local to VFS files and change all other
callers to try_lookup_noperm(). Note that the arguments are swapped.
Signed-off-by: NeilBrown <neilb@suse.de>
Link: https://lore.kernel.org/r/20250319031545.2999807-6-neil@brown.name
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
The lookup_one_len family of functions is (now) only used internally by
a filesystem on itself either
- in a context where permission checking is irrelevant such as by a
virtual filesystem populating itself, or xfs accessing its ORPHANAGE
or dquota accessing the quota file; or
- in a context where a permission check (MAY_EXEC on the parent) has just
been performed such as a network filesystem finding in "silly-rename"
file in the same directory. This is also the context after the
_parentat() functions where currently lookup_one_qstr_excl() is used.
So the permission check is pointless.
The name "one_len" is unhelpful in understanding the purpose of these
functions and should be changed. Most of the callers pass the len as
"strlen()" so using a qstr and QSTR() can simplify the code.
This patch renames these functions (include lookup_positive_unlocked()
which is part of the family despite the name) to have a name based on
"lookup_noperm". They are changed to receive a 'struct qstr' instead
of separate name and len. In a few cases the use of QSTR() results in a
new call to strlen().
try_lookup_noperm() takes a pointer to a qstr instead of the whole
qstr. This is consistent with d_hash_and_lookup() (which is nearly
identical) and useful for lookup_noperm_unlocked().
The new lookup_noperm_common() doesn't take a qstr yet. That will be
tidied up in a subsequent patch.
Signed-off-by: NeilBrown <neil@brown.name>
Link: https://lore.kernel.org/r/20250319031545.2999807-5-neil@brown.name
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Patch series "mseal system mappings", v9.
As discussed during mseal() upstream process [1], mseal() protects the
VMAs of a given virtual memory range against modifications, such as the
read/write (RW) and no-execute (NX) bits. For complete descriptions of
memory sealing, please see mseal.rst [2].
The mseal() is useful to mitigate memory corruption issues where a
corrupted pointer is passed to a memory management system. For example,
such an attacker primitive can break control-flow integrity guarantees
since read-only memory that is supposed to be trusted can become writable
or .text pages can get remapped.
The system mappings are readonly only, memory sealing can protect them
from ever changing to writable or unmmap/remapped as different attributes.
System mappings such as vdso, vvar, vvar_vclock, vectors (arm
compat-mode), sigpage (arm compat-mode), are created by the kernel during
program initialization, and could be sealed after creation.
Unlike the aforementioned mappings, the uprobe mapping is not established
during program startup. However, its lifetime is the same as the
process's lifetime [3]. It could be sealed from creation.
The vsyscall on x86-64 uses a special address (0xffffffffff600000), which
is outside the mm managed range. This means mprotect, munmap, and mremap
won't work on the vsyscall. Since sealing doesn't enhance the vsyscall's
security, it is skipped in this patch. If we ever seal the vsyscall, it
is probably only for decorative purpose, i.e. showing the 'sl' flag in
the /proc/pid/smaps. For this patch, it is ignored.
It is important to note that the CHECKPOINT_RESTORE feature (CRIU) may
alter the system mappings during restore operations. UML(User Mode Linux)
and gVisor, rr are also known to change the vdso/vvar mappings.
Consequently, this feature cannot be universally enabled across all
systems. As such, CONFIG_MSEAL_SYSTEM_MAPPINGS is disabled by default.
To support mseal of system mappings, architectures must define
CONFIG_ARCH_SUPPORTS_MSEAL_SYSTEM_MAPPINGS and update their special
mappings calls to pass mseal flag. Additionally, architectures must
confirm they do not unmap/remap system mappings during the process
lifetime. The existence of this flag for an architecture implies that it
does not require the remapping of thest system mappings during process
lifetime, so sealing these mappings is safe from a kernel perspective.
This version covers x86-64 and arm64 archiecture as minimum viable feature.
While no specific CPU hardware features are required for enable this
feature on an archiecture, memory sealing requires a 64-bit kernel. Other
architectures can choose whether or not to adopt this feature. Currently,
I'm not aware of any instances in the kernel code that actively
munmap/mremap a system mapping without a request from userspace. The PPC
does call munmap when _install_special_mapping fails for vdso; however,
it's uncertain if this will ever fail for PPC - this needs to be
investigated by PPC in the future [4]. The UML kernel can add this
support when KUnit tests require it [5].
In this version, we've improved the handling of system mapping sealing
from previous versions, instead of modifying the _install_special_mapping
function itself, which would affect all architectures, we now call
_install_special_mapping with a sealing flag only within the specific
architecture that requires it. This targeted approach offers two key
advantages: 1) It limits the code change's impact to the necessary
architectures, and 2) It aligns with the software architecture by keeping
the core memory management within the mm layer, while delegating the
decision of sealing system mappings to the individual architecture, which
is particularly relevant since 32-bit architectures never require sealing.
Prior to this patch series, we explored sealing special mappings from
userspace using glibc's dynamic linker. This approach revealed several
issues:
- The PT_LOAD header may report an incorrect length for vdso, (smaller
than its actual size). The dynamic linker, which relies on PT_LOAD
information to determine mapping size, would then split and partially
seal the vdso mapping. Since each architecture has its own vdso/vvar
code, fixing this in the kernel would require going through each
archiecture. Our initial goal was to enable sealing readonly mappings,
e.g. .text, across all architectures, sealing vdso from kernel since
creation appears to be simpler than sealing vdso at glibc.
- The [vvar] mapping header only contains address information, not
length information. Similar issues might exist for other special
mappings.
- Mappings like uprobe are not covered by the dynamic linker, and there
is no effective solution for them.
This feature's security enhancements will benefit ChromeOS, Android, and
other high security systems.
Testing:
This feature was tested on ChromeOS and Android for both x86-64 and ARM64.
- Enable sealing and verify vdso/vvar, sigpage, vector are sealed properly,
i.e. "sl" shown in the smaps for those mappings, and mremap is blocked.
- Passing various automation tests (e.g. pre-checkin) on ChromeOS and
Android to ensure the sealing doesn't affect the functionality of
Chromebook and Android phone.
I also tested the feature on Ubuntu on x86-64:
- With config disabled, vdso/vvar is not sealed,
- with config enabled, vdso/vvar is sealed, and booting up Ubuntu is OK,
normal operations such as browsing the web, open/edit doc are OK.
Link: https://lore.kernel.org/all/20240415163527.626541-1-jeffxu@chromium.org/ [1]
Link: Documentation/userspace-api/mseal.rst [2]
Link: https://lore.kernel.org/all/CABi2SkU9BRUnqf70-nksuMCQ+yyiWjo3fM4XkRkL-NrCZxYAyg@mail.gmail.com/ [3]
Link: https://lore.kernel.org/all/CABi2SkV6JJwJeviDLsq9N4ONvQ=EFANsiWkgiEOjyT9TQSt+HA@mail.gmail.com/ [4]
Link: https://lore.kernel.org/all/202502251035.239B85A93@keescook/ [5]
This patch (of 7):
Provide infrastructure to mseal system mappings. Establish two kernel
configs (CONFIG_MSEAL_SYSTEM_MAPPINGS,
ARCH_SUPPORTS_MSEAL_SYSTEM_MAPPINGS) and VM_SEALED_SYSMAP macro for future
patches.
Link: https://lkml.kernel.org/r/20250305021711.3867874-1-jeffxu@google.com
Link: https://lkml.kernel.org/r/20250305021711.3867874-2-jeffxu@google.com
Signed-off-by: Jeff Xu <jeffxu@chromium.org>
Reviewed-by: Kees Cook <kees@kernel.org>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Cc: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Andrei Vagin <avagin@gmail.com>
Cc: Anna-Maria Behnsen <anna-maria@linutronix.de>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Benjamin Berg <benjamin@sipsolutions.net>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Rientjes <rientjes@google.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Elliot Hughes <enh@google.com>
Cc: Florian Faineli <f.fainelli@gmail.com>
Cc: Greg Ungerer <gerg@kernel.org>
Cc: Guenter Roeck <groeck@chromium.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jann Horn <jannh@google.com>
Cc: Jason A. Donenfeld <jason@zx2c4.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Jorge Lucangeli Obes <jorgelo@chromium.org>
Cc: Linus Waleij <linus.walleij@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Matthew Wilcow (Oracle) <willy@infradead.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Mike Rapoport <mike.rapoport@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Pedro Falcato <pedro.falcato@gmail.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Stephen Röttger <sroettger@google.com>
Cc: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updatesk from Greg KH:
"Here is the big set of driver core updates for 6.15-rc1. Lots of stuff
happened this development cycle, including:
- kernfs scaling changes to make it even faster thanks to rcu
- bin_attribute constify work in many subsystems
- faux bus minor tweaks for the rust bindings
- rust binding updates for driver core, pci, and platform busses,
making more functionaliy available to rust drivers. These are all
due to people actually trying to use the bindings that were in
6.14.
- make Rafael and Danilo full co-maintainers of the driver core
codebase
- other minor fixes and updates"
* tag 'driver-core-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (52 commits)
rust: platform: require Send for Driver trait implementers
rust: pci: require Send for Driver trait implementers
rust: platform: impl Send + Sync for platform::Device
rust: pci: impl Send + Sync for pci::Device
rust: platform: fix unrestricted &mut platform::Device
rust: pci: fix unrestricted &mut pci::Device
rust: device: implement device context marker
rust: pci: use to_result() in enable_device_mem()
MAINTAINERS: driver core: mark Rafael and Danilo as co-maintainers
rust/kernel/faux: mark Registration methods inline
driver core: faux: only create the device if probe() succeeds
rust/faux: Add missing parent argument to Registration::new()
rust/faux: Drop #[repr(transparent)] from faux::Registration
rust: io: fix devres test with new io accessor functions
rust: io: rename `io::Io` accessors
kernfs: Move dput() outside of the RCU section.
efi: rci2: mark bin_attribute as __ro_after_init
rapidio: constify 'struct bin_attribute'
firmware: qemu_fw_cfg: constify 'struct bin_attribute'
powerpc/perf/hv-24x7: Constify 'struct bin_attribute'
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Pull bpf updates from Alexei Starovoitov:
"For this merge window we're splitting BPF pull request into three for
higher visibility: main changes, res_spin_lock, try_alloc_pages.
These are the main BPF changes:
- Add DFA-based live registers analysis to improve verification of
programs with loops (Eduard Zingerman)
- Introduce load_acquire and store_release BPF instructions and add
x86, arm64 JIT support (Peilin Ye)
- Fix loop detection logic in the verifier (Eduard Zingerman)
- Drop unnecesary lock in bpf_map_inc_not_zero() (Eric Dumazet)
- Add kfunc for populating cpumask bits (Emil Tsalapatis)
- Convert various shell based tests to selftests/bpf/test_progs
format (Bastien Curutchet)
- Allow passing referenced kptrs into struct_ops callbacks (Amery
Hung)
- Add a flag to LSM bpf hook to facilitate bpf program signing
(Blaise Boscaccy)
- Track arena arguments in kfuncs (Ihor Solodrai)
- Add copy_remote_vm_str() helper for reading strings from remote VM
and bpf_copy_from_user_task_str() kfunc (Jordan Rome)
- Add support for timed may_goto instruction (Kumar Kartikeya
Dwivedi)
- Allow bpf_get_netns_cookie() int cgroup_skb programs (Mahe Tardy)
- Reduce bpf_cgrp_storage_busy false positives when accessing cgroup
local storage (Martin KaFai Lau)
- Introduce bpf_dynptr_copy() kfunc (Mykyta Yatsenko)
- Allow retrieving BTF data with BTF token (Mykyta Yatsenko)
- Add BPF kfuncs to set and get xattrs with 'security.bpf.' prefix
(Song Liu)
- Reject attaching programs to noreturn functions (Yafang Shao)
- Introduce pre-order traversal of cgroup bpf programs (Yonghong
Song)"
* tag 'bpf-next-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (186 commits)
selftests/bpf: Add selftests for load-acquire/store-release when register number is invalid
bpf: Fix out-of-bounds read in check_atomic_load/store()
libbpf: Add namespace for errstr making it libbpf_errstr
bpf: Add struct_ops context information to struct bpf_prog_aux
selftests/bpf: Sanitize pointer prior fclose()
selftests/bpf: Migrate test_xdp_vlan.sh into test_progs
selftests/bpf: test_xdp_vlan: Rename BPF sections
bpf: clarify a misleading verifier error message
selftests/bpf: Add selftest for attaching fexit to __noreturn functions
bpf: Reject attaching fexit/fmod_ret to __noreturn functions
bpf: Only fails the busy counter check in bpf_cgrp_storage_get if it creates storage
bpf: Make perf_event_read_output accessible in all program types.
bpftool: Using the right format specifiers
bpftool: Add -Wformat-signedness flag to detect format errors
selftests/bpf: Test freplace from user namespace
libbpf: Pass BPF token from find_prog_btf_id to BPF_BTF_GET_FD_BY_ID
bpf: Return prog btf_id without capable check
bpf: BPF token support for BPF_BTF_GET_FD_BY_ID
bpf, x86: Fix objtool warning for timed may_goto
bpf: Check map->record at the beginning of check_and_free_fields()
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto updates from Herbert Xu:
"API:
- Remove legacy compression interface
- Improve scatterwalk API
- Add request chaining to ahash and acomp
- Add virtual address support to ahash and acomp
- Add folio support to acomp
- Remove NULL dst support from acomp
Algorithms:
- Library options are fuly hidden (selected by kernel users only)
- Add Kerberos5 algorithms
- Add VAES-based ctr(aes) on x86
- Ensure LZO respects output buffer length on compression
- Remove obsolete SIMD fallback code path from arm/ghash-ce
Drivers:
- Add support for PCI device 0x1134 in ccp
- Add support for rk3588's standalone TRNG in rockchip
- Add Inside Secure SafeXcel EIP-93 crypto engine support in eip93
- Fix bugs in tegra uncovered by multi-threaded self-test
- Fix corner cases in hisilicon/sec2
Others:
- Add SG_MITER_LOCAL to sg miter
- Convert ubifs, hibernate and xfrm_ipcomp from legacy API to acomp"
* tag 'v6.15-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (187 commits)
crypto: testmgr - Add multibuffer acomp testing
crypto: acomp - Fix synchronous acomp chaining fallback
crypto: testmgr - Add multibuffer hash testing
crypto: hash - Fix synchronous ahash chaining fallback
crypto: arm/ghash-ce - Remove SIMD fallback code path
crypto: essiv - Replace memcpy() + NUL-termination with strscpy()
crypto: api - Call crypto_alg_put in crypto_unregister_alg
crypto: scompress - Fix incorrect stream freeing
crypto: lib/chacha - remove unused arch-specific init support
crypto: remove obsolete 'comp' compression API
crypto: compress_null - drop obsolete 'comp' implementation
crypto: cavium/zip - drop obsolete 'comp' implementation
crypto: zstd - drop obsolete 'comp' implementation
crypto: lzo - drop obsolete 'comp' implementation
crypto: lzo-rle - drop obsolete 'comp' implementation
crypto: lz4hc - drop obsolete 'comp' implementation
crypto: lz4 - drop obsolete 'comp' implementation
crypto: deflate - drop obsolete 'comp' implementation
crypto: 842 - drop obsolete 'comp' implementation
crypto: nx - Migrate to scomp API
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux
Pull landlock updates from Mickaël Salaün:
"This brings two main changes to Landlock:
- A signal scoping fix with a new interface for user space to know if
it is compatible with the running kernel.
- Audit support to give visibility on why access requests are denied,
including the origin of the security policy, missing access rights,
and description of object(s). This was designed to limit log spam
as much as possible while still alerting about unexpected blocked
access.
With these changes come new and improved documentation, and a lot of
new tests"
* tag 'landlock-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux: (36 commits)
landlock: Add audit documentation
selftests/landlock: Add audit tests for network
selftests/landlock: Add audit tests for filesystem
selftests/landlock: Add audit tests for abstract UNIX socket scoping
selftests/landlock: Add audit tests for ptrace
selftests/landlock: Test audit with restrict flags
selftests/landlock: Add tests for audit flags and domain IDs
selftests/landlock: Extend tests for landlock_restrict_self(2)'s flags
selftests/landlock: Add test for invalid ruleset file descriptor
samples/landlock: Enable users to log sandbox denials
landlock: Add LANDLOCK_RESTRICT_SELF_LOG_SUBDOMAINS_OFF
landlock: Add LANDLOCK_RESTRICT_SELF_LOG_*_EXEC_* flags
landlock: Log scoped denials
landlock: Log TCP bind and connect denials
landlock: Log truncate and IOCTL denials
landlock: Factor out IOCTL hooks
landlock: Log file-related denials
landlock: Log mount-related denials
landlock: Add AUDIT_LANDLOCK_DOMAIN and log domain status
landlock: Add AUDIT_LANDLOCK_ACCESS and log ptrace denials
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/sergeh/linux
Pull capabilities update from Serge Hallyn:
"This contains just one patch that removes a helper function whose last
user (smack) stopped using it in 2018"
* tag 'caps-pr-20250327' of git://git.kernel.org/pub/scm/linux/kernel/git/sergeh/linux:
capability: Remove unused has_capability
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity
Pull ima updates from Mimi Zohar:
"Two performance improvements, which minimize the number of integrity
violations"
* tag 'integrity-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity:
ima: limit the number of ToMToU integrity violations
ima: limit the number of open-writers integrity violations
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/wufan/ipe
Pull ipe update from Fan Wu:
"This contains just one commit from Randy Dunlap, which fixes
kernel-doc warnings in the IPE subsystem"
* tag 'ipe-pr-20250324' of git://git.kernel.org/pub/scm/linux/kernel/git/wufan/ipe:
ipe: policy_fs: fix kernel-doc warnings
|
|
Each time a file in policy, that is already opened for read, is opened
for write, a Time-of-Measure-Time-of-Use (ToMToU) integrity violation
audit message is emitted and a violation record is added to the IMA
measurement list. This occurs even if a ToMToU violation has already
been recorded.
Limit the number of ToMToU integrity violations per file open for read.
Note: The IMA_MAY_EMIT_TOMTOU atomic flag must be set from the reader
side based on policy. This may result in a per file open for read
ToMToU violation.
Since IMA_MUST_MEASURE is only used for violations, rename the atomic
IMA_MUST_MEASURE flag to IMA_MAY_EMIT_TOMTOU.
Cc: stable@vger.kernel.org # applies cleanly up to linux-6.6
Tested-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Petr Vorel <pvorel@suse.cz>
Tested-by: Petr Vorel <pvorel@suse.cz>
Reviewed-by: Roberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
|
|
Each time a file in policy, that is already opened for write, is opened
for read, an open-writers integrity violation audit message is emitted
and a violation record is added to the IMA measurement list. This
occurs even if an open-writers violation has already been recorded.
Limit the number of open-writers integrity violations for an existing
file open for write to one. After the existing file open for write
closes (__fput), subsequent open-writers integrity violations may be
emitted.
Cc: stable@vger.kernel.org # applies cleanly up to linux-6.6
Tested-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Petr Vorel <pvorel@suse.cz>
Tested-by: Petr Vorel <pvorel@suse.cz>
Reviewed-by: Roberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl
Pull sysctl updates from Joel Granados:
- Move vm_table members out of kernel/sysctl.c
All vm_table array members have moved to their respective subsystems
leading to the removal of vm_table from kernel/sysctl.c. This
increases modularity by placing the ctl_tables closer to where they
are actually used and at the same time reducing the chances of merge
conflicts in kernel/sysctl.c.
- ctl_table range fixes
Replace the proc_handler function that checks variable ranges in
coredump_sysctls and vdso_table with the one that actually uses the
extra{1,2} pointers as min/max values. This tightens the range of the
values that users can pass into the kernel effectively preventing
{under,over}flows.
- Misc fixes
Correct grammar errors and typos in test messages. Update sysctl
files in MAINTAINERS. Constified and removed array size in
declaration for alignment_tbl
* tag 'sysctl-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl: (22 commits)
selftests/sysctl: fix wording of help messages
selftests: fix spelling/grammar errors in sysctl/sysctl.sh
MAINTAINERS: Update sysctl file list in MAINTAINERS
sysctl: Fix underflow value setting risk in vm_table
coredump: Fixes core_pipe_limit sysctl proc_handler
sysctl: remove unneeded include
sysctl: remove the vm_table
sh: vdso: move the sysctl to arch/sh/kernel/vsyscall/vsyscall.c
x86: vdso: move the sysctl to arch/x86/entry/vdso/vdso32-setup.c
fs: dcache: move the sysctl to fs/dcache.c
sunrpc: simplify rpcauth_cache_shrink_count()
fs: drop_caches: move sysctl to fs/drop_caches.c
fs: fs-writeback: move sysctl to fs/fs-writeback.c
mm: nommu: move sysctl to mm/nommu.c
security: min_addr: move sysctl to security/min_addr.c
mm: mmap: move sysctl to mm/mmap.c
mm: util: move sysctls to mm/util.c
mm: vmscan: move vmscan sysctls to mm/vmscan.c
mm: swap: move sysctl to mm/swap.c
mm: filemap: move sysctl to mm/filemap.c
...
|
|
Add LANDLOCK_RESTRICT_SELF_LOG_SUBDOMAINS_OFF for the case of sandboxer
tools, init systems, or runtime containers launching programs sandboxing
themselves in an inconsistent way. Setting this flag should only
depends on runtime configuration (i.e. not hardcoded).
We don't create a new ruleset's option because this should not be part
of the security policy: only the task that enforces the policy (not the
one that create it) knows if itself or its children may request denied
actions.
This is the first and only flag that can be set without actually
restricting the caller (i.e. without providing a ruleset).
Extend struct landlock_cred_security with a u8 log_subdomains_off.
struct landlock_file_security is still 16 bytes.
Cc: Günther Noack <gnoack@google.com>
Cc: Paul Moore <paul@paul-moore.com>
Closes: https://github.com/landlock-lsm/linux/issues/3
Link: https://lore.kernel.org/r/20250320190717.2287696-19-mic@digikod.net
[mic: Fix comment]
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
Most of the time we want to log denied access because they should not
happen and such information helps diagnose issues. However, when
sandboxing processes that we know will try to access denied resources
(e.g. unknown, bogus, or malicious binary), we might want to not log
related access requests that might fill up logs.
By default, denied requests are logged until the task call execve(2).
If the LANDLOCK_RESTRICT_SELF_LOG_SAME_EXEC_OFF flag is set, denied
requests will not be logged for the same executed file.
If the LANDLOCK_RESTRICT_SELF_LOG_NEW_EXEC_ON flag is set, denied
requests from after an execve(2) call will be logged.
The rationale is that a program should know its own behavior, but not
necessarily the behavior of other programs.
Because LANDLOCK_RESTRICT_SELF_LOG_SAME_EXEC_OFF is set for a specific
Landlock domain, it makes it possible to selectively mask some access
requests that would be logged by a parent domain, which might be handy
for unprivileged processes to limit logs. However, system
administrators should still use the audit filtering mechanism. There is
intentionally no audit nor sysctl configuration to re-enable these logs.
This is delegated to the user space program.
Increment the Landlock ABI version to reflect this interface change.
Cc: Günther Noack <gnoack@google.com>
Cc: Paul Moore <paul@paul-moore.com>
Link: https://lore.kernel.org/r/20250320190717.2287696-18-mic@digikod.net
[mic: Rename variables and fix __maybe_unused]
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
Add audit support for unix_stream_connect, unix_may_send, task_kill, and
file_send_sigiotask hooks.
The related blockers are:
- scope.abstract_unix_socket
- scope.signal
Audit event sample for abstract unix socket:
type=LANDLOCK_DENY msg=audit(1729738800.268:30): domain=195ba459b blockers=scope.abstract_unix_socket path=00666F6F
Audit event sample for signal:
type=LANDLOCK_DENY msg=audit(1729738800.291:31): domain=195ba459b blockers=scope.signal opid=1 ocomm="systemd"
Refactor and simplify error handling in LSM hooks.
Extend struct landlock_file_security with fown_layer and use it to log
the blocking domain. The struct aligned size is still 16 bytes.
Cc: Günther Noack <gnoack@google.com>
Cc: Tahera Fahimi <fahimitahera@gmail.com>
Link: https://lore.kernel.org/r/20250320190717.2287696-17-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
Add audit support to socket_bind and socket_connect hooks.
The related blockers are:
- net.bind_tcp
- net.connect_tcp
Audit event sample:
type=LANDLOCK_DENY msg=audit(1729738800.349:44): domain=195ba459b blockers=net.connect_tcp daddr=127.0.0.1 dest=80
Cc: Günther Noack <gnoack@google.com>
Cc: Konstantin Meskhidze <konstantin.meskhidze@huawei.com>
Cc: Mikhail Ivanov <ivanov.mikhail1@huawei-partners.com>
Link: https://lore.kernel.org/r/20250320190717.2287696-16-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
Add audit support to the file_truncate and file_ioctl hooks.
Add a deny_masks_t type and related helpers to store the domain's layer
level per optional access rights (i.e. LANDLOCK_ACCESS_FS_TRUNCATE and
LANDLOCK_ACCESS_FS_IOCTL_DEV) when opening a file, which cannot be
inferred later. In practice, the landlock_file_security aligned blob size is
still 16 bytes because this new one-byte deny_masks field follows the
existing two-bytes allowed_access field and precede the packed
fown_subject.
Implementing deny_masks_t with a bitfield instead of a struct enables a
generic implementation to store and extract layer levels.
Add KUnit tests to check the identification of a layer level from a
deny_masks_t, and the computation of a deny_masks_t from an access right
with its layer level or a layer_mask_t array.
Audit event sample:
type=LANDLOCK_DENY msg=audit(1729738800.349:44): domain=195ba459b blockers=fs.ioctl_dev path="/dev/tty" dev="devtmpfs" ino=9 ioctlcmd=0x5401
Cc: Günther Noack <gnoack@google.com>
Link: https://lore.kernel.org/r/20250320190717.2287696-15-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
Compat and non-compat IOCTL hooks are almost the same, except to compare
the IOCTL command. Factor out these two IOCTL hooks to highlight the
difference and minimize audit changes (see next commit).
Cc: Günther Noack <gnoack@google.com>
Link: https://lore.kernel.org/r/20250320190717.2287696-14-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
Add audit support for path_mkdir, path_mknod, path_symlink, path_unlink,
path_rmdir, path_truncate, path_link, path_rename, and file_open hooks.
The dedicated blockers are:
- fs.execute
- fs.write_file
- fs.read_file
- fs.read_dir
- fs.remove_dir
- fs.remove_file
- fs.make_char
- fs.make_dir
- fs.make_reg
- fs.make_sock
- fs.make_fifo
- fs.make_block
- fs.make_sym
- fs.refer
- fs.truncate
- fs.ioctl_dev
Audit event sample for a denied link action:
type=LANDLOCK_DENY msg=audit(1729738800.349:44): domain=195ba459b blockers=fs.refer path="/usr/bin" dev="vda2" ino=351
type=LANDLOCK_DENY msg=audit(1729738800.349:44): domain=195ba459b blockers=fs.make_reg,fs.refer path="/usr/local" dev="vda2" ino=365
We could pack blocker names (e.g. "fs:make_reg,refer") but that would
increase complexity for the kernel and log parsers. Moreover, this
could not handle blockers of different classes (e.g. fs and net). Make
it simple and flexible instead.
Add KUnit tests to check the identification from a layer_mask_t array of
the first layer level denying such request.
Cc: Günther Noack <gnoack@google.com>
Depends-on: 058518c20920 ("landlock: Align partial refer access checks with final ones")
Depends-on: d617f0d72d80 ("landlock: Optimize file path walks and prepare for audit support")
Link: https://lore.kernel.org/r/20250320190717.2287696-13-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
Add audit support for sb_mount, move_mount, sb_umount, sb_remount, and
sb_pivot_root hooks.
The new related blocker is "fs.change_topology".
Audit event sample:
type=LANDLOCK_DENY msg=audit(1729738800.349:44): domain=195ba459b blockers=fs.change_topology name="/" dev="tmpfs" ino=1
Remove landlock_get_applicable_domain() and get_current_fs_domain()
which are now fully replaced with landlock_get_applicable_subject().
Cc: Günther Noack <gnoack@google.com>
Link: https://lore.kernel.org/r/20250320190717.2287696-12-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
Asynchronously log domain information when it first denies an access.
This minimize the amount of generated logs, which makes it possible to
always log denials for the current execution since they should not
happen. These records are identified with the new AUDIT_LANDLOCK_DOMAIN
type.
The AUDIT_LANDLOCK_DOMAIN message contains:
- the "domain" ID which is described;
- the "status" which can either be "allocated" or "deallocated";
- the "mode" which is for now only "enforcing";
- for the "allocated" status, a minimal set of properties to easily
identify the task that loaded the domain's policy with
landlock_restrict_self(2): "pid", "uid", executable path ("exe"), and
command line ("comm");
- for the "deallocated" state, the number of "denials" accounted to this
domain, which is at least 1.
This requires each domain to save these task properties at creation
time in the new struct landlock_details. A reference to the PID is kept
for the lifetime of the domain to avoid race conditions when
investigating the related task. The executable path is resolved and
stored to not keep a reference to the filesystem and block related
actions. All these metadata are stored for the lifetime of the related
domain and should then be minimal. The required memory is not accounted
to the task calling landlock_restrict_self(2) contrary to most other
Landlock allocations (see related comment).
The AUDIT_LANDLOCK_DOMAIN record follows the first AUDIT_LANDLOCK_ACCESS
record for the same domain, which is always followed by AUDIT_SYSCALL
and AUDIT_PROCTITLE. This is in line with the audit logic to first
record the cause of an event, and then add context with other types of
record.
Audit event sample for a first denial:
type=LANDLOCK_ACCESS msg=audit(1732186800.349:44): domain=195ba459b blockers=ptrace opid=1 ocomm="systemd"
type=LANDLOCK_DOMAIN msg=audit(1732186800.349:44): domain=195ba459b status=allocated mode=enforcing pid=300 uid=0 exe="/root/sandboxer" comm="sandboxer"
type=SYSCALL msg=audit(1732186800.349:44): arch=c000003e syscall=101 success=no [...] pid=300 auid=0
Audit event sample for a following denial:
type=LANDLOCK_ACCESS msg=audit(1732186800.372:45): domain=195ba459b blockers=ptrace opid=1 ocomm="systemd"
type=SYSCALL msg=audit(1732186800.372:45): arch=c000003e syscall=101 success=no [...] pid=300 auid=0
Log domain deletion with the "deallocated" state when a domain was
previously logged. This makes it possible for log parsers to free
potential resources when a domain ID will never show again.
The number of denied access requests is useful to easily check how many
access requests a domain blocked and potentially if some of them are
missing in logs because of audit rate limiting, audit rules, or Landlock
log configuration flags (see following commit).
Audit event sample for a deletion of a domain that denied something:
type=LANDLOCK_DOMAIN msg=audit(1732186800.393:46): domain=195ba459b status=deallocated denials=2
Cc: Günther Noack <gnoack@google.com>
Acked-by: Paul Moore <paul@paul-moore.com>
Link: https://lore.kernel.org/r/20250320190717.2287696-11-mic@digikod.net
[mic: Update comment and GFP flag for landlock_log_drop_domain()]
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
Add a new AUDIT_LANDLOCK_ACCESS record type dedicated to an access
request denied by a Landlock domain. AUDIT_LANDLOCK_ACCESS indicates
that something unexpected happened.
For now, only denied access are logged, which means that any
AUDIT_LANDLOCK_ACCESS record is always followed by a SYSCALL record with
"success=no". However, log parsers should check this syscall property
because this is the only sign that a request was denied. Indeed, we
could have "success=yes" if Landlock would support a "permissive" mode.
We could also add a new field to AUDIT_LANDLOCK_DOMAIN for this mode
(see following commit).
By default, the only logged access requests are those coming from the
same executed program that enforced the Landlock restriction on itself.
In other words, no audit record are created for a task after it called
execve(2). This is required to avoid log spam because programs may only
be aware of their own restrictions, but not the inherited ones.
Following commits will allow to conditionally generate
AUDIT_LANDLOCK_ACCESS records according to dedicated
landlock_restrict_self(2)'s flags.
The AUDIT_LANDLOCK_ACCESS message contains:
- the "domain" ID restricting the action on an object,
- the "blockers" that are missing to allow the requested access,
- a set of fields identifying the related object (e.g. task identified
with "opid" and "ocomm").
The blockers are implicit restrictions (e.g. ptrace), or explicit access
rights (e.g. filesystem), or explicit scopes (e.g. signal). This field
contains a list of at least one element, each separated with a comma.
The initial blocker is "ptrace", which describe all implicit Landlock
restrictions related to ptrace (e.g. deny tracing of tasks outside a
sandbox).
Add audit support to ptrace_access_check and ptrace_traceme hooks. For
the ptrace_access_check case, we log the current/parent domain and the
child task. For the ptrace_traceme case, we log the parent domain and
the current/child task. Indeed, the requester and the target are the
current task, but the action would be performed by the parent task.
Audit event sample:
type=LANDLOCK_ACCESS msg=audit(1729738800.349:44): domain=195ba459b blockers=ptrace opid=1 ocomm="systemd"
type=SYSCALL msg=audit(1729738800.349:44): arch=c000003e syscall=101 success=no [...] pid=300 auid=0
A following commit adds user documentation.
Add KUnit tests to check reading of domain ID relative to layer level.
The quick return for non-landlocked tasks is moved from task_ptrace() to
each LSM hooks.
It is not useful to inline the audit_enabled check because other
computation are performed by landlock_log_denial().
Use scoped guards for RCU read-side critical sections.
Cc: Günther Noack <gnoack@google.com>
Acked-by: Paul Moore <paul@paul-moore.com>
Link: https://lore.kernel.org/r/20250320190717.2287696-10-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
Extend struct landlock_cred_security with a domain_exec bitmask to
identify which Landlock domain were created by the current task's bprm.
The whole bitmask is reset on each execve(2) call.
Cc: Günther Noack <gnoack@google.com>
Cc: Paul Moore <paul@paul-moore.com>
Link: https://lore.kernel.org/r/20250320190717.2287696-9-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
This cosmetic change is needed for audit support, specifically to be
able to filter according to cross-execution boundaries.
struct landlock_file_security's size stay the same for now but it will
increase with struct landlock_cred_security's size.
Only save Landlock domain in hook_file_set_fowner() if the current
domain has LANDLOCK_SCOPE_SIGNAL, which was previously done for each
hook_file_send_sigiotask() calls. This should improve a bit
performance.
Replace hardcoded LANDLOCK_SCOPE_SIGNAL with the signal_scope.scope
variable.
Use scoped guards for RCU read-side critical sections.
Cc: Günther Noack <gnoack@google.com>
Link: https://lore.kernel.org/r/20250320190717.2287696-8-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
This cosmetic change that is needed for audit support, specifically to
be able to filter according to cross-execution boundaries.
Replace hardcoded LANDLOCK_SCOPE_SIGNAL with the signal_scope.scope
variable.
Use scoped guards for RCU read-side critical sections.
Cc: Günther Noack <gnoack@google.com>
Link: https://lore.kernel.org/r/20250320190717.2287696-7-mic@digikod.net
[mic: Update headers]
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
This cosmetic change that is needed for audit support, specifically to
be able to filter according to cross-execution boundaries.
Optimize current_check_access_socket() to only handle the access
request.
Remove explicit domain->num_layers check which is now part of the
landlock_get_applicable_subject() call.
Cc: Günther Noack <gnoack@google.com>
Link: https://lore.kernel.org/r/20250320190717.2287696-6-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
This cosmetic change is needed for audit support, specifically to be
able to filter according to cross-execution boundaries.
Add landlock_get_applicable_subject(), mainly a copy of
landlock_get_applicable_domain(), which will fully replace it in a
following commit.
Optimize current_check_access_path() to only handle the access request.
Partially replace get_current_fs_domain() with explicit calls to
landlock_get_applicable_subject(). The remaining ones will follow with
more changes.
Remove explicit domain->num_layers check which is now part of the
landlock_get_applicable_subject() call.
Cc: Günther Noack <gnoack@google.com>
Link: https://lore.kernel.org/r/20250320190717.2287696-5-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
Create a new domain.h file containing the struct landlock_hierarchy
definition and helpers. This type will grow with audit support. This
also prepares for a new domain type.
Cc: Günther Noack <gnoack@google.com>
Link: https://lore.kernel.org/r/20250320190717.2287696-4-mic@digikod.net
Reviewed-by: Günther Noack <gnoack3000@gmail.com>
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
Landlock IDs can be generated to uniquely identify Landlock objects.
For now, only Landlock domains get an ID at creation time. These IDs
map to immutable domain hierarchies.
Landlock IDs have important properties:
- They are unique during the lifetime of the running system thanks to
the 64-bit values: at worse, 2^60 - 2*2^32 useful IDs.
- They are always greater than 2^32 and must then be stored in 64-bit
integer types.
- The initial ID (at boot time) is randomly picked between 2^32 and
2^33, which limits collisions in logs across different boots.
- IDs are sequential, which enables users to order them.
- IDs may not be consecutive but increase with a random 2^4 step, which
limits side channels.
Such IDs can be exposed to unprivileged processes, even if it is not the
case with this audit patch series. The domain IDs will be useful for
user space to identify sandboxes and get their properties.
These Landlock IDs are more secure that other absolute kernel IDs such
as pipe's inodes which rely on a shared global counter.
For checkpoint/restore features (i.e. CRIU), we could easily implement a
privileged interface (e.g. sysfs) to set the next ID counter.
IDR/IDA are not used because we only need a bijection from Landlock
objects to Landlock IDs, and we must not recycle IDs. This enables us
to identify all Landlock objects during the lifetime of the system (e.g.
in logs), but not to access an object from an ID nor know if an ID is
assigned. Using a counter is simpler, it scales (i.e. avoids growing
memory footprint), and it does not require locking. We'll use proper
file descriptors (with IDs used as inode numbers) to access Landlock
objects.
Cc: Günther Noack <gnoack@google.com>
Cc: Paul Moore <paul@paul-moore.com>
Link: https://lore.kernel.org/r/20250320190717.2287696-3-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
Extract code from dump_common_audit_data() into the audit_log_lsm_data()
helper. This helps reuse common LSM audit data while not abusing
AUDIT_AVC records because of the common_lsm_audit() helper.
Depends-on: 7ccbe076d987 ("lsm: Only build lsm_audit.c if CONFIG_SECURITY and CONFIG_AUDIT are set")
Cc: Casey Schaufler <casey@schaufler-ca.com>
Cc: James Morris <jmorris@namei.org>
Cc: Serge E. Hallyn <serge@hallyn.com>
Acked-by: Paul Moore <paul@paul-moore.com>
Link: https://lore.kernel.org/r/20250320190717.2287696-2-mic@digikod.net
Reviewed-by: Günther Noack <gnoack3000@gmail.com>
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
Because Linux credentials are managed per thread, user space relies on
some hack to synchronize credential update across threads from the same
process. This is required by the Native POSIX Threads Library and
implemented by set*id(2) wrappers and libcap(3) to use tgkill(2) to
synchronize threads. See nptl(7) and libpsx(3). Furthermore, some
runtimes like Go do not enable developers to have control over threads
[1].
To avoid potential issues, and because threads are not security
boundaries, let's relax the Landlock (optional) signal scoping to always
allow signals sent between threads of the same process. This exception
is similar to the __ptrace_may_access() one.
hook_file_set_fowner() now checks if the target task is part of the same
process as the caller. If this is the case, then the related signal
triggered by the socket will always be allowed.
Scoping of abstract UNIX sockets is not changed because kernel objects
(e.g. sockets) should be tied to their creator's domain at creation
time.
Note that creating one Landlock domain per thread puts each of these
threads (and their future children) in their own scope, which is
probably not what users expect, especially in Go where we do not control
threads. However, being able to drop permissions on all threads should
not be restricted by signal scoping. We are working on a way to make it
possible to atomically restrict all threads of a process with the same
domain [2].
Add erratum for signal scoping.
Closes: https://github.com/landlock-lsm/go-landlock/issues/36
Fixes: 54a6e6bbf3be ("landlock: Add signal scoping")
Fixes: c8994965013e ("selftests/landlock: Test signal scoping for threads")
Depends-on: 26f204380a3c ("fs: Fix file_set_fowner LSM hook inconsistencies")
Link: https://pkg.go.dev/kernel.org/pub/linux/libs/security/libcap/psx [1]
Link: https://github.com/landlock-lsm/linux/issues/2 [2]
Cc: Günther Noack <gnoack@google.com>
Cc: Paul Moore <paul@paul-moore.com>
Cc: Serge Hallyn <serge@hallyn.com>
Cc: Tahera Fahimi <fahimitahera@gmail.com>
Cc: stable@vger.kernel.org
Acked-by: Christian Brauner <brauner@kernel.org>
Link: https://lore.kernel.org/r/20250318161443.279194-6-mic@digikod.net
[mic: Add extra pointer check and RCU guard, and ease backport]
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
Pull smack updates from Casey Schaufler:
"This is a larger set of patches than usual, consisting of a set of
build clean-ups, a rework of error handling in setting up CIPSO label
specification and a bug fix in network labeling"
* tag 'Smack-for-6.15' of https://github.com/cschaufler/smack-next:
smack: recognize ipv4 CIPSO w/o categories
smack: Revert "smackfs: Added check catlen"
smack: remove /smack/logging if audit is not configured
smack: ipv4/ipv6: tcp/dccp/sctp: fix incorrect child socket label
smack: dont compile ipv6 code unless ipv6 is configured
Smack: fix typos and spelling errors
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux
Pull selinux updates from Paul Moore:
- Add additional SELinux access controls for kernel file reads/loads
The SELinux kernel file read/load access controls were never updated
beyond the initial kernel module support, this pull request adds
support for firmware, kexec, policies, and x.509 certificates.
- Add support for wildcards in network interface names
There are a number of userspace tools which auto-generate network
interface names using some pattern of <XXXX>-<NN> where <XXXX> is a
fixed string, e.g. "podman", and <NN> is a increasing counter.
Supporting wildcards in the SELinux policy for network interfaces
simplifies the policy associted with these interfaces.
- Fix a potential problem in the kernel read file SELinux code
SELinux should always check the file label in the
security_kernel_read_file() LSM hook, regardless of if the file is
being read in chunks. Unfortunately, the existing code only
considered the file label on the first chunk; this pull request fixes
this problem.
There is more detail in the individual commit, but thankfully the
existing code didn't expose a bug due to multi-stage reads only
taking place in one driver, and that driver loading a file type that
isn't targeted by the SELinux policy.
- Fix the subshell error handling in the example policy loader
Minor fix to SELinux example policy loader in scripts/selinux due to
an undesired interaction with subshells and errexit.
* tag 'selinux-pr-20250323' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
selinux: get netif_wildcard policycap from policy instead of cache
selinux: support wildcard network interface names
selinux: Chain up tool resolving errors in install_policy.sh
selinux: add permission checks for loading other kinds of kernel files
selinux: always check the file label in selinux_kernel_read_file()
selinux: fix spelling error
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm
Pull lsm updates from Paul Moore:
- Various minor updates to the LSM Rust bindings
Changes include marking trivial Rust bindings as inlines and comment
tweaks to better reflect the LSM hooks.
- Add LSM/SELinux access controls to io_uring_allowed()
Similar to the io_uring_disabled sysctl, add a LSM hook to
io_uring_allowed() to enable LSMs a simple way to enforce security
policy on the use of io_uring. This pull request includes SELinux
support for this new control using the io_uring/allowed permission.
- Remove an unused parameter from the security_perf_event_open() hook
The perf_event_attr struct parameter was not used by any currently
supported LSMs, remove it from the hook.
- Add an explicit MAINTAINERS entry for the credentials code
We've seen problems in the past where patches to the credentials code
sent by non-maintainers would often languish on the lists for
multiple months as there was no one explicitly tasked with the
responsibility of reviewing and/or merging credentials related code.
Considering that most of the code under security/ has a vested
interest in ensuring that the credentials code is well maintained,
I'm volunteering to look after the credentials code and Serge Hallyn
has also volunteered to step up as an official reviewer. I posted the
MAINTAINERS update as a RFC to LKML in hopes that someone else would
jump up with an "I'll do it!", but beyond Serge it was all crickets.
- Update Stephen Smalley's old email address to prevent confusion
This includes a corresponding update to the mailmap file.
* tag 'lsm-pr-20250323' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm:
mailmap: map Stephen Smalley's old email addresses
lsm: remove old email address for Stephen Smalley
MAINTAINERS: add Serge Hallyn as a credentials reviewer
MAINTAINERS: add an explicit credentials entry
cred,rust: mark Credential methods inline
lsm,rust: reword "destroy" -> "release" in SecurityCtx
lsm,rust: mark SecurityCtx methods inline
perf: Remove unnecessary parameter of security check
lsm: fix a missing security_uring_allowed() prototype
io_uring,lsm,selinux: add LSM hooks for io_uring_setup()
io_uring: refactor io_uring_allowed()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull hardening updates from Kees Cook:
"As usual, it's scattered changes all over. Patches touching things
outside of our traditional areas in the tree have been Acked by
maintainers or were trivial changes:
- loadpin: remove unsupported MODULE_COMPRESS_NONE (Arulpandiyan
Vadivel)
- samples/check-exec: Fix script name (Mickaël Salaün)
- yama: remove needless locking in yama_task_prctl() (Oleg Nesterov)
- lib/string_choices: Sort by function name (R Sundar)
- hardening: Allow default HARDENED_USERCOPY to be set at compile
time (Mel Gorman)
- uaccess: Split out compile-time checks into ucopysize.h
- kbuild: clang: Support building UM with SUBARCH=i386
- x86: Enable i386 FORTIFY_SOURCE on Clang 16+
- ubsan/overflow: Rework integer overflow sanitizer option
- Add missing __nonstring annotations for callers of
memtostr*()/strtomem*()
- Add __must_be_noncstr() and have memtostr*()/strtomem*() check for
it
- Introduce __nonstring_array for silencing future GCC 15 warnings"
* tag 'hardening-v6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (26 commits)
compiler_types: Introduce __nonstring_array
hardening: Enable i386 FORTIFY_SOURCE on Clang 16+
x86/build: Remove -ffreestanding on i386 with GCC
ubsan/overflow: Enable ignorelist parsing and add type filter
ubsan/overflow: Enable pattern exclusions
ubsan/overflow: Rework integer overflow sanitizer option to turn on everything
samples/check-exec: Fix script name
yama: don't abuse rcu_read_lock/get_task_struct in yama_task_prctl()
kbuild: clang: Support building UM with SUBARCH=i386
loadpin: remove MODULE_COMPRESS_NONE as it is no longer supported
lib/string_choices: Rearrange functions in sorted order
string.h: Validate memtostr*()/strtomem*() arguments more carefully
compiler.h: Introduce __must_be_noncstr()
nilfs2: Mark on-disk strings as nonstring
uapi: stddef.h: Introduce __kernel_nonstring
x86/tdx: Mark message.bytes as nonstring
string: kunit: Mark nonstring test strings as __nonstring
scsi: qla2xxx: Mark device strings as nonstring
scsi: mpt3sas: Mark device strings as nonstring
scsi: mpi3mr: Mark device strings as nonstring
...
|
|
Use the "struct" keyword in kernel-doc when describing struct
ipefs_file. Add kernel-doc for the struct members also.
Don't use kernel-doc notation for 'policy_subdir'. kernel-doc does
not support documentation comments for data definitions.
This eliminates multiple kernel-doc warnings:
security/ipe/policy_fs.c:21: warning: cannot understand function prototype: 'struct ipefs_file '
security/ipe/policy_fs.c:407: warning: cannot understand function prototype: 'const struct ipefs_file policy_subdir[] = '
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Fan Wu <wufan@kernel.org>
Cc: Paul Moore <paul@paul-moore.com>
Cc: James Morris <jmorris@namei.org>
Cc: Serge E. Hallyn <serge@hallyn.com>
Cc: linux-security-module@vger.kernel.org
Signed-off-by: Fan Wu <wufan@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs async dir updates from Christian Brauner:
"This contains cleanups that fell out of the work from async directory
handling:
- Change kern_path_locked() and user_path_locked_at() to never return
a negative dentry. This simplifies the usability of these helpers
in various places
- Drop d_exact_alias() from the remaining place in NFS where it is
still used. This also allows us to drop the d_exact_alias() helper
completely
- Drop an unnecessary call to fh_update() from nfsd_create_locked()
- Change i_op->mkdir() to return a struct dentry
Change vfs_mkdir() to return a dentry provided by the filesystems
which is hashed and positive. This allows us to reduce the number
of cases where the resulting dentry is not positive to very few
cases. The code in these places becomes simpler and easier to
understand.
- Repack DENTRY_* and LOOKUP_* flags"
* tag 'vfs-6.15-rc1.async.dir' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
doc: fix inline emphasis warning
VFS: Change vfs_mkdir() to return the dentry.
nfs: change mkdir inode_operation to return alternate dentry if needed.
fuse: return correct dentry for ->mkdir
ceph: return the correct dentry on mkdir
hostfs: store inode in dentry after mkdir if possible.
Change inode_operations.mkdir to return struct dentry *
nfsd: drop fh_update() from S_IFDIR branch of nfsd_create_locked()
nfs/vfs: discard d_exact_alias()
VFS: add common error checks to lookup_one_qstr_excl()
VFS: change kern_path_locked() and user_path_locked_at() to never return negative dentry
VFS: repack LOOKUP_ bit flags.
VFS: repack DENTRY_ flags.
|