linux-xiaomi-chiron/kernel
Frederick Lawler 7cd4c5c210 security, lsm: Introduce security_create_user_ns()
User namespaces are an effective tool to allow programs to run with
permission without requiring the need for a program to run as root. User
namespaces may also be used as a sandboxing technique. However, attackers
sometimes leverage user namespaces as an initial attack vector to perform
some exploit. [1,2,3]

While it is not the unprivileged user namespace functionality, which
causes the kernel to be exploitable, users/administrators might want to
more granularly limit or at least monitor how various processes use this
functionality, while vulnerable kernel subsystems are being patched.

Preventing user namespace already creation comes in a few of forms in
order of granularity:

        1. /proc/sys/user/max_user_namespaces sysctl
        2. Distro specific patch(es)
        3. CONFIG_USER_NS

To block a task based on its attributes, the LSM hook cred_prepare is a
decent candidate for use because it provides more granular control, and
it is called before create_user_ns():

        cred = prepare_creds()
                security_prepare_creds()
                        call_int_hook(cred_prepare, ...
        if (cred)
                create_user_ns(cred)

Since security_prepare_creds() is meant for LSMs to copy and prepare
credentials, access control is an unintended use of the hook. [4]
Further, security_prepare_creds() will always return a ENOMEM if the
hook returns any non-zero error code.

This hook also does not handle the clone3 case which requires us to
access a user space pointer to know if we're in the CLONE_NEW_USER
call path which may be subject to a TOCTTOU attack.

Lastly, cred_prepare is called in many call paths, and a targeted hook
further limits the frequency of calls which is a beneficial outcome.
Therefore introduce a new function security_create_user_ns() with an
accompanying userns_create LSM hook.

With the new userns_create hook, users will have more control over the
observability and access control over user namespace creation. Users
should expect that normal operation of user namespaces will behave as
usual, and only be impacted when controls are implemented by users or
administrators.

This hook takes the prepared creds for LSM authors to write policy
against. On success, the new namespace is applied to credentials,
otherwise an error is returned.

Links:
1. https://nvd.nist.gov/vuln/detail/CVE-2022-0492
2. https://nvd.nist.gov/vuln/detail/CVE-2022-25636
3. https://nvd.nist.gov/vuln/detail/CVE-2022-34918
4. https://lore.kernel.org/all/1c4b1c0d-12f6-6e9e-a6a3-cdce7418110c@schaufler-ca.com/

Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Reviewed-by: KP Singh <kpsingh@kernel.org>
Signed-off-by: Frederick Lawler <fred@cloudflare.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2022-08-16 17:32:46 -04:00
..
bpf bpf: Shut up kern_sys_bpf warning. 2022-08-10 23:58:13 -07:00
cgroup Various fixes: a deadline scheduler fix, a migration fix, a Sparse fix and a comment fix. 2022-08-06 17:34:06 -07:00
configs xen: branch for v6.0-rc1b 2022-08-14 09:28:54 -07:00
debug Modules updates for v5.19-rc1 2022-05-26 17:13:43 -07:00
dma remoteproc updates for v5.20 2022-08-08 15:16:29 -07:00
entry context_tracking: Take NMI eqs entrypoints over RCU 2022-07-05 13:32:59 -07:00
events Misc fixes to kprobes and the faddr2line script, plus a cleanup. 2022-08-06 17:28:12 -07:00
futex drm for 5.19-rc1 2022-05-25 16:18:27 -07:00
gcov gcov: Remove compiler version check 2021-12-02 17:25:21 +09:00
irq irqchip/genirq updates for 5.20: 2022-07-28 12:36:35 +02:00
kcsan kcsan: test: Add a .kunitconfig to run KCSAN tests 2022-07-22 09:22:59 -06:00
livepatch Livepatching changes for 5.19 2022-06-02 08:55:01 -07:00
locking RCU pull request for v5.20 (or whatever) 2022-08-02 19:12:45 -07:00
module Modules updates for 6.0 2022-08-08 14:12:19 -07:00
power Char / Misc driver changes for 6.0-rc1 2022-08-04 11:05:48 -07:00
printk printk: do not wait for consoles when suspended 2022-07-15 10:52:11 +02:00
rcu - The usual batches of cleanups from Baoquan He, Muchun Song, Miaohe 2022-08-05 16:32:45 -07:00
sched Various fixes: a deadline scheduler fix, a migration fix, a Sparse fix and a comment fix. 2022-08-06 17:34:06 -07:00
time time: Correct the prototype of ns_to_kernel_old_timeval and ns_to_timespec64 2022-08-09 20:02:13 +02:00
trace Tracing updates for 5.20 / 6.0 2022-08-05 09:41:12 -07:00
.gitignore
acct.c kernel/acct: move acct sysctls to its own file 2022-04-06 13:43:44 -07:00
async.c Revert "module, async: async_synchronize_full() on module init iff async is used" 2022-02-03 11:20:34 -08:00
audit.c audit: make is_audit_feature_set() static 2022-06-13 14:08:57 -04:00
audit.h audit: log AUDIT_TIME_* records only from rules 2022-02-22 13:51:40 -05:00
audit_fsnotify.c fsnotify: make allow_dups a property of the group 2022-04-25 14:37:18 +02:00
audit_tree.c audit: use fsnotify group lock helpers 2022-04-25 14:37:28 +02:00
audit_watch.c fsnotify: pass flags argument to fsnotify_alloc_group() 2022-04-25 14:37:12 +02:00
auditfilter.c audit/stable-5.17 PR 20220110 2022-01-11 13:08:21 -08:00
auditsc.c audit, io_uring, io-wq: Fix memory leak in io_sq_thread() and io_wqe_worker() 2022-08-04 08:33:54 -06:00
backtracetest.c
bounds.c
capability.c xfs: don't generate selinux audit messages for capability testing 2022-03-09 10:32:06 -08:00
cfi.c context_tracking: Take IRQ eqs entrypoints over RCU 2022-07-05 13:32:59 -07:00
compat.c arch: remove compat_alloc_user_space 2021-09-08 15:32:35 -07:00
configs.c
context_tracking.c MAINTAINERS: Add Paul as context tracking maintainer 2022-07-05 13:33:00 -07:00
cpu.c Intel Trust Domain Extensions 2022-05-23 17:51:12 -07:00
cpu_pm.c context_tracking: Take IRQ eqs entrypoints over RCU 2022-07-05 13:32:59 -07:00
crash_core.c kdump: round up the total memory size to 128M for crashkernel reservation 2022-07-17 17:31:40 -07:00
crash_dump.c
cred.c x86: Mark __invalid_creds() __noreturn 2022-03-15 10:32:44 +01:00
delayacct.c delayacct: track delays from write-protect copy 2022-06-01 15:55:25 -07:00
dma.c
exec_domain.c
exit.c exit: Fix typo in comment: s/sub-theads/sub-threads 2022-08-03 10:44:54 +02:00
extable.c context_tracking: Take NMI eqs entrypoints over RCU 2022-07-05 13:32:59 -07:00
fail_function.c
fork.c Tracing updates for 5.20 / 6.0 2022-08-05 09:41:12 -07:00
freezer.c
gen_kheaders.sh kheaders: Have cpio unconditionally replace files 2022-05-08 03:16:59 +09:00
groups.c security: Add LSM hook to setgroups() syscall 2022-07-15 18:21:49 +00:00
hung_task.c kernel/hung_task: fix address space of proc_dohung_task_timeout_secs 2022-07-29 18:12:35 -07:00
iomem.c
irq_work.c irq_work: use kasan_record_aux_stack_noalloc() record callstack 2022-04-15 14:49:55 -07:00
jump_label.c jump_label: make initial NOP patching the special case 2022-06-24 09:48:55 +02:00
kallsyms.c Updates to various subsystems which I help look after. lib, ocfs2, 2022-08-07 10:03:24 -07:00
kallsyms_internal.h kallsyms: move declarations to internal header 2022-07-17 17:31:39 -07:00
kcmp.c
Kconfig.freezer
Kconfig.hz
Kconfig.locks locking/rwlock: Provide RT variant 2021-08-17 17:50:51 +02:00
Kconfig.preempt Revert "signal, x86: Delay calling signals in atomic on RT enabled kernels" 2022-03-31 10:36:55 +02:00
kcov.c kcov: update pos before writing pc in trace function 2022-05-25 13:05:42 -07:00
kexec.c kexec: avoid compat_alloc_user_space 2021-09-08 15:32:34 -07:00
kexec_core.c kexec: drop weak attribute from functions 2022-07-15 12:21:16 -04:00
kexec_elf.c
kexec_file.c Updates to various subsystems which I help look after. lib, ocfs2, 2022-08-07 10:03:24 -07:00
kexec_internal.h
kheaders.c
kmod.c
kprobes.c kprobes: Forbid probing on trampoline and BPF code areas 2022-08-02 11:47:29 +02:00
ksysfs.c kernel/ksysfs.c: use helper macro __ATTR_RW 2022-03-23 19:00:33 -07:00
kthread.c kthread: make it clear that kthread_create_on_node() might be terminated by any fatal signal 2022-06-16 19:11:30 -07:00
latencytop.c latencytop: move sysctl to its own file 2022-04-21 11:40:59 -07:00
Makefile kernel: remove platform_has() infrastructure 2022-08-01 07:42:56 +02:00
module_signature.c
notifier.c notifier: Add blocking/atomic_notifier_chain_register_unique_prio() 2022-05-19 19:30:30 +02:00
nsproxy.c fs/exec: allow to unshare a time namespace on vfork+exec 2022-06-15 07:58:04 -07:00
padata.c padata: replace cpumask_weight with cpumask_empty in padata.c 2022-01-31 11:21:46 +11:00
panic.c linux-kselftest-kunit-5.20-rc1 2022-08-02 19:34:45 -07:00
params.c kobject: remove kset from struct kset_uevent_ops callbacks 2021-12-28 11:26:18 +01:00
pid.c pid: add pidfd_get_task() helper 2021-10-14 13:29:18 +02:00
pid_namespace.c kernel: pid_namespace: use NULL instead of using plain integer as pointer 2022-04-29 14:38:00 -07:00
profile.c profile: setup_profiling_timer() is moslty not implemented 2022-07-29 18:12:36 -07:00
ptrace.c ptrace: fix clearing of JOBCTL_TRACED in ptrace_unfreeze_traced() 2022-07-09 11:06:19 -07:00
range.c
reboot.c Merge branch 'rework/kthreads' into for-linus 2022-06-23 19:11:28 +02:00
regset.c
relay.c relay: remove redundant assignment to pointer buf 2022-05-12 20:38:37 -07:00
resource.c resource: Introduce alloc_free_mem_region() 2022-07-21 17:19:25 -07:00
resource_kunit.c
rseq.c rseq: Kill process when unknown flags are encountered in ABI structures 2022-08-01 15:21:42 +02:00
scftorture.c scftorture: Fix distribution of short handler delays 2022-04-11 17:07:29 -07:00
scs.c kasan, vmalloc: only tag normal vmalloc allocations 2022-03-24 19:06:48 -07:00
seccomp.c seccomp: Add wait_killable semantic to seccomp user notifier 2022-05-03 14:11:58 -07:00
signal.c signal handling: don't use BUG_ON() for debugging 2022-07-07 09:53:43 -07:00
smp.c locking/csd_lock: Change csdlock_debug from early_param to __setup 2022-07-19 11:40:00 -07:00
smpboot.c cpu/hotplug: Allow the CPU in CPU_UP_PREPARE state to be brought up again. 2022-04-12 14:13:01 +02:00
smpboot.h
softirq.c context_tracking: Take IRQ eqs entrypoints over RCU 2022-07-05 13:32:59 -07:00
stackleak.c stackleak: add on/off stack variants 2022-05-08 01:33:09 -07:00
stacktrace.c uaccess: remove CONFIG_SET_FS 2022-02-25 09:36:06 +01:00
static_call.c static_call: Don't make __static_call_return0 static 2022-04-05 09:59:38 +02:00
static_call_inline.c static_call: Don't make __static_call_return0 static 2022-04-05 09:59:38 +02:00
stop_machine.c Scheduler changes in this cycle were: 2022-05-24 11:11:13 -07:00
sys.c arm64/sme: Implement vector length configuration prctl()s 2022-04-22 18:50:54 +01:00
sys_ni.c mm/mempolicy: wire up syscall set_mempolicy_home_node 2022-01-15 16:30:30 +02:00
sysctl-test.c
sysctl.c kernel/sysctl.c: Remove trailing white space 2022-08-08 09:01:36 -07:00
task_work.c task_work: allow TWA_SIGNAL without a rescheduling IPI 2022-04-30 08:39:32 -06:00
taskstats.c kernel: make taskstats available from all net namespaces 2022-04-29 14:38:03 -07:00
torture.c torture: Wake up kthreads after storing task_struct pointer 2022-02-01 17:24:39 -08:00
tracepoint.c tracepoint: Fix kerneldoc comments 2021-08-16 11:39:51 -04:00
tsacct.c taskstats: version 12 with thread group and exe info 2022-04-29 14:38:03 -07:00
ucount.c ucounts: Handle wrapping in is_ucounts_overlimit 2022-02-17 09:11:57 -06:00
uid16.c
uid16.h
umh.c kthread: Don't allocate kthread_struct for init and umh 2022-05-06 14:49:44 -05:00
up.c
user-return-notifier.c
user.c fs/epoll: use a per-cpu counter for user's watches count 2021-09-08 11:50:27 -07:00
user_namespace.c security, lsm: Introduce security_create_user_ns() 2022-08-16 17:32:46 -04:00
usermode_driver.c blob_to_mnt(): kern_unmount() is needed to undo kern_mount() 2022-05-19 23:25:47 -04:00
utsname.c
utsname_sysctl.c
watch_queue.c This was a moderately busy cycle for documentation, but nothing all that 2022-08-02 19:24:24 -07:00
watchdog.c powerpc updates for 6.0 2022-08-06 16:38:17 -07:00
watchdog_hld.c Revert "printk: add functions to prefer direct printing" 2022-06-23 18:41:40 +02:00
workqueue.c drm for 5.20/6.0 2022-08-03 19:52:08 -07:00
workqueue_internal.h workqueue: Assign a color to barrier work items 2021-08-17 07:49:10 -10:00