Merge branch 'ucount-fixes-for-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace

Pull ucounts fixes from Eric Biederman:
 "There has been one very hard to track down bug in the ucount code that
  we have been tracking since roughly v5.14 was released. Alex managed
  to find a reliable reproducer a few days ago and then I was able to
  instrument the code and figure out what the issue was.

  It turns out the sigqueue_alloc single atomic operation optimization
  did not play nicely with ucounts multiple level rlimits. It turned out
  that either sigqueue_alloc or sigqueue_free could be operating on
  multiple levels and trigger the conditions for the optimization on
  more than one level at the same time.

  To deal with that situation I have introduced inc_rlimit_get_ucounts
  and dec_rlimit_put_ucounts that just focuses on the optimization and
  the rlimit and ucount changes.

  While looking into the big bug I found I couple of other little issues
  so I am including those fixes here as well.

  When I have time I would very much like to dig into process ownership
  of the shared signal queue and see if we could pick a single owner for
  the entire queue so that all of the rlimits can count to that owner.
  That should entirely remove the need to call get_ucounts and
  put_ucounts in sigqueue_alloc and sigqueue_free. It is difficult
  because Linux unlike POSIX supports setuid that works on a single
  thread"

* 'ucount-fixes-for-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
  ucounts: Move get_ucounts from cred_alloc_blank to key_change_session_keyring
  ucounts: Proper error handling in set_cred_ucounts
  ucounts: Pair inc_rlimit_ucounts with dec_rlimit_ucoutns in commit_creds
  ucounts: Fix signal ucount refcounting
This commit is contained in:
Linus Torvalds 2021-10-21 17:27:17 -10:00
commit 9d235ac01f
5 changed files with 69 additions and 24 deletions

View file

@ -426,22 +426,10 @@ __sigqueue_alloc(int sig, struct task_struct *t, gfp_t gfp_flags,
*/
rcu_read_lock();
ucounts = task_ucounts(t);
sigpending = inc_rlimit_ucounts(ucounts, UCOUNT_RLIMIT_SIGPENDING, 1);
switch (sigpending) {
case 1:
if (likely(get_ucounts(ucounts)))
break;
fallthrough;
case LONG_MAX:
/*
* we need to decrease the ucount in the userns tree on any
* failure to avoid counts leaking.
*/
dec_rlimit_ucounts(ucounts, UCOUNT_RLIMIT_SIGPENDING, 1);
rcu_read_unlock();
return NULL;
}
sigpending = inc_rlimit_get_ucounts(ucounts, UCOUNT_RLIMIT_SIGPENDING);
rcu_read_unlock();
if (!sigpending)
return NULL;
if (override_rlimit || likely(sigpending <= task_rlimit(t, RLIMIT_SIGPENDING))) {
q = kmem_cache_alloc(sigqueue_cachep, gfp_flags);
@ -450,8 +438,7 @@ __sigqueue_alloc(int sig, struct task_struct *t, gfp_t gfp_flags,
}
if (unlikely(q == NULL)) {
if (dec_rlimit_ucounts(ucounts, UCOUNT_RLIMIT_SIGPENDING, 1))
put_ucounts(ucounts);
dec_rlimit_put_ucounts(ucounts, UCOUNT_RLIMIT_SIGPENDING);
} else {
INIT_LIST_HEAD(&q->list);
q->flags = sigqueue_flags;
@ -464,8 +451,8 @@ static void __sigqueue_free(struct sigqueue *q)
{
if (q->flags & SIGQUEUE_PREALLOC)
return;
if (q->ucounts && dec_rlimit_ucounts(q->ucounts, UCOUNT_RLIMIT_SIGPENDING, 1)) {
put_ucounts(q->ucounts);
if (q->ucounts) {
dec_rlimit_put_ucounts(q->ucounts, UCOUNT_RLIMIT_SIGPENDING);
q->ucounts = NULL;
}
kmem_cache_free(sigqueue_cachep, q);