bianbu-linux-6.6/kernel
wanghongzhe a381b70a1c seccomp: Improve performace by optimizing rmb()
According to Kees's suggest, we started with the patch that just replaces
rmb() with smp_rmb() and did a performance test with UnixBench. The
results showed the overhead about 2.53% in rmb() test compared to the
smp_rmb() one, in a x86-64 kernel with CONFIG_SMP enabled running inside a
qemu-kvm vm. The test is a "syscall" testcase in UnixBench, which executes
5 syscalls in a loop during a certain timeout (100 second in our test) and
counts the total number of executions of this 5-syscall sequence. We set
a seccomp filter with all allow rule for all used syscalls in this test
(which will go bitmap path) to make sure the rmb() will be executed. The
details for the test:

with rmb():
/txm # ./syscall_allow_min 100
COUNT|35861159|1|lps
/txm # ./syscall_allow_min 100
COUNT|35545501|1|lps
/txm # ./syscall_allow_min 100
COUNT|35664495|1|lps

with smp_rmb():
/txm # ./syscall_allow_min 100
COUNT|36552771|1|lps
/txm # ./syscall_allow_min 100
COUNT|36491247|1|lps
/txm # ./syscall_allow_min 100
COUNT|36504746|1|lps

For a x86-64 kernel with CONFIG_SMP enabled, the smp_rmb() is just a
compiler barrier() which have no impact in runtime, while rmb() is a
lfence which will prevent all memory access operations (not just load
according the recently claim by Intel) behind itself. We can also figure
it out in disassembly:

with rmb():
0000000000001430 <__seccomp_filter>:
    1430:   41 57                   push   %r15
    1432:   41 56                   push   %r14
    1434:   41 55                   push   %r13
    1436:   41 54                   push   %r12
    1438:   55                      push   %rbp
    1439:   53                      push   %rbx
    143a:   48 81 ec 90 00 00 00    sub    $0x90,%rsp
    1441:   89 7c 24 10             mov    %edi,0x10(%rsp)
    1445:   89 54 24 14             mov    %edx,0x14(%rsp)
    1449:   65 48 8b 04 25 28 00    mov    %gs:0x28,%rax
    1450:   00 00
    1452:   48 89 84 24 88 00 00    mov    %rax,0x88(%rsp)
    1459:   00
    145a:   31 c0                   xor    %eax,%eax
*   145c:   0f ae e8                lfence
    145f:   48 85 f6                test   %rsi,%rsi
    1462:   49 89 f4                mov    %rsi,%r12
    1465:   0f 84 42 03 00 00       je     17ad <__seccomp_filter+0x37d>
    146b:   65 48 8b 04 25 00 00    mov    %gs:0x0,%rax
    1472:   00 00
    1474:   48 8b 98 80 07 00 00    mov    0x780(%rax),%rbx
    147b:   48 85 db                test   %rbx,%rbx

with smp_rmb();
0000000000001430 <__seccomp_filter>:
    1430:   41 57                   push   %r15
    1432:   41 56                   push   %r14
    1434:   41 55                   push   %r13
    1436:   41 54                   push   %r12
    1438:   55                      push   %rbp
    1439:   53                      push   %rbx
    143a:   48 81 ec 90 00 00 00    sub    $0x90,%rsp
    1441:   89 7c 24 10             mov    %edi,0x10(%rsp)
    1445:   89 54 24 14             mov    %edx,0x14(%rsp)
    1449:   65 48 8b 04 25 28 00    mov    %gs:0x28,%rax
    1450:   00 00
    1452:   48 89 84 24 88 00 00    mov    %rax,0x88(%rsp)
    1459:   00
    145a:   31 c0                   xor    %eax,%eax
    145c:   48 85 f6                test   %rsi,%rsi
    145f:   49 89 f4                mov    %rsi,%r12
    1462:   0f 84 42 03 00 00       je     17aa <__seccomp_filter+0x37a>
    1468:   65 48 8b 04 25 00 00    mov    %gs:0x0,%rax
    146f:   00 00
    1471:   48 8b 98 80 07 00 00    mov    0x780(%rax),%rbx
    1478:   48 85 db                test   %rbx,%rbx

Signed-off-by: wanghongzhe <wanghongzhe@huawei.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/1612496049-32507-1-git-send-email-wanghongzhe@huawei.com
2021-02-10 12:40:11 -08:00
..
bpf Merge branch 'exec-for-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2020-12-15 19:29:43 -08:00
cgroup Merge branch 'for-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup 2020-12-28 11:16:38 -08:00
configs Input: gtco - remove driver 2020-12-09 17:47:36 -08:00
debug smp: Cleanup smp_call_function*() 2020-11-24 16:47:49 +01:00
dma dma-mapping updates for 5.11: 2020-12-22 13:19:43 -08:00
entry The new preemtible kmap_local() implementation: 2020-12-14 18:35:53 -08:00
events Merge branch 'exec-update-lock-for-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2020-12-15 19:36:48 -08:00
gcov gcov: fix kernel-doc markup issue 2020-12-15 22:46:18 -08:00
irq genirq: Fix export of irq_to_desc() for powerpc KVM 2020-12-25 11:02:39 -08:00
kcsan kcsan: Fix encoding masks and regain address bit 2020-11-06 17:19:26 -08:00
livepatch livepatch: Use the default ftrace_ops instead of REGS when ARGS is available 2020-11-13 12:15:28 -05:00
locking A moderate set of locking updates: 2020-12-14 17:27:47 -08:00
power Power management updates for 5.11-rc1 2020-12-15 16:30:31 -08:00
printk printk changes for 5.11 2020-12-16 10:45:11 -08:00
rcu Scheduler updates: 2020-12-14 18:29:11 -08:00
sched Fix a context switch performance regression. 2020-12-27 09:00:47 -08:00
time Update/fix two CPU sanity checks in the hotplug and the boot code, 2020-12-27 09:03:41 -08:00
trace Tracing updates for 5.11 2020-12-17 13:22:17 -08:00
.gitignore
acct.c kernel/acct.c: use #elif instead of #end and #elif 2020-12-15 22:46:15 -08:00
async.c treewide: Remove uninitialized_var() usage 2020-07-16 12:35:15 -07:00
audit.c audit: replace atomic_add_return() 2020-12-02 22:52:16 -05:00
audit.h audit: change unnecessary globals into statics 2020-08-17 20:26:58 -04:00
audit_fsnotify.c fsnotify: generalize handle_inode_event() 2020-12-03 14:58:35 +01:00
audit_tree.c fsnotify: generalize handle_inode_event() 2020-12-03 14:58:35 +01:00
audit_watch.c fsnotify: generalize handle_inode_event() 2020-12-03 14:58:35 +01:00
auditfilter.c treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
auditsc.c audit/stable-5.11 PR 20201214 2020-12-16 10:54:03 -08:00
backtracetest.c treewide: Replace DECLARE_TASKLET() with DECLARE_TASKLET_OLD() 2020-07-30 11:15:58 -07:00
bounds.c
capability.c LSM: Signal to SafeSetID when setting group IDs 2020-10-13 09:17:34 -07:00
compat.c treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
configs.c
context_tracking.c context_tracking: Ensure that the critical path cannot be instrumented 2020-06-11 15:14:36 +02:00
cpu.c Scheduler updates: 2020-12-14 18:29:11 -08:00
cpu_pm.c notifier: Fix broken error handling pattern 2020-09-01 09:58:03 +02:00
crash_core.c kdump: append uts_namespace.name offset to VMCOREINFO 2020-12-15 22:46:18 -08:00
crash_dump.c
cred.c exec: Teach prepare_exec_creds how exec treats uids & gids 2020-05-20 14:44:21 -05:00
delayacct.c
dma.c
exec_domain.c
exit.c kernel/io_uring: cancel io_uring before task works 2020-12-30 19:36:54 -07:00
extable.c
fail_function.c fault-injection: handle EI_ETYPE_TRUE 2020-12-15 22:46:19 -08:00
fork.c kasan: rename (un)poison_shadow to (un)poison_range 2020-12-22 12:55:06 -08:00
freezer.c
futex.c Merge branch 'locking/rwsem' 2020-12-09 17:08:45 +01:00
gen_kheaders.sh kbuild: add variables for compression tools 2020-06-06 23:42:01 +09:00
groups.c LSM: Signal to SafeSetID when setting group IDs 2020-10-13 09:17:34 -07:00
hung_task.c kernel/hung_task.c: make type annotations consistent 2020-11-02 12:14:19 -08:00
iomem.c
irq_work.c irq_work: Optimize irq_work_single() 2020-11-24 16:47:49 +01:00
jump_label.c jump_label: Fix usage in module __init 2020-12-18 16:53:12 +01:00
kallsyms.c treewide: Convert macro and uses of __section(foo) to __section("foo") 2020-10-25 14:51:49 -07:00
kcmp.c Merge branch 'exec-update-lock-for-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2020-12-15 19:36:48 -08:00
Kconfig.freezer
Kconfig.hz
Kconfig.locks
Kconfig.preempt
kcov.c kernel: make kcov_common_handle consider the current context 2020-11-02 18:00:20 -08:00
kexec.c LSM: Introduce kernel_post_load_data() hook 2020-10-05 13:37:03 +02:00
kexec_core.c crypto: sha - split sha.h into sha1.h and sha2.h 2020-11-20 14:45:33 +11:00
kexec_elf.c
kexec_file.c crypto: sha - split sha.h into sha1.h and sha2.h 2020-11-20 14:45:33 +11:00
kexec_internal.h
kheaders.c
kmod.c kmod: remove redundant "be an" in the comment 2020-08-12 10:58:01 -07:00
kprobes.c Merge branch 'linus' into perf/kprobes 2020-11-07 13:18:49 +01:00
ksysfs.c
kthread.c Merge branch 'akpm' (patches from Andrew) 2020-12-15 12:53:37 -08:00
latencytop.c
Makefile kcov: don't instrument with UBSAN 2020-12-15 22:46:19 -08:00
module-internal.h
module.c Modules updates for v5.11 2020-12-17 13:01:31 -08:00
module_signature.c
module_signing.c
notifier.c notifier: Fix broken error handling pattern 2020-09-01 09:58:03 +02:00
nsproxy.c fixes-v5.11 2020-12-14 16:40:27 -08:00
padata.c padata: fix possible padata_works_lock deadlock 2020-09-04 17:51:55 +10:00
panic.c panic: don't dump stack twice on warn 2020-11-14 11:26:04 -08:00
params.c Modules updates for v5.11 2020-12-17 13:01:31 -08:00
pid.c Merge branch 'exec-update-lock-for-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2020-12-15 19:36:48 -08:00
pid_namespace.c fixes-v5.11 2020-12-14 16:40:27 -08:00
profile.c
ptrace.c Merge branch 'akpm' (patches from Andrew) 2020-12-15 12:53:37 -08:00
range.c kernel.h: split out min()/max() et al. helpers 2020-10-16 11:11:19 -07:00
reboot.c reboot: hide from sysfs not applicable settings 2020-12-15 22:46:19 -08:00
regset.c regset: kill ->get() 2020-07-27 14:31:12 -04:00
relay.c relay: allow the use of const callback structs 2020-12-15 22:46:18 -08:00
resource.c kernel/resource.c: fix kernel-doc markups 2020-12-15 22:46:18 -08:00
resource_kunit.c resource: provide meaningful MODULE_LICENSE() in test suite 2020-11-25 18:52:35 +01:00
rseq.c
scftorture.c scftorture: Add full-test stutter capability 2020-11-06 17:13:56 -08:00
scs.c scs: switch to vmapped shadow stacks 2020-12-01 10:30:28 +00:00
seccomp.c seccomp: Improve performace by optimizing rmb() 2021-02-10 12:40:11 -08:00
signal.c tif-task_work.arch-2020-12-14 2020-12-16 12:33:35 -08:00
smp.c Merge branch 'linus' into sched/core, to resolve semantic conflict 2020-11-27 11:10:50 +01:00
smpboot.c
smpboot.h
softirq.c Misc fixes/updates: 2020-12-27 09:06:10 -08:00
stackleak.c stackleak: let stack_erasing_sysctl take a kernel pointer buffer 2020-09-19 13:13:39 -07:00
stacktrace.c stacktrace: Remove reliable argument from arch_stack_walk() callback 2020-09-18 14:24:16 +01:00
static_call.c static_call: Fix return type of static_call_init 2020-10-02 21:18:25 +02:00
stop_machine.c Merge branch 'linus' into sched/core, to resolve semantic conflict 2020-11-27 11:10:50 +01:00
sys.c kernel: Implement selective syscall userspace redirection 2020-12-02 15:07:56 +01:00
sys_ni.c epoll: wire up syscall epoll_pwait2 2020-12-19 11:18:38 -08:00
sysctl-test.c
sysctl.c rcu: Panic after fixed number of stalls 2020-11-19 19:37:16 -08:00
task_work.c task_work: remove legacy TWA_SIGNAL path 2020-12-12 09:17:38 -07:00
taskstats.c treewide: rename nla_strlcpy to nla_strscpy. 2020-11-16 08:08:54 -08:00
test_kprobes.c
torture.c rcutorture: Make stutter_wait() caller restore priority 2020-11-06 17:13:54 -08:00
tracepoint.c tracepoints: Migrate to use SYSCALL_WORK flag 2020-11-16 21:53:15 +01:00
tsacct.c
ucount.c
uid16.c
uid16.h
umh.c usermodehelper: reset umask to default before executing user process 2020-10-06 10:31:52 -07:00
up.c
user-return-notifier.c
user.c user: Use generic ns_common::count 2020-08-19 14:14:12 +02:00
user_namespace.c fixes-v5.11 2020-12-14 16:40:27 -08:00
usermode_driver.c umd: Stop using split_argv 2020-07-07 11:58:59 -05:00
utsname.c uts: Use generic ns_common::count 2020-08-19 14:13:20 +02:00
utsname_sysctl.c
watch_queue.c watch_queue: Limit the number of watches a user can hold 2020-08-17 09:39:18 -07:00
watchdog.c kernel/watchdog: fix watchdog_allowed_mask not used warning 2020-11-14 11:26:03 -08:00
watchdog_hld.c
workqueue.c Merge branch 'for-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq 2020-12-28 11:23:02 -08:00
workqueue_internal.h