bianbu-linux-6.6/include/trace/events
Mel Gorman 1b4e3f26f9 mm: vmscan: Reduce throttling due to a failure to make progress
Mike Galbraith, Alexey Avramov and Darrick Wong all reported similar
problems due to reclaim throttling for excessive lengths of time.  In
Alexey's case, a memory hog that should go OOM quickly stalls for
several minutes before stalling.  In Mike and Darrick's cases, a small
memcg environment stalled excessively even though the system had enough
memory overall.

Commit 69392a403f ("mm/vmscan: throttle reclaim when no progress is
being made") introduced the problem although commit a19594ca4a
("mm/vmscan: increase the timeout if page reclaim is not making
progress") made it worse.  Systems at or near an OOM state that cannot
be recovered must reach OOM quickly and memcg should kill tasks if a
memcg is near OOM.

To address this, only stall for the first zone in the zonelist, reduce
the timeout to 1 tick for VMSCAN_THROTTLE_NOPROGRESS and only stall if
the scan control nr_reclaimed is 0, kswapd is still active and there
were excessive pages pending for writeback.  If kswapd has stopped
reclaiming due to excessive failures, do not stall at all so that OOM
triggers relatively quickly.  Similarly, if an LRU is simply congested,
only lightly throttle similar to NOPROGRESS.

Alexey's original case was the most straight forward

	for i in {1..3}; do tail /dev/zero; done

On vanilla 5.16-rc1, this test stalled heavily, after the patch the test
completes in a few seconds similar to 5.15.

Alexey's second test case added watching a youtube video while tail runs
10 times.  On 5.15, playback only jitters slightly, 5.16-rc1 stalls a
lot with lots of frames missing and numerous audio glitches.  With this
patch applies, the video plays similarly to 5.15.

[lkp@intel.com: Fix W=1 build warning]

Link: https://lore.kernel.org/r/99e779783d6c7fce96448a3402061b9dc1b3b602.camel@gmx.de
Link: https://lore.kernel.org/r/20211124011954.7cab9bb4@mail.inbox.lv
Link: https://lore.kernel.org/r/20211022144651.19914-1-mgorman@techsingularity.net
Link: https://lore.kernel.org/r/20211202150614.22440-1-mgorman@techsingularity.net
Link: https://linux-regtracking.leemhuis.info/regzbot/regression/20211124011954.7cab9bb4@mail.inbox.lv/
Reported-and-tested-by: Alexey Avramov <hakavlad@inbox.lv>
Reported-and-tested-by: Mike Galbraith <efault@gmx.de>
Reported-and-tested-by: Darrick J. Wong <djwong@kernel.org>
Reported-by: kernel test robot <lkp@intel.com>
Acked-by: Hugh Dickins <hughd@google.com>
Tracked-by: Thorsten Leemhuis <regressions@leemhuis.info>
Fixes: 69392a403f ("mm/vmscan: throttle reclaim when no progress is being made")
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-31 11:17:07 -08:00
..
9p.h
afs.h netfs, 9p, afs, ceph: Use folios 2021-11-10 21:16:56 +00:00
alarmtimer.h
asoc.h ASoC: soc-core: tidyup jack.h 2020-11-30 12:54:01 +00:00
avc.h selinux: add basic filtering for audit trace events 2020-08-21 17:07:29 -04:00
bcache.h block: remove superfluous param in blk_fill_rwbs() 2021-02-22 06:37:41 -07:00
block.h block: don't call blk_status_to_errno in blk_update_request 2021-10-19 05:54:57 -06:00
bpf_test_run.h selftests: bpf: test writable buffers in raw tps 2019-04-26 19:04:19 -07:00
bridge.h net: bridge: fdb: br_fdb_update can take flags directly 2019-11-01 10:32:43 -07:00
btrfs.h btrfs: use delalloc_bytes to determine flush amount for shrink_delalloc 2021-08-23 13:19:07 +02:00
cachefiles.h cachefiles: Fix oops with cachefiles_cull() due to NULL object 2021-10-05 11:22:06 +01:00
cgroup.h cgroup: use cgrp->kn->id as the cgroup ID 2019-11-12 08:18:04 -08:00
clk.h clk: Trace clk_set_rate() "range" functions 2020-12-17 01:54:31 -08:00
cma.h mm, tracing: unify PFN format strings 2021-06-29 10:53:52 -07:00
compaction.h mm/page_alloc: integrate classzone_idx and high_zoneidx 2020-06-03 20:09:44 -07:00
context_tracking.h
cpuhp.h treewide: Switch printk users from %pf and %pF to %ps and %pS, respectively 2019-04-09 14:19:06 +02:00
damon.h mm/damon: add a tracepoint 2021-09-08 11:50:24 -07:00
devfreq.h PM / devfreq: Add tracepoint for frequency changes 2020-10-26 10:52:37 +09:00
devlink.h devlink: Reduce struct devlink exposure 2021-10-12 16:29:16 -07:00
dma_fence.h treewide: Add missing semicolons to __assign_str uses 2021-06-30 09:19:14 -04:00
erofs.h erofs: get compression algorithms directly on mapping 2021-10-18 00:15:55 +08:00
error_report.h tracing: add error_report_end trace point 2021-02-26 09:41:02 -08:00
ext4.h ext4: delete some unused tracepoint definitions 2021-04-02 11:38:06 -04:00
f2fs.h f2fs: multidevice: support direct IO 2021-10-26 14:04:30 -07:00
fib.h net: Replace nhc_has_gw with nhc_gw_family 2019-04-08 15:22:40 -07:00
fib6.h ipv6: Add fib6_type and fib6_flags to fib6_result 2019-04-17 23:11:30 -07:00
filelock.h locks: Remove extra "0x" in tracepoint format specifier 2020-09-01 18:09:34 -04:00
filemap.h mm, tracing: unify PFN format strings 2021-06-29 10:53:52 -07:00
fs.h NFS: Move generic FS show macros to global header 2021-11-02 12:31:23 -04:00
fs_dax.h
fscache.h fscache: Use refcount_t for the cookie refcount instead of atomic_t 2021-08-27 13:34:03 +01:00
fsi.h trace: fsi: Print transfer size unsigned 2019-11-08 11:23:37 +01:00
fsi_master_aspeed.h fsi: aspeed: Add trace points 2019-11-08 11:28:20 +01:00
fsi_master_ast_cf.h
fsi_master_gpio.h
gpio.h tracing: stop making gpio tracing configurable 2019-04-08 15:11:48 +02:00
gpu_mem.h gpu/trace: Minor comment updates for gpu_mem_total tracepoint 2020-05-07 13:32:57 -04:00
host1x.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 1 2019-05-21 11:28:39 +02:00
huge_memory.h khugepaged: introduce 'max_ptes_shared' tunable 2020-06-03 20:09:46 -07:00
hwmon.h
i2c.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 36 2019-05-24 17:27:11 +02:00
ib_mad.h IB/MAD: Add SMP details to MAD tracing 2019-03-27 15:52:01 -03:00
ib_umad.h IB/UMAD: Add umad trace points 2019-03-27 15:52:01 -03:00
initcall.h
intel-sst.h
intel_iommu.h iommu/vt-d: Add prq_report trace event 2021-06-10 09:06:13 +02:00
intel_ish.h
io_uring.h io_uring: dump sqe contents if issue fails 2021-10-19 05:49:52 -06:00
iocost.h blk-iocost: Add iocg idle state tracepoint 2020-12-17 07:55:44 -07:00
iommu.h
ipi.h
irq.h
irq_matrix.h
iscsi.h scsi: iscsi: Capture iscsi debug messages using tracepoints 2018-12-20 20:03:55 -05:00
jbd2.h jbd2,ext4: add a shrinker to release checkpointed buffers 2021-06-24 10:54:49 -04:00
kmem.h mm, tracing: unify PFN format strings 2021-06-29 10:53:52 -07:00
kvm.h KVM: x86/mmu: Drop trace_kvm_age_page() tracepoint 2021-04-17 08:30:56 -04:00
kyber.h kyber: avoid q->disk dereferences in trace points 2021-10-15 21:02:57 -06:00
libata.h
lock.h
mce.h
mctp.h mctp: Add tracepoints for tag/key handling 2021-09-29 11:00:11 +01:00
mdio.h
migrate.h mm/migrate: demote pages during reclaim 2021-09-03 09:58:16 -07:00
mlxsw.h mlxsw: spectrum_acl: Rename rehash_dis trace 2019-03-31 11:01:23 -07:00
mmap.h mm: mmap: add trace point of vm_unmapped_area 2020-04-02 09:35:30 -07:00
mmap_lock.h mm: mmap_lock: use DECLARE_EVENT_CLASS and DEFINE_EVENT_FN 2021-11-06 13:30:36 -07:00
mmc.h
mmflags.h Merge branch 'akpm' (patches from Andrew) 2021-09-08 12:55:35 -07:00
module.h
mptcp.h mptcp: dump csum fields in mptcp_dump_mpext 2021-06-18 11:40:11 -07:00
napi.h tracing: Fix header include guards in trace event headers 2019-07-30 21:49:06 -04:00
nbd.h nbd: add tracepoints for send/receive timing 2019-04-26 19:04:19 -07:00
neigh.h neighbor: Add tracepoint to __neigh_create 2019-05-22 17:50:24 -07:00
net.h net: use %px to print skb address in trace_netif_receive_skb 2021-07-15 10:28:48 -07:00
net_probe_common.h
netfs.h netfs: Move cookie debug ID to struct netfs_cache_resources 2021-08-25 15:20:25 +01:00
netlink.h netlink: add tracepoint at NL_SET_ERR_MSG 2021-02-04 18:05:59 -08:00
nfs.h NFS: Move NFS protocol display macros to global header 2021-11-02 12:31:23 -04:00
nilfs2.h
nmi.h
objagg.h lib: introduce initial implementation of object aggregation manager 2018-11-15 14:43:43 -08:00
oom.h
osnoise.h tracing: Fix spelling in osnoise tracer "interferences" -> "interference" 2021-06-28 14:12:27 -04:00
page_isolation.h
page_pool.h mm, tracing: unify PFN format strings 2021-06-29 10:53:52 -07:00
page_ref.h mm: introduce PAGEFLAGS_MASK to replace ((1UL << NR_PAGEFLAGS) - 1) 2021-09-08 11:50:24 -07:00
pagemap.h mm/lru: Convert __pagevec_lru_add_fn to take a folio 2021-10-18 07:49:40 -04:00
percpu.h
power.h PM: QoS: Simplify definitions of CPU latency QoS trace events 2020-02-13 11:26:39 +01:00
power_cpu_migrate.h
preemptirq.h tracing: Change offset type to s32 in preempt/irq tracepoints 2020-01-03 11:34:37 -05:00
printk.h
pwc.h media: usb: pwc: Introduce TRACE_EVENTs for pwc_isoc_handler() 2019-01-16 11:15:11 -05:00
pwm.h pwm: Implement tracing for .get_state() and .apply_state() 2020-01-20 12:28:37 +01:00
qdisc.h qdisc: add new field for qdisc_enqueue tracepoint 2021-07-27 14:16:38 +01:00
qla.h scsi: qla2xxx: Suppress two recently introduced compiler warnings 2020-05-19 21:43:01 -04:00
qrtr.h net: qrtr: Add tracepoint support 2020-04-22 12:55:54 -07:00
random.h random: remove dead code left over from blocking pool 2021-04-02 18:28:12 +11:00
rcu.h rcu/nocb: Unify timers 2021-05-12 12:10:23 -07:00
rdma.h RDMA/core: Move the rdma_show_ib_cm_event() macro 2020-08-24 16:01:47 -03:00
rdma_core.h RDMA/core: Add trace points to follow MR allocation 2020-01-07 16:10:53 -04:00
regulator.h regulator: core: Add regulator bypass trace points 2020-05-29 17:17:02 +01:00
rpcgss.h sunrpc: fix header include guard in trace header 2021-11-17 18:27:32 -05:00
rpcrdma.h A slow cycle for nfsd: mainly cleanup, including Neil's patch dropping 2021-11-10 16:45:54 -08:00
rpm.h PM-runtime: add tracepoints for usage_count changes 2020-01-13 12:28:29 +01:00
rseq.h
rtc.h
rxrpc.h rxrpc: Fix a missing NULL-pointer check in a trace 2020-09-14 16:18:59 +01:00
sched.h sched/tracing: Remove the redundant 'success' in the sched tracepoint 2021-06-10 11:16:20 -04:00
scmi.h firmware: arm_scmi: Use signed integer to report transfer status 2020-06-30 14:07:08 +01:00
scsi.h scsi: core: Kill message byte 2021-05-31 22:48:24 -04:00
sctp.h sctp: move trace_sctp_probe_path into sctp_outq_sack 2019-12-26 13:06:45 -08:00
signal.h
siox.h
skb.h
smbus.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 36 2019-05-24 17:27:11 +02:00
sock.h net: sock: add trace for socket errors 2021-06-29 11:28:21 -07:00
spi.h spi: Enable tracing of the SPI setup CS selection 2021-05-26 21:22:13 +01:00
spmi.h
sunrpc.h A slow cycle for nfsd: mainly cleanup, including Neil's patch dropping 2021-11-10 16:45:54 -08:00
sunrpc_base.h SUNRPC: Tracepoints should display tk_pid and cl_clid as a fixed-size field 2021-10-20 18:09:54 -04:00
sunvnet.h
swiotlb.h
syscalls.h syscalls: Remove start and number from syscall_get_arguments() args 2019-04-05 09:26:43 -04:00
target.h scsi: target: core: Add CONTROL field for trace events 2020-10-02 18:36:19 -04:00
task.h
tcp.h tcp: add tracepoint for checksum errors 2021-05-14 15:26:03 -07:00
tegra_apb_dma.h tracing: Fix header include guards in trace event headers 2019-07-30 21:49:06 -04:00
thermal.h thermal: devfreq_cooling: change tracing function and arguments 2020-12-11 14:10:44 +01:00
thermal_power_allocator.h
thp.h
timer.h tracing: Fix various typos in comments 2021-03-23 14:08:18 -04:00
tlb.h
udp.h
ufs.h scsi: ufs: core: Enable power management for wlun 2021-05-10 22:28:20 -04:00
v4l2.h media: v4l2: abstract timeval handling in v4l2_buffer 2020-01-03 15:43:35 +01:00
vb2.h
vmscan.h mm: vmscan: Reduce throttling due to a failure to make progress 2021-12-31 11:17:07 -08:00
vsock_virtio_transport_common.h virtio/vsock: update trace event for SEQPACKET 2021-06-11 13:32:47 -07:00
wbt.h bdi: use bdi_dev_name() to get device name 2020-05-09 16:07:39 -06:00
workqueue.h workqueue/tracing: Copy workqueue name to buffer in trace event 2021-03-18 12:57:37 -04:00
writeback.h Merge branch 'akpm' (patches from Andrew) 2021-11-06 14:08:17 -07:00
xdp.h xdp: Extend xdp_redirect_map with broadcast support 2021-05-26 09:46:16 +02:00
xen.h x86/mm/tlb: Flush remote and local TLBs concurrently 2021-03-06 12:59:10 +01:00