Add RSS field, in LMK_PROCKILL cmd, to report the latest memory usage of
the killed process.
Test: Verified RSS field is captured in ApplicationExitInfo
Bug: 322549716
Change-Id: Ic1788e8121da97cd879bd7e9d685c7b879ea5475
Signed-off-by: Carlos Galo <carlosgalo@google.com>
When we get nothing from /proc/<PID>/status and /proc/<PID>/cmdline,
we should return NULL or False because this usually indicates the
process has already terminated. We should avoid attempting
to kill a non-existent process, as it's an unnecessary waste
of kill timeout.
Bug: 331612600
Test: give memory pressure to trigger LMKD
Change-Id: I468ff25012f9bb6fc842a7fad268ebcad0de4690
Signed-off-by: Martin Liu <liumartin@google.com>
Experiments on low-RAM devices indicate regressions due to the new low
memory kill reason which cause LMKD to kill too many processes. Change
ro.lmk.lowmem_min_oom_score to disable kills for this reason by default.
Bug: 341257415
Change-Id: Id7137c4c8d888061353b253dc6906d2854e31b1d
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
We are trying to remove BPF_FD_JUST_USE_INT since we now have access to
libbase everywhere.
Test: builds
Change-Id: Ie9445d3d648e6837deb718aa38ebef3c936653d6
Integrate io_uring within LMKD to batch the read, and write, system
calls needed to process, and register, processes and adjust their OOM
scores.
Test: atest lmkd_tests
Bug: 325525024
Change-Id: I339be2b6f569189519e0e11d07cd6d7d1cf2566d
Signed-off-by: Carlos Galo <carlosgalo@google.com>
Adjusting the string copy message to utiilize the correct format for
direct_reclaim_duration_ms.
Test: m
Bug: 244232958
Change-Id: I3fbfc33e2520ef38b829db67ddb59c636a2bc3e1
Signed-off-by: Carlos Galo <carlosgalo@google.com>
Creating new cmd in LMKD to batch PROCPRIO requests.
Test: atest lmkd_tests
Bug: 325525024
Change-Id: I5460446d4e968e80263aa25298e2a893863eece4
Signed-off-by: Carlos Galo <carlosgalo@google.com>
Refactor cmd_procprio() to reuse its main functionality for bulk updates
later on.
Test: m
Bug: 295231583
Change-Id: Ic42de6e256b813349530f19a20e3ef9d484b20cf
Signed-off-by: Carlos Galo <carlosgalo@google.com>
Updating the cmd_procprio function with the clang-format suggestions. No logic changes.
Test: m
Bug: 325525024
Change-Id: Id6c1feb717259406d953e5e2a174398bccf65d23
Signed-off-by: Carlos Galo <carlosgalo@google.com>
Moving logic for registering proc, after oom adjustment, to its
own function. This work is for the introduction of the new
PROCS_PRIO cmd. This logic will be shared between the current PROCPRIO
and PROCS_PRIO cmd.
Test: m
Bug: 325525024
Change-Id: I0683f63faa3dfa2e4534cdfb8935b4d2f83a6af9
Signed-off-by: Carlos Galo <carlosgalo@google.com>
All the PATH_MAX usages are used to store proc/<pid>/filename
information in lmkd. PATH_MAX is 4096, which is an overkill
of buffer sizes for their usage. Replace PATH_MAX with a smaller size.
Test: m
Bug: 325525024
Change-Id: If6d500102fca532a8afc331d0c847675d6e9e96f
Signed-off-by: Carlos Galo <carlosgalo@google.com>
Current lmkd behavior to kill cached apps when free memory hits low
watermark threshold does not work well on certain devices where more
or less aggressive behavior would yield better results. Introduce a
tunable to control the min oom_score_adj level at which lmkd considers
to kill processes when the system gets into this state. The default
value is set to 701 which preserves the current behavior of killing
cached apps except for the last active one. Setting it to lower values
will make more processes eligible to be killed, setting it to higher
values will limit the kills to a smaller set of processes. Setting it
to 1001 will prevent any process from being killed for this reason.
Bug: 334867461
Bug: 337063274
Change-Id: I1447436e0a0cd1e696b34d2c06b92ff73a5100a9
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Update the init_psi_monitor API to monitor MEMORY, IO, and CPU
resources.
Test: m libpsi
Bug: 335872571
Change-Id: Ieae8c98be0e6353a1d0ca0728c84bcf1897b259c
Free swap is calculated using the min of free swap that kernel would
consider using and easily available memory which can be used by ZRAM
for swapping purposes. However calculation does not consider the
average data compression ratio of ZRAM. Introduce a tunable to set
the average swap compression ratio used when evaluating the amount
of data which can be swapped. Default is set to 1 (no compression)
to keep current behavior. Setting it to 0 will ignore available memory
and assume that configured swap size can be always utilized fully.
Bug: 285854307
Bug: 327561101
Change-Id: I6b0f93ce24179ebf7365a3dbcd52c6e4a52ac200
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Replacing mechanism of reading vmstats to detect kswapd with
memevents instead. Maintain vmstats mechanism if bpf is not supported by
current kernel.
Test: Verified lmkd receives kswapd state changes
Test: m
Bug: 330606003
Change-Id: I9b980a8b94e015d1b8e0986fff9113890420d102
Signed-off-by: Carlos Galo <carlosgalo@google.com>
This monitoring will no longer only track direct reclaim state changes.
Adjusting naming to reflect the broader utilization.
Test: m
Bug: 330606003
Change-Id: Ib77b8b58cd6e8ce1296ffa14481018c29e979754
Signed-off-by: Carlos Galo <carlosgalo@google.com>
Add new command to allow for post-boot actions to occur. This
will allow for the MemEventListener to start after the BPF files are
loaded, removing the need to stall during boot-up until they are loaded
in the device.
Test: Verified memevent listener initialized post-boot
Test: Verified LMKD no longer stalls until BPF progs are loaded
Bug: 331008250
Bug: 244232958
Change-Id: I55f97b41349ea7693cff81b1170d33712b820292
Signed-off-by: Carlos Galo <carlosgalo@google.com>
Add kill reason for when the device is stuck in direct reclaim for
longer than the configurable threshold. Only allow configurable
threshold, and direct reclaim stuck detection, if memevents direct
reclaim monitoring is enabled.
Test: Verified direct reclaim stuck kill log with memory pressure test
Test: m
Bug: 244232958
Change-Id: I1156899874d2eb7e0f4b61597741087c110b3414
Signed-off-by: Carlos Galo <carlosgalo@google.com>
Replacing mechanism of reading vmstats to detect direct reclaim with
memevents instead. Maintain vmstats mechanism if bpf ring buffer is not
supported by current kernel.
Test: Verified lmkd receives direct reclaim state changes
Test: m
Bug: 244232958
Change-Id: I59ee7657da1240355d611dfa129c4d50bed2c330
Signed-off-by: Carlos Galo <carlosgalo@google.com>
Adjust lmk cycle after kill to honor an optional min score.
Test: cycle after kill under mem pressure honors specified score
Bug: 309380316
Change-Id: I9ab8e29b58846cc291acb2834638ddf7a7759eca
Signed-off-by: Matt Stephenson <stephensonmatt@google.com>
Add kill reason for a cached app kill when free memory is under low
watermark.
Bug: 306755741
Change-Id: Idf92da326f6e0990e6d9fd9acdd21b19f6bdd241
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
With Android U removing AMS kills, lmkd has additional duty to kill
cached apps which previously were killed by AMS. The former logic is
not proactive enough and leads to too many cached apps contributing
to memory pressure.
Implement additional logic to kill cached apps (excluding previous
foreground apps) when low watermark is breached or when device is
thrashing.
Bug: 300660611
Change-Id: I356eac1fe6d44dad292a7ea2fadee69a5be61479
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
As a result of experiments, the default relation between critical and
normal thrashing limits has been shown to be insufficient. Increase the
relation from 2x to 3x.
Bug: 194316048
Change-Id: I19877e0df56be07f3f503688f408f5f91f4b1e67
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
In rare cases it's possible that pgscan is not changing because inactive
LRU is empty and can't be refilled from the active LRU due to all
pages being hot. In such conditions pgscan_kswapd/pgscan_direct will
not change while pgrefill will be increasing due to active LRU being
scanned. Lmkd would incorrectly treat this situation as if no reclaim
activity happened.
Change lmkd to check pgrefill as well to detect such conditions.
Bug: 288383787
Change-Id: I6b49607429e2f673bba2645ccddff1a141afbcd1
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
To save CPU cycles during boot for low resource device a new
configuration is added to delay initialization of monitoring until boot
is complete.
Bug: 288566858
Test: Build, boot and verified boot logs to confirm the behavior.
Merged-In: I17cfbf4c7f83bc80dd92a99dfb0254a7e20289be
Change-Id: I17cfbf4c7f83bc80dd92a99dfb0254a7e20289be
The LmkStateChanged atom was historically used to mark lmk activity
and trigger additional stats polling. For more than a year this has
not been used at all (as statsd supported event-based triggering).
Remove unnecessary functionality.
Bug: 278174420
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: I9f7f56711fabb751cf7a57ea7279759bcc4a3dff
Previously the min_oom score of the candidate search was sent to
lmkd_free_memory_before_kill_hook. This is incorrect as the hook expects
the actual oom score of the process.
Bug: b/273670531
Test: cq
Change-Id: Id72c8b39f9c745a8f20fde15266857cb2d2222bf
In the case of ZRAM, SwapFree does not represent the actual free swap
amount because swap space is taken from the free memory or reclaimed.
Therefore use free memory and easily reclaimable memory as an
approximation of how much free swap system can use. Use SwapFree as
a measure of how much swap space the system will consider using. Min
of those two measurements is used to decide how much usable swap the
system still has.
Bug: 238495258
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: Ia7b0cc4a744d14c0d6e52603795917cf5824ea15
pid_remove() is not a thread-safe function and can be called only from
the main thread. Calling it from the watchdog_callback() executed in the
context of the watchdog thread can cause a use-after-free failure if the
same record is being used by the main thread. Fix this issue by marking
the process record as invalid instead of destroying it. Later on invalid
records will be cleared in kill_one_process() called from the context of
the main thread.
Fixes: f8727745f9 ("lmkd: Remove process record after it is killed by lmkd watchdog")
Bug: 248448498
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: I0c7776aea1518c17f0a29904a44b7fe8f33980ca
After lmkd watchdog kills a process, its record should be removed.
Bug: 243567425
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: I70bb2a432df8088ebc9865fbc36b065738248d19
We already emit a richer slice (including pid and oom score) so there's
no reason for the additional print event
Bug: 195085238
Change-Id: I1140f0287934e5f0abdeeb64554a249c4c940def
Underscore character may cause bpf prog/map naming collision. For
example, x.o with map y_z and x_y.o with map z both result in x_y_z
prog/map name, which should be prevented during compile-time.
aosp/2147825 will prohibit underscore character in bpf source name
(source name derives the obj name). Existing bpf modules with underscore
characters in source name need to be updated accordingly.
Bug: 236706995
Test: adb root; adb shell ls -l /sys/fs/bpf/ | grep gpuMem
Change-Id: I213624e59ce1bca6ee7c22028504f2d51e9c50df
The lmkd run wrong for Go devices, because min_score_adj is unused
when use_minfree_levels is set TURE.
Bug: 237947900
Signed-off-by: Yuming Han <yuming.han@unisoc.com>
Change-Id: I717561cd9e5f4d1a2ca60d9fc84adcd6e129420a
Both pgscan_kwsapd and pgscan_direct are defined as unsigned long,
the overflow issues occur on ARM kernel space. Just check whether
their values changed.
Signed-off-by: Yuming Han <yuming.han@unisoc.com>
Change-Id: I73b27855ede9ca729208775e982660bae967ab92
The size of the vmstat_field_names array should correspond to the number
of elements in vmstat_field enum (VS_FIELD_COUNT).
Bug: 227769256
Reported-by: Yuming Han <yuming.han@unisoc.com>
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: Icac2810c4efca2a07cefba6e220165ef4f194867
If hooks are enabled in LMKD and kill_info is not supplied to
kill_one_process, there will be a null dereference on kill_info. This
changes validates ki before dereferencing.
Bug: b/210075795
Test: cq
Change-Id: Ie81ca9bdb73a71f16dc5682c8721a557b8b094fb
Merged-In: Ie81ca9bdb73a71f16dc5682c8721a557b8b094fb
Adds several hooks to LMKD that can be overridden by the vendor. This
allows for device specific control of LMKD when necessary.
Bug: b/210075795
Test: cq
Change-Id: Ib231743183134b05148d45d681765860da6274ae
(cherry picked from commit 2c1248381a52fc520c6cd1acfaee80818eaa9ee1)
Merged-In: Ib231743183134b05148d45d681765860da6274ae