Some log messages mention "oom_adj" instead of "oom_score_adj" when
referring to oom_score_adj. This is confusing because "oom_adj" is a
separate value which was supplanted by oom_score_adj, but can still be
used.
Test: trigger memory pressure and view logs
Change-Id: I23825083cecfff6bd32bfb39c6dac1f2b17a72a7
Added SPDX-license-identifier-Apache-2.0 to:
Android.bp
libpsi/Android.bp
tests/Android.bp
Bug: 68860345
Bug: 151177513
Bug: 151953481
Test: m all
Exempt-From-Owner-Approval: janitorial work
Change-Id: I5fed190764c763388c50c2fea58c5c421579bd30
Memory cgroups are disabled on non-AndroidGo devices. Change the test
not to fail due to missing in-kernel memory cgroup support.
Bug: 172296409
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: I35d724c23c22e97458976c1cad45fe9d993326f9
LMK events are an important platform memory monitoring signal. Enable
them by default.
Changes:
- Compile lmkd with statsd by default
- Signal lmkd by default
Test: build, statsd cts
Bug: 177985094
Change-Id: I070660767db6e3bc8926ff82b64b99c7ee9a0108
Linux kernel 5.9 change some vmstat fields including workingset_refault
which affects lmkd operation. Update vmstat parsing to handle both
old (workingset_refault) and new (workingset_refault_file) names for
that field.
Bug: 175617952
Test: lmkd_unit_test
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: I8f9b3d027ca96154f07e7252902a5aa04cf05a9f
workingset_refault field in zoneinfo is currently being parsed but
is not used. Instead the same field in vmstat is being used to
capture the number of file-backed workingset refaults. Remove the
unused field parsing code.
Bug: 175617952
Test: lmkd_unit_test
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: I79641a833c252cf50ac08c0c7d17c8294236d82d
Make vendor_available = true so that other modules in vendor image
can leverage this library to init, register and unregister to psi.
Bug: 169346507
Change-Id: I47f7d25984e09d61703e7b2bd6fcb8db9d3814f5
Signed-off-by: Sudarshan Rajagopalan <sudaraja@codeaurora.org>
The libpsi source code is missing cutils and stdio header files.
Add cutils and stdio header files, and add libcutils_headers to
the header library in Android.bp.
Bug: 169346507
Change-Id: I2d613d5724d3c5f52dd52dcae7024439f2e8d5bb
Signed-off-by: Sudarshan Rajagopalan <sudaraja@codeaurora.org>
Information like free memory and swap as well as kill reason would be
useful for understanding regressions in the number of lmk kills in the
field.
Bug: 168117803
Test: statsd_testdrive 51, load with lmk_unit_test
Merged-In: Ic46aed3c85b880b32ac5ad61b55f90e0d33517c7
Change-Id: Ic46aed3c85b880b32ac5ad61b55f90e0d33517c7
Information like free memory and swap as well as kill reason would be
useful for understanding regressions in the number of lmk kills in the
field.
Bug: 168117803
Change-Id: Ic46aed3c85b880b32ac5ad61b55f90e0d33517c7
Test: statsd_testdrive 51, load with lmk_unit_test
If the first PSI event triggers a kill, lmkd won't resume polling
immediately after the process has died. Instead, it will wait until the
next PSI event to resume the polling which is too late when the device
is under memory pressure. This happens if data communication with AMS
happens after previous polling window expired, in which case paused
handler gets reset and polling does not resume after the kill.
Fix this by changing pause handler reset logic.
Bug: 167562248
Test: memory pressure test
Signed-off-by: Martin Liu <liumartin@google.com>
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Merged-In: I10c65c85b718a656e3d8991bf09948b96da895cb
Change-Id: I10c65c85b718a656e3d8991bf09948b96da895cb
It seems we have chance that file_base_lru is zero.
Avoid it by adding 1.
Bug: 167660459
Bug: 163134367
Test: boot
Signed-off-by: Martin Liu <liumartin@google.com>
Merged-In: If19dbbaafe6cd28a9d5b7f8a002f3cd33daab5e7
Change-Id: If19dbbaafe6cd28a9d5b7f8a002f3cd33daab5e7
When a device is thrashing the file cache, workingset refaults can
grow slowly because of variant reasons. Current thrashing detection
mechanism could reset the thrashing counter frequently as it relies
on presence of reclaim activity, however refaults can keep increasing
even when the device is not actively reclaiming. In addition, the
thrashing counter gets reset when conditions require a kill but lmkd
could not find an eligible process to be killed. This is problematic
because when this happens thrashing is being ignored.
Use a fixed 1 sec periods to aggregate the thrashing counter. Also we
need to keep monitoring thrashing counter while retrying as someone
could release the memory to mitigate the thrashing. If thrashing
counter is greater than the limit at the end of the 1 sec period this
means lmkd failed to find an eligible process to kill. In this case
we store accumulated thrashing in case a new eligible process appears
until accumulated thrashing is less that the limit or we miss an
entire 1 sec window.
Bug: 163134367
Test: heavy loading launch
Signed-off-by: Martin Liu <liumartin@google.com>
Merged-In: Ie9f4121ea604179c0ad510cc8430e7a6aec6e6b2
Change-Id: Ie9f4121ea604179c0ad510cc8430e7a6aec6e6b2
This reverts commit 95551f816a.
Reason to revert: don't need this change.
Bug: 163134367
Signed-off-by: Martin Liu <liumartin@google.com>
Change-Id: I8b209b054b6caec553bce13cd51f931401c1e42a
If the first PSI event triggers a kill, lmkd won't resume polling
immediately after the process has died. Instead, it will wait until the
next PSI event to resume the polling which is too late when the device
is under memory pressure. This happens if data communication with AMS
happens after previous polling window expired, in which case paused
handler gets reset and polling does not resume after the kill.
Fix this by changing pause handler reset logic.
Bug: 167562248
Test: memory pressure test
Signed-off-by: Martin Liu <liumartin@google.com>
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: I10c65c85b718a656e3d8991bf09948b96da895cb
It seems we have chance that file_base_lru is zero.
Avoid it by adding 1.
Bug: 167660459
Test: boot
Signed-off-by: Martin Liu <liumartin@google.com>
Change-Id: If19dbbaafe6cd28a9d5b7f8a002f3cd33daab5e7
When a device is thrashing the file cache, workingset refaults can
grow slowly because of variant reasons. Current thrashing detection
mechanism could reset the thrashing counter frequently as it relies
on presence of reclaim activity, however refaults can keep increasing
even when the device is not actively reclaiming. In addition, the
thrashing counter gets reset when conditions require a kill but lmkd
could not find an eligible process to be killed. This is problematic
because when this happens thrashing is being ignored.
Use a fixed 1 sec periods to aggregate the thrashing counter. Also we
need to keep monitoring thrashing counter while retrying as someone
could release the memory to mitigate the thrashing. If thrashing
counter is greater than the limit at the end of the 1 sec period this
means lmkd failed to find an eligible process to kill. In this case
we store accumulated thrashing in case a new eligible process appears
until accumulated thrashing is less that the limit or we miss an
entire 1 sec window.
Bug: 163134367
Test: heavy loading launch
Signed-off-by: Martin Liu <liumartin@google.com>
Change-Id: Ie9f4121ea604179c0ad510cc8430e7a6aec6e6b2