rss memory for process is much lower than the rss in cgroup memory.stat

148 Views Asked by At

We have a Go program (Go version 1.19 run in k8s pod, the RSS for the process show on node (ps / top) is much lower than the value shown in cgroup memory.stat.

This is the memory.stat in cgroup, the RSS is 7860129792, and rss_huge is 5066719232.

$ cat memory.stat

cache 547885056
rss 7860129792    <--  the rss in cgroup is much higher than the value in "ps -aux"
rss_huge 5066719232  <-- notice that there is also high rss_huge
shmem 0
mapped_file 0
dirty 20480
writeback 0
swap 0
pgpgin 450943252
pgpgout 450125090
pgfault 1097413913
pgmajfault 0
inactive_anon 0
active_anon 7859318784
inactive_file 546922496
active_file 962560
unevictable 0
hierarchical_memory_limit 12884901888
hierarchical_memsw_limit 12884901888
total_cache 547885056
total_rss 7860129792
total_rss_huge 5066719232
total_shmem 0
total_mapped_file 0
total_dirty 20480
total_writeback 0
total_swap 0
total_pgpgin 450943252
total_pgpgout 450125090
total_pgfault 1097413913
total_pgmajfault 0
total_inactive_anon 0
total_active_anon 7859318784
total_inactive_file 546922496
total_active_file 962560
total_unevictable 0

docker stats shows almost the same as cgroup.

$docker stats c39bc01d525e

CONTAINER ID          CPU %               MEM USAGE / LIMIT   MEM %               NET I/O             BLOCK I/O           PIDS
c39bc01d525e          49.27%              7.88GiB / 12GiB            65.67%              0B / 0B             0B / 24.6kB         106

However, there are 3 corresponding processes managed by this cgroup. The major process is pid 496687, its RSS is only 5205340, much lower than the one in cgroup and docker stats.

$ cat cgroup.procs 
496644
496687
496688

$ ps -aux | grep -E "496644|496687|496688"

USER        PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root     496644  0.0  0.0   1604  1464 ?        Ss   Oct28   0:00 sh ./bin/start.sh
root     496687 26.5  0.4 6466348 5205340 ?     Sl   Oct28 7271:55 /go/release/bin/golang-app
root     496688  0.0  0.0   1588   608 ?        S    Oct28   0:31 tail -f /dev/null

I also check the smaps for more details, the sum value of Rss / AnonHugePages / Size shows as below, the value is closed to the one show in ps, and still quite lower than the one in cgoup memeory.stat

sum for Rss:

$ cat /proc/496687/smaps | grep Rss | awk -F':' '{print $2 }' | awk 'BEGIN {sum=0} {sum+=$1} END {print sum}'
4645704

sum for AnonHugePages:

$ cat /proc/496687/smaps | grep AnonHugePages | awk -F':' '{print $2 }' | awk 'BEGIN {sum=0} {sum+=$1} END {print sum}'
524288

sum for Size:

$ cat /proc/496687/smaps | grep -E "^Size:" | awk -F':' '{print $2 }' | awk 'BEGIN {sum=0} {sum+=$1} END {print sum}'
6466352

Here is the setting for THP:

$ cat /sys/kernel/mm/transparent_hugepage/enabled 
[always] madvise never

$ cat /sys/kernel/mm/transparent_hugepage/defrag 
always defer defer+madvise [madvise] never

$ cat /sys/kernel/mm/transparent_hugepage/shmem_enabled 
always within_size advise [never] deny force

and the system info as below, we also turn off the swap so there is no swap cache:

$uname -a
Linux 4.14.15-1.el7.elrepo.x86_64 #1 SMP Tue Jan 23 20:28:26 EST 2018 x86_64 x86_64 x86_64 GNU/Linux

$cat /etc/redhat-release 
Red Hat Enterprise Linux Server release 7.9 (Maipo)

So what may cause the RSS in ps / top much lower than the one in cgroup memory.stat?

As per the doc in https://kernel.org/doc/Documentation/cgroup-v1/memory.txt, the rss in cgroup memroy.stat includes "transparent hugepages", is THP also be counted into the RSS show in ps / top?

If it's caused by THP, what's its mechanism for impacting the memory accounting?

0

There are 0 best solutions below