Understanding Process Sleep

56 Views Asked by At

I have used the Yocto project to create an embedded linux image. I have developed an application to run in my embedded linux image. In this I am using some third party vendor code to handle ethernet communications. In a section of their code is a function ReceiveRawTimeout which uses system functions such as recv and select.

When running my application we are constantly communicating over CAN and ETH. At about 170 seconds into operation the entire application stalls. All communications go down, a few second later they come back. This repeats until the application is killed. When I run the command watch cat /proc/PID/status I can see the voluntary and nonvoluntary context switch numbers pause, I also see my application state change from R to S and back to R.

All of this leads me to believe the process is waiting on IO and going to sleep until resources are available. I have a very long strace of the application running but I do not see any clear indications of an issue. To be perfectly clear I have not used strace before this and I am attempting to gather as much information as possible.

My application is built in Eclipse on Ubuntu and programmed to the device. If I am using Eclipse to run/debug my application, or if I run my application from the command line this issue does not exist. The issue only exists if I run the application on boot with a systemd service.

[Unit]
Description=Application with strace

[Service]
Type=simple
Restart=always
RestartSec=1
DefaultTimeoutStopSec=5
ExecStart=/usr/bin/strace -o /home/root/strace_log.txt -f -e trace=all /usr/bin/app

[Install]
WantedBy=multi-user.target

I believe I have found the area of code which causes the issue, I am unsure how to solve it. I have considered 3 approaches.

  1. Add nice to my image and run my application at a higher priority
  2. Reduce how often I am calling my function that calls ReceiveRawTimeout to ease the IO burden
  3. Using a non-blocking thread

Are there any other paths I can follow in solving this? Is there a way to run my application that prevents it from going to sleep?

1

There are 1 best solutions below

0
On

So this had nothing and everything to do with my application.

It turns out my application is logging too much data. JournalD is having to perform some background tasks to reclaim memory. When JournalD does this it takes 90%+ of the CPU and stalls my application. Removing a majority of my logs has resolved the issue.