I have developed a simple kernel module to access system registers.
This module sets up a timer with an expiration time of 10 microseconds.
Upon timer expiration, the module's timer handler function is invoked,
which reads the contents of the system registers and logs them into the
trace log file (/sys/kernel/debug/tracing/trace) using the trace_printk() statement.
//Global variables
int time_interval = 10;
struct timer_list g_timer;
/* timer handler function */
void TimerHandler(unsigned long data)
{
int reg1, reg2;
/*
Code to read system registers...
reg1=...
reg2=...
*/
/* Reset the Timer */
mod_timer(&g_timer, jiffies + msecs_to_jiffies(time_interval));
trace_printk("%d %d:\n", reg1, reg2);
}
I have written two functions to start/stop the periodic timer. The start_timer() function is written as below:
/* Start the periodic timer */
setup_timer(&g_timer, TimerHandler, 0);
mod_timer(&g_timer, jiffies + msecs_to_jiffies(time_interval));
The piece of code in stop_timer() is as follows:
del_timer_sync(&g_timer);
I have created a debugfs file named "start" which starts/stops the timer.
Writing 1 is for starting the timer and 0 is for stopping the timer.
I have written a shell script to test the above kernel module:
echo "enabling kernel tracing "
echo 1 > /sys/kernel/debug/tracing/tracing_on
echo 1 > /sys/kernel/debug/kmodule/start
echo "starting the matmul prgram "
taskset -c 0 ./matrixMul -wA=64 -hA=64 -wB=64 -hB=64 & PID_TO_WAIT=$!
wait $PID_TO_WAIT
echo "stopping kernel module"
echo 0 > /sys/kernel/debug/kmodule/start
echo "Disable kernel tracing"
echo 0 > /sys/kernel/debug/tracing/tracing_on
echo "copying kernel trace log file "
echo -n > mem_cycles.txt
cp /sys/kernel/debug/tracing/trace mem_cycles.txt
#clear the kernel trace file
echo > /sys/kernel/debug/tracing/trace
I have a few queries about the above setup:
Query 1: The kernel trace log file (mem_cycles.txt file):
During a few executions of the above script program, I get the following log traces:
# tracer: nop
#
# entries-in-buffer/entries-written: 8/8 #P:4
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / delay
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
# | | | |||| | |
matrixMul-7455 [000] ..s1 2266.295984: TimerHandler: 435 204
<idle>-0 [000] ..s1 2266.311961: TimerHandler: 143 0
<idle>-0 [000] .ns1 2266.327974: TimerHandler: 726 488
matrixMul-7455 [000] ..s1 2266.343976: TimerHandler: 665 414
matrixMul-7455 [000] ..s. 2266.359970: TimerHandler: 386 150
matrixMul-7455 [000] ..s. 2266.375972: TimerHandler: 442 192
<idle>-0 [000] .ns1 2266.391974: TimerHandler: 3271 3014
<idle>-0 [000] .ns1 2266.407964: TimerHandler: 195 0
In the above output, the first column typically displays the program name
(matrixMul),followed by the contents of the system registers as the last two entries.
However, during most runs of the above shell script, the first column often shows
-0 with no entries matrixMul program. Sample log file is as below:
# tracer: nop
#
# entries-in-buffer/entries-written: 8/8 #P:4
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / delay
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
# | | | |||| | |
<idle>-0 [001] ..s1 2231.424448: TimerHandler: 488 254
<idle>-0 [001] ..s1 2231.440438: TimerHandler: 129 0
<idle>-0 [001] .ns1 2231.456447: TimerHandler: 140 0
<idle>-0 [001] ..s1 2231.472443: TimerHandler: 358 126
<idle>-0 [001] ..s1 2231.492444: TimerHandler: 436 211
<idle>-0 [001] .ns1 2231.508445: TimerHandler: 377 129
mmcqd/0-1883 [001] ..s. 2231.524451: TimerHandler: 1889 1656
<idle>-0 [001] ..s1 2231.540440: TimerHandler: 210 0
How to figure out the entries for my program named matrixMul from the above output? Since the matrixMul program takes around 40 microseconds for its execution, I expect at least four entries in the log trace file for my program, which I do not get most of the time.
Query 2: About the number of entries in the kernel log file.
I have configured the timer function to expire after 10 microseconds. If I increase the matrix size to 1024 x 1024, the program execution time extends to around 70 milliseconds. Surprisingly, I found approximately 450 entries in the kernel log file, which is far less than the expected number of entries. The number of entries does not change even when setting a higher size of the kernel buffer by
echo 8192 > /sys/kernel/debug/tracing/buffer_size_kb
How can I ensure that all entries are captured in the log file?
Working on Jetson Nano device, with Ubuntu 18.04 and Linux kernel 4.9
Kindly provide your valuable input.