I'm debugging a system load problem that a customer encounters on their production system and they've made a test application that simulates the load to reproduce the problem:
In this particular workload, one of the things the coder did was to:
while(1)
initialize inotify
watch a directory for events
receive event
process event
remove watch
close inotify fd
Strangely enough, the high system load comes from the close()
of the inotify fd:
inotify_init() = 4 <0.000020>
inotify_add_watch(4, "/mnt/tmp/msys_sim/QUEUES/Child_032", IN_CREATE) = 1 <0.059537>
write(1, "Child [032] sleeping\n", 21) = 21 <0.000012>
read(4, "\1\0\0\0\0\1\0\0\0\0\0\0\20\0\0\0SrcFile.b8tlfT\0\0", 512) = 32 <0.231012>
inotify_rm_watch(4, 1) = 0 <0.000044>
close(4) = 0 <0.702530>
open("/mnt/tmp/msys_sim/QUEUES/Child_032", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 4 <0.000031>
lseek(4, 0, SEEK_SET) = 0 <0.000010>
getdents(4, /* 3 entries */, 32768) = 88 <0.000048>
getdents(4, /* 0 entries */, 32768) = 0 <0.000009>
write(1, "Child [032] dequeue [SrcFile.b8t"..., 37) = 37 <0.000011>
unlink("/mnt/tmp/msys_sim/QUEUES/Child_032/SrcFile.b8tlfT") = 0 <0.059298>
lseek(4, 0, SEEK_SET) = 0 <0.000011>
getdents(4, /* 2 entries */, 32768) = 48 <0.000038>
getdents(4, /* 0 entries */, 32768) = 0 <0.000009>
close(4) = 0 <0.000012>
inotify_init() = 4 <0.000020>
inotify_add_watch(4, "/mnt/tmp/msys_sim/QUEUES/Child_032", IN_CREATE) = 1 <0.040385>
write(1, "Child [032] sleeping\n", 21) = 21 <0.000903>
read(4, "\1\0\0\0\0\1\0\0\0\0\0\0\20\0\0\0SrcFile.mQgUSh\0\0", 512) = 32 <0.023423>
inotify_rm_watch(4, 1) = 0 <0.000012>
close(4) = 0 <0.528736>
What could possibly be causing the close() call to take such an enormous amount of time? I can identify two possible things:
- closing and reinitializing inotify every time
- There are 256K files (flat) in
/mnt/tmp/msys_sim/SOURCES
and a particular file in/mnt/tmp/msys_sim/QUEUES/Child_032
is hardlinked to one in that directory. But SOURCES is never opened by the above process
Is it an artifact of using inotify wrong? What can I point at to say "What you're doing is WRONG!"?
Output of perf top
(I had been looking for this!)
Events: 109K cycles
70.01% [kernel] [k] _spin_lock
24.30% [kernel] [k] __fsnotify_update_child_dentry_flags
2.24% [kernel] [k] _spin_unlock_irqrestore
0.64% [kernel] [k] __do_softirq
0.60% [kernel] [k] __rcu_process_callbacks
0.46% [kernel] [k] run_timer_softirq
0.40% [kernel] [k] rcu_process_gp_end
Sweet! I suspect a spinlock somewhere and the entire system goes highly latent when this happens.
Usually the pseudo code
inotify
loop would look like this:There is no need to remove the watch and reinitialize inotify on every loop.