We have a large number of linux machines as a compute farm. We launch jobs on the farm using LSF. Sporadically, and randomly, some job is creating and delete thousands of 'tmp' files in my home directly:
ls /home/cpp_home/tmp* ---------- 1 cpp_home dev 0 Dec 10 14:25 tmpxJL9In -rw------- 1 cpp_home dev 0 Dec 10 14:25 tmpnvAtiS -rw------- 1 cpp_home dev 0 Dec 10 14:25 tmphSrnk7 -rw------- 1 cpp_home dev 0 Dec 10 14:25 tmpJFO5Cr ---------- 1 cpp_home dev 0 Dec 10 14:25 tmpRIzn7A -rw------- 1 cpp_home dev 0 Dec 10 14:25 tmpvulwsT ---------- 1 cpp_home dev 0 Dec 10 14:25 tmpeSz_gN ---------- 1 cpp_home dev 0 Dec 10 14:25 tmpEcatTM -rw------- 1 cpp_home dev 0 Dec 10 14:25 tmpOy1jdi ---------- 1 cpp_home dev 0 Dec 10 14:26 tmp4oB8ua
How the hell can I found out what process is doing this? They look suspiciously like std 'C' library tempfile, or standard python tempfiles.... but since they don't stick around for long I can't find out what job (of the thousands that are running via LSF) are creating them.
I don't have source code for all the jobs... There is a great deal of third party CAD/EDA tools in use, so it could be one of them. Or it could be perl, or python scripts, or...
If the variety of jobs is high, this could be difficult to find; however, if the job types are not too diverse, you should be able to correlate the times that you find these being created with jobs owned by the user owning the files. By running similar jobs in isolation, you could then test to verify the behavior.
There is perhaps though another possibility. Depending on LSF settings some output/logging files may be created as temp files and copied into their final location. I could imagine this may explain the phenomenon, but normally these would fall under a $HOME/.lsbatch directory. Settings might be able to adjust this location. See this text from the BSUB command reference: