python multiprocessing.Pool Too many files open logging files

1.3k Views Asked by At

This is in regards to Too many files open with multiprocessing.Pool

I have having similar problem. My setup is Ubuntu with quad core running this simple script (python 2.7 with pathos==0.2a1.dev, pathos is only being used to allow mp map to work with class method) to log a separate file for each process.

Why doesn't python close the logger file???

Here is the code...

import multiprocessing
from pathos.multiprocessing import ProcessingPool
import logging

class CreateLogger(object):
    def create_logger(self, i):
        i = str(i)
        logger = logging.getLogger(i)
        hdlr = logging.FileHandler('/tmp/test/{0}.log'.format(i))
        formatter = logging.Formatter('%(asctime)s %(levelname)s %(message)s')
        hdlr.setFormatter(formatter)
        logger.addHandler(hdlr) 
        logger.setLevel(logging.WARNING)
        logger.info('{0}-test'.format(i))
        logger.propagate = 0

if __name__ == '__main__':
    cl = CreateLogger()
    ilist = range(30000)
    pool = ProcessingPool(multiprocessing.cpu_count())
    pool.map(cl.create_logger, ilist)

Error is:

pydev debugger: starting (pid: 21825)
Traceback (most recent call last):
  File "/opt/eclipse/plugins/org.python.pydev_3.8.0.201409251235/pysrc/pydevd.py", line 2183, in <module>
    globals = debugger.run(setup['file'], None, None)
  File "/opt/eclipse/plugins/org.python.pydev_3.8.0.201409251235/pysrc/pydevd.py", line 1622, in run
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "/home/amit/workspace/amit/device_polling/tests/simple_mp_test.py", line 21, in <module>
    pool.map(cl.create_logger, ilist)
  File "/miniconda/envs/test/lib/python2.7/site-packages/pathos-0.2a1.dev-py2.7.egg/pathos/multiprocessing.py", line 123, in map
    return _pool.map(star(f), zip(*args)) # chunksize
  File "/miniconda/envs/test/lib/python2.7/site-packages/processing/pool.py", line 130, in map
    return self.mapAsync(func, iterable, chunksize).get()
  File "/miniconda/envs/test/lib/python2.7/site-packages/processing/pool.py", line 373, in get
    raise self._value
IOError: [Errno 24] Too many open files: '/tmp/test/15336.log'

You can see that the too many files open error starts at 15336. And while this script was in debug, i see that there are 16K files open for this script

amit@sharknado:/tmp/test$ lsof -u amit | grep test | wc -l
lsof: WARNING: can't stat() fuse.gvfsd-fuse file system /run/user/112/gvfs
      Output information may be incomplete.
16622
amit@sharknado:/tmp/test$
1

There are 1 best solutions below

0
On

Found my answer... It looks like, python doesn't close logging files. I need to do them manually.

1st, I tried using __del__ in a MyLogger class to close the logging file

def __del__(self):
    if self.logger:
        for hdlr in self.logger.handlers:
            self.logger.removeHandler(hdlr)
            hdlr.flush()
            hdlr.close()

I soon realized, that this wasn't getting called. I had to call .__del__() manually and that fixed the problem.

Learnt:

  1. Nothing wrong with multiprocessing; it works as expected.

  2. logging doesn't close the log file, you need to do it manually.