Running asyncio.subprocess.Process from Tornado RequestHandler

1k Views Asked by At

I'm trying to write a Tornado web app which runs a local command asynchronously, as a coroutine. This is the stripped down example code:

#! /usr/bin/env python3

import shlex
import asyncio
import logging

from tornado.web import Application, url, RequestHandler
from tornado.httpserver import HTTPServer
from tornado.ioloop import IOLoop

logging.getLogger('asyncio').setLevel(logging.DEBUG)


async def run():
    command = "python3 /path/to/my/script.py"
    logging.debug('Calling command: {}'.format(command))
    process = asyncio.create_subprocess_exec(
        *shlex.split(command),
        stdout=asyncio.subprocess.PIPE,
        stderr=asyncio.subprocess.STDOUT
    )
    logging.debug('  - process created')

    result = await process
    stdout, stderr = result.communicate()
    output = stdout.decode()
    return output

def run_sync(self, path):
    command = "python3 /path/to/my/script.py"
    logging.debug('Calling command: {}'.format(command))
    try:
        result = subprocess.run(
            *shlex.split(command),
            stdout=subprocess.PIPE,
            stderr=subprocess.STDOUT,
            check=True
        )
    except subprocess.CalledProcessError as ex:
        raise RunnerError(ex.output)
    else:
        return result.stdout


class TestRunner(RequestHandler):

    async def get(self):
        result = await run()
        self.write(result)

url_list = [
    url(r"/test", TestRunner),
]
HTTPServer(Application(url_list, debug=True)).listen(8080)
logging.debug("Tornado server started at port {}.".format(8080))
IOLoop.configure('tornado.platform.asyncio.AsyncIOLoop')
IOLoop.instance().start()

When /path/to/my/script.py is called directly it executes as expected. Also, when I have TestHandler.get implemented as a regular, synchronous method (see run_sync), it executes correctly. However, when running the above app and calling /test, the log shows:

DEBUG:asyncio:Using selector: EpollSelector
DEBUG:asyncio:execute program 'python3' stdout=stderr=<pipe>
DEBUG:asyncio:process 'python3' created: pid 21835

However, ps shows that the process hanged:

$ ps -ef | grep 21835
berislav 21835 21834  0 19:19 pts/2    00:00:00 [python3] <defunct>

I have a feeling that I'm not implementing the right loop, or I'm doing it wrong, but all the examples I've seen show how to use asyncio.get_event_loop().run_until_complete(your_coro()), and I couldn't find much about combining asyncio and Tornado. All suggestions welcome!

1

There are 1 best solutions below

2
On BEST ANSWER

Subprocesses are tricky because of the singleton SIGCHLD handler. In asyncio, this means that they only work with the "main" event loop. If you change tornado.ioloop.IOLoop.configure('tornado.platform.asyncio.AsyncIOLoop') to tornado.platform.asyncio.AsyncIOMainLoop().install(), then the example works. A few other cleanups were also necessary; here's the full code:

#! /usr/bin/env python3

import shlex
import asyncio
import logging

import tornado.platform.asyncio
from tornado.web import Application, url, RequestHandler
from tornado.httpserver import HTTPServer
from tornado.ioloop import IOLoop

logging.getLogger('asyncio').setLevel(logging.DEBUG)

async def run():
    command = "python3 /path/to/my/script.py"
    logging.debug('Calling command: {}'.format(command))
    process = await asyncio.create_subprocess_exec(
        *shlex.split(command),
        stdout=asyncio.subprocess.PIPE,
        stderr=asyncio.subprocess.STDOUT
    )
    logging.debug('  - process created')

    result = await process.wait()
    stdout, stderr = await process.communicate()
    output = stdout.decode()
    return output

tornado.platform.asyncio.AsyncIOMainLoop().install()
IOLoop.instance().run_sync(run)

Also note that tornado has its own subprocess interface in tornado.process.Subprocess, so if that's the only thing you need asyncio for, consider using the Tornado version instead. Be aware that combining Tornado and asyncio's subprocesses interfaces in the same process may produce conflicts with the SIGCHLD handler, so you should pick one or the other, or use the libraries in such a way that the SIGCHLD handler is unnecessary (for example by relying solely on stdout/stderr instead of the process's exit status).