I am trying to terminate whole supervision tree from a supervised worker process. Here is my supervision tree:
+--------------------------+
| |
+--------+ Sup1: Dynamic Supervisor +---------+
| | | |
| +-------------+------------+ |
| | |
| | |
v v v
+------------------+ +------------------+ +------------------+
| | | | | |
| Job1: Supervisor | | Job2: Supervisor | | Job3: Supervisor |
| | | | | |
+------------------+ +-+-------- +---+--+ +------------------+
| |
| |
| |
| |
v v
+-------------------+ +--------------+
| | | |
| Progress Monitor: | | Work: Worker |
| Worker | | |
| | +--------------+
+-------------------+
Process life cycle:
- A
Job
is started via:DynamicSupervisor.start_child(__MODULE__, spec)
- Each job is a supervision tree as well: 1 supervisor (restart strategy -
one_for_one
) -> 2 workers Progress Monitor
worker knows when the given job is done- On job done,
Progress Monitor
worker makes an attempt to terminate the whole job supervision tree, by calling:DynamicSupervisor.terminate_child(__MODULE__, pid)
Progress Monitor
is expected to do cleanup steps interminate
callback - it is trapping exit signals
Problems and observations:
DynamicSupervisor.terminate_child
is a blocking call, which means it waits for all child processes to terminate as well, including the calling process -Progress Monitor
Progress Monitor
is in a deadlock and can not terminate. Parent supervisor sends:kill
signal, which does not triggerterminate
callback
Quick workarounds:
Call
DynamicSupervisor.terminate_child
fromProgress Monitor
worker asynchronously:spawn(fn -> DynamicSupervisor.terminate_child(__MODULE__, pid) end)
Define shutdown strategy for
Sup1: Dynamic Supervisor
:shutdown: 5_000
It will wait at most 5s for a job supervision tree termination and then it will send
shutdown
exit signal. This will ensureterminate
callback being called forProgress Monitor
process.
Not happy with both of them.
Questions:
- How to trigger supervision tree termination from a worker process and avoid deadlocks?
- If terminating supervision tree from a worker is not the best practice, what is the recommended way then?
- Any recommendations how to redesign supervision tree to make graceful termination easier?
Just call it in async task
Task.async(fn -> Process.exit(Sup1, :shutdown) end)
it will terminate Sup1 and with it all children will shutdownEDIT:
If you need prettier solution, it depends what elese you need. In most cases, I create Bootstrapper worker that will do initialization and some other stuff. You could add easily other features.
So considering above, and just roughly speaking, I would add in a layer above (
AppSupervisor
), Another DynamicSupervisor so it can start Bootstrapper and passself()
to it (or register it under local name to avoid this injection). After that, on start, Bootstrap worker will start Sup1 (your dynamic supervisor) and await for other messages, e.g.:terminate_sup1
that will shutdownSup1
process. Later, in some of below workers you can shutdownSup1
by casting:terminate_sup1
message to bootstraper. Also there is a door that allow you to start again Sup1 when another message is sent to bootstrap worker.Further more, if you just need to shutdown Sup1, just go with Task. But if you need control, then put it into single worker process that should have control over it, when it is up or down.