I've created a cluster of EC2 instances using cfncluster and now I need to run the dispynode.py command on all the nodes.
I do that by first creating a list of private IP addresses called "workers.txt" then running the following bash command
for host in $(cat workers.txt); do
ssh $host "dispynode.py --ext_ip_addr $host &";
done
this appears to work since I get the expected dispynode output for each IP address. For example, for each IP address I'll get an output similar to this
NOTE: Using dispy port 61591 (was 51348 in earlier versions)
2019-08-22 06:07:12 dispynode - dispynode version: 4.11.0, PID: 16074
2019-08-22 06:07:12 dispynode - Files will be saved under "/tmp/dispy/node"
2019-08-22 06:07:12 pycos - version 4.8.11 with epoll I/O notifier
2019-08-22 06:07:12 dispynode - "ip-172-31-8-242" serving 8 cpus
Enter "quit" or "exit" to terminate dispynode,
"stop" to stop service, "start" to restart service,
"release" to check and close computation,
"cpus" to change CPUs used, anything else to get status:
Enter "quit" or "exit" to terminate dispynode,
"stop" to stop service, "start" to restart service,
"release" to check and close computation,
"cpus" to change CPUs used, anything else to get status:
NOTE: Using dispy port 61591 (was 51348 in earlier versions)
the problem is, when I SSH into the node and check if the process is running, it's not.
ssh 172.31.8.242
kill -0 16074
-bash: kill: (16074) - No such process
And the dispy client doesn't work and can't discover the nodes.
Question: Why isn't my parallel ssh command starting the program on the nodes and/or why doesn't the process remain running if it was started
I haven't used dispy myself, but the "Enter 'quit' or 'exit' to terminate dispynode..." message suggests that dispynode is running interactively and reading from standard input. In that case, when you close the SSH session, dispynode will read an end-of-file condition on its standard input, and it might exit when that happens.
According to the dispy documentation, dispynode has a
--daemonoption which prevents it from running interactively:So, try using the
--daemonoption:The "&" may be unnecessary here, because dispynode might put itself in the background.