Using Ubuntu 12.04 I am trying to set up a LAN cluster. The details:
Controller Config
# Configuration file for ipcontroller.
c = get_config()
c.IPControllerApp.reuse_files = True
c.IPControllerApp.engine_ssh_server = u'bar@bar1'
c.HubFactory.ip = '*'
c.HubFactory.db_class = 'NoDB'
Cluster Config
# Configuration file for ipcluster.
c = get_config()
c.IPClusterEngines.engine_launcher_class = 'SSH'
c.SSHEngineSetLauncher.engine_args = ['--profile-dir=~/.config/ipython/profile_foo']
c.SSHEngineSetLauncher.engines = {'foo@foo1' : 1, 'foo@foo2' : 1, 'foo@foo3' : 1, 'foo@foo4' : 1}
Engine config
# Configuration file for ipengine.
c = get_config()
c.EngineFactory.timeout = 10
So, then running
ipcluster start --profile=foo --debug
yields the following:
2013-09-03 19:43:45.772 [IPClusterStart] Process 'ssh' started: 5198
2013-09-03 19:43:45.773 [IPClusterStart] Process 'engine set' started: [None, None, None, None]
2013-09-03 19:43:47.086 [IPClusterStart] 2013-09-03 19:44:02.726 [IPEngineApp] Completed registration with id 0
2013-09-03 19:43:47.795 [IPClusterStart] 2013-09-03 19:43:53.737 [IPEngineApp] Completed registration with id 1
2013-09-03 19:43:48.561 [IPClusterStart] 2013-09-03 19:43:59.793 [IPEngineApp] Completed registration with id 2
2013-09-03 19:43:49.667 [IPClusterStart] 2013-09-03 19:44:03.859 [IPEngineApp] Completed registration with id 3
2013-09-03 19:44:15.773 [IPClusterStart] Engines appear to have started successfully
Looks good to me. But when I try to connect with a Client, I get less than the anticipated number of engines. This occurs even for 1 or 2 engines running on a single remote machine
In [22]: rc=Client(profile='foo')
In [23]: rc.ids
Out[23]: [1, 2]
I set the timeout high in case that was the issue, but it persists.
If I run ipcontroller
and ipengines
separately, the process succeeds, but I would really prefer being able to start and stop a cluster with ipcluster
.