ipcluster command not creating full set of engines

890 Views Asked by At

Using Ubuntu 12.04 I am trying to set up a LAN cluster. The details:

Controller Config

# Configuration file for ipcontroller.

c = get_config()
c.IPControllerApp.reuse_files = True
c.IPControllerApp.engine_ssh_server = u'bar@bar1'
c.HubFactory.ip = '*'
c.HubFactory.db_class = 'NoDB'

Cluster Config

# Configuration file for ipcluster.

c = get_config()
c.IPClusterEngines.engine_launcher_class = 'SSH'
c.SSHEngineSetLauncher.engine_args = ['--profile-dir=~/.config/ipython/profile_foo']
c.SSHEngineSetLauncher.engines = {'foo@foo1' : 1, 'foo@foo2' : 1, 'foo@foo3' : 1, 'foo@foo4' : 1}

Engine config

# Configuration file for ipengine.

c = get_config()
c.EngineFactory.timeout = 10

So, then running

ipcluster start --profile=foo --debug

yields the following:

2013-09-03 19:43:45.772 [IPClusterStart] Process 'ssh' started: 5198
2013-09-03 19:43:45.773 [IPClusterStart] Process 'engine set' started: [None, None, None, None]
2013-09-03 19:43:47.086 [IPClusterStart] 2013-09-03 19:44:02.726 [IPEngineApp] Completed registration with id 0
2013-09-03 19:43:47.795 [IPClusterStart] 2013-09-03 19:43:53.737 [IPEngineApp] Completed registration with id 1
2013-09-03 19:43:48.561 [IPClusterStart] 2013-09-03 19:43:59.793 [IPEngineApp] Completed registration with id 2
2013-09-03 19:43:49.667 [IPClusterStart] 2013-09-03 19:44:03.859 [IPEngineApp] Completed registration with id 3
2013-09-03 19:44:15.773 [IPClusterStart] Engines appear to have started successfully

Looks good to me. But when I try to connect with a Client, I get less than the anticipated number of engines. This occurs even for 1 or 2 engines running on a single remote machine

In [22]: rc=Client(profile='foo')

In [23]: rc.ids
Out[23]: [1, 2]

I set the timeout high in case that was the issue, but it persists.

If I run ipcontroller and ipengines separately, the process succeeds, but I would really prefer being able to start and stop a cluster with ipcluster.

0

There are 0 best solutions below