I have recently upgraded a small development cluster to Cassandra 2.0.0 from 1.2.9. I used DataStax OpsCenter free edition and it worked OK before. After the upgrade it refused to see the cluster - it was showing the cluster name with 0 nodes alive. Trying to stop/start agents etc changed nothing. I ended up deleting OpsCenter keyspace and reinstalling opscenter from scratch. But the problem is still there - I cannot add the running cluster. When I try to do it as "existing cluster" and click "Save" button I get "Error creating cluster: Call to /cluster-configs timed out." message in about 20-30 seconds.
I did some digging and found that OpsCenter never responds to the HTTP POST:
{"cassandra":{"seed_hosts":"10.X.Y.Z","api_port":"9160","username":"","password":""},"jmx":{"port":"7199","username":"","password":""},"agents":{}}'
sent to http://:8888/cluster-configs
This is what I see in the opscenter logs:
2013-09-11 19:40:19+0000 [] DEBUG: Trying to connect to node XXXXXX over thrift
2013-09-11 19:40:19+0000 [] DEBUG: Not returning SASL credentials for XXXXXXX
2013-09-11 19:40:19+0000 [] INFO: Starting factory <opscenterd.ThriftService.NoReconnectCassandraClientFactory instance at 0x2b3d3f8>
2013-09-11 19:40:19+0000 [] DEBUG: Node ping successful: XXXXXXXX
2013-09-11 19:40:19+0000 [] INFO: Adding new cluster 'my-cluster-name': {u'jmx': {u'username': u'', u'password': u'', u'port': u'7199'}, 'kerberos_client_principals': {}, 'kerberos': {}, u'agents': {}, 'kerberos_hostnames': {}, 'kerberos_services': {}, u'cassandra': {u'username': u'', u'seed_hosts': u'XXXXXXXX', u'api_port': u'9160', u'password': u''}}
2013-09-11 19:40:19+0000 [] INFO: Starting new cluster services for my-cluster-name
2013-09-11 19:40:19+0000 [my-cluster-name] INFO: Starting services for cluster my-cluster-name
2013-09-11 19:40:19+0000 [my-cluster-name] DEBUG: keyspace type system are {'system': [u'system', u'system_traces', u'system_auth', u'dse_auth']}
2013-09-11 19:40:19+0000 [] DEBUG: Not using SSL for Thrift communication
2013-09-11 19:40:19+0000 [] DEBUG: ignored_keyspaces are [u'system', u'system_traces', u'system_auth', u'dse_auth']
2013-09-11 19:40:19+0000 [] DEBUG: Not using Kerberos authentication for Thrift
2013-09-11 19:40:19+0000 [my-cluster-name] DEBUG: Not using separate storage cluster
2013-09-11 19:40:19+0000 [] INFO: Metric caching enabled with 50 points and 1000 metrics cached
2013-09-11 19:40:19+0000 [] INFO: Starting PushService
2013-09-11 19:40:19+0000 [my-cluster-name] DEBUG: Adding connection to <CassandraNode XXXXXXXX:9160 @0x2ac18c0>
2013-09-11 19:40:19+0000 [my-cluster-name] DEBUG: Adding connection to <CassandraNode XXXXXXXX:9160 @0x2ac18c0>
2013-09-11 19:40:19+0000 [my-cluster-name] DEBUG: Adding connection to <CassandraNode XXXXXXXX:9160 @0x2ac18c0>
2013-09-11 19:40:19+0000 [my-cluster-name] DEBUG: Adding connection to <CassandraNode XXXXXXXX:9160 @0x2ac18c0>
2013-09-11 19:40:19+0000 [my-cluster-name] DEBUG: Adding connection to <CassandraNode XXXXXXXX:9160 @0x2ac18c0>
2013-09-11 19:40:19+0000 [my-cluster-name] DEBUG: Waiting for describe_version() results
2013-09-11 19:40:19+0000 [my-cluster-name] INFO: Starting CassandraCluster service
2013-09-11 19:40:19+0000 [my-cluster-name] INFO: agent_config items: {'cassandra_log_location': '/var/log/cassandra/system.log', 'thrift_port': 9160, 'thrift_ssl_truststore': None, 'rollups300_ttl': 2419200, 'rollups86400_ttl': -1, 'jmx_port': 7199, 'metrics_ignored_solr_cores': '', 'api_port': '61621', 'metrics_enabled': 1, 'thrift_ssl_truststore_type': 'JKS', 'kerberos_use_ticket_cache': True, 'kerberos_renew_tgt': True, 'rollups60_ttl': 604800, 'cassandra_install_location': '', 'rollups7200_ttl': 31536000, 'kerberos_debug': False, 'storage_keyspace': 'OpsCenter', 'ec2_metadata_api_host': '169.254.169.254', 'provisioning': 0, 'kerberos_use_keytab': True, 'metrics_ignored_column_families': '', 'thrift_ssl_truststore_password': None, 'metrics_ignored_keyspaces': 'system, system_traces, system_auth, dse_auth, OpsCenter'}
2013-09-11 19:40:19+0000 [] INFO: Stopping factory <opscenterd.ThriftService.NoReconnectCassandraClientFactory instance at 0x2b3d3f8>
2013-09-11 19:41:07+0000 [] DEBUG: Average opscenterd CPU usage: 0.40%, memory usage: 38 MB
2013-09-11 19:42:07+0000 [] DEBUG: Average opscenterd CPU usage: 0.02%, memory usage: 38 MB
I did some tcpdump'ing on the seed host and I do see Thrift traffic, quite a bit, in fact. Nobody else is using the cluster right now and this traffic is certainly from the opscenter.
Cassandra seems to be alive, responds to the queries, does not show anything disturbing in the logs.
Any ideas what causes these problems with opscenter? DataStax claims they support Cassandra 2.0.0.
What version of OpsCenter are you using? You need to be on the very latest 3.2.2 version to have 2.0 work with it.