Cannot add existing Cassandra 2.0.0 cluster to OpsCenter

1.2k Views Asked by At

I have recently upgraded a small development cluster to Cassandra 2.0.0 from 1.2.9. I used DataStax OpsCenter free edition and it worked OK before. After the upgrade it refused to see the cluster - it was showing the cluster name with 0 nodes alive. Trying to stop/start agents etc changed nothing. I ended up deleting OpsCenter keyspace and reinstalling opscenter from scratch. But the problem is still there - I cannot add the running cluster. When I try to do it as "existing cluster" and click "Save" button I get "Error creating cluster: Call to /cluster-configs timed out." message in about 20-30 seconds.

I did some digging and found that OpsCenter never responds to the HTTP POST:

{"cassandra":{"seed_hosts":"10.X.Y.Z","api_port":"9160","username":"","password":""},"jmx":{"port":"7199","username":"","password":""},"agents":{}}'

sent to http://:8888/cluster-configs

This is what I see in the opscenter logs:

2013-09-11 19:40:19+0000 [] DEBUG: Trying to connect to node XXXXXX over thrift
2013-09-11 19:40:19+0000 [] DEBUG: Not returning SASL credentials for XXXXXXX
2013-09-11 19:40:19+0000 []  INFO: Starting factory <opscenterd.ThriftService.NoReconnectCassandraClientFactory instance at 0x2b3d3f8>
2013-09-11 19:40:19+0000 [] DEBUG: Node ping successful: XXXXXXXX
2013-09-11 19:40:19+0000 []  INFO: Adding new cluster 'my-cluster-name': {u'jmx': {u'username': u'', u'password': u'', u'port': u'7199'}, 'kerberos_client_principals': {}, 'kerberos': {}, u'agents': {}, 'kerberos_hostnames': {}, 'kerberos_services': {}, u'cassandra': {u'username': u'', u'seed_hosts': u'XXXXXXXX', u'api_port': u'9160', u'password': u''}}
2013-09-11 19:40:19+0000 []  INFO: Starting new cluster services for my-cluster-name
2013-09-11 19:40:19+0000 [my-cluster-name]  INFO: Starting services for cluster my-cluster-name
2013-09-11 19:40:19+0000 [my-cluster-name] DEBUG: keyspace type system are {'system': [u'system', u'system_traces', u'system_auth', u'dse_auth']}
2013-09-11 19:40:19+0000 [] DEBUG: Not using SSL for Thrift communication
2013-09-11 19:40:19+0000 [] DEBUG: ignored_keyspaces are [u'system', u'system_traces', u'system_auth', u'dse_auth']
2013-09-11 19:40:19+0000 [] DEBUG: Not using Kerberos authentication for Thrift
2013-09-11 19:40:19+0000 [my-cluster-name] DEBUG: Not using separate storage cluster
2013-09-11 19:40:19+0000 []  INFO: Metric caching enabled with 50 points and 1000 metrics cached
2013-09-11 19:40:19+0000 []  INFO: Starting PushService
2013-09-11 19:40:19+0000 [my-cluster-name] DEBUG: Adding connection to <CassandraNode XXXXXXXX:9160 @0x2ac18c0>
2013-09-11 19:40:19+0000 [my-cluster-name] DEBUG: Adding connection to <CassandraNode XXXXXXXX:9160 @0x2ac18c0>
2013-09-11 19:40:19+0000 [my-cluster-name] DEBUG: Adding connection to <CassandraNode XXXXXXXX:9160 @0x2ac18c0>
2013-09-11 19:40:19+0000 [my-cluster-name] DEBUG: Adding connection to <CassandraNode XXXXXXXX:9160 @0x2ac18c0>
2013-09-11 19:40:19+0000 [my-cluster-name] DEBUG: Adding connection to <CassandraNode XXXXXXXX:9160 @0x2ac18c0>
2013-09-11 19:40:19+0000 [my-cluster-name] DEBUG: Waiting for describe_version() results
2013-09-11 19:40:19+0000 [my-cluster-name]  INFO: Starting CassandraCluster service
2013-09-11 19:40:19+0000 [my-cluster-name]  INFO: agent_config items: {'cassandra_log_location': '/var/log/cassandra/system.log', 'thrift_port': 9160, 'thrift_ssl_truststore': None, 'rollups300_ttl': 2419200, 'rollups86400_ttl': -1, 'jmx_port': 7199, 'metrics_ignored_solr_cores': '', 'api_port': '61621', 'metrics_enabled': 1, 'thrift_ssl_truststore_type': 'JKS', 'kerberos_use_ticket_cache': True, 'kerberos_renew_tgt': True, 'rollups60_ttl': 604800, 'cassandra_install_location': '', 'rollups7200_ttl': 31536000, 'kerberos_debug': False, 'storage_keyspace': 'OpsCenter', 'ec2_metadata_api_host': '169.254.169.254', 'provisioning': 0, 'kerberos_use_keytab': True, 'metrics_ignored_column_families': '', 'thrift_ssl_truststore_password': None, 'metrics_ignored_keyspaces': 'system, system_traces, system_auth, dse_auth, OpsCenter'}
2013-09-11 19:40:19+0000 []  INFO: Stopping factory <opscenterd.ThriftService.NoReconnectCassandraClientFactory instance at 0x2b3d3f8>
2013-09-11 19:41:07+0000 [] DEBUG: Average opscenterd CPU usage: 0.40%, memory usage: 38 MB
2013-09-11 19:42:07+0000 [] DEBUG: Average opscenterd CPU usage: 0.02%, memory usage: 38 MB

I did some tcpdump'ing on the seed host and I do see Thrift traffic, quite a bit, in fact. Nobody else is using the cluster right now and this traffic is certainly from the opscenter.

Cassandra seems to be alive, responds to the queries, does not show anything disturbing in the logs.

Any ideas what causes these problems with opscenter? DataStax claims they support Cassandra 2.0.0.

3

There are 3 best solutions below

0
On

What version of OpsCenter are you using? You need to be on the very latest 3.2.2 version to have 2.0 work with it.

4
On

DataStax only officially supports versions of Cassandra that come with DataStax Enterprise. The current version of Cassandra packaged in DataStax Enterprise is 1.2.x, which is why OpsCenter worked with that version of Apache Cassandra. OpsCenter doesn't work with Apache Cassandra 2.0 yet, but we are working it to prepare for when DataStax Enterprise does support it.

0
On

I had a similar problem when switching to Cassandra 2.0.1 and Opscenter 3.2.2. I found it was related to the rpc_server_type in cassandra.yaml. If the rpc_server_type is set to HSHA then opscenter has a problem connecting to the cluster. When I switched it to sync, opscenter connected just fine. Hope that helps.