Members not joining JGroups cluster for ActiveMQ Artemis on Azure

630 Views Asked by At

I am trying to setup an ActiveMQ Artemis cluster (Wildfly 18) that runs on Azure. Based on all I have read, UDP discovery does not work on Azure and you have to configure JGroups to use azure-ping. My JGroups config is:

<subsystem xmlns="urn:jboss:domain:jgroups:7.0">
    <channels default="activemq_channel">
        <channel name="activemq_channel" stack="udp" cluster="activemq-cluster"/>
    </channels>
    <stacks>
        <stack name="udp">
            <transport type="UDP" socket-binding="jgroups-udp"/>
            <protocol type="azure.AZURE_PING">
                <property name="storage_account_name">${jboss.jgroups.azure_ping.storage_account_name}</property>
                <property name="storage_access_key">${jboss.jgroups.azure_ping.storage_access_key}</property>
                <property name="container">${jboss.jgroups.azure_ping.container}</property>
            </protocol>
            <protocol type="MERGE3"/>
            <socket-protocol type="FD_SOCK" socket-binding="jgroups-udp-fd"/>
            <protocol type="FD_ALL"/>
            <protocol type="VERIFY_SUSPECT"/>
            <protocol type="pbcast.NAKACK2"/>
            <protocol type="UNICAST3"/>
            <protocol type="pbcast.STABLE"/>
            <protocol type="pbcast.GMS"/>
            <protocol type="UFC"/>
            <protocol type="FRAG3"/>
        </stack>
        <stack name="tcp">
            <transport type="TCP" socket-binding="jgroups-tcp"/>
            <protocol type="azure.AZURE_PING">
                <property name="storage_account_name">${jboss.jgroups.azure_ping.storage_account_name}</property>
                <property name="storage_access_key">${jboss.jgroups.azure_ping.storage_access_key}</property>
                <property name="container">${jboss.jgroups.azure_ping.container}</property>
            </protocol>
            <protocol type="MERGE3"/>
            <socket-protocol type="FD_SOCK" socket-binding="jgroups-tcp-fd"/>
            <protocol type="FD_ALL"/>
            <protocol type="VERIFY_SUSPECT"/>
            <protocol type="pbcast.NAKACK2"/>
            <protocol type="UNICAST3"/>
            <protocol type="pbcast.STABLE"/>
            <protocol type="pbcast.GMS"/>
            <protocol type="FRAG3"/>
        </stack>
    </stacks>
</subsystem>

I start up the first server and I see it create a new cluster. Server W101:

2020-10-08 15:04:16,405 DEBUG [org.jgroups.protocols.pbcast.GMS] (ServerService Thread Pool -- 84) address=w101, cluster=activemq-cluster, physical address=10.100.20.7:55200
2020-10-08 15:04:16,671 INFO  [org.jgroups.protocols.pbcast.GMS] (ServerService Thread Pool -- 84) w101: no members discovered after 266 ms: creating cluster as first member
2020-10-08 15:04:16,686 DEBUG [org.jgroups.protocols.pbcast.NAKACK2] (ServerService Thread Pool -- 84) 
[w101 setDigest()]
existing digest:  []
new digest:       w101: [0 (0)]
resulting digest: w101: [0 (0)]
2020-10-08 15:04:16,686 DEBUG [org.jgroups.protocols.pbcast.GMS] (ServerService Thread Pool -- 84) w101: installing view [w101|0] (1) [w101]
2020-10-08 15:04:16,702 DEBUG [org.jgroups.protocols.pbcast.GMS] (ServerService Thread Pool -- 84) w101: created cluster (first member). My view is [w101|0], impl is org.jgroups.protocols.pbcast.CoordGmsImpl

Then I start the second server and I see it try to join but it logs an error and eventually it fails and creates a separate cluster.

Server 2 log...Server W102:

2020-10-08 15:13:06,931 DEBUG [org.jgroups.protocols.pbcast.GMS] (ServerService Thread Pool -- 84) address=w102, cluster=activemq-cluster, physical address=10.100.20.8:55200
2020-10-08 15:13:06,994 DEBUG [org.jgroups.protocols.pbcast.GMS] (ServerService Thread Pool -- 84) w102: sending JOIN(w102) to w101
2020-10-08 15:13:10,009 WARN  [org.jgroups.protocols.pbcast.GMS] (ServerService Thread Pool -- 84) w102: JOIN(w102) sent to w101 timed out (after 3000 ms), on try 0
2020-10-08 15:13:10,009 DEBUG [org.jgroups.protocols.pbcast.GMS] (ServerService Thread Pool -- 84) w102: sending JOIN(w102) to w101
2020-10-08 15:13:13,025 WARN  [org.jgroups.protocols.pbcast.GMS] (ServerService Thread Pool -- 84) w102: JOIN(w102) sent to w101 timed out (after 3000 ms), on try 1

The server 1 logs during the join show this. Server W101:

2020-10-08 15:13:07,447 DEBUG [org.jgroups.protocols.pbcast.GMS] (thread-9,activemq-cluster,w101) w101: installing view [w101|1] (2) [101, 65cdad94-a068-ee86-af75-5cff9c7afa42]
2020-10-08 15:13:07,447 DEBUG [org.jgroups.protocols.FD_SOCK] (FD_SOCK pinger-15,activemq-cluster,w101) w101: pingable_mbrs=[w101, 65cdad94-a068-ee86-af75-5cff9c7afa42], ping_dest=65cdad94-a068-ee86-af75-5cff9c7afa42
2020-10-08 15:13:07,447 WARN  [org.jgroups.protocols.UDP] (TQ-Bundler-4,activemq-cluster,w053tomreg101) JGRP000032: w053tomreg101: no physical address for 65cdad94-a068-ee86-af75-5cff9c7afa42, dropping message
2020-10-08 15:13:09,478 WARN  [org.jgroups.protocols.UDP] (TQ-Bundler-4,activemq-cluster,w053tomreg101) JGRP000032: w053tomreg101: no physical address for 65cdad94-a068-ee86-af75-5cff9c7afa42, dropping message
2020-10-08 15:13:09,494 DEBUG [org.jgroups.protocols.FD_SOCK] (FD_SOCK pinger-15,activemq-cluster,w101) w101: pingable_mbrs=[w101, 65cdad94-a068-ee86-af75-5cff9c7afa42], ping_dest=65cdad94-a068-ee86-af75-5cff9c7afa42
2020-10-08 15:13:09,603 WARN  [org.jgroups.protocols.pbcast.GMS] (thread-9,activemq-cluster,w101) w101: failed to collect all ACKs (expected=1) for view [w101|1] after 2156 ms, missing 1 ACKs from (1) 65cdad94-a068-ee86-af75-5cff9c7afa42
2020-10-08 15:13:11,541 DEBUG [org.jgroups.protocols.FD_SOCK] (FD_SOCK pinger-15,activemq-cluster,w101) w101: pingable_mbrs=[w101, 65cdad94-a068-ee86-af75-5cff9c7afa42], ping_dest=65cdad94-a068-ee86-af75-5cff9c7afa42
2020-10-08 15:13:11,541 WARN  [org.jgroups.protocols.UDP] (TQ-Bundler-4,activemq-cluster,w101) JGRP000032: w101: no physical address for 65cdad94-a068-ee86-af75-5cff9c7afa42, dropping message

Am I missing some additional configuration?

0

There are 0 best solutions below