I'm trying to setup a cross-datacenter replication mode between two infinispans 9.4.x as per Keycloak documentation, but the thing is that I'm trying to do this in sligtly modified environment:
- multicast doesn't work between DC for obvious reasons
- I have to use the port 7601 because 7600 is already used on this host by Keycloak JGroups transport (yup, by it's internal infinispan, and my future question would be "why do I need to use external extra Infinispan instance instead of setiing up replication between internal Keycloak's Infinispans, but first things first).
These a parts of my config that I added/modified:
[...]
<replicated-cache-configuration name="sessions-cfg" mode="SYNC" start="EAGER" batching="false">
<backups>
<backup site="site2" failure-policy="FAIL" strategy="SYNC" enabled="true">
<take-offline after-failures="3" min-wait="60000"/>
</backup>
</backups>
<locking acquire-timeout="0"/>
</replicated-cache-configuration>
[...]
<subsystem xmlns="urn:infinispan:server:jgroups:9.4">
<channels default="infinicluster">
<channel name="infinicluster" stack="tcp"/>
<channel name="xsite" stack="tcp"/>
</channels>
<stacks default="${jboss.default.jgroups.stack:udp}">
<stack name="udp">
<transport type="UDP" socket-binding="jgroups-udp"/>
<protocol type="PING"/>
<protocol type="MERGE3"/>
<protocol type="FD_SOCK" socket-binding="jgroups-udp-fd"/>
<protocol type="FD_ALL"/>
<protocol type="VERIFY_SUSPECT"/>
<protocol type="pbcast.NAKACK2"/>
<protocol type="UNICAST3"/>
<protocol type="pbcast.STABLE"/>
<protocol type="pbcast.GMS"/>
<protocol type="UFC_NB"/>
<protocol type="MFC_NB"/>
<protocol type="FRAG3"/>
<relay site="site1">
<remote-site name="site2" channel="xsite"/>
</relay>
</stack>
<stack name="tcp">
<transport type="TCP" socket-binding="jgroups-tcp"/>
<protocol type="TCPPING">
<property name="initial_hosts">
host1.tld[7601],host2.tld[7601]
</property>
<property name="ergonomics">
false
</property>
</protocol>
<protocol type="MERGE3"/>
<protocol type="FD_SOCK" socket-binding="jgroups-tcp-fd"/>
<protocol type="FD_ALL"/>
<protocol type="VERIFY_SUSPECT"/>
<protocol type="pbcast.NAKACK2">
<property name="use_mcast_xmit">
false
</property>
Of course, I changed the port numbers in JGroups socket bindings accordingly. Both instances seem to be starting okay (complaining only about rest https bindings, which seems to be a minor error), I can even see the communications between instances in the logs:
2020-05-06 23:28:54,713 INFO [org.infinispan.CLUSTER] (remote-thread--p2-t20)[Context=___hotRodTopologyCache]ISPN100002: Starting rebalance with members [host1.tld, host2.tld], phase READ_OLD_WRITE_ALL, topology id 2
2020-05-06 23:28:54,779 INFO [org.infinispan.CLUSTER] (remote-thread--p2-t2) [Context=___hotRodTopologyCache]ISPN100009: Advancing to rebalance phase READ_ALL_WRITE_ALL, topology id 3
2020-05-06 23:28:54,807 INFO [org.infinispan.CLUSTER] (remote-thread--p2-t21)[Context=___hotRodTopologyCache]ISPN100009: Advancing to rebalance phase READ_NEW_WRITE_ALL, topology id 4
2020-05-06 23:28:54,834 INFO [org.infinispan.CLUSTER] (remote-thread--p2-t4)[Context=___hotRodTopologyCache]ISPN100010: Finished rebalance with members [host1.tld,host2.tld], topology id 5
The main issue is, that as soon as I open the web management page of any of the instances, I get the error on the logs (suppose I open the management page from site1, host1.tld):
2020-05-06 23:30:49,057 ERROR [org.jboss.as.controller.management-operation] (External Management Request Threads -- 1) WFLYCTL0013: Operation ("read-attribute") failed - address: ([
("subsystem" => "datagrid-infinispan"),
("cache-container" => "clustered")]): java.lang.IllegalStateException: Site host2.tld not defined in all the cluster members
at org.infinispan.xsite.XSiteAdminOperations.clusterStatus(XSiteAdminOperations.java:78)
at org.infinispan.xsite.GlobalXSiteAdminOperations.globalStatus(GlobalXSiteAdminOperations.java:93)
at org.jboss.as.clustering.infinispan.subsystem.CacheContainerMetricsHandler.filterSitesByStatus(CacheContainerMetricsHandler.java:343)
at org.jboss.as.clustering.infinispan.subsystem.CacheContainerMetricsHandler.executeRuntimeStep(CacheContainerMetricsHandler.java:297)
at org.jboss.as.controller.AbstractRuntimeOnlyHandler$1.execute(AbstractRuntimeOnlyHandler.java:59)
at org.jboss.as.controller.AbstractOperationContext.executeStep(AbstractOperationContext.java:999)
at org.jboss.as.controller.AbstractOperationContext.processStages(AbstractOperationContext.java:743)
at org.jboss.as.controller.AbstractOperationContext.executeOperation(AbstractOperationContext.java:467)
at org.jboss.as.controller.OperationContextImpl.executeOperation(OperationContextImpl.java:1411)
at org.jboss.as.controller.ModelControllerImpl.internalExecute(ModelControllerImpl.java:423)
at org.jboss.as.controller.ModelControllerImpl.lambda$execute$1(ModelControllerImpl.java:243)
at org.wildfly.security.auth.server.SecurityIdentity.runAs(SecurityIdentity.java:265)
at org.wildfly.security.auth.server.SecurityIdentity.runAs(SecurityIdentity.java:231)
at org.jboss.as.controller.ModelControllerImpl.execute(ModelControllerImpl.java:243)
at org.jboss.as.domain.http.server.DomainApiHandler.handleRequest(DomainApiHandler.java:212)
at io.undertow.server.handlers.encoding.EncodingHandler.handleRequest(EncodingHandler.java:72)
at org.jboss.as.domain.http.server.DomainApiCheckHandler.handleRequest(DomainApiCheckHandler.java:93)
at org.jboss.as.domain.http.server.security.ElytronIdentityHandler.lambda$handleRequest$0(ElytronIdentityHandler.java:62)
at org.wildfly.security.auth.server.SecurityIdentity.runAs(SecurityIdentity.java:289)
at org.wildfly.security.auth.server.SecurityIdentity.runAs(SecurityIdentity.java:246)
at org.jboss.as.controller.AccessAuditContext.doAs(AccessAuditContext.java:254)
at org.jboss.as.controller.AccessAuditContext.doAs(AccessAuditContext.java:225)
at org.jboss.as.domain.http.server.security.ElytronIdentityHandler.handleRequest(ElytronIdentityHandler.java:61)
at io.undertow.server.handlers.BlockingHandler.handleRequest(BlockingHandler.java:56)
at io.undertow.server.Connectors.executeRootHandler(Connectors.java:360)
at io.undertow.server.HttpServerExchange$1.run(HttpServerExchange.java:830)
at org.jboss.threads.ContextClassLoaderSavingRunnable.run(ContextClassLoaderSavingRunnable.java:35)
at org.jboss.threads.EnhancedQueueExecutor.safeRun(EnhancedQueueExecutor.java:1985)
at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.doRunTask(EnhancedQueueExecutor.java:1487)
at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.run(EnhancedQueueExecutor.java:1349)
at java.lang.Thread.run(Thread.java:748)
at org.jboss.threads.JBossThread.run(JBossThread.java:485)
If I open the web management page from another site, the error is mirrored - this time it complains about host1.tld. It's obvious that I did something wrong, but I have no idea what exactly. Will be glad if someone could help me.