infinispan client HOTROD massive put data ISPN000476: Timed out waiting for responses for request

518 Views Asked by At

I found other responses to the issue, but I would want to explain my use case if someone have the configuration solutions side or in other case is the Distribuited Cache service limit.

DATA GRID Server 8.2.3 in a cluster 4 VM with the followed config

{
      "distributed-cache": {
        "mode": "SYNC",
        "remote-timeout": 17500,
        "state-transfer": {
          "timeout": 60000
        },
        "encoding": {
          "key": {
            "media-type": "text/plain"
          },
          "value": {
            "media-type": "application/x-protostream"
          }
        },
        "locking": {
          "concurrency-level": 1000,
          "acquire-timeout": 15000,
          "striping": false
        },
        "statistics": true
      }
    }

Application side client HOT-ROD version with standard Jcache lib version12.1.11.Final-redhat-00001

@PostConstruct
  private void setUp() {
      LOGGER.info("START [setUp] CACHE");
     
      File conf = new File(System.getenv("CLIENT_HOTROD_FILE_PATH"));
     
      URI uri = conf.toURI();
    
      // Retrieve the system wide cache manager via org.infinispan.jcache.remote.JCachingProvider
      javax.cache.CacheManager cacheManager = Caching.getCachingProvider("org.infinispan.jcache.remote.JCachingProvider")
            .getCacheManager(uri, this.getClass().getClassLoader(), null);
      
      this.cache = cacheManager.getCache(DATAGRIDKEY);
     
      LOGGER.info("END [setUp] cache " + this.cache.getName() );
  }

The client config is default.

My test performed is the: Massive put data on distribuited Cache infinispan on the cluster Frequently the app receive the responses timeout from the server as followed:

[1/26/22 14:58:02:767 CET] 00001ffd HOTROD W org.infinispan.client.hotrod.impl.protocol.Codec20 checkForErrorsInResponseStatus ISPN004005: Error received from the server: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 1770 from DM10RH08

Is there the way to optimize the performance server side and client side via configuration ?

1

There are 1 best solutions below

0
On

I resolved the issue changing the server Data grid configuration

distributed-cache from SYNC to ASYNC mode.

The exceeding remote-timeout server side was caused from the time spent for Sync replication key in subset nodes of the cluster Data Grid.

From the Data Grid document ( doc SYNC and ASYNC replication ):

Replication mode can be synchronous or asynchronous depending on the problem being addressed.

Synchronous replication blocks a thread or caller (for example on a put() operation) until the modifications are replicated across all nodes in the cluster. By waiting for acknowledgments, synchronous replication ensures that all replications are successfully applied before the operation is concluded.

Asynchronous replication operates significantly faster than synchronous replication because it does not need to wait for responses from nodes. Asynchronous replication performs the replication in the background and the call returns immediately. Errors that occur during asynchronous replication are written to a log. As a result, a transaction can be successfully completed despite the fact that replication of the transaction may not have succeeded on all the cache instances in the cluster.

My last server conf:

{
  "distributed-cache": {
    "mode": "ASYNC",
    "remote-timeout": 17500,
    "state-transfer": {
      "timeout": 60000
    },
    "encoding": {
      "key": {
        "media-type": "text/plain"
      },
      "value": {
        "media-type": "application/x-protostream"
      }
    },
    "locking": {
      "concurrency-level": 1000,
      "acquire-timeout": 15000,
      "striping": false
    },
    "statistics": true
  }
}

Furthermore always in server conf ( /opt/redhat/redhat-datagrid-8.2-server/bin/server.conf )

I set Xms the same size of Xmx to avoid the performance degradation when the GC starts up.

I found performance tuning link: performance tuning cache link