Our Application have App server(Jboss 7.4) that reads and writes to Redis Cluster (1 master and 2 slaves, Redis 7.0.8 version). App servers and Redis servers are hosted in RHEL 7 Linux servers (each instance on different server, no sharing).
We have observed that in prod environment suddenly we got socket timeout error while connecting to Redis. After restarting Redis, the socket timeout error is gone but we were not able to write anything in the Redis Servers. We had to use flushdns to clear all keys in Redis. And then it started working.
Can you please say what could lead to this behavior ? How can we debug such a issue or find RCA for same to ensure this does not come back again.