How to retry an operation when it fails

782 Views Asked by At

I have singleton client with the below contract

public interface MQPublisher {
    void publish(String message) throws ClientConnectionException, ClientErrorException;

    void start() throws ClientException;

    void stop();
}

The class which is using this publisher is as below :

public class MessagePublisher {
    @Autowired
    private MQPublisher publisher;
    private AtomicBoolean isPublisherRunning;

    public void startPublisher() {
        if (!isPublisherRunning.get()) {
            publisher.start();
            isPublisherRunning.compareAndSet(false, true);
        }
    }

    @Retry(RETRY_MSG_UPLOAD)
    public void sendMessage(String msg) {
        try {
            startPublisher();
            publisher.publish(msg); // when multiple requests fail with the same exception, what will happen??
        } catch (Exception e) {
            log.error("Exception while publishing message : {}", msg, e);
            publisher.stop();
            isPublisherRunning.compareAndSet(true, false);
            throw e;
        }
    }

We are using resilience4j retry functionality to retry the sendMessage method. This works fine in case of a single request. Consider a case when multiple requests are processed parallely and all of them fails with an exception. In this case, these requests will be retried and there is a chance that one thread will start the publisher while the other will stop it and it will throw exceptions again. How to handle this scenario in a cleaner way?

1

There are 1 best solutions below

0
On

It isn't clear why the whole publisher should be stopped in case of failure. Nevertheless, if there are real reasons for that, I would change the stop method to use an atomic timer that will restart on each message sending and stop the publisher only after at least 5 seconds (or the time needed for a message to be successfully sent) have passed from the message sending. Something like that:

@Slf4j
public class MessagePublisher {
    private static final int RETRY_MSG_UPLOAD = 10;
    @Autowired
    private MQPublisher publisher;
    private AtomicBoolean isPublisherRunning;
    private AtomicLong publishStart;


    public void startPublisher() {
       if (!isPublisherRunning.get()) {
           publisher.start();
           isPublisherRunning.compareAndSet(false, true);
       }
    }

    @Retryable(maxAttempts = RETRY_MSG_UPLOAD)
    public void sendMessage(String msg) throws InterruptedException {
        try {
            startPublisher();
            publishStart.set(System.nanoTime());
            publisher.publish(msg); // when multiple requests fail with the same exception, what will happen??
        } catch (Exception e) {
            log.error("Exception while publishing message : {}", msg, e);
            while (System.nanoTime() < publishStart.get() + 5000000000L) {
                Thread.sleep(1000);
            }
            publisher.stop();
            isPublisherRunning.compareAndSet(true, false);
            throw e;
        }
    }
}

I think it is important to mention (as you just did) that this is a terrible design, and that such calculations should be done by the publisher implementer and not by the caller.