Mutiny - How does retry... retries?
Last week, David, a Quarkus user, asked me about retrying an asynchronous
operation. David has a picky remote HTTP service, which sometimes
misbehaves. Easy answer! Just do: uni.onFailure().retry().atMost(x)
.
But, David is curious, and asked me a second question: how does retry work?
Well, that’s simple; it just retries… As you can imagine, that did not
satisfy David’s curiosity.
While I was answering to David, I realized that retrying is not that simple and deserves more explanation. That’s what we are going to see in this blog post.
Disclaimer about retries
Okay, if you are the kind of reader who skips the terms and conditions, you can jump to the next section. But, for others, I need to warn you about retries. Retries may not be the solution to your problem, as it can be terrible. Retrying can only be done if your system can handle duplicated requests or messages. In other words, you can only retry if your system is idempotent, i.e., sending a request or a message multiple times does not change the overall state. In doubt, check before implementing a retry, as it can harm your system.
Disclaimer said! Let’s look under the hood of retry.
Retry is about resubscribing
Let’s imagine you have a Uni
representing your asynchronous action, like
in David’s case, an invocation of a remote service:
Uni<String> uni = invokePickyService(client);
Unis are lazy. Until someone subscribes to them, nothing happens. In our
case, the request is only sent to the remote service when someone subscribes
to the uni
. So to execute the request, we need to subscribe:
Uni<String> uni = invokePickyService(client);
uni.subscribe().with(
resp -> System.out.println("Success: " + resp),
failure -> System.out.println("Failed: " + failure.getMessage())
);
In Quarkus, most of the time, you return the Uni
, and Quarkus subscribes
to it. So, don’t be mistaken, there is a subscription, you may not see it.
This laziness is the retry secret. In the following sequence diagram, you can see that the request is sent when we get a subscriber:
An interesting consequence of this is that if you subscribe twice, you emit two requests, and so get two responses:
But let’s go back to retry. What’s a retry? A retry is a reaction to a
failure, so you are going to write: uni.onFailure().retry()
. You also
need to indicate how long do you want to retry:
Uni<String> uni = invokePickyService(client)
.onFailure().retry().indefinitely();
uni.subscribe().with(
resp -> System.out.println("Success: " + resp)
);
In this snippet, we retry indefinitely until we get a successful result.
But, how does it work under the hood? It’s quite simple. If it gets a failure, it just re-subscribes:
It resubscribes until it gets a successful response. In other words, retrying is resubscribing.
How many times should I retry?
That’s always a good question. Should I retry indefinitely? Most probably, not. Indefinitely can be very long, and it could never end if the called service fails continuously.
You can configure the number of retries using atMost
:
Uni<String> uni = invokePickyService(client)
.onFailure().retry().atMost(2);
uni.subscribe().with(
resp -> System.out.println("Success: " + resp),
failure -> System.out.println("Failure: " + failure.getMessage())
);
atMost
indicates that at most n
attempts will be done before failing.
If we still get a failure after that number of resubscription, the last failure is sent to the subscriber.
You can also use until
and decide to retry by looking at the received failure:
Uni<String> uni = invokePickyService(client)
.onFailure().retry().until(failure -> ! (failure instanceof TooManyRequestsException));
Bonus: Exponential backoff
So far, our retries happen immediately. It might not be very wise, and separating a bit our retries may give better results, especially when facing intermittent failures due to the load or other external causes. Using exponential backoff provides a reasonable tradeoff. Retrying with exponential backoff:
-
retries after an initial delay,
-
on every failure, it doubles the previous delay, with an optional maximum,
-
it can use a jitter to add a random duration to the delay,
-
it can have a max delay if needed,
-
it is still constrained by
atMost
Uni<String> uni = invokePickyService(client)
.onFailure().retry()
.withBackOff(Duration.ofSeconds(1)).withJitter(0.2).atMost(10);
This last snippet configures the retry with exponential backoff. The first retry happens after 1 second, and then it doubles every time without an upper limit. Random jitter is applied to delays. If, unfortunately, after ten attempts, it still fails, the failure is sent to the subscriber.
Demo time!
Ok, enough talking. I’ve built a simple playground where you can adjust and
try the various retry policies:
https://gist.github.com/cescoffier/e9abce907a1c3d05d70bea3dae6dc3d5. You
can jbang the script by downloading it and running jbang Retry.java
, or
just run:
jbang https://gist.github.com/cescoffier/e9abce907a1c3d05d70bea3dae6dc3d5
The called service fails 50% of the time (well, it uses a random, so statistically 50% of the time).
That concludes this blog post. Again, before using retry, please verify
that your system can support it. Retrying is resubscribing. You can
configure how long, how many times, and when retrying should be attempted.
There are many more options offered by Mutiny, like when
or using
deadlines (expireIn
and expireAt
) when using exponential backoff. You
can try all these with the provided playground.
Stay tuned! We have plenty of other gems to discuss here!