Automatic retries
When you request verification of an AI-generated answer or want to connect with a human Expert, the process may take some time. The verification duration depends on factors such as answer complexity and expert availability. In this case you must implement automatic retries
One effective way to do automatic retries is by using exponential backoff, which helps prevent unnecessary load on the API while improving resilience.
Why use exponential backoff?Copied!
Exponential backoff is a retry strategy that gradually increases the wait time between failed requests. It has several advantages:
-
Prevents crashes and data loss by handling temporary failures gracefully.
-
Optimizes retry timing by starting with short waits and increasing them if failures persist.
-
Reduces API congestion by adding random jitter to prevent multiple retries from happening at the same time.
Example implementing retries in PythonCopied!
The following code snippet demonstrates how to implement exponential backoff retries using the OpenAI Python library:
import os
import openai
import uuid
import time
import random
...
# Initialize the OpenAI client
client = openai.OpenAI(
api_key=api_key,
base_url="https://eaasapi.justanswer.com.bangkok.ord2a.ja.team/api/v1"
)
...
# example of implementing exponential backoff retries
def get_completion_with_retry(messages, session_id, max_retries=10):
"""Get a completion from the Pearl API with exponential backoff retries."""
retry_count = 0
base_delay = 1 # Start with 1 second delay
while retry_count <= max_retries:
try:
metadata = {
"sessionId": session_id,
"mode":"pearl-ai-verified"
}
# Add retry flag if this is a retry attempt
if retry_count > 0:
metadata["retry"] = "true"
response = client.chat.completions.create(
model="pearl-ai",
messages=messages,
metadata=metadata
)
return response.choices[0].message.content
except openai.UnprocessableEntityError:
if retry_count < max_retries:
# Calculate delay with exponential backoff and jitter
delay = base_delay * (2 ** retry_count) + random.uniform(0, 0.5)
print(f"Completion still processing. Retrying in {delay:.2f} seconds...")
time.sleep(delay)
retry_count += 1
else:
raise
except Exception as e:
raise
...
When making a retry request, you must include the metadata attribute { "retry": "true" }
to indicate that the request is a retry. This helps the system distinguish it from a standard message intended to generate a new completion.
Without this flag, the system will treat the message as a new request, which may lead to duplicate or inconsistent behavior. Make sure to include this metadata only when resending the exact same message after receiving a 422 Status Code (Unprocessable Entity).