Automatic retries

When you request verification of an AI-generated answer or want to connect with a human Expert, the process may take some time. The verification duration depends on factors such as answer complexity and expert availability. In this case you must implement automatic retries

One effective way to do automatic retries is by using exponential backoff, which helps prevent unnecessary load on the API while improving resilience.

Why use exponential backoff?

Exponential backoff is a retry strategy that gradually increases the wait time between failed requests. It has several advantages:

Prevents crashes and data loss by handling temporary failures gracefully.
Optimizes retry timing by starting with short waits and increasing them if failures persist.
Reduces API congestion by adding random jitter to prevent multiple retries from happening at the same time.

Example implementing retries in Python

The following code snippet demonstrates how to implement exponential backoff retries using the OpenAI Python library:

import os
import openai
import uuid
import time
import random
...

# Initialize the OpenAI client
client = openai.OpenAI(
    api_key=api_key,
    base_url="https://api.pearl.com/api/v1/"
)
...

# example of implementing exponential backoff retries
def get_completion_with_retry(messages, session_id, max_retries=10):
    """Get a completion from the Pearl API with exponential backoff retries."""
    retry_count = 0
    base_delay = 1  # Start with 1 second delay
    
    while retry_count <= max_retries:
        try:
            metadata = {
                "sessionId": session_id,
                "mode":"pearl-ai-verified"
            }
            
            response = client.chat.completions.create(
                model="pearl-ai", 
                messages=messages,
                metadata=metadata
            )
            return response.choices[0].message.content
            
        except openai.UnprocessableEntityError:
            if retry_count < max_retries:
                # Calculate delay with exponential backoff and jitter
                delay = base_delay * (2 ** retry_count) + random.uniform(0, 0.5)
                print(f"Completion still processing. Retrying in {delay:.2f} seconds...")
                time.sleep(delay)
                retry_count += 1
            else:
                raise
        except Exception as e:
            raise
...

Whenever an answer is not yet available, the API will return a 422 Unprocessable Entity status code. In this case, you should retry the same request until a valid response is returned.

The system automatically recognizes retry requests and distinguishes them from standard messages, ensuring that duplicate responses are not generated.