AI Prompt Engineering

Getting ChatGPT to Write Accurate Retry Logic Without Infinite Loop Traps

June 30, 2026 10 min read 4 views

You ask ChatGPT to write retry logic for an API call, and it gives you something that looks reasonable at a glance. Then it hits production and hammers a failing service with thousands of requests per minute, or worse, loops forever waiting for a condition that never becomes true. The output is syntactically fine — the bugs are in the design.

The good news is that retry logic has well-understood rules. Once you learn how to encode those rules into your prompt, ChatGPT produces code you can actually ship.

What You'll Learn

  • Why ChatGPT's default retry patterns create infinite loops and thundering herd problems
  • The exact prompt structure that gets you safe exponential backoff with jitter
  • How to enforce maximum attempt caps and per-operation timeouts
  • How to tell ChatGPT which errors should not be retried
  • Common gotchas to audit before you commit any AI-generated retry code

Prerequisites

This guide uses Python for all examples, but the prompting principles apply to any language. You should be comfortable reading async/await code and have a basic understanding of HTTP status codes. No special libraries are required beyond the standard time and random modules, though the examples also show how to prompt for tenacity-based versions.

How ChatGPT Typically Generates Retry Logic (and Where It Goes Wrong)

When you give ChatGPT a vague prompt like "write a retry wrapper for my API call", it almost always produces one of two patterns: a bare while True loop with a break on success, or a for loop with a fixed sleep. Both patterns carry silent traps.

The while True version creates a genuine infinite loop the moment the break condition is never met — for example, when the remote service returns a non-exception error code that the code doesn't recognize. The for loop version is safer on attempt count, but the fixed sleep causes a thundering herd: if ten clients all back off for exactly two seconds, they all retry at exactly the same moment.

A third common issue is that ChatGPT often catches Exception broadly without distinguishing between transient errors (worth retrying) and permanent ones like 400 Bad Request or 401 Unauthorized (never worth retrying). Retrying a 401 just generates noise and delays the caller from receiving a useful error.

These aren't hallucination problems — ChatGPT knows what exponential backoff is. The issue is that without explicit constraints in your prompt, it takes the path of least tokens.

The Prompt Pattern That Gets You Safe Retry Logic

The key is to front-load your constraints so ChatGPT has no room to default to lazy patterns. Structure your prompt in four parts: context, requirements, non-requirements, and output format.

Here is a prompt template you can adapt directly:

You are writing production Python retry logic for an HTTP API client.

Requirements (all must be satisfied):
1. Exponential backoff: wait = base_delay * (2 ** attempt), starting at 1 second.
2. Full jitter: actual wait = random.uniform(0, wait) to prevent thundering herd.
3. Maximum attempts: exactly 5. After 5 failures, raise the last exception — do not loop forever.
4. Total timeout: if cumulative elapsed time exceeds 30 seconds, stop retrying and raise a TimeoutError.
5. Only retry on transient HTTP errors: 429, 500, 502, 503, 504. Do NOT retry on 4xx errors other than 429.
6. Log each retry attempt with the attempt number, status code, and wait duration.
7. The function must be async (asyncio).

Non-requirements:
- Do not use any third-party retry library.
- Do not catch bare Exception — be specific about what you catch.

Output: a single async Python function called `fetch_with_retry(url, session)` with type hints.
Include a brief docstring. No usage example needed.

Notice how each requirement eliminates one of the failure modes described above. Naming the specific HTTP codes to retry forces ChatGPT to write an explicit allowlist instead of a broad catch-all. Setting both an attempt cap and a wall-clock timeout covers two distinct failure scenarios: a fixed number of fast failures, and a smaller number of slow failures that drag on too long.

If you're building similar structured prompts for other infrastructure code, the approach in getting ChatGPT to write accurate Celery task configs without silent failures follows the same constraint-first pattern and is worth reading alongside this guide.

Exponential Backoff With Jitter: The Right Implementation

When you use the prompt above, ChatGPT should produce something close to this. Here's the target output to use as a reference when auditing what you receive:

import asyncio
import logging
import random
import time
from typing import Set

import aiohttp

logger = logging.getLogger(__name__)

RETRYABLE_STATUS_CODES: Set[int] = {429, 500, 502, 503, 504}
MAX_ATTEMPTS = 5
BASE_DELAY = 1.0  # seconds
MAX_TOTAL_SECONDS = 30.0


async def fetch_with_retry(url: str, session: aiohttp.ClientSession) -> aiohttp.ClientResponse:
    """Fetch a URL with exponential backoff and full jitter.

    Retries only on transient HTTP errors (429, 500, 502, 503, 504).
    Raises the last exception after MAX_ATTEMPTS or when the total
    elapsed time exceeds MAX_TOTAL_SECONDS.
    """
    start_time = time.monotonic()
    last_exception: Exception | None = None

    for attempt in range(MAX_ATTEMPTS):
        try:
            response = await session.get(url)

            if response.status not in RETRYABLE_STATUS_CODES:
                response.raise_for_status()  # raises on 4xx/5xx not in retry set
                return response

            last_exception = aiohttp.ClientResponseError(
                response.request_info,
                response.history,
                status=response.status,
            )

        except aiohttp.ClientResponseError as exc:
            if exc.status not in RETRYABLE_STATUS_CODES:
                raise  # non-retryable; propagate immediately
            last_exception = exc

        except aiohttp.ClientConnectionError as exc:
            last_exception = exc  # network-level errors are always retryable

        elapsed = time.monotonic() - start_time
        if elapsed >= MAX_TOTAL_SECONDS:
            raise TimeoutError(
                f"Retry budget of {MAX_TOTAL_SECONDS}s exhausted after {attempt + 1} attempts."
            ) from last_exception

        wait = random.uniform(0, BASE_DELAY * (2 ** attempt))
        logger.warning(
            "Attempt %d failed. Retrying in %.2fs. Elapsed: %.2fs.",
            attempt + 1, wait, elapsed,
        )
        await asyncio.sleep(wait)

    raise last_exception  # type: ignore[misc]

A few things to notice here. The jitter is full jitterrandom.uniform(0, cap) rather than cap + random_fraction. Full jitter distributes retries more evenly across time than decorrelated or equal jitter, which matters when many clients are hitting the same endpoint simultaneously. The elapsed-time check happens before sleeping, which means you won't burn your timeout budget waiting before the last attempt.

Capping Retries: Max Attempts, Timeouts, and Deadlines

One of the trickiest things to get right when prompting ChatGPT is the interaction between attempt count and wall-clock time. The model frequently generates code that enforces one but not the other.

Consider the scenario: you allow 5 attempts with exponential backoff starting at 1 second. In the worst case, your waits are roughly 0–1s, 0–2s, 0–4s, 0–8s — so the total could reach around 15 seconds even with full jitter. That might be fine for a batch job but unacceptable for a user-facing request with a 5-second SLA.

The fix is to include a deadline check after every sleep. Your prompt should say this explicitly: "after each failed attempt, check whether total elapsed time exceeds N seconds and abort if so." Without that instruction, ChatGPT often adds the timeout at the outer level using asyncio.wait_for, which works but gives you less control over the error message and makes it harder to log the attempt count at the time of timeout.

If you are working with tenacity instead of hand-rolled logic, here is a prompt snippet that gets the combination right:

Use the `tenacity` library. Configure `retry` with:
- stop = stop_after_attempt(5) | stop_after_delay(30)
- wait = wait_exponential(multiplier=1, min=1, max=16) combined with wait_random(0, 1) for jitter
- retry = retry_if_exception(lambda e: isinstance(e, aiohttp.ClientConnectionError)
    or (isinstance(e, aiohttp.ClientResponseError) and e.status in {429,500,502,503,504}))
- before_sleep = before_sleep_log(logger, logging.WARNING)
- reraise = True

The stop_after_attempt(5) | stop_after_delay(30) combination is the key: tenacity will stop on whichever condition triggers first, giving you both caps simultaneously.

Handling Non-Retryable Errors Correctly

ChatGPT almost always needs an explicit nudge here. The default output either retries everything or retries nothing — both extremes are wrong.

The rule is straightforward: retry on transient infrastructure failures, never on client-caused errors. A 400 means your request is malformed; retrying it is pointless. A 401 means your credentials are invalid; retrying it will just lock you out faster. A 403 means you lack permission; retrying will not grant you permission. Only 429 (rate limited) and the 5xx family (server-side failure) are candidates for retry.

Tell ChatGPT this directly in your prompt. Use a concrete allowlist, not a blocklist:

Retryable HTTP status codes: 429, 500, 502, 503, 504 only.
For all other status codes, raise the error immediately without retrying.
For connection-level errors (timeouts, DNS failures), always retry up to the attempt cap.

It's also worth specifying what happens to the Retry-After header on 429 responses. Some APIs send it; when they do, you should respect it instead of your computed backoff. Ask ChatGPT to check for this header and use its value when present:

retry_after = response.headers.get("Retry-After")
wait = float(retry_after) if retry_after else random.uniform(0, BASE_DELAY * (2 ** attempt))

This is the kind of edge case that rarely appears in ChatGPT's unsolicited output but produces correct behavior the moment you ask for it. The pattern is similar to what you encounter when prompting for rate limiting middleware — there's a relevant treatment of header-based throttle signals in getting ChatGPT to write accurate rate limiting middleware without gaps.

Common Pitfalls to Watch For

Even with a good prompt, always audit AI-generated retry logic against this checklist before committing it.

Bare while True without a guaranteed exit

Search for while True in any output. If it appears, confirm that every code path inside the loop either returns, raises, or breaks. A missing exception branch or an unexpected status code that falls through without incrementing the attempt counter is enough to create an infinite loop in production.

Swallowed exceptions on the last attempt

A classic mistake: the loop catches an exception, stores it in a variable, but then the raise at the end is inside the except block rather than outside it — so the last failure is silently swallowed and the function returns None. Make sure the final raise is at the same indentation level as the loop, not nested inside an exception handler. You can see a similar pattern in error handling middleware discussed in getting ChatGPT to write accurate logging middleware without swallowing errors.

Missing jitter on the first attempt's backoff

When attempt starts at 0, 2 ** 0 = 1, so the cap for the first retry is just 1 second. With full jitter, that means the first retry could happen almost immediately (anywhere from 0 to 1 second). That's fine. But check that random.uniform is actually called — some ChatGPT outputs compute the cap correctly and then forget to apply randomness to it.

Re-raising the wrong exception type

If your function accumulates last_exception across multiple attempts but the final raise is raise RuntimeError("max retries exceeded"), the original cause is lost. Always use raise original_exception or raise RuntimeError("...") from original_exception so the call stack is preservable for debugging.

No timeout on the individual request

Retry logic and per-request timeouts are separate concerns. If your underlying HTTP call has no timeout, a single hung request can block a retry attempt indefinitely — making your attempt cap meaningless. Always set a per-request timeout in your session configuration and mention this requirement to ChatGPT explicitly.

The same principle applies to other async patterns. If you are generating WebSocket handlers alongside retry logic, the guide on getting ChatGPT to write accurate WebSocket handlers without dropped messages covers per-connection timeout strategies that pair well with what's described here.

Wrapping Up: Next Steps

ChatGPT writes retry logic quickly, but the defaults skew toward simplicity rather than correctness. The patterns above give you a systematic way to close that gap.

  • Copy the prompt template from the "Prompt Pattern" section and adapt the status codes, attempt cap, and timeout values to match your specific service SLA.
  • Audit every output against the checklist in "Common Pitfalls" — especially the bare while True check and the last-attempt exception propagation.
  • Add a per-request timeout to your HTTP session separately from the retry timeout; they solve different failure modes and both are necessary.
  • Test with a mock server that returns 503s on every request and verify the function stops at exactly the attempt cap and raises the correct exception type.
  • Consider tenacity for production code where you need more configuration options; use the prompt snippet from the "Capping Retries" section to get the stop conditions right from the start.

Frequently Asked Questions

How do I stop ChatGPT from generating infinite retry loops?

Include an explicit maximum attempt count and a wall-clock timeout in your prompt, and specify that the function must raise the last exception after exhausting both limits. Without these constraints, ChatGPT often defaults to open-ended while True loops that have no guaranteed exit path.

What HTTP status codes should retry logic actually retry on?

Retry on 429 (Too Many Requests) and the 5xx server-error family: 500, 502, 503, and 504. Never retry on 4xx client errors like 400, 401, or 403, because those errors indicate a problem with the request itself that retrying will not fix.

What is full jitter in exponential backoff and why does it matter?

Full jitter means selecting a random wait time between zero and the computed exponential cap — for example, random.uniform(0, base_delay * 2**attempt). It matters because without jitter, all clients that fail at the same time back off for the same duration and then retry simultaneously, creating a thundering herd that can overwhelm the recovering service.

Should I use tenacity or write retry logic by hand when prompting ChatGPT?

Either approach works, but tenacity gives you more reliable stop-condition composition out of the box, particularly the ability to stop on whichever of two conditions triggers first. For hand-rolled logic, you must explicitly check both the attempt count and elapsed time after every failure, which requires more careful prompting to get right.

How do I make ChatGPT respect the Retry-After header in its retry output?

Add a specific line to your prompt such as 'If the response includes a Retry-After header, use its value as the wait duration instead of the computed backoff.' ChatGPT will not include this behavior by default, but it generates correct header-parsing code reliably once you ask for it.

📤 Share this article

Sign in to save

Comments (0)

No comments yet. Be the first!

Leave a Comment

Sign in to comment with your profile.

📬 Weekly Newsletter

Stay ahead of the curve

Get the best programming tutorials, data analytics tips, and tool reviews delivered to your inbox every week.

No spam. Unsubscribe anytime.