AI Prompt Engineering

Getting ChatGPT to Write Accurate Pagination Logic Without Off-By-One Bugs

June 30, 2026 10 min read 3 views

You ask ChatGPT to write pagination logic and it looks right at first glance. Then in production, the last page shows one extra empty fetch, your total-page count is off by one, or page 0 and page 1 both return the same records. These are not random bugs — they are predictable failure modes that come from how language models generalize pagination patterns without locking down the boundary math.

The good news is that the bugs are consistent, which means you can eliminate them with more precise prompts. This guide shows you exactly how.

What You'll Learn

  • Why LLMs systematically produce off-by-one errors in pagination code
  • How to write prompts that force correct offset/limit and total-page math
  • How to handle edge cases like empty result sets and single-item pages
  • How to prompt for cursor-based pagination without boundary confusion
  • A checklist for reviewing any AI-generated pagination before it ships

Prerequisites

This guide assumes you are building or reviewing a backend API or data layer that returns paginated results. Examples use Python and SQL, but the prompting techniques apply regardless of language. You should understand what LIMIT and OFFSET do in SQL, and have a basic idea of how cursor-based pagination differs from offset-based.

Why ChatGPT Gets Pagination Wrong

Pagination math looks deceptively simple. There are only a handful of variables: total item count, page size, current page number, and offset. But every one of those variables has an ambiguity baked in: is your page index zero-based or one-based? Does "total pages" round up or truncate? What offset does page 1 produce?

ChatGPT is trained on a huge variety of codebases that make different choices about all of these questions. Without explicit constraints, the model averages across conventions and picks whatever felt most common in its training data. That average is rarely consistent with your specific codebase, and it almost never handles edge cases explicitly.

As explained in the analysis of how early tokens skew LLM outputs, the model's early assumptions about a problem heavily influence every token it writes afterward. If your prompt implies a 1-based page index early on, the rest of the generated function should follow — but only if you make that assumption explicit. Leave it ambiguous and you get whatever the model defaulted to.

Understanding the Two Most Common Off-By-One Patterns

Pattern 1: Wrong offset for the first page

When ChatGPT generates offset/limit pagination with a 1-based page number, it often writes the offset as page * page_size. For page 1 with a page size of 10, that gives offset 10 — skipping the first ten rows entirely. The correct formula for 1-based pagination is (page - 1) * page_size.

Conversely, for 0-based page indexes, the model sometimes writes (page - 1) * page_size, which produces a negative offset on page 0. Neither bug is hard to fix, but both are hard to catch in a code review if you don't specifically check the formula.

Pattern 2: Total-page count that drops the last page

ChatGPT frequently writes total page count as total_items / page_size using integer division, which truncates rather than rounds up. With 101 items and a page size of 10, you get 10 pages instead of 11 — and the last item is silently unreachable. The correct expression is ceil(total_items / page_size) or equivalently (total_items + page_size - 1) // page_size in integer arithmetic.

How to Prompt for Offset/Limit Pagination

The key is to spell out every assumption that has any ambiguity. Do not let the model infer indexing convention, rounding behavior, or edge-case handling. The following prompt template is what works well in practice.

Write a Python function called `paginate_query` that accepts:
- `query`: a SQLAlchemy Select object
- `page`: an integer, 1-based (page 1 is the first page, NOT page 0)
- `page_size`: a positive integer, default 20

Requirements:
1. The SQL OFFSET must be `(page - 1) * page_size`. Do NOT use `page * page_size`.
2. Return a dict with keys: `items` (list), `total_items` (int), `total_pages` (int), `current_page` (int), `page_size` (int).
3. `total_pages` must use ceiling division: `math.ceil(total_items / page_size)`.
4. If `total_items` is 0, return `total_pages` as 0, not 1.
5. If `page` exceeds `total_pages`, return an empty `items` list without raising an exception.
6. Do NOT use any ORM pagination helpers — compute offset and limit explicitly.

Notice the prompt does three things. First, it states the indexing convention explicitly and tells the model what NOT to write. Second, it lists the exact output shape. Third, it enumerates every edge case as a numbered constraint. When you give ChatGPT explicit numbered requirements, it tends to address each one in sequence rather than inferring behavior from context.

Here is the kind of output that prompt reliably produces:

import math
from sqlalchemy import select, func
from sqlalchemy.orm import Session

def paginate_query(query, page: int, page_size: int = 20, session: Session = None):
    offset = (page - 1) * page_size

    count_query = select(func.count()).select_from(query.subquery())
    total_items = session.execute(count_query).scalar_one()

    total_pages = math.ceil(total_items / page_size) if total_items > 0 else 0

    items = session.execute(
        query.offset(offset).limit(page_size)
    ).scalars().all()

    return {
        "items": items,
        "total_items": total_items,
        "total_pages": total_pages,
        "current_page": page,
        "page_size": page_size,
    }

This matches the spec you gave. Review it against each numbered constraint before accepting it.

Prompting for Total-Page Count Without Rounding Errors

Even when you ask ChatGPT to compute total pages correctly, it sometimes hedges by adding a comment like "you may want to use ceiling division here" without actually using it. Force the issue by being explicit about the formula.

Compute total_pages using: math.ceil(total_items / page_size)
Do NOT use integer division (//) for this calculation.
Special case: if total_items == 0, set total_pages = 0.

If you are working in a language without a built-in ceil, tell the model which equivalent to use. For JavaScript:

// Instruct ChatGPT to use this formula:
const totalPages = totalItems === 0 ? 0 : Math.ceil(totalItems / pageSize);
// Do NOT use Math.floor or integer truncation.

Stating the "do NOT" version of the instruction is not redundant. LLMs respond better to both the positive instruction and the explicit prohibition, particularly for numeric expressions where several equivalent-looking options exist.

Handling Edge Cases: Empty Sets, Single Pages, and Boundaries

Off-by-one bugs are worst at the edges. An empty result set, a result set that is exactly one page, and a request for a page beyond the last one are the three cases ChatGPT most often gets wrong without explicit guidance.

Add this block to any pagination prompt:

Edge cases to handle explicitly:
- total_items == 0: return total_pages=0, items=[], current_page=page
- total_items == page_size exactly: return total_pages=1, items=[all items], no second empty page
- page > total_pages: return items=[], do not raise an error
- page < 1: raise ValueError("page must be >= 1")

The "exactly one page" case trips up generated code more than you might expect. Without the explicit constraint, ChatGPT sometimes returns total_pages=2 for a 20-item result with page_size=20 because it uses (total_items + page_size) // page_size instead of math.ceil(total_items / page_size).

This mirrors the kind of silent edge-case failure described in the article on writing accurate retry logic without infinite loop traps — in both cases, the model produces code that looks correct for the happy path but breaks at boundaries.

Cursor-Based Pagination: A More Specific Prompt Pattern

Cursor-based pagination avoids offset drift on frequently updated datasets, but it introduces a different class of ambiguity. The cursor is typically an opaque value encoding the last-seen record's sort key. ChatGPT regularly conflates the cursor with a page number, encodes it in ways that break on special characters, or forgets to handle the "no next page" case.

Here is a prompt structure that eliminates those issues:

Write a Python function `cursor_paginate` for a SQLAlchemy model `Event` with columns `id` (int, primary key) and `created_at` (datetime), ordered by `created_at DESC, id DESC`.

Requirements:
1. Accept `cursor` (optional str, default None) and `page_size` (int, default 20).
2. The cursor encodes the `created_at` and `id` of the LAST item returned in the previous page.
3. Encode the cursor as a base64 JSON string: base64(json({"created_at": iso8601_str, "id": int}))
4. If `cursor` is None, return the first page (no WHERE clause on cursor fields).
5. If `cursor` is provided, filter WHERE (created_at, id) < (cursor_created_at, cursor_id) using a row-value comparison or equivalent.
6. Return: {"items": [...], "next_cursor": str | None}
7. `next_cursor` is None when the returned items count is less than page_size — do NOT fetch an extra item to detect the next page.

Requirement 7 is critical. A common ChatGPT shortcut is to fetch page_size + 1 rows and use the extra row's existence to determine whether a next page exists, then discard it from the response. That approach works but it is inconsistent — if you do not explicitly ask for it, ChatGPT might use it on some calls and not others, or forget to discard the extra row. Locking down the detection method prevents that inconsistency.

Common Pitfalls When Reviewing AI-Generated Pagination

Even with a precise prompt, always review generated pagination code against this checklist before committing it:

  • Offset formula: Is it (page - 1) * page_size for 1-based, or page * page_size for 0-based? Does it match your convention?
  • Total pages rounding: Is math.ceil or equivalent used? Not floor division?
  • Zero-item edge case: Does total_pages return 0 or 1 for an empty dataset? Pick one and be consistent across your API.
  • Over-page request: Does requesting page 999 on a 3-page dataset return an empty list or throw an exception? Make sure it matches your API contract.
  • Cursor decoding: Is the cursor decoded inside a try/except? A malformed cursor should return a 400, not a 500.
  • SQL injection: Is the page/offset value passed as a parameter, not string-interpolated into the query?

Similar discipline is useful when reviewing any AI-generated backend logic. The guide on writing accurate rate-limiting middleware shows the same pattern of edge-case enumeration applying to a different problem domain.

One more thing worth checking: if ChatGPT generated both a paginator function and an API endpoint that calls it, verify the page number is extracted from the request correctly. A common mistake is parsing page as a string and passing it directly into the offset formula, which makes string multiplication behave unexpectedly in some languages.

Testing the boundaries automatically

Ask ChatGPT to generate the tests alongside the implementation, and specify the exact cases in your prompt:

Also write pytest tests covering:
- page=1 with 0 items: expect total_pages=0, items=[]
- page=1 with exactly page_size items: expect total_pages=1, items has page_size elements
- page=1 with page_size + 1 items: expect total_pages=2
- page=2 with page_size + 1 items: expect items has 1 element
- page=999 with 5 items and page_size=10: expect items=[], no exception

When you provide the test cases alongside the implementation prompt, the model is more likely to write implementation code that actually passes them. There is a feedback loop between the spec and the code when both exist in the same context window. This is the same principle behind the approach in the guide on writing accurate Playwright test scripts — specifying expected outcomes up front reduces hallucinated behavior.

Wrapping Up

Pagination bugs from AI-generated code are predictable and preventable. Here are the concrete steps to take right now:

  1. State your indexing convention explicitly in every pagination prompt — 0-based or 1-based — and include a "do NOT" instruction for the wrong formula.
  2. Enumerate all edge cases as numbered requirements: empty sets, exact-page-size results, and out-of-bounds page requests.
  3. Force the ceiling division formula by name (math.ceil, Math.ceil) rather than describing the intent in words.
  4. For cursor-based pagination, specify the encoding format, the WHERE clause shape, and exactly how "no next page" is detected.
  5. Ask for boundary tests in the same prompt and verify the generated implementation passes each one before you ship it.

The underlying principle is that ChatGPT writes better code when you remove ambiguity, not when you ask more nicely. Every assumption you leave implicit is a decision the model makes for you — and it will not always make the same decision twice.

Frequently Asked Questions

Why does ChatGPT generate the wrong OFFSET in SQL pagination?

ChatGPT often uses `page * page_size` as the offset instead of `(page - 1) * page_size` for 1-based page numbers because it averages across codebases that mix 0-based and 1-based conventions. You fix it by explicitly stating the indexing convention and the exact formula in your prompt.

How do I get ChatGPT to correctly calculate the total number of pages?

Instruct the model to use `math.ceil(total_items / page_size)` by name, and explicitly say not to use floor or integer division. Without this, the model commonly uses integer truncation, which drops the final partial page from the count.

What edge cases should I always specify when prompting for pagination logic?

Always specify: what happens when total_items is 0, what happens when results fit exactly in one page, what to return when the requested page exceeds the total page count, and how to handle a page number below 1. Each of these is a case where generated code commonly produces incorrect results without explicit instructions.

Is cursor-based pagination harder to get right with ChatGPT than offset-based?

Yes, because cursor-based pagination has more encoding and filtering choices that are easy to underspecify. The model frequently conflates cursors with page numbers or forgets the null-cursor first-page case. Providing the exact cursor encoding format and the WHERE clause structure in your prompt prevents most of these mistakes.

Should I ask ChatGPT to write tests alongside the pagination implementation?

Yes, and you should specify the exact boundary cases in the same prompt — empty sets, exact-page-size results, and out-of-bounds page numbers. When the test cases exist in the same context as the implementation request, the model tends to write code that handles those cases rather than ignoring them.

📤 Share this article

Sign in to save

Comments (0)

No comments yet. Be the first!

Leave a Comment

Sign in to comment with your profile.

📬 Weekly Newsletter

Stay ahead of the curve

Get the best programming tutorials, data analytics tips, and tool reviews delivered to your inbox every week.

No spam. Unsubscribe anytime.