Gaurav Sarma

Your API returns the first page of results in 12ms. Page 10 takes 45ms. Page 100 takes 800ms. The query hasn't changed. The table hasn't grown. The only thing that changed is the offset. This is not a mystery once you understand what the database is actually doing, and it is entirely avoidable.

The Problem

Most APIs are built with offset pagination first because it maps naturally to how humans think about pages. "Give me items 0 through 10, then 10 through 20." It also maps naturally to the SQL you already know:

SELECT * FROM posts ORDER BY created_at DESC LIMIT 10 OFFSET 100;

The problem is what happens inside the database when you run this. The query planner cannot seek directly to row 100. It must scan the index from the beginning, count 100 rows, discard them, and then return the next 10. At offset 100 that cost is small. At offset 100,000 on a busy table, you are discarding 100,000 rows per request, every request, for every user sitting on a late page.

There is a second problem: drift. If a new row is inserted into the table between the time a client fetches page 1 and page 2, every subsequent page shifts by one. Items get duplicated or silently skipped. This is not theoretical. It happens on any live table with ongoing writes.

Prerequisites

Familiarity with SQL: SELECT, WHERE, ORDER BY, indexes
Basic understanding of how database indexes work (B-tree lookup vs sequential scan)
Some exposure to building or consuming paginated REST APIs

The Three Approaches

Offset Pagination

SELECT id, title, created_at
FROM posts
ORDER BY created_at DESC
LIMIT 10 OFFSET $offset;

The client tracks a page number or offset integer and increments it on each request. Simple to implement, simple to reason about, and completely broken at scale.

Cost: O(offset + page_size). The database must touch every row before the offset to count past them.

Drift: any write between pages can shift results.

Random access: works. You can jump to page 47 with OFFSET 470.

Cursor Pagination

The client instead passes the position of the last item it saw, and the server fetches rows after that position:

-- First page (no cursor)
SELECT id, title, created_at
FROM posts
ORDER BY id DESC
LIMIT 10;

-- Subsequent pages (cursor = last seen id)
SELECT id, title, created_at
FROM posts
WHERE id < $last_seen_id
ORDER BY id DESC
LIMIT 10;

The WHERE id < $last_seen_id clause turns this into an index seek. The database goes directly to the position in the B-tree and reads forward. Cost is O(log N + page_size) regardless of how far into the dataset you are.

Cost: O(log N + page_size). Constant with respect to pagination depth.

Drift: none. The cursor encodes an absolute position, not a relative count.

Random access: impossible. You cannot jump to page 47 without traversing pages 1 through 46 first.

Keyset Pagination

Keyset pagination is the generalization of cursor pagination for arbitrary sort orders. When your sort column is not unique (common in practice: created_at, score, price), you add a tiebreaker:

-- Sort by created_at DESC, id DESC (stable, unique composite key)
SELECT id, title, created_at
FROM posts
WHERE (created_at, id) < ($last_ts, $last_id)
ORDER BY created_at DESC, id DESC
LIMIT 10;

The tuple comparison (created_at, id) < ($last_ts, $last_id) matches PostgreSQL and most other databases' row value comparisons. The composite index on (created_at, id) makes this an index seek rather than a scan.

Technical Decisions

Cursor column requirements

Not every column works as a cursor. The requirements are strict:

Indexed: the column must be part of an index the database can seek on.
Monotonic or stable for your sort order: the cursor encodes a position in the sort order, so the sort must be deterministic.
Unique (or made unique via tiebreaker): if two rows have the same cursor value, the WHERE clause will skip all of them or return duplicates.

Auto-increment integer IDs satisfy all three naturally. UUIDs do not work unless they are time-ordered (UUIDv7, ULID). created_at timestamps are not unique, so you always need (created_at, id) as a composite cursor.

Opaque cursors

Exposing raw id or timestamp values as the cursor leaks schema internals to clients and creates fragile contracts. If you later switch from integer IDs to UUIDs, every client breaks.

The standard practice is to base64-encode the cursor value:

// Encode: serialize the cursor payload and base64 it
type CursorPayload struct {
    CreatedAt time.Time `json:"created_at"`
    ID        int64     `json:"id"`
}

func EncodeCursor(p CursorPayload) string {
    b, _ := json.Marshal(p)
    return base64.StdEncoding.EncodeToString(b)
}

func DecodeCursor(s string) (CursorPayload, error) {
    b, err := base64.StdEncoding.DecodeString(s)
    if err != nil {
        return CursorPayload{}, err
    }
    var p CursorPayload
    return p, json.Unmarshal(b, &p)
}

The API response includes the cursor for the next page:

{
  "items": [...],
  "next_cursor": "eyJjcmVhdGVkX2F0IjoiMjAyNi0wMy0xMVQxMjowMDowMFoiLCJpZCI6NDJ9"
}

The client passes ?cursor=eyJ... on the next request. You can change the internal encoding at any time without breaking the contract, as long as you version or gracefully handle old cursors.

Forward-only is a real constraint

Cursor pagination does not support backward navigation or random page access without significant additional complexity. If your product has "page N of M" UI with a page number input, cursor pagination forces you to either:

Drop the random-access feature
Pre-paginate results and cache page cursors server-side
Accept offset pagination's costs for this specific use case

Many consumer products (Twitter, Instagram, GitHub notifications) use cursor-based infinite scroll precisely because the UX does not require random page access.

Implementation

Setting up the index

Before writing any application code, make sure the index exists. A missing index turns a cursor seek into a full table scan:

-- For cursor on id only (simpler case)
CREATE INDEX IF NOT EXISTS posts_id_desc ON posts (id DESC);

-- For cursor on (created_at, id) composite
CREATE INDEX IF NOT EXISTS posts_created_id ON posts (created_at DESC, id DESC);

PostgreSQL's query planner will use these for the tuple comparison WHERE (created_at, id) < ($1, $2).

The query

-- No cursor (first page)
SELECT id, title, body, created_at
FROM posts
ORDER BY created_at DESC, id DESC
LIMIT $1;

-- With cursor
SELECT id, title, body, created_at
FROM posts
WHERE (created_at, id) < ($1, $2)
ORDER BY created_at DESC, id DESC
LIMIT $3;

In Go with database/sql:

func ListPosts(ctx context.Context, db *sql.DB, cursor *CursorPayload, limit int) ([]Post, *CursorPayload, error) {
    var (
        rows *sql.Rows
        err  error
    )

    if cursor == nil {
        rows, err = db.QueryContext(ctx, `
            SELECT id, title, body, created_at
            FROM posts
            ORDER BY created_at DESC, id DESC
            LIMIT $1
        `, limit+1) // fetch one extra to detect if there's a next page
    } else {
        rows, err = db.QueryContext(ctx, `
            SELECT id, title, body, created_at
            FROM posts
            WHERE (created_at, id) < ($1, $2)
            ORDER BY created_at DESC, id DESC
            LIMIT $3
        `, cursor.CreatedAt, cursor.ID, limit+1)
    }
    if err != nil {
        return nil, nil, err
    }
    defer rows.Close()

    var posts []Post
    for rows.Next() {
        var p Post
        if err := rows.Scan(&p.ID, &p.Title, &p.Body, &p.CreatedAt); err != nil {
            return nil, nil, err
        }
        posts = append(posts, p)
    }

    // If we got limit+1 results, there is a next page
    var nextCursor *CursorPayload
    if len(posts) > limit {
        last := posts[limit-1]
        nextCursor = &CursorPayload{CreatedAt: last.CreatedAt, ID: last.ID}
        posts = posts[:limit] // trim the extra row
    }

    return posts, nextCursor, rows.Err()
}

The limit+1 trick avoids a separate COUNT(*) query to determine whether a next page exists. You fetch one more than you need: if you get it, there is a next page and the cursor points to the last item you actually return.

The HTTP handler

func (h *Handler) ListPostsHandler(w http.ResponseWriter, r *http.Request) {
    limit := 20
    var cursor *CursorPayload

    if raw := r.URL.Query().Get("cursor"); raw != "" {
        decoded, err := DecodeCursor(raw)
        if err != nil {
            http.Error(w, "invalid cursor", http.StatusBadRequest)
            return
        }
        cursor = &decoded
    }

    posts, nextCursor, err := ListPosts(r.Context(), h.db, cursor, limit)
    if err != nil {
        http.Error(w, "internal error", http.StatusInternalServerError)
        return
    }

    resp := map[string]any{"items": posts}
    if nextCursor != nil {
        resp["next_cursor"] = EncodeCursor(*nextCursor)
    }

    w.Header().Set("Content-Type", "application/json")
    json.NewEncoder(w).Encode(resp)
}

How It All Fits Together

A client fetches the first page with no cursor. The server returns items plus next_cursor. The client stores the cursor and passes it as ?cursor=... on the next request. The server decodes the cursor, uses it in a WHERE (created_at, id) < (...) index seek, returns the next page plus a new cursor. This continues until next_cursor is absent from the response, signalling the last page.

Client                          Server                      DB
  |                               |                          |
  |-- GET /posts ----------------->|                          |
  |                               |-- SELECT ... LIMIT 21 -->|
  |                               |<-- 21 rows --------------|
  |<-- {items, next_cursor} ------|                          |
  |                               |                          |
  |-- GET /posts?cursor=eyJ... --->|                          |
  |                               |-- SELECT ... WHERE (created_at,id) < (...) -->|
  |                               |<-- 21 rows -------------------------------|
  |<-- {items, next_cursor} ------|                          |

Every request is an index seek at the same cost, regardless of which page you are on.

Lessons Learned

Offset pagination is fine for small, stable datasets. If your table has fewer than 10,000 rows and write volume is low, the offset cost is negligible and the simplicity is worth it. Optimise when you have a measured problem, not before.

Composite cursors are the rule, not the exception. Pure id-based cursors only work when sorting by ID. The moment a client wants to sort by created_at, score, or any non-unique column, you need a composite cursor. Build the infrastructure for it once and all sort orders become easy.

The limit+1 trick is underused. Many implementations do a separate SELECT COUNT(*) to determine if a next page exists. That count query is expensive on large tables and becomes a bottleneck as the table grows. Fetching one extra row is always cheaper.

Backwards pagination is genuinely hard. If you need "previous page", you either need to store the cursor history client-side (feasible) or add a second query that reverses the sort direction. Neither is terrible, but neither is as clean as forward-only. Design your UX around this constraint early.

Do not sort by RANDOM() with cursor pagination. Cursor pagination requires a stable, deterministic sort order. Randomised feeds need a different approach entirely (pre-generated feed tables, snapshot isolation, or accepting that cursor pagination does not apply).

What's Next

If your dataset is large enough that even keyset pagination struggles (extremely high-cardinality columns, cross-shard queries), the next step is usually pre-materialized feed tables or seek-based pagination with snapshot reads. These are common patterns in high-scale social feeds but add significant infrastructure complexity.

For most APIs, keyset pagination on a composite index is the right answer and the ceiling for when it stops being enough is very high.

Cursor Pagination vs Offset Pagination: Which One Should You Use?