Cloud & DevOps Backend Infrastructure

Debugging DigitalOcean Managed Postgres Connection Pool Exhaustion Under Load

June 23, 2026 11 min read 1 views

Your app is humming along fine at normal traffic, then a load spike hits and Postgres starts rejecting connections with FATAL: sorry, too many clients already. You bump the pool size, redeploy, and it happens again two hours later. The problem isn't just the number you set — it's that the whole connection lifecycle is leaking.

This guide walks you through diagnosing DigitalOcean Managed Postgres connection pool exhaustion systematically: reading the right metrics, finding the actual source of leak, and tuning both PgBouncer and your application so the fix holds under real load.

What you'll learn

  • How DigitalOcean's connection architecture (PgBouncer + Postgres) actually works and where limits bite
  • Which pg_stat_activity and PgBouncer queries to run during an incident
  • How to spot connection leaks in application code without guessing
  • The right PgBouncer pool mode, size, and timeout settings for high-concurrency workloads
  • Application-side patterns that prevent exhaustion from recurring

Prerequisites

  • A DigitalOcean Managed Postgres cluster (any plan) with at least one connection pool configured in the Control Panel
  • Access to the defaultdb or your target database via psql
  • Basic familiarity with connection pooling concepts and your application's database driver

How DigitalOcean Managed Postgres handles connections

DigitalOcean puts PgBouncer in front of every Managed Postgres cluster. When you connect on port 25060, you're talking to PgBouncer, not Postgres directly. PgBouncer multiplexes those client connections onto a smaller set of real server connections to Postgres on port 5432.

This means there are two separate limits you can hit:

  • PgBouncer pool size — the maximum client connections PgBouncer will accept for a named pool. Configured per pool in the Control Panel or via API.
  • Postgres max_connections — the hard limit on server-side connections Postgres will accept. On Managed Postgres this is determined by plan size and is not directly editable.

When your app's connection count exceeds the PgBouncer pool size, new connections queue up or get rejected immediately, depending on the pool mode and max_client_conn. When PgBouncer tries to open more server connections than max_connections allows, Postgres itself rejects them. Both scenarios look similar from your app's perspective but require different fixes.

PgBouncer pool mode matters more than pool size. Session mode holds a server connection for the entire client session. Transaction mode releases it after every transaction. For most web workloads, transaction mode multiplies your effective capacity by an order of magnitude.

Reading the right metrics before you change anything

Changing pool size without understanding what's consuming connections is guesswork. Run these queries first — either during an incident or by reproducing load in staging.

Check active vs. idle connections in Postgres

Connect directly to the database (using the postgres user credentials on port 5432, not through PgBouncer) and run:

SELECT
  state,
  wait_event_type,
  wait_event,
  COUNT(*) AS count
FROM pg_stat_activity
WHERE datname = 'your_database_name'
GROUP BY state, wait_event_type, wait_event
ORDER BY count DESC;

You're looking at the state column. A healthy cluster under load shows mostly active connections doing real work. If you see a large number of idle or idle in transaction rows, you have a leak. idle in transaction is especially dangerous — those connections hold locks and consume a server slot indefinitely until the client closes or times out.

Check how close you are to the hard limit

SELECT
  COUNT(*) AS total_connections,
  (SELECT setting::int FROM pg_settings WHERE name = 'max_connections') AS max_connections,
  ROUND(
    100.0 * COUNT(*) /
    (SELECT setting::int FROM pg_settings WHERE name = 'max_connections'),
    1
  ) AS pct_used
FROM pg_stat_activity;

If you're above 80% of max_connections, you're close enough to the edge that any burst will tip you over. Note the number — it tells you whether the problem is on the Postgres side or the PgBouncer side.

Inspect PgBouncer stats directly

Connect to your PgBouncer admin interface. DigitalOcean exposes this on the same host and port as the pool, using the pgbouncer virtual database:

psql "postgresql://doadmin:YOUR_PASSWORD@your-cluster.db.ondigitalocean.com:25060/pgbouncer?sslmode=require"

Then run:

SHOW POOLS;
SHOW CLIENTS;
SHOW SERVERS;

SHOW POOLS gives you cl_active, cl_waiting, sv_active, and sv_idle per pool. If cl_waiting is non-zero, clients are queuing — your pool is saturated. If sv_idle is high while cl_waiting is also high, you're likely in session mode and PgBouncer can't reuse idle server connections for waiting clients.

Finding where connections are leaking

Idle and idle-in-transaction connections don't disappear on their own unless something closes them. The most common causes are application-level leaks: connections checked out of a pool and never returned, long-running transactions left open, or ORMs misconfigured to open more connections than needed.

Find long-running idle-in-transaction sessions

SELECT
  pid,
  usename,
  application_name,
  state,
  now() - state_change AS idle_duration,
  LEFT(query, 100) AS last_query
FROM pg_stat_activity
WHERE state = 'idle in transaction'
  AND now() - state_change > INTERVAL '30 seconds'
ORDER BY idle_duration DESC;

The application_name column often tells you which service or dyno opened the connection. If you see hundreds of rows from a single application, that application is not closing transactions correctly.

Check application_name diversity

SELECT application_name, COUNT(*) AS connections
FROM pg_stat_activity
WHERE datname = 'your_database_name'
GROUP BY application_name
ORDER BY connections DESC;

If one service accounts for an outsized share, dig into its connection pool configuration and transaction handling first. This is the fastest way to narrow the blast radius without instrumenting every service.

Terminate stuck connections when you need breathing room

During an active incident, you sometimes need to clear idle-in-transaction connections to restore capacity while you fix the root cause:

SELECT pg_terminate_backend(pid)
FROM pg_stat_activity
WHERE state = 'idle in transaction'
  AND now() - state_change > INTERVAL '5 minutes';

This is a temporary measure, not a fix. The connections will accumulate again if the application keeps leaking them.

Tuning PgBouncer pool settings correctly

DigitalOcean lets you configure pool size and pool mode per named pool from the Control Panel under Databases → your cluster → Connection Pools. You can also manage pools via the DigitalOcean API if you want to automate it.

Switch to transaction mode

If your pools are in session mode, switch them to transaction mode. In session mode, a client holds a server connection for the entire time it's connected — even while idle between queries. In transaction mode, PgBouncer borrows a server connection only for the duration of each transaction, then returns it to the pool immediately.

Transaction mode works for most standard web application workloads. The main exception is code that uses session-level features like advisory locks, SET LOCAL configuration, or temporary tables that persist across transactions. Audit your codebase for those patterns before switching.

Set pool size based on actual Postgres capacity

A common mistake is setting the PgBouncer pool size to match the number of application threads or processes. That defeats the purpose of pooling. Instead, size it based on what Postgres can actually handle.

A reasonable starting formula for transaction mode:

  • Find your Postgres max_connections (from the query above)
  • Reserve a few connections (typically 3–5) for admin access and monitoring
  • Distribute the remainder across your pools proportionally

For example: if max_connections is 97 and you have one primary pool, set the pool size to around 90 and let PgBouncer queue the rest. With transaction mode, those 90 server connections can serve hundreds of simultaneous application connections.

Configure idle timeouts

PgBouncer has two timeout settings that prevent idle connections from wasting server slots. In DigitalOcean's managed configuration, you won't edit a pgbouncer.ini file directly, but you should understand what the defaults are doing. The key parameters are server_idle_timeout (how long an idle server connection is kept open) and client_idle_timeout (how long an idle client connection is kept before disconnection).

If your application doesn't implement connection keep-alives, set client_idle_timeout low enough that stale client connections get cleaned up. A value of 60 seconds is a sensible default for most web apps that have their own pool-level idle timeout.

Fixing the application side

PgBouncer tuning buys you headroom, but the real fix is making sure your application acquires and releases connections correctly. Most frameworks and ORMs have a configurable connection pool at the application layer that sits in front of PgBouncer. Getting that right is just as important as the PgBouncer settings.

Use a bounded application-level pool

Don't let your application open an unbounded number of connections to PgBouncer. Libraries like psycopg2 with psycopg2-pool, Node's pg with a pool config, or Django's CONN_MAX_AGE all let you cap how many connections a single process will open.

In Python with SQLAlchemy, a minimal production-safe pool config looks like this:

from sqlalchemy import create_engine

engine = create_engine(
    "postgresql+psycopg2://user:pass@host:25060/dbname",
    pool_size=5,          # connections kept open per process
    max_overflow=10,      # extra connections allowed under burst
    pool_timeout=30,      # seconds to wait before raising
    pool_recycle=1800,    # recycle connections every 30 min
    pool_pre_ping=True,   # check connection health before use
)

With 10 application processes each capped at 15 connections (5 + 10 overflow), you send at most 150 simultaneous connections to PgBouncer — a number you can reason about and match against your pool size.

Always close connections and commit or rollback transactions

The most common source of idle in transaction connections is code that opens a transaction, hits an exception, and never calls rollback(). Use context managers or try/finally blocks so cleanup is guaranteed:

from sqlalchemy.orm import Session

with Session(engine) as session:
    try:
        result = session.execute(some_query)
        session.commit()
    except Exception:
        session.rollback()
        raise

In Django, ATOMIC_REQUESTS = True wraps each HTTP request in a transaction that commits on success and rolls back on any unhandled exception — a safe default for most applications. Just be aware that long-running views hold a transaction (and a server connection in session mode) for the entire request duration.

Don't hold connections across slow external calls

A pattern that silently drains pools: opening a database transaction, calling an external HTTP API mid-transaction, then committing. If that HTTP call takes three seconds, the connection is occupied for three seconds — multiply that by concurrent requests and you exhaust the pool quickly. Move any non-database work outside the transaction boundary.

This kind of invisible performance problem is similar to how ElastiCache Redis evictions silently degrade app performance — the symptom shows up at the application layer long after the root cause should have been obvious in the infrastructure metrics.

Common pitfalls that keep you stuck

Bumping pool size without checking pool mode

Increasing the PgBouncer pool size in session mode just lets more clients hold idle server connections simultaneously. You end up hitting the Postgres max_connections ceiling faster. Always check pool mode first.

Miscounting connections across replicas and services

If you have read replicas, each replica has its own max_connections limit. Make sure your read traffic isn't silently falling back to the primary when a replica is under load — that doubles the primary's connection count without any warning. DigitalOcean surfaces replica metrics in the same Insights tab as the primary, so check both.

Forgetting that connection count scales with process count

Horizontal scaling helps throughput but multiplies connection count. Going from 5 to 20 application containers quadruples the connections arriving at PgBouncer. If your pool size doesn't account for this, auto-scaling becomes self-defeating — the more instances you spin up, the faster you hit the connection ceiling. This is the same category of problem as flapping ALB health checks that kill healthy ECS tasks: scaling events triggering cascading failures.

Ignoring pg_stat_statements for long-running queries

A query that runs for 30 seconds holds a connection (and possibly locks) for 30 seconds. Even with a well-tuned pool, a handful of slow queries under load can exhaust your server connections. Enable pg_stat_statements and look for queries with high mean_exec_time or total_exec_time — those are often the real culprits behind connection exhaustion events.

SELECT
  LEFT(query, 80) AS query,
  calls,
  ROUND(mean_exec_time::numeric, 2) AS mean_ms,
  ROUND(total_exec_time::numeric, 2) AS total_ms
FROM pg_stat_statements
WHERE dbid = (SELECT oid FROM pg_database WHERE datname = 'your_database_name')
ORDER BY total_exec_time DESC
LIMIT 15;

Not setting statement_timeout as a backstop

Set a statement_timeout at the database user level or per pool to cap runaway queries. Even 30 seconds is a reasonable ceiling for most OLTP workloads. A query that would have held a connection for five minutes instead fails fast, freeing the connection for the next request.

ALTER ROLE your_app_user SET statement_timeout = '30s';

Wrapping up: next steps

Connection pool exhaustion on DigitalOcean Managed Postgres almost always has a layered cause: a pool mode that doesn't match your workload, an application that leaks transactions, and slow queries that hold connections longer than expected. Fix all three and the problem stays fixed.

Here are the concrete actions to take right now:

  1. Run the pg_stat_activity queries above to get a snapshot of your current connection state. Identify idle-in-transaction connections and which application_name owns them.
  2. Switch affected pools to transaction mode if they're in session mode, and verify your application doesn't use session-level features that break under transaction mode.
  3. Cap your application-level pool size per process and confirm that pool_size × process_count stays well under your PgBouncer pool limit.
  4. Add statement_timeout to your app database user as a backstop against runaway queries consuming connections.
  5. Set up an alert on connection count in DigitalOcean Insights or an external monitoring tool so you know when you're approaching 80% utilization — before the next incident, not during it.

If you're also dealing with slow query performance rather than pure connection exhaustion, the same pg_stat_statements data points you toward index gaps and sequential scans that are worth fixing in their own right. Connection stability and query performance are two sides of the same coin under load.

Frequently Asked Questions

How do I find out what my DigitalOcean Managed Postgres max_connections limit is?

Connect to your database on port 5432 using the admin credentials and run: SELECT setting FROM pg_settings WHERE name = 'max_connections'; The limit is set by DigitalOcean based on your plan's memory and cannot be changed directly, but you can work around it by using PgBouncer in transaction mode to multiplex connections.

What is the difference between PgBouncer session mode and transaction mode for connection pooling?

In session mode, a client holds a dedicated server connection for its entire connected lifetime, even when idle. In transaction mode, PgBouncer borrows a server connection only for the duration of each transaction and returns it immediately after — allowing far more concurrent clients to share a smaller number of real Postgres connections.

Why do I keep seeing 'idle in transaction' connections piling up on my Postgres cluster?

Idle-in-transaction connections appear when application code opens a database transaction but fails to commit or rollback before going idle — typically because an exception was thrown and not properly handled. Ensure every transaction is wrapped in a try/finally or context manager block that guarantees rollback on error.

Will switching PgBouncer to transaction mode break my existing application?

Transaction mode breaks applications that rely on session-level Postgres features such as advisory locks, temporary tables that persist across transactions, or SET LOCAL configuration changes. Audit your codebase for these patterns before switching; if you use them, you can keep those specific connections in session mode while moving the rest to transaction mode.

How many connections should I set the PgBouncer pool size to on DigitalOcean?

Start by finding your Postgres max_connections value, subtract 3–5 for admin and monitoring use, and use the remainder as your pool ceiling across all pools. In transaction mode, that pool can serve far more simultaneous application connections than the raw number suggests, so resist the urge to set it equal to your total thread or process count.

📤 Share this article

Sign in to save

Comments (0)

No comments yet. Be the first!

Leave a Comment

Sign in to comment with your profile.

📬 Weekly Newsletter

Stay ahead of the curve

Get the best programming tutorials, data analytics tips, and tool reviews delivered to your inbox every week.

No spam. Unsubscribe anytime.