Skip to main content

Database Connection Errors

Database connection errors are among the most disruptive in production. They can cascade — a brief DB overload can exhaust the connection pool, causing all requests to fail. This lesson covers root-cause diagnosis for the most common DB errors.

Learning Focus

By the end of this lesson you can: diagnose connection refused, pool exhaustion, SSL, and Alembic migration errors — and implement preventative measures for each.

OperationalError: connection refused

Full error: asyncpg.exceptions.ConnectionRefusedError: Connection refused (os error 111)

Root causes:

  • PostgreSQL is not running
  • Wrong host or port in DATABASE_URL
  • Firewall blocking port 5432

Diagnose:

# Check if Postgres is running
systemctl status postgresql

# Test connectivity directly
psql -h localhost -U postgres -c "SELECT 1;"

# From inside Docker
docker compose exec app bash
psql "${DATABASE_URL}" -c "SELECT 1;"

Fix:

# Start PostgreSQL
systemctl start postgresql

# Verify DATABASE_URL
python -c "from app.core.config import settings; print(settings.DATABASE_URL)"

OperationalError: too many connections

Full error: asyncpg.exceptions.TooManyConnectionsError: sorry, too many clients already

Root cause: pool_size × workers exceeds PostgreSQL max_connections (default: 100).

Diagnose:

-- Check active connections
SELECT count(*) FROM pg_stat_activity WHERE datname = 'mydb';

-- See connection limit
SHOW max_connections;

Fix:

app/db/session.py
# Reduce pool_size (or increase PostgreSQL max_connections)
engine = create_async_engine(
settings.DATABASE_URL,
pool_size=5, # Reduced from 20
max_overflow=5,
pool_timeout=30,
)

Or increase PostgreSQL limits:

# postgresql.conf
max_connections = 200

Pool Exhaustion — All Requests Hang

Symptom: All requests hang, no error, eventually timeout.

Cause: All connections are checked out (likely a long-running query or a session not closed).

Diagnose:

-- Find long-running queries
SELECT pid, now() - pg_stat_activity.query_start AS duration, query, state
FROM pg_stat_activity
WHERE state != 'idle'
ORDER BY duration DESC;

Fix:

  • Ensure get_db always closes sessions (use async with)
  • Set pool_timeout=30 to raise instead of hanging
  • Set statement_timeout in PostgreSQL for runaway queries

SSL Certificate Errors

Full error: ssl.SSLCertVerificationError: certificate verify failed

Fix:

app/db/session.py
engine = create_async_engine(
settings.DATABASE_URL + "?ssl=require", # For managed DB services
# Or disable verification (dev only):
connect_args={"ssl": False},
)

Alembic Migration Errors

Can't locate revision:

# Reset migration state (dangerous — only on fresh DB)
alembic stamp head

# Or manually set to specific revision
alembic stamp <revision_id>

Table already exists:

# Migration already applied but not tracked
alembic stamp head # Mark as applied without running migrations

Column X does not exist after deploy:

# Migration not applied — run it
alembic upgrade head

# Check current state
alembic current

Connection Pool Pre-Ping

Enable pre-ping to detect stale connections:

app/db/session.py
engine = create_async_engine(
settings.DATABASE_URL,
pool_pre_ping=True, # Test connection before returning from pool
pool_recycle=1800, # Recycle connections every 30 minutes
)

Common Pitfalls Summary

ErrorRoot CauseFix
Connection refusedDB not running or wrong hostCheck service status and DATABASE_URL
Too many connectionsPool too large for DB max_connectionsReduce pool_size or increase max_connections
All requests hangPool exhausted (session not closed)Check for unclosed sessions; set pool_timeout
SSL cert errorManaged DB requires SSLAdd ?ssl=require to connection string
Migration not foundMissing import in env.pyImport all model modules in alembic/env.py
Stale connectionsLong-lived connections dyingEnable pool_pre_ping=True and pool_recycle

Hands-On Diagnostic Checklist

db-diagnostic.sh
# 1) Can the app reach the database?
python -c "
import asyncio
from sqlalchemy.ext.asyncio import create_async_engine
from sqlalchemy import text
from app.core.config import settings

async def check():
engine = create_async_engine(settings.DATABASE_URL)
async with engine.connect() as conn:
result = await conn.execute(text('SELECT version()'))
print('DB version:', result.scalar())

asyncio.run(check())
"

# 2) How many active connections?
psql ${DATABASE_URL} -c "SELECT count(*) FROM pg_stat_activity;"

# 3) Are migrations up to date?
alembic current
alembic check # Fails if unapplied migrations exist

# 4) Test pool exhaustion resilience
hey -n 500 -c 100 http://localhost:8000/health

What's Next