Database Connection Errors
Database connection errors are among the most disruptive in production. They can cascade — a brief DB overload can exhaust the connection pool, causing all requests to fail. This lesson covers root-cause diagnosis for the most common DB errors.
By the end of this lesson you can: diagnose connection refused, pool exhaustion, SSL, and Alembic migration errors — and implement preventative measures for each.
OperationalError: connection refused
Full error: asyncpg.exceptions.ConnectionRefusedError: Connection refused (os error 111)
Root causes:
- PostgreSQL is not running
- Wrong host or port in
DATABASE_URL - Firewall blocking port 5432
Diagnose:
# Check if Postgres is running
systemctl status postgresql
# Test connectivity directly
psql -h localhost -U postgres -c "SELECT 1;"
# From inside Docker
docker compose exec app bash
psql "${DATABASE_URL}" -c "SELECT 1;"
Fix:
# Start PostgreSQL
systemctl start postgresql
# Verify DATABASE_URL
python -c "from app.core.config import settings; print(settings.DATABASE_URL)"
OperationalError: too many connections
Full error: asyncpg.exceptions.TooManyConnectionsError: sorry, too many clients already
Root cause: pool_size × workers exceeds PostgreSQL max_connections (default: 100).
Diagnose:
-- Check active connections
SELECT count(*) FROM pg_stat_activity WHERE datname = 'mydb';
-- See connection limit
SHOW max_connections;
Fix:
# Reduce pool_size (or increase PostgreSQL max_connections)
engine = create_async_engine(
settings.DATABASE_URL,
pool_size=5, # Reduced from 20
max_overflow=5,
pool_timeout=30,
)
Or increase PostgreSQL limits:
# postgresql.conf
max_connections = 200
Pool Exhaustion — All Requests Hang
Symptom: All requests hang, no error, eventually timeout.
Cause: All connections are checked out (likely a long-running query or a session not closed).
Diagnose:
-- Find long-running queries
SELECT pid, now() - pg_stat_activity.query_start AS duration, query, state
FROM pg_stat_activity
WHERE state != 'idle'
ORDER BY duration DESC;
Fix:
- Ensure
get_dbalways closes sessions (useasync with) - Set
pool_timeout=30to raise instead of hanging - Set
statement_timeoutin PostgreSQL for runaway queries
SSL Certificate Errors
Full error: ssl.SSLCertVerificationError: certificate verify failed
Fix:
engine = create_async_engine(
settings.DATABASE_URL + "?ssl=require", # For managed DB services
# Or disable verification (dev only):
connect_args={"ssl": False},
)
Alembic Migration Errors
Can't locate revision:
# Reset migration state (dangerous — only on fresh DB)
alembic stamp head
# Or manually set to specific revision
alembic stamp <revision_id>
Table already exists:
# Migration already applied but not tracked
alembic stamp head # Mark as applied without running migrations
Column X does not exist after deploy:
# Migration not applied — run it
alembic upgrade head
# Check current state
alembic current
Connection Pool Pre-Ping
Enable pre-ping to detect stale connections:
engine = create_async_engine(
settings.DATABASE_URL,
pool_pre_ping=True, # Test connection before returning from pool
pool_recycle=1800, # Recycle connections every 30 minutes
)
Common Pitfalls Summary
| Error | Root Cause | Fix |
|---|---|---|
| Connection refused | DB not running or wrong host | Check service status and DATABASE_URL |
| Too many connections | Pool too large for DB max_connections | Reduce pool_size or increase max_connections |
| All requests hang | Pool exhausted (session not closed) | Check for unclosed sessions; set pool_timeout |
| SSL cert error | Managed DB requires SSL | Add ?ssl=require to connection string |
| Migration not found | Missing import in env.py | Import all model modules in alembic/env.py |
| Stale connections | Long-lived connections dying | Enable pool_pre_ping=True and pool_recycle |
Hands-On Diagnostic Checklist
# 1) Can the app reach the database?
python -c "
import asyncio
from sqlalchemy.ext.asyncio import create_async_engine
from sqlalchemy import text
from app.core.config import settings
async def check():
engine = create_async_engine(settings.DATABASE_URL)
async with engine.connect() as conn:
result = await conn.execute(text('SELECT version()'))
print('DB version:', result.scalar())
asyncio.run(check())
"
# 2) How many active connections?
psql ${DATABASE_URL} -c "SELECT count(*) FROM pg_stat_activity;"
# 3) Are migrations up to date?
alembic current
alembic check # Fails if unapplied migrations exist
# 4) Test pool exhaustion resilience
hey -n 500 -c 100 http://localhost:8000/health