Diagnosing and Fixing N+1 Query Problems in SQLAlchemy ORM

May 24, 2026 6 min read 58 views
Abstract diagram showing a single database query branching into many smaller queries, illustrating the N+1 query problem in a clean flat-style illustration

Your endpoint feels fine in development with a handful of rows, but under real traffic the database is getting hammered with hundreds of tiny queries per request. The culprit is almost always an N+1 query problem, and SQLAlchemy's lazy-loading defaults make it easy to introduce one without noticing.

This article walks you through identifying the problem, understanding why SQLAlchemy generates those extra queries, and applying the right fix for your situation.

What you'll learn

  • How to enable query logging to catch N+1 patterns in development
  • Why lazy loading causes N+1 problems and when it's actually fine
  • How to use selectinload, joinedload, and subqueryload to eliminate excess queries
  • How to apply eager loading on a per-query basis without changing your model definitions
  • Common mistakes that bring N+1 problems back after you've fixed them

Prerequisites

You should be comfortable writing SQLAlchemy ORM queries and have a basic understanding of Python. The examples use SQLAlchemy 2.x with the select() style API, but the loading strategies apply to 1.4 as well. A running PostgreSQL or SQLite database helps if you want to follow along.

What an N+1 Problem Actually Looks Like

The name comes from the query count: one query to fetch N parent rows, then one additional query for each of those rows to fetch related data. Fetch 100 posts and their authors, and you've just fired 101 queries instead of one or two.

Here's a minimal example. Say you have Post and Author models with a many-to-one relationship:

from sqlalchemy import Column, Integer, String, ForeignKey
from sqlalchemy.orm import relationship, DeclarativeBase

class Base(DeclarativeBase):
    pass

class Author(Base):
    __tablename__ = "authors"
    id = Column(Integer, primary_key=True)
    name = Column(String)

class Post(Base):
    __tablename__ = "posts"
    id = Column(Integer, primary_key=True)
    title = Column(String)
    author_id = Column(Integer, ForeignKey("authors.id"))
    author = relationship("Author")  # lazy="select" by default

Now you query all posts and print each author's name:

from sqlalchemy import select

posts = session.scalars(select(Post)).all()
for post in posts:
    print(post.author.name)  # triggers a new query for every post

SQLAlchemy fetches all posts in one query. Then, the moment you access post.author inside the loop, it fires a separate SELECT for each post's author. That's the N+1.

Turning On Query Logging to Confirm It

The fastest way to confirm you have an N+1 problem is to watch the SQL SQLAlchemy generates. Set the engine's echo flag to True during development:

from sqlalchemy import create_engine

engine = create_engine("postgresql+psycopg2://user:pass@localhost/mydb", echo=True)

Every SQL statement will print to stdout. If you see the same SELECT repeated with different primary key values, you've found your N+1. You can also route logging through Python's standard logging module if you want more control:

import logging
logging.basicConfig()
logging.getLogger("sqlalchemy.engine").setLevel(logging.INFO)

In production, consider using a query counting middleware or a profiling tool like SQLAlchemy's event system to track query counts per request without the full log noise.

Why Lazy Loading Is the Default

SQLAlchemy defaults to lazy="select" loading on relationships because it's the safest general choice. When you load a model, SQLAlchemy doesn't know whether you'll ever need the related objects. Fetching everything eagerly by default would make simple lookups wasteful.

Lazy loading is perfectly fine when you access a relationship on a single object. The problem appears when you iterate over a collection and touch the relationship for every item. That's when the query count multiplies.

Fixing It with selectinload

selectinload is usually the right default fix. It issues a second query that fetches all related objects in one shot using an IN clause, then maps them back to the parent objects in Python. No join required, and the result sets stay manageable.

from sqlalchemy.orm import selectinload

posts = session.scalars(
    select(Post).options(selectinload(Post.author))
).all()

for post in posts:
    print(post.author.name)  # no additional queries fired here

SQLAlchemy now executes exactly two queries: one for posts, one for all the authors whose IDs appeared in the first result. It doesn't matter if you have 10 or 10,000 posts.

selectinload works well for both many-to-one and one-to-many relationships. It keeps query complexity low and avoids the row duplication issues you get with joins on collection relationships.

When to Use joinedload Instead

joinedload uses a SQL JOIN to fetch the parent and related rows in a single query. That sounds more efficient, but it comes with trade-offs you need to understand.

from sqlalchemy.orm import joinedload

posts = session.scalars(
    select(Post).options(joinedload(Post.author))
).all()

For a many-to-one relationship (each post has one author), joinedload is a solid choice. The join produces exactly one row per post and there's no duplication.

For a one-to-many relationship (each author has many posts), the join multiplies rows. Fetch 5 authors, each with 20 posts, and the database returns 100 rows instead of 25. SQLAlchemy deduplicates them in Python, but you're still transferring more data over the wire. For large collections, selectinload is almost always better.

Use joinedload when the relationship is many-to-one or one-to-one and you're confident the result set won't balloon.

subqueryload: The Third Option

subqueryload issues a second query that embeds the original query as a subquery. It was more relevant in older SQLAlchemy versions; in most modern cases selectinload performs at least as well and generates simpler SQL. You may still encounter subqueryload in existing codebases, so it's worth knowing what it does.

from sqlalchemy.orm import subqueryload

posts = session.scalars(
    select(Post).options(subqueryload(Post.comments))
).all()

If you're starting fresh, default to selectinload and only reach for subqueryload if profiling shows a reason.

Chaining Loaders for Nested Relationships

Real applications often have multiple levels of relationships. You can chain loaders to eagerly load nested associations without writing joins by hand.

from sqlalchemy.orm import selectinload

# Load posts -> comments -> comment authors
posts = session.scalars(
    select(Post).options(
        selectinload(Post.comments).selectinload(Comment.author)
    )
).all()

Each level fires one additional query, so three levels deep means three queries total instead of exponentially more. Keep nesting in check β€” if you need four or five levels of eager loading, that's usually a sign the query or the data model needs rethinking.

Setting Load Strategies on the Model vs. Per Query

You can set a default loading strategy directly on the relationship definition:

from sqlalchemy.orm import relationship

class Post(Base):
    __tablename__ = "posts"
    id = Column(Integer, primary_key=True)
    title = Column(String)
    author_id = Column(Integer, ForeignKey("authors.id"))
    author = relationship("Author", lazy="selectin")  # always eager

This is convenient but blunt. Every query that touches Post will now eagerly load authors, even when you don't need them. That can create unnecessary overhead in queries where you only need post titles.

The better pattern for most applications is to keep the default lazy loading and apply eager loading explicitly in the queries that need it. That way you only pay for what you actually use.

Common Pitfalls That Bring N+1 Problems Back

Accessing relationships outside the session

If you load objects and then close the session before iterating, SQLAlchemy can't issue the lazy-load queries. You'll get a DetachedInstanceError. The fix is to eagerly load everything you need before the session closes, not to keep the session open longer than necessary.

Serializers triggering lazy loads

Libraries like Pydantic or Marshmallow will access every field you include in a schema. If your schema includes a nested relationship and you haven't eager-loaded it, the serializer fires a query per object. Add selectinload to any query feeding a serializer that touches relationships.

Forgetting about secondary relationships

You fix the first level but forget there's a second. Now you have a 1+N+NΒ² pattern. Use query logging every time you touch a new code path that involves relationships β€” don't assume a fix at one level covers the whole chain.

Mixing eager and lazy loading on the same relationship

If you set lazy="selectin" on the model but also call .options(lazyload(Post.author)) in a query, the per-query option wins. That's useful to know, but it also means a careless lazyload override can reintroduce the problem you just fixed.

Wrapping Up

N+1 query problems are quiet in development and painful in production. The steps to get ahead of them are straightforward:

  • Enable SQL logging during development on any endpoint that reads collections with relationships. Count the queries.
  • Apply selectinload as your default fix for most one-to-many and many-to-one relationships. Switch to joinedload only for single-object lookups or simple many-to-one cases.
  • Chain loaders for nested relationships rather than hoping lazy loading will be fast enough.
  • Audit your serializers β€” they're a common hidden trigger for lazy loads that don't show up until you profile a real request.
  • Keep load strategies in queries, not model definitions, so you only pay for eager loading where it's actually needed.

With these habits in place, your SQLAlchemy app will scale to real data volumes without the database becoming the bottleneck.

πŸ“€ Share this article

Sign in to save

Comments (0)

No comments yet. Be the first!

Leave a Comment

Sign in to comment with your profile.

πŸ“¬ Weekly Newsletter

Stay ahead of the curve

Get the best programming tutorials, data analytics tips, and tool reviews delivered to your inbox every week.

No spam. Unsubscribe anytime.