Fixing PostgreSQL CTE That Materializes When You Need It Inline
You write a clean CTE, expecting the planner to optimize through it like any other subquery β and instead your query takes ten times longer than the equivalent rewrite without the WITH clause. The data is the same, the logic is the same, but PostgreSQL is doing extra work you never asked for. The culprit is almost always CTE materialization.
This article explains exactly what materialization means, how to confirm it is happening, and how to control it with a single SQL keyword introduced in PostgreSQL 12.
What You'll Learn
- What materialization means and why PostgreSQL does it by default in older versions
- How to read
EXPLAIN ANALYZEoutput to confirm a CTE is being materialized - How to use
NOT MATERIALIZEDto allow the planner to inline your CTE - When you actually want materialization, and how to force it explicitly
- Edge cases and gotchas that can trip you up after the fix
What's Actually Happening When a CTE Materializes
A Common Table Expression (CTE) defined with WITH can be executed in one of two ways. When it is inlined, PostgreSQL folds it into the surrounding query as if it were a subquery, and the planner can push predicates down into it, use indexes, and generally optimize the whole query together. When it is materialized, PostgreSQL executes the CTE as a separate step, stores the full result set in memory (or on disk if it is large), and then reads from that temporary store for every reference in the outer query.
Materialization acts as an optimization fence. The planner cannot push a WHERE clause from the outer query down into the CTE body. If your CTE scans a million rows and the outer query only needs ten of them, materialization means you always pay for the million-row scan β even if an index could have given you those ten rows directly.
-- Example: a CTE that looks innocent but may materialize
WITH recent_orders AS (
SELECT order_id, customer_id, amount, created_at
FROM orders
WHERE created_at >= now() - interval '90 days'
)
SELECT *
FROM recent_orders
WHERE customer_id = 42;
If this CTE materializes, PostgreSQL fetches every order from the last 90 days before filtering to customer 42. If there is an index on (customer_id, created_at), that index goes completely unused inside the CTE.
How to Spot Materialization in EXPLAIN ANALYZE
Run EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT) on your query. Look for the node labeled CTE Scan and, separately, a node labeled Materialize or a top-level CTE <name> section. If you see the CTE's body plan executed as its own independent subtree before the outer plan references it, it is materializing.
EXPLAIN (ANALYZE, BUFFERS)
WITH recent_orders AS (
SELECT order_id, customer_id, amount, created_at
FROM orders
WHERE created_at >= now() - interval '90 days'
)
SELECT *
FROM recent_orders
WHERE customer_id = 42;
A materialized output looks like this in the plan:
CTE Scan on recent_orders (cost=... rows=... width=...)
CTE recent_orders
-> Seq Scan on orders (cost=... rows=95000 ...) <-- full scan!
Filter: (created_at >= ...)
Notice the sequential scan returns 95,000 rows. The customer_id = 42 filter is nowhere inside the CTE subtree β it is applied after materialization, too late for any index to help. Compare this to what an inlined CTE plan looks like: the outer filter pushes inside, and you see an index scan on orders directly, returning a handful of rows.
If you want a deeper primer on reading EXPLAIN output, the article on fixing PostgreSQL EXPLAIN ANALYZE that shows a seq scan despite an index covers the mechanics of how the planner chooses scan strategies.
The Pre-PostgreSQL-12 Default and Why It Changed
Before PostgreSQL 12, every CTE was always materialized, no exceptions. This was intentional: it guaranteed that CTEs with side effects (like INSERT ... RETURNING inside a writable CTE) ran exactly once, and it gave developers a predictable optimization fence they could use deliberately. The trade-off was that even pure, read-only CTEs paid the materialization cost.
PostgreSQL 12 changed the default for CTEs that are referenced exactly once and have no side effects: those CTEs are now inlined automatically. This is the right default for the majority of cases, but it also introduced a subtle behavioral shift. Queries that were written to rely on the materialization fence (to, say, prevent repeated evaluation of a volatile function) now behave differently unless you add the explicit keyword.
The short version: if you are on PostgreSQL 12 or later, pure single-reference CTEs inline by default. If you are on an earlier version, they always materialize. Knowing your version matters when you read someone else's workaround online.
When Materialization Hurts You
The performance hit from materialization is worst when all of the following are true:
- The CTE reads a large table or produces a large intermediate result.
- The outer query filters that result down to a small fraction of the rows.
- An index on the base table could have served the combined predicate efficiently.
- The CTE is on PostgreSQL 11 or earlier (forced materialization), or it is referenced more than once in PostgreSQL 12+.
Materialization also silently disables predicate pushdown for joins. If your outer query joins the CTE result to another table and filters on a column from the CTE, the planner cannot reorder that join to use an index inside the CTE. You end up with a nested loop over the full materialized set instead of a small indexed lookup.
Forcing Inline Execution with NOT MATERIALIZED
PostgreSQL 12 added two explicit hints you can attach to a CTE definition: MATERIALIZED and NOT MATERIALIZED. To force inline execution, append NOT MATERIALIZED after the CTE name:
WITH recent_orders AS NOT MATERIALIZED (
SELECT order_id, customer_id, amount, created_at
FROM orders
WHERE created_at >= now() - interval '90 days'
)
SELECT *
FROM recent_orders
WHERE customer_id = 42;
With this in place, the planner treats the CTE body as a subquery it is free to optimize through. It can push the customer_id = 42 predicate inside, combine it with the date range filter, and use the (customer_id, created_at) index to return exactly the rows you need in one pass.
Run EXPLAIN ANALYZE again after adding NOT MATERIALIZED. You should see the CTE Scan node disappear and be replaced with an index scan or index-only scan directly on the base table. The node count in the plan shrinks and the actual row counts drop dramatically.
Handling Multiple CTE References
If your outer query references the same CTE more than once, NOT MATERIALIZED tells the planner it is allowed to execute the CTE body multiple times β once per reference site. For a pure, side-effect-free CTE that hits an indexed table, this is usually faster than materializing the full result and scanning it twice. But measure it: if the CTE body is expensive and the result is small, materializing once and reading twice can win.
-- Referenced twice β consider whether NOT MATERIALIZED is still better
WITH cte AS NOT MATERIALIZED (
SELECT id, value FROM my_table WHERE status = 'active'
)
SELECT a.id, b.value
FROM cte a
JOIN cte b ON a.id = b.id + 1;
Run EXPLAIN (ANALYZE, BUFFERS) for both variants and compare actual total time and buffer hits. Let the numbers decide.
When You Actually Want MATERIALIZED
Materialization is not always the enemy. There are legitimate reasons to keep it or even force it explicitly.
Volatile Functions
If your CTE calls a volatile function β one that can return different values on each call, like random(), now() with session-level side effects, or a custom function marked VOLATILE β you might need materialization to ensure the function runs exactly once. Without it, the planner might call it once per row in the outer query.
-- Force materialization to evaluate random() once
WITH sample AS MATERIALIZED (
SELECT * FROM large_table ORDER BY random() LIMIT 1000
)
SELECT * FROM sample WHERE category = 'A';
Writable CTEs
CTEs that contain INSERT, UPDATE, or DELETE statements are always materialized regardless of any keyword, because the planner must guarantee they execute exactly once. You do not need to add MATERIALIZED to writable CTEs β it is the default and cannot be overridden.
Expensive Subqueries Referenced Multiple Times
If a CTE body involves an expensive aggregation or a slow function and the outer query references it three or four times, materializing once and scanning the result repeatedly is cheaper than re-executing the body each time. Add MATERIALIZED explicitly to signal this intent clearly to anyone reading the query later.
Common Pitfalls and Edge Cases
PostgreSQL Version Compatibility
NOT MATERIALIZED and MATERIALIZED keywords are only available from PostgreSQL 12 onward. If you need to support older clusters, you have two options: rewrite the CTE as a subquery (which always inlines), or accept the materialization cost. Subquery rewrites are mechanical but effective.
-- Pre-12 workaround: use a subquery instead of a CTE
SELECT *
FROM (
SELECT order_id, customer_id, amount, created_at
FROM orders
WHERE created_at >= now() - interval '90 days'
) recent_orders
WHERE customer_id = 42;
The Planner Is Not Always Wrong to Materialize
Just because you add NOT MATERIALIZED does not guarantee a faster query. If the outer query has a very low selectivity filter and the planner's statistics are accurate, it might genuinely be cheaper to scan the materialized result than to re-plan around the base table. Always verify with EXPLAIN ANALYZE using representative data, not just a toy dataset.
Statistics Staleness
Sometimes a CTE materializes because the planner's row count estimates are wrong, leading it to believe the CTE result will be small enough that a sequential scan of the materialized set is fine. Running ANALYZE orders to refresh table statistics can sometimes change the plan even without touching the query. This is especially relevant after a large bulk insert β a scenario covered in the article on fixing PostgreSQL foreign key constraints that silently fail on bulk insert.
Nested CTEs
If you have CTEs that reference other CTEs, each level is evaluated independently for materialization. You may need NOT MATERIALIZED on multiple levels, and the interaction between them can produce unexpected plans. Flatten deeply nested CTEs into subqueries when the plan becomes hard to reason about.
UPSERT CTEs
A common pattern is to use a CTE with INSERT ... ON CONFLICT to return affected rows, then join the result in the outer query. Because the CTE contains a DML statement, it is always materialized. If you expect the outer join to be fast, make sure the materialized result is small β filter aggressively inside the CTE body itself. For more on UPSERT behavior in PostgreSQL, the article on fixing PostgreSQL UPSERT that silently skips updates on conflict is worth reading alongside this one.
Wrapping Up
CTE materialization is one of those issues that is invisible until it is catastrophic. A query that works fine on a small dataset starts timing out in production, and the only clue is buried three levels deep in an EXPLAIN plan. Now you know where to look and what to do about it.
Here are the concrete steps to take right now:
- Run
EXPLAIN (ANALYZE, BUFFERS)on any slow query that uses a CTE and look forCTE Scannodes with high row counts. - If you see the CTE executing as an independent subtree with a sequential scan, add
NOT MATERIALIZEDto the CTE definition (PostgreSQL 12+) or rewrite it as a subquery for older versions. - Run
ANALYZEon the affected table to ensure the planner has fresh statistics before drawing conclusions. - Benchmark both the materialized and inlined variants on production-scale data β use actual row counts, not test fixtures.
- For CTEs you intentionally want to materialize (volatile functions, expensive aggregations referenced multiple times), add the explicit
MATERIALIZEDkeyword so the intent is clear in the code.
Frequently Asked Questions
How do I know if my PostgreSQL CTE is being materialized or inlined?
Run EXPLAIN (ANALYZE, BUFFERS) on your query and look for a 'CTE Scan' node with the CTE's body executed as a separate subtree. If the inner plan shows a sequential scan returning far more rows than the outer query needs, the CTE is materializing and acting as an optimization fence.
Does adding NOT MATERIALIZED always make a CTE faster in PostgreSQL?
Not always. NOT MATERIALIZED allows the planner to inline the CTE, but whether that produces a faster plan depends on your data distribution, available indexes, and how many times the CTE is referenced. Always benchmark with EXPLAIN ANALYZE on realistic data before committing to the change.
Can I use NOT MATERIALIZED in PostgreSQL 11 or earlier?
No, the NOT MATERIALIZED and MATERIALIZED keywords were introduced in PostgreSQL 12. On earlier versions, all CTEs are always materialized, and the only workaround is to rewrite the CTE as a plain subquery, which the planner is free to inline.
Why does my CTE with INSERT or UPDATE still materialize even after adding NOT MATERIALIZED?
Writable CTEs β those containing INSERT, UPDATE, or DELETE β are always materialized regardless of any keyword hint, because the database must guarantee they execute exactly once. The NOT MATERIALIZED keyword only affects read-only CTEs.
If a CTE is referenced twice in the same query, will NOT MATERIALIZED cause the body to run twice?
Yes, when you use NOT MATERIALIZED on a CTE referenced multiple times, the planner is allowed to execute the CTE body once per reference site rather than caching the result. For lightweight CTEs that hit indexed tables this is usually still faster, but for expensive aggregations materializing once and reading the cached result twice may be cheaper.
π€ Share this article
Sign in to saveRelated Articles
Comments (0)
No comments yet. Be the first!