SQL Window Functions Giving Wrong Totals When PARTITION BY Is Missing
You run a query, scan the totals column, and something feels wrong. The numbers are too large, or every row in a group shows the same value instead of a running count. The query executed without an error β SQL didn't warn you. It just gave you silently incorrect results.
This is the most common symptom of a missing or misplaced PARTITION BY clause in a window function. The fix is usually one line, but finding it requires understanding what that clause actually controls.
What You'll Learn
- How
PARTITION BYdefines the scope of a window function calculation - Why omitting it inflates totals across the entire result set
- How to fix broken running totals, ranks, and cumulative sums
- How window frames interact with
ORDER BYto produce unexpected accumulation - A checklist to audit window functions before they go to production
Prerequisites
You should be comfortable writing basic SELECT statements with GROUP BY and aggregate functions. The examples below use PostgreSQL syntax, but the behavior applies equally to MySQL 8+, SQL Server, and BigQuery. Window frame syntax is standard SQL and behaves consistently across those engines.
What Window Functions Actually Do (and Where They Go Wrong)
A window function performs a calculation across a set of rows that are related to the current row. Unlike a GROUP BY aggregate, it does not collapse rows β each row keeps its own identity and gets an additional computed value alongside it.
The full syntax looks like this:
function_name(expression) OVER (
PARTITION BY column1, column2
ORDER BY column3
ROWS BETWEEN ... AND ...
)
The OVER() clause is what makes it a window function. Inside it, PARTITION BY divides the result set into independent groups (partitions), ORDER BY determines the row order within each partition, and the optional frame clause narrows the calculation further. Omit PARTITION BY entirely, and the window spans the entire result set β all rows become one giant partition.
How PARTITION BY Controls the Calculation Window
Think of PARTITION BY as the equivalent of GROUP BY, except the rows are not collapsed. Each partition gets its own fresh calculation. The function resets and starts over for every new partition value.
Consider a sales table with columns region, sales_rep, and sale_amount. If you want the total sales per region alongside each individual row, you write:
SELECT
region,
sales_rep,
sale_amount,
SUM(sale_amount) OVER (PARTITION BY region) AS region_total
FROM sales;
Every row where region = 'North' will show the same region_total β the sum of all North rows. Every row where region = 'South' will show the South total. The partitioning is what creates that boundary.
The Classic Bug: SUM() OVER () Without PARTITION BY
Drop the PARTITION BY and you get a single partition containing every row in the result set. The SUM() returns the grand total on every single row, regardless of region.
-- Broken: returns the grand total on every row
SELECT
region,
sales_rep,
sale_amount,
SUM(sale_amount) OVER () AS region_total -- missing PARTITION BY
FROM sales;
The output looks like this:
| region | sales_rep | sale_amount | region_total |
|---|---|---|---|
| North | Alice | 500 | 2100 |
| North | Bob | 700 | 2100 |
| South | Carol | 400 | 2100 |
| South | Dave | 500 | 2100 |
Every row shows 2100 β the total for all regions combined. This is not a SQL bug; it is exactly what you asked for. The engine has no idea you intended per-region totals unless you tell it with PARTITION BY region.
This pattern is especially dangerous when you later divide by that total to compute percentages. Every percentage calculation becomes wrong, and the numbers can look plausible enough to slip past a quick review.
The same class of silent-result bug appears in tools like Power BI, where a missing filter context produces inflated totals. If you've ever hit that problem, the Power BI DAX measures returning blank instead of the expected value troubleshooting guide covers the DAX equivalent.
Running Totals That Reset at the Wrong Time
Running totals introduce a second dimension of failure: the ORDER BY inside OVER(). A running total requires both a partition boundary and a sort order. Get either wrong, and the accumulation either never resets or resets too early.
The wrong query
-- Running total never resets between customers
SELECT
customer_id,
order_date,
order_total,
SUM(order_total) OVER (ORDER BY order_date) AS running_total
FROM orders;
Without PARTITION BY customer_id, the running total accumulates across all customers sorted by date. Row 1 for customer A, then row 1 for customer B, and so on β the total keeps growing across the entire table.
The correct query
-- Running total resets per customer
SELECT
customer_id,
order_date,
order_total,
SUM(order_total) OVER (
PARTITION BY customer_id
ORDER BY order_date
) AS running_total
FROM orders;
Now each customer has an independent running total. The window resets to zero every time customer_id changes.
Checking your work
A quick sanity check: filter the result to a single customer_id and verify the final running_total matches a plain SUM(order_total) for that customer. If it doesn't, your partition boundary is still wrong.
-- Sanity check
SELECT
customer_id,
SUM(order_total) AS expected_total
FROM orders
WHERE customer_id = 42
GROUP BY customer_id;
RANK() and ROW_NUMBER() Returning Duplicates or Wrong Order
Ranking functions are another place where a missing PARTITION BY produces results that look almost right but are systematically off. The symptom is usually that the ranking continues across groups instead of resetting, or that rows within the same logical group get ranks that bleed from the previous group.
-- Wrong: ranking is global across all departments
SELECT
department,
employee_name,
salary,
RANK() OVER (ORDER BY salary DESC) AS salary_rank
FROM employees;
This ranks all employees together. The top earner company-wide gets rank 1, not the top earner per department.
-- Correct: ranking resets per department
SELECT
department,
employee_name,
salary,
RANK() OVER (
PARTITION BY department
ORDER BY salary DESC
) AS salary_rank
FROM employees;
Now rank 1 appears once per department. If you want strictly sequential numbers with no gaps (unlike RANK() which skips after ties), use DENSE_RANK() instead β but the PARTITION BY requirement is identical.
This pattern of partitioned groupings mirrors what happens in Tableau when a Level of Detail expression runs at the wrong scope. The Tableau LOD expressions returning unexpected values guide walks through the same conceptual problem from a different angle.
When Your Window Frame Causes Unexpected Accumulation
Even with a correct PARTITION BY, you can still get wrong totals if the window frame is either implicit or misconfigured. A window frame controls exactly which rows within the partition participate in each calculation.
Implicit frame behavior
When you include ORDER BY inside OVER() without specifying a frame, most SQL engines default to ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW. This is the running-total frame. It accumulates from the first row of the partition up to and including the current row.
If you intended a static partition total (every row in the partition shows the same sum), you need to either remove the ORDER BY or explicitly set the frame:
-- Static total: every row in the partition shows the same sum
SELECT
region,
sale_amount,
SUM(sale_amount) OVER (
PARTITION BY region
ORDER BY sale_date
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
) AS region_total
FROM sales;
The frame ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING always spans the entire partition, regardless of the current row. This gives you the partition total on every row even when ORDER BY is present.
RANGE vs ROWS
Swapping ROWS for RANGE introduces another subtlety. RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW includes all rows with the same ORDER BY value as the current row, not just rows physically before it. If two rows share the same date, RANGE includes both of them in the current row's frame even if one of them comes after. ROWS is almost always what you actually want for running totals.
Common Pitfalls and Gotchas
Partitioning on a column with NULLs
SQL treats NULL as its own partition value. If your PARTITION BY column contains nulls, all null rows form one partition together. This is usually fine, but it catches people off guard when they expect null rows to be excluded. They are included β they just live in the null partition.
Using window functions in WHERE or HAVING
Window functions are evaluated after WHERE and HAVING filters, so you cannot reference a window function alias in those clauses. Wrap your query in a CTE or subquery if you need to filter on a windowed value:
WITH ranked AS (
SELECT
department,
employee_name,
salary,
RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS salary_rank
FROM employees
)
SELECT * FROM ranked WHERE salary_rank = 1;
Confusing PARTITION BY with GROUP BY
A GROUP BY collapses rows into one row per group and discards individual row detail. PARTITION BY keeps all rows and adds a computed column. You cannot use a window function as a substitute for GROUP BY when you want summarized output, and you cannot use GROUP BY when you need per-row calculations alongside group context.
If you've run into similar silent-result problems in Python, the same logic applies to grouped aggregations in pandas. The pandas groupby aggregation returning wrong numbers article covers how grouping scope errors surface there.
Multiple window functions with different partitions in one SELECT
You can mix window functions with different PARTITION BY and ORDER BY clauses in the same SELECT. Each OVER() clause is independent. Double-check that each one has the partition that matches its intent β it's easy to copy-paste one and forget to update the column name.
SELECT
region,
department,
sale_amount,
SUM(sale_amount) OVER (PARTITION BY region) AS region_total,
SUM(sale_amount) OVER (PARTITION BY department) AS dept_total,
SUM(sale_amount) OVER () AS grand_total
FROM sales;
All three window functions coexist. The grand total intentionally has no PARTITION BY here β that is deliberate and correct for a percentage-of-grand-total calculation. Intent matters: missing PARTITION BY is only a bug when you needed a narrower scope.
Duplicate rows inflating results
If your source table has unexpected duplicates, every window function total will be wrong β but the window function is not the cause. Run a quick count check before blaming the OVER() clause. Unintended row multiplication is a common upstream problem. The pandas merge silently dropping rows guide has a useful diagnostic mindset for tracking down where rows appear or disappear unexpectedly.
Wrapping Up: Next Steps
A missing PARTITION BY is a one-line fix, but it produces results that are fully valid SQL β so the engine will never tell you something went wrong. You need to catch it yourself.
Here are five concrete actions to take right now:
- Audit every window function in your query and ask: should this reset per group? If yes, add or correct
PARTITION BY. - Run a sanity check by comparing the window result for one partition value against a plain
GROUP BYaggregate on the same filter. - Be explicit about frames. If you add
ORDER BYinsideOVER(), decide whether you want a running total or a static partition total, then write the frame explicitly rather than relying on the default. - Use
ROWSinstead ofRANGEfor running totals unless you specifically need tie-inclusive accumulation. - Check for upstream duplicates before assuming the window function is wrong β inflated totals can come from the data, not the calculation logic.
Frequently Asked Questions
Why does my SQL SUM window function show the same value on every row?
This happens when you omit PARTITION BY from the OVER() clause. Without it, the function treats the entire result set as one partition and returns the grand total on every row. Add PARTITION BY with the column you want to group by, such as OVER (PARTITION BY region), to fix it.
What is the difference between PARTITION BY and GROUP BY in SQL?
GROUP BY collapses multiple rows into a single summary row per group, discarding individual row detail. PARTITION BY, used inside a window function's OVER() clause, keeps every row intact and adds a computed column based on the group β you get both the row-level detail and the group-level calculation at once.
How do I make a running total reset for each customer or group in SQL?
Include both PARTITION BY and ORDER BY inside your OVER() clause: SUM(amount) OVER (PARTITION BY customer_id ORDER BY order_date). The PARTITION BY resets the accumulation at each new customer, while ORDER BY controls the direction of accumulation within that partition.
Can you use a window function result in a WHERE clause?
No. Window functions are evaluated after WHERE and HAVING filters, so you cannot reference a windowed column alias in those clauses. Wrap the query in a CTE or subquery first, then filter on the windowed column in the outer query's WHERE clause.
Why does adding ORDER BY inside OVER() change my SUM totals?
Adding ORDER BY triggers an implicit window frame of ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW in most SQL engines, turning a static partition total into a running total. If you want each row to show the full partition sum even with ORDER BY present, explicitly set the frame to ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING.
π€ Share this article
Sign in to saveRelated Articles
Comments (0)
No comments yet. Be the first!