Fixing Python openpyxl That Skips Hidden Rows When Reading Cell Data

June 18, 2026 8 min read 1 views

You open an Excel file with openpyxl, iterate over the rows, and your row count is off. After some digging, you realize the missing rows are the hidden ones. The data is there in the file β€” you can see it when you unhide those rows in Excel β€” but your Python script never touches them.

This is a subtle issue because openpyxl does not strip hidden rows from the file. They are all there in memory after you load the workbook. The problem is almost always in how you iterate, not in what openpyxl loads.

What You'll Learn

  • How openpyxl represents row visibility internally
  • Which iteration patterns silently skip hidden rows and why
  • The correct approach to read every row, hidden or not
  • How to intentionally detect and separate hidden rows from visible ones
  • How AutoFilter-hidden rows differ from manually hidden rows

Prerequisites

You need openpyxl installed (pip install openpyxl) and a working knowledge of reading Excel files in Python. The examples below use Python 3.8+ syntax. No third-party dependencies beyond openpyxl are required.

How openpyxl Stores Row Visibility

When Excel saves a workbook, each row can carry a set of attributes including hidden, height, and outlineLevel. openpyxl maps these into a RowDimension object stored in the worksheet's row_dimensions dictionary, keyed by row number.

A row that has never been explicitly hidden in Excel will often have no entry in row_dimensions at all β€” openpyxl omits default values to keep memory usage low. A hidden row will have an entry where hidden is True.

import openpyxl

wb = openpyxl.load_workbook("data.xlsx")
ws = wb.active

# Inspect visibility of the first 10 rows
for row_num in range(1, 11):
    dim = ws.row_dimensions.get(row_num)
    if dim is not None:
        print(f"Row {row_num}: hidden={dim.hidden}")
    else:
        print(f"Row {row_num}: no dimension entry (visible by default)")

This tells you the raw state. Hidden rows still have their cells populated in the worksheet object β€” they are just flagged with that attribute.

Why Some Iteration Patterns Skip Hidden Rows

Here is the part that catches most people off guard. openpyxl itself does not skip hidden rows during iteration. If you are missing rows, the skip is almost always happening in your own code β€” or in a helper library layered on top of openpyxl.

Pattern 1: Manual row_dimensions check

A common pattern picked up from Stack Overflow filters out rows based on a row_dimensions check:

# This pattern silently skips hidden rows
for row in ws.iter_rows():
    row_num = row[0].row
    if ws.row_dimensions[row_num].hidden:
        continue  # intended to skip hidden rows
    # ... process visible rows only

If you copied this into your script to skip hidden rows but then forgot about it (or someone else added it), that continue is your culprit. This is the most common cause by far.

Pattern 2: Accessing row_dimensions with square brackets on a missing key

The row_dimensions dictionary is a defaultdict-like structure. When you access ws.row_dimensions[row_num] for a row that has no explicit dimension entry, openpyxl creates a blank RowDimension object. A freshly created RowDimension has hidden set to False, so this specific pattern does not skip rows by itself β€” but it can confuse logic that checks dim is not None.

Pattern 3: Third-party wrappers

Libraries like openpyxl-utils or custom in-house utilities sometimes wrap openpyxl's iteration and filter on the hidden flag internally. If you are using a wrapper function, check whether it has built-in visibility filtering before blaming openpyxl.

The Fix: Iterating All Rows Regardless of Visibility

The straightforward fix is to use ws.iter_rows() or index into ws directly, and simply not apply any hidden check. openpyxl will give you every row.

import openpyxl

wb = openpyxl.load_workbook("data.xlsx")
ws = wb.active

all_rows_data = []

for row in ws.iter_rows(values_only=True):
    # Every row is returned β€” hidden or not
    all_rows_data.append(row)

print(f"Total rows read: {len(all_rows_data)}")

The values_only=True argument returns plain Python values instead of Cell objects, which is usually what you want when you are extracting data. If you need cell metadata (like coordinates or styles), drop that argument and work with the Cell objects directly.

Reading a specific range that includes hidden rows

If your sheet has a header row and data starts on row 2, you can still pass a range to iter_rows() without any visibility side effects:

for row in ws.iter_rows(min_row=2, max_row=ws.max_row, values_only=True):
    if any(cell is not None for cell in row):  # skip genuinely empty rows
        all_rows_data.append(row)

The any(cell is not None ...) guard skips rows where every cell is empty β€” common at the bottom of sheets β€” without touching visibility at all.

If you have run into similar silent data-loss issues when working across multiple workbooks, the article on fixing openpyxl losing cell styles when copying rows between workbooks covers related row-level gotchas worth knowing about.

Detecting and Filtering Hidden Rows on Purpose

Sometimes you genuinely want to separate visible rows from hidden ones β€” for example, when a colleague has hidden rows that represent deleted or inactive records and you only want to process the active data.

Here is a clean, explicit way to do that:

import openpyxl

wb = openpyxl.load_workbook("data.xlsx")
ws = wb.active

visible_rows = []
hidden_rows = []

for row in ws.iter_rows(min_row=2, values_only=True):
    row_num = row[0] if row else None  # values_only strips cell ref, need index
    # Switch to Cell objects to read row number cleanly
    pass

# Better approach: iterate Cell objects, check row_dimensions separately
for row in ws.iter_rows(min_row=2):
    row_num = row[0].row
    dim = ws.row_dimensions.get(row_num)
    is_hidden = dim.hidden if dim is not None else False

    row_values = tuple(cell.value for cell in row)

    if is_hidden:
        hidden_rows.append(row_values)
    else:
        visible_rows.append(row_values)

print(f"Visible rows: {len(visible_rows)}")
print(f"Hidden rows:  {len(hidden_rows)}")

Notice the use of ws.row_dimensions.get(row_num) rather than ws.row_dimensions[row_num]. Using .get() returns None for rows with no explicit dimension entry instead of creating a new object, which keeps your dictionary clean and your intent explicit.

Handling Hidden Rows in Filtered Tables (AutoFilter)

AutoFilter is a different beast. When a user applies a filter in Excel, the non-matching rows are flagged as hidden in the file format β€” but the mechanism is not always row_dimensions[n].hidden. openpyxl reads the AutoFilter definition from the XML but does not re-evaluate the filter logic. This means:

  • Rows hidden by AutoFilter are stored the same way as manually hidden rows β€” with hidden=True in their RowDimension.
  • openpyxl itself will still return those rows during iteration.
  • If you want only the rows a user would see with the filter active, check row_dimensions[n].hidden as shown above.
  • If you want all data regardless of filtering, iterate without any visibility check.

The key insight is that openpyxl does not re-apply filter criteria β€” it just reads the hidden flags Excel already wrote into the file. So your code controls what you see.

For issues that arise when openpyxl reads formula results incorrectly, the guide on fixing openpyxl reading Excel formulas instead of values explains how cached values and data_only mode interact, which is often a related pain point in the same scripts.

Common Pitfalls

Pitfall 1: Trusting max_row when hidden rows exist

ws.max_row returns the last row that has any data, regardless of visibility. This is usually correct, but if hidden rows contain data below your visible data range, max_row will be higher than you expect. Always check your row count against the actual file if something looks off.

Pitfall 2: read_only mode and row dimensions

When you open a workbook with openpyxl.load_workbook("file.xlsx", read_only=True), the row_dimensions dictionary is not populated. Read-only mode skips worksheet dimensions to reduce memory. If you need to check visibility, open the file in normal mode. The trade-off is higher memory use for large files.

# read_only=True: row_dimensions is empty β€” cannot check hidden status
wb_ro = openpyxl.load_workbook("data.xlsx", read_only=True)
ws_ro = wb_ro.active
print(ws_ro.row_dimensions)  # {}

# Normal mode: row_dimensions is populated
wb = openpyxl.load_workbook("data.xlsx")
ws = wb.active
print(len(ws.row_dimensions))  # number of rows with explicit dimension entries

Pitfall 3: Outline groups treated as hidden

Excel's row grouping feature (the collapse/expand buttons on the left margin) can set rows to hidden when the group is collapsed. openpyxl stores these the same way β€” hidden=True in row_dimensions plus an outlineLevel value. If you are checking hidden to filter rows, collapsed group rows will also be filtered out. If that is not what you want, check dim.outlineLevel as well and decide your handling explicitly.

Pitfall 4: Confusing empty rows with hidden rows

A row can be hidden and have data, or visible and completely empty. Do not conflate the two. A check for all(cell is None for cell in row_values) tells you the row is empty β€” not hidden. Treat visibility and emptiness as independent attributes.

This kind of subtle state confusion also shows up in database libraries. If you find yourself debugging similar silent data gaps, the article on fixing sqlite3 returning wrong values after ALTER TABLE covers a structurally similar problem in a different context.

Pitfall 5: Expecting openpyxl to replicate Excel's display

openpyxl reads the file format, not the Excel rendering engine. It will not filter rows based on AutoFilter criteria at read time, will not run conditional formatting logic, and will not recalculate formulas (unless you open with data_only=True and the file has cached values). Keep your expectations grounded in what the library actually does: parse XML into Python objects.

For another angle on number formatting surprises during reads, the post on fixing openpyxl that fails to apply number formatting to cells is worth a read if your numeric cell values look wrong after processing.

Wrapping Up

The root cause of openpyxl appearing to skip hidden rows is almost never in openpyxl itself β€” it is a visibility check in your iteration logic, a third-party wrapper, or an assumption about what read_only mode preserves. Here are the concrete steps to get unstuck:

  1. Audit your iteration loop for any hidden check or continue statement and remove it if you want all rows.
  2. Avoid read_only=True when you need to inspect row visibility; the row_dimensions dictionary is not populated in that mode.
  3. Use ws.row_dimensions.get(row_num) instead of direct key access to avoid creating spurious dimension entries.
  4. Distinguish between hidden and empty rows β€” they are independent states and need separate handling logic.
  5. If you are using a wrapper library, check its source or docs to see whether it filters on the hidden flag internally before you dig into openpyxl itself.

Frequently Asked Questions

Does openpyxl automatically skip hidden rows when iterating a worksheet?

No, openpyxl does not skip hidden rows during iteration. All rows are returned by iter_rows() regardless of their hidden status. If rows are being skipped, the cause is a visibility check in your own code or a wrapper library, not openpyxl itself.

How do I check if a specific row is hidden in openpyxl?

Use ws.row_dimensions.get(row_number) to retrieve the RowDimension object for that row, then check its hidden attribute. If get() returns None, the row has no explicit dimension entry and is visible by default.

Why does row_dimensions appear empty when I open a file in read_only mode?

openpyxl skips populating row_dimensions when read_only=True to reduce memory usage. If you need to inspect or filter based on row visibility, open the workbook in normal mode by omitting the read_only argument.

Are rows hidden by Excel's AutoFilter handled the same way as manually hidden rows in openpyxl?

Yes, AutoFilter-hidden rows are stored with hidden=True in their RowDimension entry, the same as manually hidden rows. openpyxl does not re-evaluate AutoFilter criteria at read time; it simply reads the hidden flags that Excel wrote into the file.

How do I read only the visible rows from an Excel file using openpyxl?

Iterate over all rows using iter_rows(), then for each row check ws.row_dimensions.get(row_num) and skip rows where the hidden attribute is True. This approach correctly handles both manually hidden rows and AutoFilter-hidden rows.

πŸ“€ Share this article

Sign in to save

Comments (0)

No comments yet. Be the first!

Leave a Comment

Sign in to comment with your profile.

πŸ“¬ Weekly Newsletter

Stay ahead of the curve

Get the best programming tutorials, data analytics tips, and tool reviews delivered to your inbox every week.

No spam. Unsubscribe anytime.