Fixing Python xlrd Errors When Opening xlsx Files After Version 2.0

June 09, 2026 5 min read 36 views
Laptop screen displaying a Python script with an error message next to a green Excel spreadsheet file icon on a clean desk

You ran your Excel-reading script this morning and it blew up with something like XLRDError: Excel xlsx file; not supported. Nothing in your code changed. The file is fine. What happened?

xlrd 2.0 dropped support for the xlsx format entirely, and if your environment upgraded the package without you noticing, every script reading xlsx files is now broken. This guide walks you through exactly why it happened and the fastest ways to fix it.

What You'll Learn

  • Why xlrd 2.0 stopped supporting xlsx and what that means for your code
  • How to fix pandas.read_excel calls that relied on xlrd
  • When to use openpyxl, xlrd, or other engines
  • How to handle legacy xls files alongside modern xlsx files in the same project
  • How to pin or manage your dependencies so this doesn't happen again

Prerequisites

You should be comfortable running Python from the command line and installing packages with pip. The examples use pandas and openpyxl, so a basic familiarity with DataFrames helps. Code samples are tested against Python 3.8+.

What Changed in xlrd 2.0

Before version 2.0, xlrd was the default engine behind pandas.read_excel and could handle both the old binary .xls format and the newer XML-based .xlsx format. The maintainer decided to strip xlsx support in the 2.0 release, citing security concerns and the fact that openpyxl already handles xlsx files much better.

The result is clean and intentional: xlrd 2.0+ reads only .xls files. If you pass it an xlsx file, it raises an error immediately rather than trying to parse it.

xlrd now only supports .xls files. For .xlsx, .xlsm, and .xlsb formats, use openpyxl or another dedicated library.

This is actually the right call. The old xlsx support in xlrd was incomplete and occasionally produced wrong results on files with complex formatting. But it still broke a lot of code quietly when packages auto-upgraded.

Confirming the Problem

First, check which version of xlrd you have installed:

pip show xlrd

If the version is 2.0 or higher and you're trying to read xlsx files, that's your problem. You'll see one of these errors depending on how you're calling it:

XLRDError: Excel xlsx file; not supported
NotImplementedError: formatting_info=True not supported for xlsx files

If you're using pandas, the error surfaces through the read_excel call since pandas delegates to xlrd under the hood when no engine is specified and xlrd is installed.

Fix 1: Switch the pandas Engine to openpyxl

This is the correct fix for almost everyone reading xlsx files. Install openpyxl if you don't already have it:

pip install openpyxl

Then update your read_excel call to specify the engine explicitly:

# Before β€” breaks with xlrd 2.0
import pandas as pd
df = pd.read_excel("report.xlsx")

# After β€” explicitly use openpyxl for xlsx files
import pandas as pd
df = pd.read_excel("report.xlsx", engine="openpyxl")

That single argument change is all you need in most cases. openpyxl is actively maintained, handles all modern xlsx features, and is the officially recommended engine in the pandas documentation for xlsx files.

Fix 2: Downgrade xlrd (Only for Legacy xls Files)

If you genuinely need to read old-style .xls binary files and your code also reads xlsx files, you need to handle both formats. Downgrading xlrd is only the right move when .xls support is still required.

pip install "xlrd==1.2.0"

Version 1.2.0 is the last release before xlsx support was removed. Be aware this is essentially an unmaintained path for xlsx. Use it only for xls and route xlsx files through openpyxl.

Fix 3: Route by File Extension

If your script processes both xls and xlsx files from a folder, detect the format and pick the right engine automatically:

import pandas as pd
from pathlib import Path

def read_excel_file(filepath: str) -> pd.DataFrame:
    path = Path(filepath)
    suffix = path.suffix.lower()

    if suffix == ".xls":
        # xlrd handles old binary xls files
        return pd.read_excel(filepath, engine="xlrd")
    elif suffix in (".xlsx", ".xlsm"):
        # openpyxl handles all modern formats
        return pd.read_excel(filepath, engine="openpyxl")
    else:
        raise ValueError(f"Unsupported file format: {suffix}")

# Usage
df = read_excel_file("data/sales_2019.xls")
df2 = read_excel_file("data/sales_2024.xlsx")

This pattern makes the intent explicit in code. Anyone reading it later will immediately understand the format-handling logic rather than wondering why different engine keywords appear in different places.

Fix 4: Using openpyxl Directly (Without pandas)

Sometimes you don't need a DataFrame at all. If you're reading cell values, checking formulas, or writing back to a file, openpyxl directly is cleaner than going through pandas:

from openpyxl import load_workbook

wb = load_workbook("report.xlsx", data_only=True)
ws = wb.active

for row in ws.iter_rows(min_row=2, values_only=True):
    print(row)

The data_only=True flag is important: it tells openpyxl to return cached cell values instead of formula strings. Without it, a cell containing =SUM(A1:A10) will return that formula string, not the number.

Common Pitfalls

Forgetting to install openpyxl

Specifying engine="openpyxl" without installing the package gives you a different error: ModuleNotFoundError: No module named 'openpyxl'. Run pip install openpyxl first. In a project with a requirements.txt, add it there so teammates don't hit the same issue.

Using xlrd for xlsm or xlsb files

Even xlrd 1.x doesn't handle xlsm (macro-enabled) or xlsb (binary xlsx) files well. openpyxl handles xlsm. For xlsb, the pyxlsb library is the current best option: pd.read_excel("file.xlsb", engine="pyxlsb").

The engine argument position matters

A common mistake is passing engine as a positional argument. It's a keyword-only argument in pandas:

# Wrong
df = pd.read_excel("report.xlsx", "openpyxl")

# Right
df = pd.read_excel("report.xlsx", engine="openpyxl")

Virtual environments with stale packages

If you fixed the issue in one environment but the error keeps appearing in CI or on another machine, check that requirements.txt or pyproject.toml reflects the change. A stale lockfile will reinstall the wrong version of xlrd on the next fresh install.

openpyxl and password-protected files

openpyxl cannot open password-protected xlsx files. If you need to handle encrypted spreadsheets, look at the msoffcrypto-tool package to decrypt the file first, then pass the decrypted stream to openpyxl.

Pinning Your Dependencies

The reason this broke quietly for many teams is unpinned dependencies. If your requirements.txt just says xlrd, pip will happily install the latest version, which is 2.x. Be explicit:

# requirements.txt

# For xlsx files β€” preferred approach
openpyxl>=3.0

# For xls files only, if needed
xlrd==1.2.0

If you're using a modern packaging tool like Poetry or pip-tools, let it generate a lockfile and commit that to your repository. Anyone who clones the repo gets identical package versions, eliminating an entire class of environment-specific bugs.

Quick Reference: Which Engine for Which Format

File formatExtensionRecommended engineInstall
Modern Excel.xlsx, .xlsmopenpyxlpip install openpyxl
Legacy Excel.xlsxlrd 1.2.0pip install xlrd==1.2.0
Binary Excel.xlsbpyxlsbpip install pyxlsb
ODS (LibreOffice).odsodfpip install odfpy

Next Steps

You now have the full picture of the xlrd 2.0 change and the right paths forward. Here's what to do next:

  • Update your code today: Add engine="openpyxl" to every pd.read_excel call that handles xlsx files.
  • Pin your dependencies: Edit your requirements.txt or lockfile so the fix sticks across environments and teammates' machines.
  • Audit your project for xls vs xlsx: Run a quick search for read_excel calls and confirm each one uses the right engine for its file format.
  • Add openpyxl to your standard toolkit: If you write back to Excel files at all, openpyxl gives you full control over formatting, formulas, and multiple sheets.
  • Test in CI: Add a test that reads a sample xlsx file so a future accidental downgrade fails loudly before it reaches production.

πŸ“€ Share this article

Sign in to save

Comments (0)

No comments yet. Be the first!

Leave a Comment

Sign in to comment with your profile.

πŸ“¬ Weekly Newsletter

Stay ahead of the curve

Get the best programming tutorials, data analytics tips, and tool reviews delivered to your inbox every week.

No spam. Unsubscribe anytime.