Fixing Python Pandas to_excel That Overwrites All Sheets Except the Last
You write a loop, call df.to_excel() for each DataFrame, and open the file β only to find a single sheet containing whatever you wrote last. Everything else is gone. This is one of the most common pandas gotchas, and it trips up even experienced Python developers the first time they hit it.
What You'll Learn
- Why calling
to_excel()directly on a file path destroys existing sheets - How to use
ExcelWritercorrectly to write multiple sheets in one pass - How to append new sheets to an Excel file that already exists on disk
- How to preserve existing sheets when writing into a workbook with openpyxl
- Which common mistakes will silently eat your data even after you think you've fixed the problem
The Problem: Only the Last Sheet Survives
Here's the pattern that causes the issue. It looks perfectly reasonable at first glance:
import pandas as pd
df1 = pd.DataFrame({"col": [1, 2, 3]})
df2 = pd.DataFrame({"col": [4, 5, 6]})
df3 = pd.DataFrame({"col": [7, 8, 9]})
df1.to_excel("output.xlsx", sheet_name="Sheet1")
df2.to_excel("output.xlsx", sheet_name="Sheet2")
df3.to_excel("output.xlsx", sheet_name="Sheet3")
Open output.xlsx and you'll find exactly one sheet: Sheet3. Sheet1 and Sheet2 were overwritten each time a new call was made. This isn't a bug β it's exactly what the API is designed to do.
Why to_excel() Overwrites Everything
Each call to df.to_excel("output.xlsx") creates a brand-new ExcelWriter object internally, opens the file path for writing, writes the single sheet, and closes the file. "Opens for writing" means it creates the file from scratch. Any previous content is gone before the new content is even written.
Think of it like calling open("file.txt", "w") three times in a row. Each call truncates the file. The fact that you're specifying a different sheet_name each time doesn't change that β each call is operating on a fresh workbook with no memory of what came before.
The underlying engines (openpyxl for .xlsx, xlwt for legacy .xls) work at the workbook level. When you open a new workbook, it starts empty. There's no mechanism to "add a sheet" by calling to_excel on a path string multiple times.
The Fix: Use ExcelWriter as a Context Manager
The correct approach is to create one ExcelWriter object and pass all your DataFrames to it before it closes. Using it as a context manager (with with) guarantees the file is saved and closed exactly once, after all sheets have been written.
import pandas as pd
df1 = pd.DataFrame({"col": [1, 2, 3]})
df2 = pd.DataFrame({"col": [4, 5, 6]})
df3 = pd.DataFrame({"col": [7, 8, 9]})
with pd.ExcelWriter("output.xlsx", engine="openpyxl") as writer:
df1.to_excel(writer, sheet_name="Sheet1", index=False)
df2.to_excel(writer, sheet_name="Sheet2", index=False)
df3.to_excel(writer, sheet_name="Sheet3", index=False)
Now all three sheets are present when you open the file. The with block keeps the writer open until all three to_excel calls complete, then flushes and closes the file once. Set index=False unless you actually want the DataFrame index written as a column β it's off by default in most reporting use cases.
Writing Multiple DataFrames in a Loop
The same pattern works cleanly when you're iterating over a collection of DataFrames. A common real-world scenario is splitting a large dataset by category and writing each slice to its own sheet.
import pandas as pd
# Simulate a dataset with a category column
data = pd.DataFrame({
"region": ["North", "North", "South", "South", "East"],
"sales": [120, 95, 200, 175, 88]
})
with pd.ExcelWriter("sales_by_region.xlsx", engine="openpyxl") as writer:
for region, group_df in data.groupby("region"):
group_df.to_excel(writer, sheet_name=region, index=False)
Each iteration calls to_excel on the same open writer object. The writer accumulates sheets and writes the complete workbook when the with block exits. No sheets are lost.
One caveat: Excel sheet names have a 31-character limit. If your category values are longer than that, you'll get an error. Truncate the sheet name before passing it: sheet_name=str(region)[:31].
Appending to an Already-Existing File
Sometimes the file already exists on disk and you want to add new sheets without touching the ones already there. Opening with the default mode creates a fresh file, so you need to tell ExcelWriter to open in append mode instead.
import pandas as pd
new_df = pd.DataFrame({"metric": ["revenue", "cost"], "value": [5000, 3200]})
with pd.ExcelWriter(
"output.xlsx",
engine="openpyxl",
mode="a", # append mode
if_sheet_exists="new" # required in pandas >= 1.4 with mode='a'
) as writer:
new_df.to_excel(writer, sheet_name="Summary", index=False)
The mode="a" argument tells openpyxl to load the existing workbook rather than create a new one. The if_sheet_exists parameter (introduced in pandas 1.4) controls what happens if the target sheet name already exists in the file. Set it to "new" to create a uniquely named sheet, "replace" to overwrite just that sheet, or "overlay" to write data on top of existing content without clearing the sheet first.
Preserving Existing Sheets With openpyxl
If you need finer control β for example, you want to update two specific sheets while leaving five others untouched β you can load the workbook directly with openpyxl and pass it to ExcelWriter via the book attribute. This technique was common before mode="a" was available, and it still works for edge cases that append mode doesn't cover cleanly.
import pandas as pd
from openpyxl import load_workbook
path = "output.xlsx"
# Load the existing workbook
book = load_workbook(path)
with pd.ExcelWriter(path, engine="openpyxl") as writer:
writer.book = book
# Rebuild the sheet map so pandas knows what already exists
writer.sheets = {ws.title: ws for ws in book.worksheets}
updated_df = pd.DataFrame({"status": ["done", "pending"]})
updated_df.to_excel(writer, sheet_name="Status", index=False)
This pattern manually wires the existing workbook into the writer before any to_excel call is made. All sheets not touched during the with block remain exactly as they were. Note that in newer versions of pandas (1.4+) the preferred path is mode="a" with if_sheet_exists, but the manual approach still works and gives you programmatic control over which sheets to keep or remove.
If you've run into other issues with openpyxl and workbook state, the article on fixing openpyxl merged cells that lose data on re-open covers a related class of problems around how openpyxl handles existing workbook content.
Common Pitfalls and Gotchas
Calling .save() or .close() manually
If you use ExcelWriter without the with statement, you must call writer.close() (or the older writer.save()) explicitly. Forgetting this means the file either isn't written at all or is written in a corrupted state. The context manager handles this for you, which is why it's the recommended approach.
Duplicate sheet names
Passing the same sheet_name twice within a single ExcelWriter session will raise a ValueError in recent pandas versions, or silently create a sheet with a numeric suffix in older ones. Make sure your sheet names are unique before entering the loop. A quick check: assert len(names) == len(set(names)).
Engine mismatch
Using engine="xlsxwriter" does not support mode="a". xlsxwriter can only create new files. If you need to append sheets to an existing file, you must use engine="openpyxl". Mixing engines between read and write operations on the same file is a reliable source of confusion β pick openpyxl and stay consistent.
Forgetting to install the engine
pandas does not ship with openpyxl or xlsxwriter. If you get an ImportError when creating an ExcelWriter, install the missing package:
pip install openpyxl
Writing to a file that's open in Excel
On Windows, Excel locks the file while it's open. Writing to a locked file raises a PermissionError. Close the file in Excel before running your script. This is a filesystem-level constraint, not a pandas issue, so no amount of flag-tweaking will work around it.
Index column appearing unexpectedly
By default, to_excel writes the DataFrame index as the first column. If your index is just 0, 1, 2, ..., that's almost certainly not data you want in the sheet. Pass index=False to suppress it. This is the same problem you'd encounter when writing pivot table output to Excel where the index carries the row labels.
Large files and memory
openpyxl loads the entire workbook into memory in normal mode. For files with hundreds of thousands of rows across multiple sheets, this can exhaust available RAM. In that scenario, consider xlsxwriter (write-only, no append support) or write each sheet to a separate file and merge them with openpyxl's read-only / write-only cell modes. For a related look at how pandas handles data irregularities before writing, see fixing pandas resample that returns NaN for irregular time series.
One more subtle issue: if you're running this code in a data pipeline that also does complex merges before writing, double-check that your source DataFrames don't carry unexpected duplicate rows. The article on fixing pandas merge() that creates duplicate rows from non-unique keys walks through how to catch that before it inflates your Excel output.
Wrapping Up
The overwrite problem comes down to one thing: each bare call to df.to_excel("file.xlsx") creates a new workbook from scratch. Once you route all writes through a single ExcelWriter context manager, the problem disappears.
Here are the concrete actions to take right now:
- Replace every multi-call pattern with a single
with pd.ExcelWriter(...) as writer:block. - Add
engine="openpyxl"explicitly β don't rely on pandas to pick the right default, especially if your environment has multiple engines installed. - Use
mode="a"withif_sheet_existswhen you need to add sheets to a file that already exists on disk, rather than loading the book manually. - Truncate sheet names to 31 characters before passing them in, particularly when names come from user data or category values.
- Pass
index=Falseunless your index is meaningful data that belongs in the sheet.
Frequently Asked Questions
Why does pandas to_excel only keep the last sheet when writing to the same file?
Each call to df.to_excel('file.xlsx') creates a brand-new workbook from scratch and overwrites whatever was there before. To write multiple sheets, you must use a single pd.ExcelWriter context manager and pass all DataFrames to it before it closes.
How do I append a new sheet to an existing Excel file with pandas without deleting the other sheets?
Open ExcelWriter with mode='a' (append) and engine='openpyxl', then use the if_sheet_exists parameter to control what happens if the sheet name already exists. This loads the existing workbook rather than creating a new one, so all untouched sheets are preserved.
Can I use xlsxwriter to append sheets to an existing Excel file in pandas?
No. xlsxwriter only supports creating new files and does not have an append mode. If you need to add sheets to an existing workbook, you must use engine='openpyxl' with mode='a'.
How do I write multiple DataFrames to separate sheets in one Excel file using pandas?
Wrap all your to_excel calls inside a single 'with pd.ExcelWriter(path, engine="openpyxl") as writer:' block and pass the writer object instead of a file path. Each call can specify a different sheet_name, and the complete workbook is saved when the block exits.
What does the if_sheet_exists parameter do in pandas ExcelWriter?
Introduced in pandas 1.4, if_sheet_exists controls behavior when you try to write to a sheet name that already exists in append mode. Setting it to 'replace' overwrites that sheet, 'new' creates a sheet with a unique name, and 'overlay' writes data on top of existing content without clearing the sheet first.
π€ Share this article
Sign in to saveRelated Articles
Comments (0)
No comments yet. Be the first!