Fixing openpyxl: Stop It From Overwriting Your Excel Sheets on Save
You run your Python script, open the Excel file, and half your data is gone. The sheets you carefully built yesterday have been replaced, and the only thing left is whatever your script wrote this time. If you've hit this problem with openpyxl, you're not alone β it's one of the most common traps when writing Excel automation scripts.
The good news is the fix is straightforward once you understand what's actually happening under the hood.
What you'll learn
- Why
openpyxloverwrites sheets when you save - How to load an existing workbook instead of creating a new one
- How to write to a specific sheet without touching the others
- How to add a new sheet to an existing file safely
- Common mistakes that still cause data loss even after you think you've fixed it
Prerequisites
You'll need Python 3.7 or newer and openpyxl installed. If you haven't installed it yet, run:
pip install openpyxlThe examples below assume you have an existing .xlsx file on disk. The patterns apply whether you're working with a small reporting script or a larger data pipeline.
Why openpyxl Overwrites in the First Place
The root cause is almost always this pattern:
from openpyxl import Workbook
wb = Workbook() # creates a brand-new, empty workbook
ws = wb.active
ws['A1'] = 'Hello'
wb.save('report.xlsx') # overwrites whatever was there beforeWhen you call Workbook(), you're creating a fresh workbook in memory. It has no knowledge of the file that already exists on disk. When you then call wb.save('report.xlsx'), you're writing that empty-except-for-your-new-data workbook over the top of the original file. Everything that was in the old file is gone.
This isn't a bug β it's the expected behavior of creating a new object. The fix is to stop creating a new workbook and start loading the existing one.
The Core Fix: Use load_workbook Instead of Workbook()
openpyxl provides load_workbook() specifically for opening files that already exist on disk. It reads the entire workbook into memory, preserving all sheets, formatting, and data.
from openpyxl import load_workbook
wb = load_workbook('report.xlsx') # loads the existing file
ws = wb.active
ws['A1'] = 'Updated value'
wb.save('report.xlsx') # saves back to the same fileNow when you save, you're writing the loaded workbook β including all its existing sheets β back to disk. Only the cell you explicitly changed is different.
Writing to a Specific Sheet Without Touching Others
If your workbook has multiple sheets and you only want to update one, reference it by name rather than using wb.active. The active sheet is just whichever sheet was last selected when the file was saved, which may not be what you want.
from openpyxl import load_workbook
wb = load_workbook('report.xlsx')
# Reference a specific sheet by name
ws = wb['Sales Data']
ws['B2'] = 9500
wb.save('report.xlsx')If the sheet name you reference doesn't exist, openpyxl will raise a KeyError. You can check what sheets are available with wb.sheetnames, which returns a list of sheet name strings.
print(wb.sheetnames)
# ['Sales Data', 'Summary', 'Raw Data']Adding a New Sheet to an Existing Workbook
Sometimes you want to append a new sheet to a file without touching any of the existing ones. Load the workbook first, then create the new sheet.
from openpyxl import load_workbook
wb = load_workbook('report.xlsx')
# Create a new sheet (appended at the end by default)
new_ws = wb.create_sheet(title='Q3 Results')
new_ws['A1'] = 'Quarter'
new_ws['B1'] = 'Revenue'
new_ws['A2'] = 'Q3'
new_ws['B2'] = 42000
wb.save('report.xlsx')The existing sheets are untouched. You can also control where the new sheet appears by passing a position argument: wb.create_sheet(title='Q3 Results', index=0) inserts it as the first sheet.
Replacing Sheet Contents Without Deleting the Sheet
If you need to clear a sheet and write fresh data to it β without deleting and recreating it β iterate over the rows and clear the values, or overwrite cell by cell. A clean approach is to delete and recreate the sheet at the same position, then repopulate it.
from openpyxl import load_workbook
wb = load_workbook('report.xlsx')
sheet_name = 'Sales Data'
position = wb.sheetnames.index(sheet_name)
# Remove the old sheet
del wb[sheet_name]
# Recreate it at the same position
new_ws = wb.create_sheet(title=sheet_name, index=position)
# Write fresh data
headers = ['Product', 'Units', 'Revenue']
for col_num, header in enumerate(headers, start=1):
new_ws.cell(row=1, column=col_num, value=header)
wb.save('report.xlsx')This preserves the sheet's position in the tab order, which matters if other sheets have named references or if users expect a consistent layout.
Handling the Case Where the File May Not Exist Yet
A common scenario in reporting scripts: the first run creates the file, and every subsequent run updates it. You need to handle both cases gracefully.
import os
from openpyxl import Workbook, load_workbook
filepath = 'report.xlsx'
if os.path.exists(filepath):
wb = load_workbook(filepath)
ws = wb['Sales Data']
else:
wb = Workbook()
ws = wb.active
ws.title = 'Sales Data'
# Write your data
ws.append(['Product A', 150, 7500])
wb.save(filepath)The os.path.exists() check is the simplest approach. If you're in an environment where the file might be created concurrently by another process, you'll want more robust file locking, but for most scripting use cases this is sufficient.
Common Pitfalls That Still Cause Data Loss
Saving to a different filename by accident
Double-check that the filename you pass to load_workbook() and wb.save() are the same. If they differ β even by a typo or a path separator β you end up with a new file and the original is untouched (but your script thinks it updated it).
Using Workbook() and then saving with the existing filename
This is the original mistake. If any part of your code still calls Workbook() and saves to the target path, you'll overwrite. Search your script for any remaining Workbook() calls that aren't intentional.
Opening the file in Excel at the same time
On Windows, Excel locks .xlsx files when they're open. openpyxl may raise a PermissionError on save. Close the file in Excel before running your script, or save to a temporary file and rename it afterward.
The read_only flag
load_workbook() has a read_only parameter that defaults to False. If you've copied code from a snippet that sets read_only=True, you'll get an error when you try to save. Only use read_only=True when you genuinely only need to read the file and want the performance benefit of not loading the full structure.
# This will raise an error on save β don't do this if you need to write
wb = load_workbook('report.xlsx', read_only=True)
wb.save('report.xlsx') # AttributeError: 'ReadOnlyWorkbook' has no attribute 'save'Forgetting to call wb.save()
All your changes live in memory until you call wb.save(). If an exception is raised before that line, none of your changes are written to disk. Wrap your save in a try/finally block if partial writes would leave the file in a bad state, or write to a temporary file and rename on success.
A Complete Working Example
Here's a script that ties everything together: it loads an existing workbook, updates one sheet, adds a new one, and saves safely.
import os
from openpyxl import Workbook, load_workbook
from openpyxl.utils import get_column_letter
filepath = 'monthly_report.xlsx'
# Load or create
if os.path.exists(filepath):
wb = load_workbook(filepath)
else:
wb = Workbook()
wb.active.title = 'Summary'
# Update the Summary sheet
if 'Summary' in wb.sheetnames:
ws_summary = wb['Summary']
else:
ws_summary = wb.create_sheet('Summary')
ws_summary['A1'] = 'Last Updated'
ws_summary['B1'] = '2024-07'
# Add a new data sheet if it doesn't already exist
if 'July Data' not in wb.sheetnames:
ws_data = wb.create_sheet('July Data')
headers = ['Date', 'Product', 'Sales']
for i, h in enumerate(headers, start=1):
ws_data.cell(row=1, column=i, value=h)
# Save back to the same file
wb.save(filepath)
print(f'Saved {filepath} β sheets: {wb.sheetnames}')Wrapping Up
The overwrite problem comes down to one thing: using Workbook() when you should be using load_workbook(). Once you have the right object in memory, all the standard read and write operations work as expected without touching sheets you didn't intend to modify.
Here are the concrete next steps to take:
- Audit your script for any
Workbook()call that targets an existing file path, and replace it withload_workbook(). - Add a filename guard using
os.path.exists()so your script handles both the first run and subsequent runs without branching logic errors. - Reference sheets by name rather than
wb.activeso your script is explicit about what it's changing. - Test with a copy of your real file before running any destructive update on production data β keep the original as a backup until you're confident the script is correct.
- Read the openpyxl documentation on
load_workbookparameters, particularlydata_only, which controls whether formula cells return their cached values or the formula string itself.
π€ Share this article
Sign in to saveRelated Articles
Comments (0)
No comments yet. Be the first!