Fix Silent Data Loss When Merging Upstream in Forked Repos

You sync your fork with upstream, the merge completes without a single conflict, and everything looks fine. Then three days later a colleague notices that a feature you shipped two weeks ago has simply vanished from the branch. No error, no warning — just gone.

This is one of the more disorienting problems in collaborative Git workflows because the tooling gives you no signal that anything went wrong. Understanding why it happens and building a repeatable process to catch it is the difference between a trustworthy contribution workflow and a repo that quietly eats work.

What you'll learn

Why Git can drop your commits silently during an upstream merge
How to detect lost commits before they reach a shared branch
Which merge strategies are safe and which are dangerous for forks
A repeatable sync workflow that protects your history
Common pitfalls maintainers and contributors both miss

Prerequisites

This article assumes you are comfortable with basic Git commands (commit, merge, rebase, log) and that you have a forked repository with a configured upstream remote. Examples use Git 2.x on the command line. The concepts apply to any hosting platform (GitHub, GitLab, Gitea).

Why Silent Loss Happens

When you merge upstream into your fork's main branch, Git performs a three-way merge. It finds the common ancestor of your branch and the upstream branch, then applies changes from both sides. The problem is what Git considers a "change."

If upstream rewrote a file in a way that encompasses or reverts what you changed, Git can resolve the merge cleanly by accepting upstream's version — and your commit's effect disappears even though the commit itself still exists in the reflog. The merge commit is legitimate. The content you added is not there anymore.

This scenario is especially common when:

Upstream did a large refactor that touched the same files you modified
Upstream squash-merged a branch that effectively reverted interim history
You rebased your fork's branch onto a different base commit before syncing
Upstream force-pushed to rewrite history (rare, but it happens on younger projects)

Confirming the Problem: How to Detect Lost Commits

Before you can fix anything, you need to know which commits are missing. Git gives you two reliable tools for this.

Using git log with symmetric difference

The symmetric difference operator (...) lets you see commits that exist on one branch but not another.

# Commits on your branch not reachable from upstream/main
git log upstream/main..HEAD --oneline

# Commits on upstream/main not reachable from your branch
git log HEAD..upstream/main --oneline

Run the first command before the merge so you have a baseline list of your commits. Run it again after. If commits disappeared from the output without appearing in the diff of your branch, they were silently absorbed.

Using git cherry

git cherry compares commits by their patch content rather than their SHA. A commit prefixed with - means an equivalent patch already exists in the upstream; one prefixed with + means it does not.

git cherry -v upstream/main HEAD

Save this output to a file before you sync. After the merge, run it again and diff the two outputs. Any commit that moved from + to absent without you explicitly intending it should be investigated immediately.

# Save pre-merge cherry output
git cherry -v upstream/main HEAD > before-sync.txt

# After syncing...
git cherry -v upstream/main HEAD > after-sync.txt

diff before-sync.txt after-sync.txt

Safe Upstream Sync Strategies

The way you pull upstream changes determines how much risk you carry. Not all approaches are equal.

Rebase your work on top of upstream

Rebasing replays your commits on top of the latest upstream HEAD. Each commit is re-applied one at a time, so Git is forced to surface conflicts at the exact point where they occur rather than silently resolving them at a higher level.

# Fetch latest upstream
git fetch upstream

# Rebase your branch on upstream main
git rebase upstream/main

If a commit conflicts, Git pauses and asks you to resolve it manually. Nothing is silently discarded. This is the preferred method for feature branches that have not yet been shared widely.

Merge with --no-ff and inspect the result

If you must merge (for example, because the branch history is already public), use --no-ff to force a merge commit even when a fast-forward is possible. Then immediately audit what happened.

git fetch upstream
git merge --no-ff upstream/main

# Inspect what the merge commit actually changed
git show HEAD

Read the diff of the merge commit carefully. A merge commit that touches your feature files is a warning sign — it means upstream's version won the conflict resolution, and you should verify your changes are still present.

Use a dedicated sync branch

Never merge upstream directly into a branch that contains unreviewed work. Create a temporary branch, sync there first, review it, then bring it into your working branch.

git fetch upstream
git checkout -b upstream-sync upstream/main

# Now you have a clean branch representing upstream
# Merge or rebase your feature branch on top of it
git checkout my-feature
git rebase upstream-sync

# Delete the sync branch when done
git branch -d upstream-sync

Recovering Lost Commits

If you have already merged and discovered that commits are missing, the reflog is your recovery mechanism. Git keeps a local record of every position HEAD has been in, including before the merge.

# Show the reflog for HEAD
git reflog

# Find the SHA just before the merge commit
# It will look something like: abc1234 HEAD@{3}: merge upstream/main: Merge made by the...

Once you identify the SHA of your branch before the merge, you can cherry-pick the missing commits back onto your current branch.

# Cherry-pick a range of commits from before the merge
git cherry-pick abc1234^..def5678

Alternatively, you can create a recovery branch from the pre-merge state, verify your commits are there, then selectively apply them.

git checkout -b recovery abc1234
git log --oneline -10

The reflog entries expire after 90 days by default, so do not wait too long to recover.

Automating the Safety Check

A pre-merge script that captures the commit list and compares it afterward removes the human memory requirement from this process. Here is a minimal shell script you can drop into your project's tooling.

#!/usr/bin/env bash
# sync-upstream.sh — safe upstream sync with commit audit

set -euo pipefail

UPSTREAM_REMOTE="${1:-upstream}"
UPSTREAM_BRANCH="${2:-main}"
SNAPSHOT_FILE=".git/pre-sync-cherry.txt"

echo "Capturing pre-sync commit list..."
git cherry -v "${UPSTREAM_REMOTE}/${UPSTREAM_BRANCH}" HEAD > "${SNAPSHOT_FILE}"

echo "Fetching upstream..."
git fetch "${UPSTREAM_REMOTE}"

echo "Rebasing onto ${UPSTREAM_REMOTE}/${UPSTREAM_BRANCH}..."
git rebase "${UPSTREAM_REMOTE}/${UPSTREAM_BRANCH}"

echo "Verifying commits after sync..."
git cherry -v "${UPSTREAM_REMOTE}/${UPSTREAM_BRANCH}" HEAD > /tmp/post-sync-cherry.txt

PRE_COUNT=$(wc -l < "${SNAPSHOT_FILE}")
POST_COUNT=$(wc -l < /tmp/post-sync-cherry.txt)

if [ "${POST_COUNT}" -lt "${PRE_COUNT}" ]; then
  echo "WARNING: Commit count dropped from ${PRE_COUNT} to ${POST_COUNT}. Review the diff:"
  diff "${SNAPSHOT_FILE}" /tmp/post-sync-cherry.txt
  exit 1
fi

echo "Sync complete. Commit count unchanged (${POST_COUNT})."

Run it as bash sync-upstream.sh upstream main. If the commit count drops, the script exits non-zero and prints the diff so you can see exactly what changed.

Common Pitfalls

Trusting a clean merge as proof of correctness

A merge with zero conflicts is not a guarantee that your content survived intact. Upstream could have simply overwritten the same section in a way that happens to be a strict superset of your changes — or an outright replacement. Always verify the diff of the merge commit touches what you expect.

Squash-merging your own feature branch before syncing

If you squash your feature branch into a single commit and then rebase onto upstream, git cherry may not recognise the squashed commit as equivalent to the original commits. You can end up in a state where Git cannot tell what is yours and what is upstream's. Prefer to squash only after the sync is confirmed clean.

Forgetting to update the upstream remote URL

When an upstream project moves (organisation rename, domain change), your remote URL goes stale silently. You keep fetching from an old mirror and never see the real upstream commits. Run git remote -v periodically and compare against the canonical repository URL.

Relying on GitHub's "Sync fork" button without auditing

The platform's one-click sync is convenient, but it merges upstream into your default branch without any pre-merge snapshot. Use it only on branches that contain no original work, or always pull the result locally and run git cherry afterward.

Wrapping Up

Silent data loss in forked repos is almost always preventable with a small amount of process discipline. Here are the concrete actions to take right now:

Run git cherry -v upstream/main HEAD before every upstream sync and save the output. Compare it after the sync completes.
Switch to rebase-based syncing (git rebase upstream/main) for feature branches that are not yet public. It surfaces conflicts where they belong instead of hiding them in a merge commit.
Read the diff of every merge commit that touches files you own before pushing. A merge commit that modifies your files is always worth a second look.
Add the sync script above (or an equivalent) to your project tooling so the audit runs automatically and your team does not rely on memory.
Check your reflog immediately if you suspect a loss. The window to recover is 90 days, but the sooner you act the simpler the cherry-pick will be.

Fixing Silent Data Loss When Merging Upstream Changes in Forked Repos

What you'll learn

Prerequisites

Why Silent Loss Happens

Confirming the Problem: How to Detect Lost Commits

Using git log with symmetric difference

Using git cherry

Safe Upstream Sync Strategies

Rebase your work on top of upstream

Merge with --no-ff and inspect the result

Use a dedicated sync branch

Recovering Lost Commits

Automating the Safety Check

Common Pitfalls

Trusting a clean merge as proof of correctness

Squash-merging your own feature branch before syncing

Forgetting to update the upstream remote URL

Relying on GitHub's "Sync fork" button without auditing

Wrapping Up

Comments (0)

Leave a Comment

Fixing Silent Data Loss When Merging Upstream Changes in Forked Repos

What you'll learn

Prerequisites

Why Silent Loss Happens

Confirming the Problem: How to Detect Lost Commits

Using git log with symmetric difference

Using git cherry

Safe Upstream Sync Strategies

Rebase your work on top of upstream

Merge with --no-ff and inspect the result

Use a dedicated sync branch

Recovering Lost Commits

Automating the Safety Check

Common Pitfalls

Trusting a clean merge as proof of correctness

Squash-merging your own feature branch before syncing

Forgetting to update the upstream remote URL

Relying on GitHub's "Sync fork" button without auditing

Wrapping Up

Comments (0)

Leave a Comment

Stay ahead of the curve