Understanding Bazaar's Architecture

Revision Graph and Ghost Revisions

Bazaar represents history as a directed acyclic graph (DAG). Each revision references its parent(s). A "ghost revision" occurs when a commit references a parent revision that does not exist in the current repository. This breaks ancestry traversal, making merges fail or behave unexpectedly.

Distributed Nature and Pull Semantics

In contrast to Git's fetch + merge model, Bazaar's pull and merge workflows can create inconsistencies if teams do not synchronize branches fully before pushing changes. Missing revisions commonly arise during partial pulls or interrupted synchronizations.

Root Causes

1. Partial Clone or Incomplete Pulls

If a developer clones or pulls from a shallow or interrupted source, their local branch may include references to parent revisions that never arrived—leading to ghost entries.

2. Branch Divergence and Offline Commits

Developers making commits in disconnected environments (e.g., laptops without remote access) and later pushing without reconciling ancestry can introduce orphaned history segments.

3. Misuse of Git Interop Tools

Using bzr fast-import or bzr-git bridges without mapping all parents correctly can create revisions that reference Git commits not available in Bazaar's history.

Diagnostics

Using bzr log --show-ids

This command reveals revision IDs. Ghost revisions show up as unknown or unresolved IDs in ancestry chains.

bzr log --show-ids --long

Detecting Ghosts Programmatically

Use internal APIs or plugins to identify missing ancestors. Example with Python API:

from bzrlib import branch as _branch
br = _branch.Branch.open(".")
graph = br.repository.get_graph()
ghosts = [rev for rev in graph.heads() if graph.get_parent_map([rev]).get(rev) is None]

Step-by-Step Recovery

1. Identify Affected Branches

Run bzr heads and bzr missing across branches to isolate divergence points and missing revisions.

2. Re-pull from Canonical Sources

Fetch full history from the mainline branch where ghosts originated. This often restores the missing revision ancestry.

bzr pull --overwrite ../canonical-branch

3. Manual Rebase or Patch Cherry-Picking

If ghosts cannot be resolved via pull, export diffs and re-apply patches manually to a clean base:

bzr diff -r ghost_rev..current_rev > patch.diff
cd ../clean-branch
patch -p0 < ../patch.diff
bzr commit -m "Reapplied patch sans ghost ancestry"

Best Practices to Avoid Ghost Revisions

  • Enforce synchronized pulls before commit or push in shared repositories.
  • Document and enforce usage of canonical mainline branches.
  • Use mirroring scripts to verify consistency across distributed repos.
  • Avoid mixed usage of interop tools unless fully understood.
  • Run CI checks to validate revision graph integrity before merging branches.

Conclusion

Ghost revisions in Bazaar are a product of DVCS flexibility combined with inconsistent workflows. While Bazaar is no longer mainstream, it persists in legacy environments, and its issues can silently break important delivery pipelines. Understanding its internal graph structure, using diagnostic tools effectively, and establishing disciplined branching strategies are key to maintaining long-term repo health. For organizations with critical codebases in Bazaar, investing in migration planning or robust automation layers is recommended.

FAQs

1. What is a ghost revision in Bazaar?

A ghost revision is a commit that references a parent revision not present in the local branch, breaking merge logic and revision ancestry.

2. Can ghost revisions cause data loss?

Not directly, but they can prevent merges and obscure commit ancestry, making recovery and collaboration error-prone if not addressed.

3. How do I safely migrate from Bazaar to Git without ghosts?

Use bzr fast-export with full history and verify revision ancestry using bzr verify before migration. Avoid partial export/imports.

4. Are there tools to auto-fix ghost revisions?

No official auto-fix tool exists. Manual diagnosis and patch application remain the most reliable path unless full ancestry can be restored from source.

5. Does Launchpad still support Bazaar?

Yes, but with limited updates. Organizations relying on Launchpad + Bazaar should consider long-term migration plans to Git or other actively maintained systems.