Lesson Completion
Back to course

How Git Diffs Work

Intermediate
8 minutes4.7Git

The Hook (The "Byte-Sized" Intro)

Git doesn't store diffs. It stores snapshots. Each commit is a complete picture of every file. So how does git diff work? Git computes diffs on the fly by comparing two tree objects — walking them side by side, comparing blob SHAs, and producing the output you see. This means any two commits can be diffed, in any order, instantly.

📖 What is How Git Diffs Work?

Git computes diffs dynamically by comparing tree and blob objects between two snapshots. It doesn't store diffs — it generates them on demand.

Conceptual Clarity

Diff computation:

  1. Compare two tree objects (e.g., commit A's tree vs commit B's tree)
  2. Walk entries side by side
  3. Same blob SHA? → unchanged (skip)
  4. Different blob SHA? → compute line-level diff
  5. Missing entry? → added or deleted
  6. Same blob SHA, different name? → rename detected

Diff contexts:

CommandCompares
git diffWorking tree ↔ Staging area
git diff --stagedStaging area ↔ HEAD
git diff HEADWorking tree ↔ HEAD
git diff A..BCommit A ↔ Commit B
git diff A...BCommon ancestor ↔ Commit B

Rename detection: Git doesn't track renames explicitly. It detects them heuristically by comparing blob SHAs across trees. If a blob appears in the old tree under name A and in the new tree under name B, Git reports it as a rename.

Real-Life Analogy

Git stores photos, not movies. A diff is like holding up two photos side by side and circling the differences. You can compare ANY two photos — not just consecutive ones. The comparison is computed on the spot, not pre-recorded.

Visual Architecture

flowchart LR TREE_A["📁 Tree A<br/>Commit abc"] --> DIFF["🔍 Diff Engine"] TREE_B["📁 Tree B<br/>Commit def"] --> DIFF DIFF --> OUTPUT["📝 Diff Output<br/>+added / -removed"] style DIFF fill:#0f3460,stroke:#53d8fb,color:#53d8fb style OUTPUT fill:#1b2d1b,stroke:#53d8fb,color:#53d8fb

Why It Matters

  • Flexibility: Compare any two commits, not just adjacent ones.
  • Speed: Identical blob SHAs are skipped — only changed files are diffed.
  • Rename tracking: Git detects renames without explicit tracking.
  • Foundation: Understanding diff contexts prevents confusion between diff, diff --staged, and diff HEAD.

Code

bash
# ─── Diff working tree vs staging ─── git diff # Shows unstaged changes # ─── Diff staging vs HEAD ─── git diff --staged # Shows what will be committed # ─── Diff between two commits ─── git diff abc123..def456 # ─── Diff with stats (summary) ─── git diff --stat HEAD~5..HEAD # ─── Detect renames ─── git diff -M HEAD~1..HEAD # -M enables rename detection # Shows: renamed: old_name.js -> new_name.js # ─── Word-level diff ─── git diff --word-diff # ─── Diff specific file ─── git diff HEAD -- src/app.js

Key Takeaways

  • Git stores snapshots, not diffs — diffs are computed on demand.
  • Different diff commands compare different pairs of states.
  • Rename detection works by comparing blob SHAs across trees.
  • --stat gives a quick summary; -p (default) shows line-level changes.

Interview Prep

  • Q: Does Git store diffs or snapshots? A: Snapshots. Each commit points to a tree object representing the complete state of the project. Diffs are computed on the fly by comparing two tree objects. This is why you can diff any two commits instantly, not just consecutive ones.

  • Q: How does Git detect file renames? A: Heuristically. Git compares blob SHAs between old and new trees. If a blob exists in the old tree under name A and the new tree under name B, Git infers a rename. The -M flag controls the similarity threshold (default 50%).

  • Q: What is the difference between git diff and git diff --staged? A: git diff compares the working tree against the staging area (shows unstaged changes). git diff --staged compares the staging area against HEAD (shows what will be in the next commit).

Topics Covered

Git InternalsDiffs

Tags

#git#internals#diff#intermediate

Last Updated

2026-02-13