Lesson Completion
Back to course

git gc

Intermediate
7 minutes4.7Git

The Hook (The "Byte-Sized" Intro)

Over time, your .git folder accumulates loose objects, stale reflogs, and unreachable data. git gc is the housekeeping service: it packs loose objects into packfiles, removes unreachable objects past their expiry, and compresses references. Git runs it automatically in the background, but understanding it helps you manage large repos and know when your data becomes permanently unrecoverable.

📖 What is git gc?

git gc (garbage collection) optimizes the repository by packing loose objects, pruning unreachable objects, and compressing references.

Conceptual Clarity

What gc does:

ActionEffect
Pack loose objectsCombines into packfiles with delta compression
Prune unreachable objectsRemoves objects older than expiry (default: 2 weeks)
Expire reflog entriesRemoves old reflog entries (default: 90 days reachable, 30 days unreachable)
Repack referencesCompresses .git/refs/ into packed-refs

Auto-gc triggers: Git runs git gc --auto automatically when:

  • Loose objects exceed ~6,700
  • Packfiles exceed ~50

The expiry timeline:

Object TypeReflog ExpiryPrune After
Reachable reflog90 daysNever (still reachable)
Unreachable reflog30 days2 weeks after reflog expires

Real-Life Analogy

git gc is spring cleaning for your repo. Loose papers (objects) get filed into binders (packfiles). Old receipts (unreachable objects) get shredded after 30 days. The room is cleaner, smaller, and faster to navigate.

Visual Architecture

flowchart LR GC["git gc"] --> PACK["📦 Pack Objects"] GC --> PRUNE["🗑️ Prune Unreachable"] GC --> EXPIRE["⏰ Expire Reflogs"] GC --> COMPRESS["🗜️ Compress Refs"] style GC fill:#0f3460,stroke:#53d8fb,color:#53d8fb style PRUNE fill:#2d1b1b,stroke:#e94560,color:#e94560

Why It Matters

  • Performance: Packed repos are faster to read and transfer.
  • Disk space: Removes duplicate and unreachable data.
  • Recovery window: Unreachable objects survive ~30 days before gc removes them.
  • Awareness: Knowing gc timelines tells you how long recovery is possible.

Code

bash
# ─── Run garbage collection ─── git gc # Packs, prunes, compresses # ─── Aggressive gc (slower, more compression) ─── git gc --aggressive # Better compression, takes longer # ─── Check what gc would prune ─── git prune --dry-run # Shows what would be removed without removing it # ─── Configure expiry times ─── git config gc.reflogExpire "90 days" git config gc.reflogExpireUnreachable "30 days" git config gc.pruneExpire "2 weeks" # ─── Check repo size ─── git count-objects -v # count: 0 (loose objects) # packs: 1 (packfiles) # size-pack: 5432 (KB) # ─── Disable auto-gc temporarily ─── git config gc.auto 0 # Re-enable: git config gc.auto 6700

Key Takeaways

  • git gc packs objects, prunes unreachable data, and compresses references.
  • Git runs auto-gc in the background — you rarely need to run it manually.
  • Unreachable objects survive ~30 days before gc permanently removes them.
  • After gc prunes objects, recovery via fsck is no longer possible.

Interview Prep

  • Q: What does git gc do? A: It packs loose objects into packfiles (with delta compression), prunes unreachable objects older than the configured expiry, expires old reflog entries, and compresses references. This reduces disk usage and improves performance.

  • Q: How long are unreachable objects recoverable? A: Unreachable reflog entries expire after 30 days (default). After that, git gc can prune the objects — typically 2 weeks after the reflog entry expires. So roughly 30-45 days total. After pruning, recovery is impossible.

  • Q: When would you run git gc --aggressive? A: After importing a large repository, after removing large files from history (e.g., with filter-branch), or when the repository has grown very large and you want maximum compression. It's slower but produces better results.

Topics Covered

Git InternalsPerformance

Tags

#git#internals#gc#performance

Last Updated

2026-02-13