Lesson Completion
Back to course

Large Repo Performance Tips

Intermediate
8 minutes4.8Git

The Hook (The "Byte-Sized" Intro)

A 10GB repo with 500,000 files and 100,000 commits will bring git status to its knees — unless you know the tricks. Shallow clones, partial clones, fsmonitor, commit graphs, and sparse checkout can make a massive repo feel as responsive as a small one. These aren't workarounds — they're how companies like Microsoft manage repos with 3.5 million files.

📖 What are Large Repo Performance Tips?

Techniques and configurations that keep Git fast and responsive when working with repositories that have large file counts, deep history, or significant binary content.

Conceptual Clarity

Performance techniques ranked:

TechniqueImpactEffortWhat It Helps
core.fsmonitor🟢 HighLowgit status speed
core.untrackedCache🟢 HighLowgit status speed
Commit graph🟢 HighLowgit log, git merge-base
Shallow clone🟡 MediumLowClone speed, disk space
Partial clone🟢 HighLowClone speed, disk space
Sparse checkout🟢 HighMediumWorking dir size
Git LFS🟡 MediumMediumBinary file handling
feature.manyFiles🟢 HighLowEnables multiple optimizations

What feature.manyFiles enables:

  • core.untrackedCache true
  • core.fsmonitor true
  • index.version 4 (smaller, faster index)

Real-Life Analogy

Optimizing a large repo is like optimizing a warehouse. You don't carry everything — you have a catalog (commit graph), a fast lookup system (fsmonitor), and only stock what's needed on the floor (sparse checkout).

Visual Architecture

flowchart TD LARGE["📦 Large Repo<br/>500K files"] --> FS["⚡ fsmonitor<br/>Fast status"] LARGE --> GRAPH["📊 Commit graph<br/>Fast log"] LARGE --> SPARSE["🔍 Sparse checkout<br/>Subset of files"] LARGE --> LFS["📎 Git LFS<br/>Large files"] FS & GRAPH & SPARSE & LFS --> FAST["✅ Fast Operations"] style LARGE fill:#2d1b1b,stroke:#e94560,color:#e94560 style FAST fill:#1b2d1b,stroke:#53d8fb,color:#53d8fb

Why It Matters

  • Developer productivity: Slow Git = wasted time on every operation.
  • CI speed: Shallow/partial clones cut pipeline times significantly.
  • Disk space: Partial clones avoid downloading unused data.
  • Scale: These techniques are how Git scales to enterprise-sized repos.

Code

bash
# ─── Quick wins (set these immediately) ─── git config --global feature.manyFiles true git config --global core.fsmonitor true git config --global core.untrackedCache true # ─── Generate commit graph (speeds up log, merge-base) ─── git commit-graph write --reachable # Git auto-updates this on gc, but you can trigger manually # ─── Shallow clone (CI/CD) ─── git clone --depth 1 https://github.com/team/large-repo.git # Only 1 commit of history — fast clone # ─── Partial clone (on-demand blobs) ─── git clone --filter=blob:none https://github.com/team/large-repo.git # Tree objects only; blobs fetched on demand # ─── Git LFS for large binaries ─── git lfs install git lfs track "*.psd" "*.zip" "*.mp4" git add .gitattributes git commit -m "chore: track large files with LFS" # ─── Check repo size ─── git count-objects -vH # size-pack: 2.1 GiB ← How much data is packed # ─── Aggressive garbage collection ─── git gc --aggressive --prune=now

Key Takeaways

  • feature.manyFiles true is the single biggest quick win for large repos.
  • Commit graph speeds up git log, merge-base, and branch operations.
  • Shallow/partial clones are essential for CI/CD pipelines.
  • Git LFS handles large binary files without bloating the repo.

Interview Prep

  • Q: How do you make Git faster in a large repository? A: Enable feature.manyFiles (activates fsmonitor, untracked cache, index v4), generate a commit graph (git commit-graph write), use sparse checkout to limit working directory size, and use Git LFS for binary files.

  • Q: What is a partial clone? A: git clone --filter=blob:none downloads commit and tree objects but fetches file content (blobs) on demand as you check out or access files. This dramatically reduces initial clone time and disk usage.

  • Q: How does core.fsmonitor speed up git status? A: Instead of Git scanning every file for changes, fsmonitor uses the OS file system events to track which files changed. Git only checks those files, making git status near-instant even in repos with hundreds of thousands of files.

Topics Covered

Large ReposPerformance

Tags

#git#performance#large-repos#optimization

Last Updated

2026-02-13