Lesson Completion
Back to course

Trees and Blobs

Intermediate
8 minutes4.7Git

The Hook (The "Byte-Sized" Intro)

Git doesn't store files the way you think. There are no "files" in Git's database — just blobs (raw content) and trees (directory listings). A blob has no name, no path, no permissions. A tree is what connects names to blobs. Rename a file? Same blob, new tree. Two identical files? One blob, referenced twice. This is how Git stays fast and space-efficient.

📖 What are Trees and Blobs?

Blobs store file content. Trees store directory structure, mapping filenames and permissions to blob/tree SHAs. Together they represent a snapshot of your project at any point in time.

Conceptual Clarity

Blob vs Tree:

FeatureBlobTree
StoresRaw file contentDirectory entries
ContainsJust bytes (no metadata)Mode + type + SHA + name
Named?❌ No filename✅ Lists filenames
Nested?✅ Trees can contain trees

Tree entry format:

<mode> <type> <sha> <name> 100644 blob abc123 README.md 100755 blob def456 run.sh 040000 tree 789abc src/

File modes:

ModeMeaning
100644Regular file
100755Executable file
120000Symbolic link
040000Subdirectory (tree)

Real-Life Analogy

Blobs are like pages ripped from a book — they have content but no title or page number. Trees are the table of contents — they list "Chapter 1 is page 5, Chapter 2 is page 12." Without the table of contents, you have content but no structure.

Visual Architecture

flowchart TD ROOT["📁 Root Tree"] --> README["📄 Blob: README.md<br/>100644"] ROOT --> SRC["📁 Tree: src/<br/>040000"] ROOT --> SCRIPT["📄 Blob: run.sh<br/>100755"] SRC --> APP["📄 Blob: app.js<br/>100644"] SRC --> UTIL["📄 Blob: utils.js<br/>100644"] style ROOT fill:#1a1a2e,stroke:#ffd700,color:#ffd700 style SRC fill:#1a1a2e,stroke:#ffd700,color:#ffd700 style README fill:#1b2d1b,stroke:#53d8fb,color:#53d8fb style APP fill:#1b2d1b,stroke:#53d8fb,color:#53d8fb

Why It Matters

  • Deduplication: Identical files share the same blob — no wasted space.
  • Rename detection: Renaming creates a new tree but reuses the blob.
  • Snapshot efficiency: Unchanged files reuse existing blobs across commits.
  • Foundation: Every commit points to a root tree — understanding trees unlocks Git.

Code

bash
# ─── View the root tree of the latest commit ─── git cat-file -p HEAD^{tree} # 100644 blob abc123 README.md # 100755 blob def456 run.sh # 040000 tree 789abc src # ─── View a subtree ─── git cat-file -p 789abc # 100644 blob aaa111 app.js # 100644 blob bbb222 utils.js # ─── View a blob (file content) ─── git cat-file -p abc123 # # My Project # Welcome to the README. # ─── Prove deduplication: identical files = same blob ─── echo "hello" > file1.txt cp file1.txt file2.txt git add . git cat-file -p HEAD^{tree} # Both file1.txt and file2.txt point to the SAME blob SHA! # ─── Create a blob manually ─── echo "hello" | git hash-object -w --stdin # Writes the blob and returns its SHA

Key Takeaways

  • Blobs store raw content with no name or path — just bytes.
  • Trees map filenames + permissions to blob/tree SHAs — they ARE the directory structure.
  • Identical files share the same blob — Git deduplicates automatically.
  • Renaming a file changes the tree but reuses the same blob.

Interview Prep

  • Q: How does Git store files internally? A: Git stores file content as blob objects (raw bytes, no filename). Directory structure is stored as tree objects that map filenames and permissions to blob/tree SHAs. A commit points to a root tree, which contains the complete project snapshot.

  • Q: What happens when you rename a file in Git? A: The blob (file content) stays the same since the content hasn't changed. A new tree object is created with the new filename mapping to the same blob SHA. Git detects renames by comparing blob SHAs between trees.

  • Q: How does Git achieve space efficiency across commits? A: Unchanged files reuse existing blob objects — only modified files create new blobs. Trees also reuse unchanged subtrees. This means a commit with 1000 files where only 1 changed creates just 1 new blob and a chain of new trees up to the root.

Topics Covered

Git InternalsObject Model

Tags

#git#internals#trees#blobs

Last Updated

2026-02-13