The Hook (The "Byte-Sized" Intro)
Alarms fire. Error rates spike. Users are complaining. This is where Git workflows meet real-world pressure. The incident response workflow answers: was this caused by a recent deploy? Should we revert or hotfix? How do we communicate? After the fire is out, the post-mortem uses Git history to trace exactly what happened.
📖 What is the Incident Response Workflow?
A structured process for using Git and deployment tools to detect, diagnose, fix, and learn from production incidents.
Conceptual Clarity
Incident response phases:
| Phase | Git Actions | Goal |
|---|---|---|
| 1. Detect | Check recent deploys | Was this caused by a change? |
| 2. Triage | git log --since, git diff | What changed recently? |
| 3. Mitigate | Revert or feature flag off | Stop the bleeding |
| 4. Fix | Hotfix branch if needed | Proper fix |
| 5. Verify | Deploy fix, monitor | Confirm resolution |
| 6. Post-mortem | git blame, git log | Trace root cause |
Revert vs hotfix decision:
| Situation | Action |
|---|---|
| Root cause is clear + fix is simple | Hotfix |
| Root cause is unclear | Revert the last deploy |
| Multiple recent changes could be the cause | Revert all recent changes |
| Revert would lose important changes | Feature flag off + hotfix |
Real-Life Analogy
Incident response is like a fire department response. First: contain the fire (revert/mitigate). Second: investigate the cause (git log, blame). Third: prevent future fires (post-mortem action items).
Visual Architecture
Why It Matters
- Speed: Knowing the workflow eliminates decision paralysis during incidents.
- Safety: Revert first, investigate later — minimize user impact.
- Learning: Post-mortems use Git history to trace root causes.
- Accountability: Git log provides an audit trail of all actions taken.
Code
# ─── Step 1: What changed recently? ───
git log --oneline --since="2 hours ago" main
git diff v1.0.0..v1.0.1 --stat
# ─── Step 2: Who changed the broken file? ───
git blame src/payments/process.js
git log --oneline -5 src/payments/process.js
# ─── Step 3a: Revert (when cause is unclear) ───
git revert HEAD --no-edit # Revert last commit
git push origin main # Deploy the revert
# (or revert the deploy via CI/CD rollback)
# ─── Step 3b: Hotfix (when cause is clear) ───
git checkout -b hotfix/INCIDENT-42-fix v1.0.1
# Make the fix...
git commit -m "fix(payments): handle null card object"
git push -u origin hotfix/INCIDENT-42-fix
# ─── Step 4: Verify ───
# Monitor error rates, check logs, run smoke tests
# ─── Step 5: Post-mortem ───
# Document in a post-mortem template:
# - Timeline of events
# - Root cause (link to commit)
# - Impact (users affected, duration)
# - Action items (preventive measures)Key Takeaways
- Revert first if the root cause isn't immediately clear.
- Use
git log --sinceandgit diffto triage quickly. - Hotfix when the cause is known and the fix is small.
- Post-mortems use Git history to build the timeline and trace the root cause.
Interview Prep
-
Q: During a production incident, should you revert or hotfix? A: If the root cause is clear and the fix is simple, hotfix. If not, revert the last deploy immediately to stop the bleeding, then investigate. The priority is restoring service, not fixing the bug.
-
Q: How does Git help during incident response? A:
git log --sinceshows recent changes,git diffshows what changed between versions,git blameidentifies who changed specific lines, andgit revertquickly undoes problematic commits. -
Q: What should a post-mortem include? A: Timeline (when detected, mitigated, resolved), root cause (link to specific commit), impact (users/revenue affected), and action items (tests to add, checks to implement, process improvements).