Lesson Completion
Back to course

Incident Response Workflow

Intermediate
8 minutes4.7Git

The Hook (The "Byte-Sized" Intro)

Alarms fire. Error rates spike. Users are complaining. This is where Git workflows meet real-world pressure. The incident response workflow answers: was this caused by a recent deploy? Should we revert or hotfix? How do we communicate? After the fire is out, the post-mortem uses Git history to trace exactly what happened.

📖 What is the Incident Response Workflow?

A structured process for using Git and deployment tools to detect, diagnose, fix, and learn from production incidents.

Conceptual Clarity

Incident response phases:

PhaseGit ActionsGoal
1. DetectCheck recent deploysWas this caused by a change?
2. Triagegit log --since, git diffWhat changed recently?
3. MitigateRevert or feature flag offStop the bleeding
4. FixHotfix branch if neededProper fix
5. VerifyDeploy fix, monitorConfirm resolution
6. Post-mortemgit blame, git logTrace root cause

Revert vs hotfix decision:

SituationAction
Root cause is clear + fix is simpleHotfix
Root cause is unclearRevert the last deploy
Multiple recent changes could be the causeRevert all recent changes
Revert would lose important changesFeature flag off + hotfix

Real-Life Analogy

Incident response is like a fire department response. First: contain the fire (revert/mitigate). Second: investigate the cause (git log, blame). Third: prevent future fires (post-mortem action items).

Visual Architecture

flowchart TD ALERT["🚨 Alert"] --> TRIAGE["🔍 Triage<br/>git log, git diff"] TRIAGE --> DECIDE{"Clear root cause?"} DECIDE -->|"Yes"| HOTFIX["🩹 Hotfix"] DECIDE -->|"No"| REVERT["⏪ Revert Deploy"] HOTFIX --> VERIFY["✅ Verify"] REVERT --> VERIFY VERIFY --> POSTMORTEM["📝 Post-mortem"] style ALERT fill:#2d1b1b,stroke:#e94560,color:#e94560 style VERIFY fill:#1b2d1b,stroke:#53d8fb,color:#53d8fb

Why It Matters

  • Speed: Knowing the workflow eliminates decision paralysis during incidents.
  • Safety: Revert first, investigate later — minimize user impact.
  • Learning: Post-mortems use Git history to trace root causes.
  • Accountability: Git log provides an audit trail of all actions taken.

Code

bash
# ─── Step 1: What changed recently? ─── git log --oneline --since="2 hours ago" main git diff v1.0.0..v1.0.1 --stat # ─── Step 2: Who changed the broken file? ─── git blame src/payments/process.js git log --oneline -5 src/payments/process.js # ─── Step 3a: Revert (when cause is unclear) ─── git revert HEAD --no-edit # Revert last commit git push origin main # Deploy the revert # (or revert the deploy via CI/CD rollback) # ─── Step 3b: Hotfix (when cause is clear) ─── git checkout -b hotfix/INCIDENT-42-fix v1.0.1 # Make the fix... git commit -m "fix(payments): handle null card object" git push -u origin hotfix/INCIDENT-42-fix # ─── Step 4: Verify ─── # Monitor error rates, check logs, run smoke tests # ─── Step 5: Post-mortem ─── # Document in a post-mortem template: # - Timeline of events # - Root cause (link to commit) # - Impact (users affected, duration) # - Action items (preventive measures)

Key Takeaways

  • Revert first if the root cause isn't immediately clear.
  • Use git log --since and git diff to triage quickly.
  • Hotfix when the cause is known and the fix is small.
  • Post-mortems use Git history to build the timeline and trace the root cause.

Interview Prep

  • Q: During a production incident, should you revert or hotfix? A: If the root cause is clear and the fix is simple, hotfix. If not, revert the last deploy immediately to stop the bleeding, then investigate. The priority is restoring service, not fixing the bug.

  • Q: How does Git help during incident response? A: git log --since shows recent changes, git diff shows what changed between versions, git blame identifies who changed specific lines, and git revert quickly undoes problematic commits.

  • Q: What should a post-mortem include? A: Timeline (when detected, mitigated, resolved), root cause (link to specific commit), impact (users/revenue affected), and action items (tests to add, checks to implement, process improvements).

Topics Covered

WorkflowsIncident Response

Tags

#git#incident#response#workflow

Last Updated

2026-02-13