The Decision Journal for Technical Choices
The Decision Journal for Technical Choices
What It Is
A decision journal is a structured log where you record important technical decisions before you make them, including your reasoning, alternatives considered, expected outcomes, and confidence level. Then, weeks or months later, you review what actually happened and analyze where your thinking was accurate or flawed.
The practice comes from investment management, where fund managers use decision journals to improve their judgment by creating a feedback loop between predictions and outcomes. Annie Duke popularized the technique in “Thinking in Bets,” and it’s remarkably applicable to software engineering.
Why It Works
The Feedback Problem in Engineering
Most engineers never develop systematic decision-making skills because we don’t get clear feedback on our technical choices. Consider:
You choose Database A over Database B for a project. Three months later, the project is successful. Was your database choice right? You’ll never know—maybe Database B would have been even better.
You decide to refactor a module instead of rewriting it. The refactor takes longer than expected. Was this a bad decision? Perhaps—but maybe the rewrite would have taken even longer.
You advocate for microservices architecture. Two years later, the system is complex and hard to debug. Was microservices the wrong choice? Or was it right but poorly implemented?
Without structured reflection, these experiences don’t improve your judgment. You might remember the outcomes, but you forget the context, alternatives, and reasoning that led to each decision.
The Hindsight Bias Trap
Human memory is terrible at preserving your original reasoning. Research shows we unconsciously rewrite our memories to make past decisions seem more obvious than they were:
- When choices work out, we remember being more confident than we actually were
- When they fail, we convince ourselves we “knew all along” it was risky
- We forget the alternatives we seriously considered
- We misremember the information available at decision time
A decision journal fixes this by preserving your actual thinking at the moment of decision, creating an objective record for later analysis.
How to Implement It
1. Choose Which Decisions to Journal
You can’t journal every decision—you’d spend all day writing. Focus on:
High-stakes technical choices:
- Architecture decisions (monolith vs. microservices, database selection, API design patterns)
- Technology adoption (new frameworks, languages, infrastructure)
- Performance vs. complexity tradeoffs
- Build vs. buy decisions
- Technical debt remediation priorities
Medium-stakes recurring decisions:
- Code organization approaches
- Testing strategies for specific features
- Deployment sequencing
- API versioning approaches
Rule of thumb: If the decision will affect the codebase for more than 3 months or require more than 2 weeks of work to reverse, journal it.
2. The Decision Entry Template
When making a decision, create an entry with these sections:
A. Context (2-3 sentences)
What problem are you solving? What constraints exist?
Example: “Our API response times have degraded to 400ms P95 as data volume increased 10x. We need to improve performance without a complete rewrite. Timeline: 4 weeks. Budget: 1 engineer.”
B. Alternatives Considered (3-5 options)
What approaches did you evaluate? Include the obvious choice you’re rejecting.
Example:
- Add read replicas and load balance queries
- Implement caching layer (Redis)
- Optimize slow queries (identified 3 main culprits)
- Move to NoSQL database
- Vertical scaling (larger database instance)
C. Decision Made
What did you choose?
Example: “Implementing Redis caching layer + optimizing the top 3 slow queries (combined approach of #2 and #3)”
D. Reasoning (Most important section)
Why this choice over alternatives? What assumptions are you making?
Example:
- Caching alone (#2) gives ~60% of needed improvement based on query patterns, but not enough
- Query optimization (#3) alone is fragile—will break again with next 2x growth
- Combined approach gets us to target and buys time for larger architectural changes
- Read replicas (#1) add operational complexity we don’t have resources to manage well
- NoSQL migration (#4) is too risky for our 4-week timeline
- Vertical scaling (#5) is expensive and doesn’t address inefficient queries
Key assumptions:
- Current query patterns remain stable (75% of requests hit the same 100 records)
- Cache invalidation strategy will be straightforward for our use case
- Team has Redis experience (we use it elsewhere)
E. Expected Outcomes (Measurable)
What will success look like? Be specific.
Example:
- P95 response time: <150ms (currently 400ms)
- Cache hit rate: >70%
- Implementation time: 2-3 weeks
- No increase in error rates
- Operational overhead: <2 hours/week
F. Confidence Level (0-100%)
How confident are you this is the right choice?
Example: “75% confidence this is the right approach. Main uncertainty: cache invalidation might be more complex than expected.”
G. Review Date
When will you evaluate this decision?
Example: “Review in 6 weeks (December 20, 2025)”
3. The Review Process
This is where learning happens. At your review date, add:
H. Actual Outcomes
What really happened? Use metrics where possible.
Example:
- P95 response time: 125ms (better than target!)
- Cache hit rate: 82% (better than expected)
- Implementation time: 3.5 weeks (slightly over estimate)
- Error rate increased by 0.08% (cache failures during deployment)
- Operational overhead: ~1 hour/week
I. Analysis
Where was your thinking accurate? Where was it wrong?
Example:
Got right:
- Combined approach was correct—needed both caching and optimization
- Cache hit rate was indeed high due to concentrated query patterns
- Performance improvement met targets
Got wrong:
- Underestimated implementation time by 25%—cache invalidation WAS more complex than expected (validation failure on my key assumption)
- Didn’t anticipate deployment risk—should have planned more gradual rollout
Luck vs. skill:
- Somewhat lucky that query patterns remained stable—we hadn’t validated this assumption deeply
- Cache hit rate was better than expected partly due to a feature that launched during implementation, increasing query repetition
J. Lessons Learned
What will you do differently next time?
Example:
- When assuming “X will be straightforward,” explicitly test that assumption earlier
- For infrastructure changes, always plan gradual rollout even if timeline is tight
- 25% time buffer on estimates when integration points are uncertain
- Validate data pattern assumptions with deeper analysis (we got lucky here)
Real-World Examples
Example 1: The Microservices Decision
Context: Monolithic application, team growing from 8 to 20 engineers, deployment bottlenecks.
Decision: Split into 5 service domains
Reasoning: Team size hitting coordination limits, domain boundaries clear, deployment independence worth operational cost.
Expected Outcome: 50% reduction in deployment conflicts, 2x deployment frequency
6-Month Review:
- Deployment frequency increased 3x (better than expected)
- But: Debugging distributed issues took 40% longer than anticipated
- Team velocity initially decreased 20% during transition (hadn’t predicted this)
Lesson: Should have planned for 3-month velocity dip and communicated this to stakeholders. Otherwise decision was correct.
Example 2: The TypeScript Migration
Context: Large JavaScript codebase, increasing bug rate from type errors
Decision: Incremental migration to TypeScript over 6 months
Reasoning: All-at-once rewrite too risky, team familiar with TS, tooling support good
Expected Outcome: 30% reduction in production type errors, 20% slower development initially
9-Month Review:
- Production type errors down 45% (better than expected)
- Development initially 30% slower (worse than expected), but back to baseline by month 5
- Unexpected benefit: Onboarding new engineers 40% faster due to types serving as documentation
Lesson: Underestimated learning curve, but underestimated benefits too. Decision correct, but should have allocated more time for team training upfront.
Common Pitfalls
1. Outcome Bias in Reviews
Pitfall: Judging decision quality by outcome rather than reasoning.
A good decision with bad outcome is still a good decision (you played the odds correctly but got unlucky). A bad decision with good outcome is still a bad decision (you got lucky).
Fix: In your review, explicitly separate “Was my reasoning sound given what I knew?” from “Did it work out?”
2. Vague Expected Outcomes
Pitfall: “This will improve performance”
Fix: “This will reduce P95 latency from 400ms to <150ms”
Measurable expectations enable real learning.
3. Not Recording Alternatives
Pitfall: Only writing what you decided
Fix: Document what you didn’t choose and why. This prevents hindsight bias (“I should have done X” when you never seriously considered X).
4. Skipping the Review
Pitfall: Writing the journal but never reviewing it
Fix: Set calendar reminders for review dates. If the decision’s impact isn’t clear yet, set a new review date—don’t just abandon it.
5. Only Journaling Big Decisions
Pitfall: Waiting for “important enough” decisions
Fix: Journal medium-sized decisions too. You need volume to detect patterns in your thinking. Aim for 2-3 entries per month.
Advanced Techniques
Pattern Recognition
After 20-30 entries, analyze for patterns:
- Which types of decisions do you consistently get right/wrong?
- Do you over/under-estimate implementation time?
- Are you overconfident in certain domains?
- Do specific cognitive biases recur? (e.g., sunk cost fallacy, optimism bias)
Team Decision Journals
For architecture decisions, create shared journals:
- Multiple team members record their individual predictions
- Review together, discussing divergent expectations
- Builds team calibration and shared mental models
Confidence Calibration
Track your confidence levels against outcomes:
- When you’re 70% confident, are you right 70% of the time?
- Most people are overconfident—learning your actual calibration improves judgment
Time Investment
Per decision:
- Initial entry: 15-20 minutes
- Review: 10-15 minutes
Monthly total: ~1.5 hours for 3 decisions
ROI: Avoiding one bad architectural decision every 6 months easily justifies this investment. Most Staff Engineers make decisions affecting millions of dollars of engineering time—improving that judgment by even 10% is enormously valuable.
Tools and Formats
Simple approach: Plain text files or markdown in your notes system
Structured approach: Spreadsheet with columns for each section
Advanced approach: Tools like Notion, Obsidian, or custom databases with tagging and searching
The format matters less than consistency. Pick something you’ll actually use.
Why This Matters for Staff Engineers
As you advance in your career, the leverage of your decisions increases exponentially. At senior levels, you might make 5-10 high-stakes technical decisions per year that affect dozens of engineers and millions in business value.
The difference between making these decisions at 60% accuracy vs. 80% accuracy is massive:
- Fewer failed initiatives
- Better resource allocation
- Faster problem-solving through pattern recognition
- Increased trust from leadership (decisions backed by track record)
A decision journal is compound learning—each entry makes future decisions slightly better, and the effect accumulates over years.
The engineers who reach Principal and Distinguished levels aren’t just smart—they have exceptional judgment developed through deliberate practice. A decision journal is that deliberate practice made systematic.
Getting Started
Start this week:
- Create a simple template (text file, doc, or note)
- Identify one technical decision you’re making this week
- Spend 15 minutes filling out the entry
- Set a review reminder for 4-8 weeks from now
- When the reminder fires, spend 10 minutes on the review
That’s it. After 3-4 entries, the process becomes natural. After 20 entries, you’ll notice your decision-making improving.
The hardest part is starting. The second-hardest part is doing the reviews (our brains resist confronting prediction errors). But the engineers who push through these two obstacles develop judgment that becomes their career superpower.
Your future self will thank you for the feedback loop you’re creating today.