The Decision Journal for Technical Choices

What It Is

A decision journal is a structured log where you record important technical decisions before you make them, including your reasoning, alternatives considered, expected outcomes, and confidence level. Then, weeks or months later, you review what actually happened and analyze where your thinking was accurate or flawed.

The practice comes from investment management, where fund managers use decision journals to improve their judgment by creating a feedback loop between predictions and outcomes. Annie Duke popularized the technique in “Thinking in Bets,” and it’s remarkably applicable to software engineering.

Why It Works

The Feedback Problem in Engineering

Most engineers never develop systematic decision-making skills because we don’t get clear feedback on our technical choices. Consider:

You choose Database A over Database B for a project. Three months later, the project is successful. Was your database choice right? You’ll never know—maybe Database B would have been even better.
You decide to refactor a module instead of rewriting it. The refactor takes longer than expected. Was this a bad decision? Perhaps—but maybe the rewrite would have taken even longer.
You advocate for microservices architecture. Two years later, the system is complex and hard to debug. Was microservices the wrong choice? Or was it right but poorly implemented?

Without structured reflection, these experiences don’t improve your judgment. You might remember the outcomes, but you forget the context, alternatives, and reasoning that led to each decision.

The Hindsight Bias Trap

Human memory is terrible at preserving your original reasoning. Research shows we unconsciously rewrite our memories to make past decisions seem more obvious than they were:

When choices work out, we remember being more confident than we actually were
When they fail, we convince ourselves we “knew all along” it was risky
We forget the alternatives we seriously considered
We misremember the information available at decision time

A decision journal fixes this by preserving your actual thinking at the moment of decision, creating an objective record for later analysis.

How to Implement It

1. Choose Which Decisions to Journal

You can’t journal every decision—you’d spend all day writing. Focus on:

High-stakes technical choices:

Architecture decisions (monolith vs. microservices, database selection, API design patterns)
Technology adoption (new frameworks, languages, infrastructure)
Performance vs. complexity tradeoffs
Build vs. buy decisions
Technical debt remediation priorities

Medium-stakes recurring decisions:

Code organization approaches
Testing strategies for specific features
Deployment sequencing
API versioning approaches

Rule of thumb: If the decision will affect the codebase for more than 3 months or require more than 2 weeks of work to reverse, journal it.

2. The Decision Entry Template

When making a decision, create an entry with these sections:

A. Context (2-3 sentences)

What problem are you solving? What constraints exist?

Example: “Our API response times have degraded to 400ms P95 as data volume increased 10x. We need to improve performance without a complete rewrite. Timeline: 4 weeks. Budget: 1 engineer.”

B. Alternatives Considered (3-5 options)

What approaches did you evaluate? Include the obvious choice you’re rejecting.

Example:

Add read replicas and load balance queries
Implement caching layer (Redis)
Optimize slow queries (identified 3 main culprits)
Move to NoSQL database
Vertical scaling (larger database instance)

C. Decision Made

What did you choose?

Example: “Implementing Redis caching layer + optimizing the top 3 slow queries (combined approach of #2 and #3)”

D. Reasoning (Most important section)

Why this choice over alternatives? What assumptions are you making?

Example:

Caching alone (#2) gives ~60% of needed improvement based on query patterns, but not enough
Query optimization (#3) alone is fragile—will break again with next 2x growth
Combined approach gets us to target and buys time for larger architectural changes
Read replicas (#1) add operational complexity we don’t have resources to manage well
NoSQL migration (#4) is too risky for our 4-week timeline
Vertical scaling (#5) is expensive and doesn’t address inefficient queries

Key assumptions:

Current query patterns remain stable (75% of requests hit the same 100 records)
Cache invalidation strategy will be straightforward for our use case
Team has Redis experience (we use it elsewhere)

E. Expected Outcomes (Measurable)

What will success look like? Be specific.

Example:

P95 response time: <150ms (currently 400ms)
Cache hit rate: >70%
Implementation time: 2-3 weeks
No increase in error rates
Operational overhead: <2 hours/week

F. Confidence Level (0-100%)

How confident are you this is the right choice?

Example: “75% confidence this is the right approach. Main uncertainty: cache invalidation might be more complex than expected.”

G. Review Date

When will you evaluate this decision?

Example: “Review in 6 weeks (December 20, 2025)”

3. The Review Process

This is where learning happens. At your review date, add:

H. Actual Outcomes

What really happened? Use metrics where possible.

Example:

P95 response time: 125ms (better than target!)
Cache hit rate: 82% (better than expected)
Implementation time: 3.5 weeks (slightly over estimate)
Error rate increased by 0.08% (cache failures during deployment)
Operational overhead: ~1 hour/week

I. Analysis

Where was your thinking accurate? Where was it wrong?

Example:

Got right:

Combined approach was correct—needed both caching and optimization
Cache hit rate was indeed high due to concentrated query patterns
Performance improvement met targets

Got wrong:

Underestimated implementation time by 25%—cache invalidation WAS more complex than expected (validation failure on my key assumption)
Didn’t anticipate deployment risk—should have planned more gradual rollout

Luck vs. skill:

Somewhat lucky that query patterns remained stable—we hadn’t validated this assumption deeply
Cache hit rate was better than expected partly due to a feature that launched during implementation, increasing query repetition

J. Lessons Learned

What will you do differently next time?

Example:

When assuming “X will be straightforward,” explicitly test that assumption earlier
For infrastructure changes, always plan gradual rollout even if timeline is tight
25% time buffer on estimates when integration points are uncertain
Validate data pattern assumptions with deeper analysis (we got lucky here)

Real-World Examples

Example 1: The Microservices Decision

Context: Monolithic application, team growing from 8 to 20 engineers, deployment bottlenecks.

Decision: Split into 5 service domains

Reasoning: Team size hitting coordination limits, domain boundaries clear, deployment independence worth operational cost.

Expected Outcome: 50% reduction in deployment conflicts, 2x deployment frequency

6-Month Review:

Deployment frequency increased 3x (better than expected)
But: Debugging distributed issues took 40% longer than anticipated
Team velocity initially decreased 20% during transition (hadn’t predicted this)

Lesson: Should have planned for 3-month velocity dip and communicated this to stakeholders. Otherwise decision was correct.

Example 2: The TypeScript Migration

Context: Large JavaScript codebase, increasing bug rate from type errors

Decision: Incremental migration to TypeScript over 6 months

Reasoning: All-at-once rewrite too risky, team familiar with TS, tooling support good

Expected Outcome: 30% reduction in production type errors, 20% slower development initially

9-Month Review:

Production type errors down 45% (better than expected)
Development initially 30% slower (worse than expected), but back to baseline by month 5
Unexpected benefit: Onboarding new engineers 40% faster due to types serving as documentation

Lesson: Underestimated learning curve, but underestimated benefits too. Decision correct, but should have allocated more time for team training upfront.

Common Pitfalls

1. Outcome Bias in Reviews

Pitfall: Judging decision quality by outcome rather than reasoning.

A good decision with bad outcome is still a good decision (you played the odds correctly but got unlucky). A bad decision with good outcome is still a bad decision (you got lucky).

Fix: In your review, explicitly separate “Was my reasoning sound given what I knew?” from “Did it work out?”

2. Vague Expected Outcomes

Pitfall: “This will improve performance”

Fix: “This will reduce P95 latency from 400ms to <150ms”

Measurable expectations enable real learning.

3. Not Recording Alternatives

Pitfall: Only writing what you decided

Fix: Document what you didn’t choose and why. This prevents hindsight bias (“I should have done X” when you never seriously considered X).

4. Skipping the Review

Pitfall: Writing the journal but never reviewing it

Fix: Set calendar reminders for review dates. If the decision’s impact isn’t clear yet, set a new review date—don’t just abandon it.

5. Only Journaling Big Decisions

Pitfall: Waiting for “important enough” decisions

Fix: Journal medium-sized decisions too. You need volume to detect patterns in your thinking. Aim for 2-3 entries per month.

Advanced Techniques

Pattern Recognition

After 20-30 entries, analyze for patterns:

Which types of decisions do you consistently get right/wrong?
Do you over/under-estimate implementation time?
Are you overconfident in certain domains?
Do specific cognitive biases recur? (e.g., sunk cost fallacy, optimism bias)

Team Decision Journals

For architecture decisions, create shared journals:

Multiple team members record their individual predictions
Review together, discussing divergent expectations
Builds team calibration and shared mental models

Confidence Calibration

Track your confidence levels against outcomes:

When you’re 70% confident, are you right 70% of the time?
Most people are overconfident—learning your actual calibration improves judgment

Time Investment

Per decision:

Initial entry: 15-20 minutes
Review: 10-15 minutes

Monthly total: ~1.5 hours for 3 decisions

ROI: Avoiding one bad architectural decision every 6 months easily justifies this investment. Most Staff Engineers make decisions affecting millions of dollars of engineering time—improving that judgment by even 10% is enormously valuable.

Tools and Formats

Simple approach: Plain text files or markdown in your notes system

Structured approach: Spreadsheet with columns for each section

Advanced approach: Tools like Notion, Obsidian, or custom databases with tagging and searching

The format matters less than consistency. Pick something you’ll actually use.

Why This Matters for Staff Engineers

As you advance in your career, the leverage of your decisions increases exponentially. At senior levels, you might make 5-10 high-stakes technical decisions per year that affect dozens of engineers and millions in business value.

The difference between making these decisions at 60% accuracy vs. 80% accuracy is massive:

Fewer failed initiatives
Better resource allocation
Faster problem-solving through pattern recognition
Increased trust from leadership (decisions backed by track record)

A decision journal is compound learning—each entry makes future decisions slightly better, and the effect accumulates over years.

The engineers who reach Principal and Distinguished levels aren’t just smart—they have exceptional judgment developed through deliberate practice. A decision journal is that deliberate practice made systematic.

Getting Started

Start this week:

Create a simple template (text file, doc, or note)
Identify one technical decision you’re making this week
Spend 15 minutes filling out the entry
Set a review reminder for 4-8 weeks from now
When the reminder fires, spend 10 minutes on the review

That’s it. After 3-4 entries, the process becomes natural. After 20 entries, you’ll notice your decision-making improving.

The hardest part is starting. The second-hardest part is doing the reviews (our brains resist confronting prediction errors). But the engineers who push through these two obstacles develop judgment that becomes their career superpower.

Your future self will thank you for the feedback loop you’re creating today.

2025-11-13

../