Evaluation Matrix

Score options on explicit criteria. Never decide by vibe. The matrix exists so the user can see *which assumption* would have to flip for the recommendation to flip.

Process

Pick criteria (defaults below; tailor with the user - ask one at a time).
Weight criteria 1-5 (5 = decisive). Default: all 3.
Score each option per criterion 1-5.
Compute weighted total. Show top-2 plus the dark-horse.
Run a sensitivity check: which single weight or score, if flipped, changes the winner?

Default criteria - software

Criterion	What you're scoring
Outcome value	How much of the basic function is delivered.
Implementation effort	Inverse of cost. 5 = trivial.
Reversibility	How fast can we undo if wrong (5 = config flip).
Risk surface	Inverse of risk; data-loss/permission/migration weigh heaviest.
Operational burden	On-call, runbooks, dashboards, manual ops.
User comprehension	Will a typical user grok this without training.
Testability	Can we prove correctness before ship.
Architecture fit	Conformance with existing patterns; deviation costs more.
Time to feedback	How fast we'll learn if the bet is right.

Drop any criterion the user explicitly waives. Add domain-specific ones (compliance, latency budget, infra cost) when relevant.

Default criteria - non-software

Criterion	Notes
Outcome value	Movement toward the stated outcome.
Cost	Money + opportunity cost.
Time to value	Calendar weeks until benefit lands.
Stakeholder fit	Alignment with the people whose buy-in matters.
Reversibility	Cost to undo.
Risk	Severity x likelihood of negative outcomes.
Learning value	What we'll know we didn't know before.
Energy / morale	Will the team actually do it.

Template

Options: A=<no-build>, B=<config-only>, C=<lean build>, D=<full build>, E=<dark-horse>

Criterion           Weight  A   B   C   D   E
Outcome value           5   1   2   4   5   3
Implementation effort   3   5   5   3   1   4
Reversibility           4   5   5   3   1   4
Risk surface            4   5   4   3   1   3
Operational burden      3   5   4   3   1   4
User comprehension      2   3   4   4   3   2
Testability             3   5   4   3   2   3
Architecture fit        3   5   3   4   3   2
Time to feedback        4   5   5   3   1   4

Weighted totals:    (compute)  -  recommend top-2 + dark-horse.

Sensitivity check

After the table, state:

The winner.
The single change that would unseat it (e.g. "if Reversibility drops to weight 2, D wins").
Which assumption the user should challenge first.

Pitfalls

Don't pad with criteria that don't differentiate options.
Don't let the sponsor weight; weight first, then reveal the winner.
If two options tie, run a sacrifice test (cut the most expensive function in each) before adding more criteria.

Process

Pick criteria (defaults below; tailor with the user - ask one at a time).

Weight criteria 1-5 (5 = decisive). Default: all 3.

Score each option per criterion 1-5.

Compute weighted total. Show top-2 plus the dark-horse.

Run a sensitivity check: which single weight or score, if flipped, changes the winner?

Default criteria - software

Criterion	What you're scoring
Outcome value	How much of the basic function is delivered.
Implementation effort	Inverse of cost. 5 = trivial.
Reversibility	How fast can we undo if wrong (5 = config flip).
Risk surface	Inverse of risk; data-loss/permission/migration weigh heaviest.
Operational burden	On-call, runbooks, dashboards, manual ops.
User comprehension	Will a typical user grok this without training.
Testability	Can we prove correctness before ship.
Architecture fit	Conformance with existing patterns; deviation costs more.
Time to feedback	How fast we'll learn if the bet is right.

Drop any criterion the user explicitly waives. Add domain-specific ones (compliance, latency budget, infra cost) when relevant.

Default criteria - non-software

Criterion	Notes
Outcome value	Movement toward the stated outcome.
Cost	Money + opportunity cost.
Time to value	Calendar weeks until benefit lands.
Stakeholder fit	Alignment with the people whose buy-in matters.
Reversibility	Cost to undo.
Risk	Severity x likelihood of negative outcomes.
Learning value	What we'll know we didn't know before.
Energy / morale	Will the team actually do it.

Template

Options: A=<no-build>, B=<config-only>, C=<lean build>, D=<full build>, E=<dark-horse> Criterion Weight A B C D E Outcome value 5 1 2 4 5 3 Implementation effort 3 5 5 3 1 4 Reversibility 4 5 5 3 1 4 Risk surface 4 5 4 3 1 3 Operational burden 3 5 4 3 1 4 User comprehension 2 3 4 4 3 2 Testability 3 5 4 3 2 3 Architecture fit 3 5 3 4 3 2 Time to feedback 4 5 5 3 1 4 Weighted totals: (compute) - recommend top-2 + dark-horse.

Value Grill

references/evaluation-matrix.md

Evaluation Matrix

Process

Default criteria - software

Default criteria - non-software

Template

Sensitivity check

Pitfalls

Preparing the source view

Value Grill

references/evaluation-matrix.md

Evaluation Matrix

Process

Default criteria - software

Default criteria - non-software

Template

Sensitivity check

Pitfalls