Evaluation Matrix
Score options on explicit criteria. Never decide by vibe. The matrix exists so the user can see *which assumption* would have to flip for the recommendation to flip.
Process
- Pick criteria (defaults below; tailor with the user - ask one at a time).
- Weight criteria 1-5 (5 = decisive). Default: all 3.
- Score each option per criterion 1-5.
- Compute weighted total. Show top-2 plus the dark-horse.
- Run a sensitivity check: which single weight or score, if flipped, changes the winner?
Default criteria - software
| Criterion | What you're scoring |
|---|---|
| Outcome value | How much of the basic function is delivered. |
| Implementation effort | Inverse of cost. 5 = trivial. |
| Reversibility | How fast can we undo if wrong (5 = config flip). |
| Risk surface | Inverse of risk; data-loss/permission/migration weigh heaviest. |
| Operational burden | On-call, runbooks, dashboards, manual ops. |
| User comprehension | Will a typical user grok this without training. |
| Testability | Can we prove correctness before ship. |
| Architecture fit | Conformance with existing patterns; deviation costs more. |
| Time to feedback | How fast we'll learn if the bet is right. |
Drop any criterion the user explicitly waives. Add domain-specific ones (compliance, latency budget, infra cost) when relevant.
Default criteria - non-software
| Criterion | Notes |
|---|---|
| Outcome value | Movement toward the stated outcome. |
| Cost | Money + opportunity cost. |
| Time to value | Calendar weeks until benefit lands. |
| Stakeholder fit | Alignment with the people whose buy-in matters. |
| Reversibility | Cost to undo. |
| Risk | Severity x likelihood of negative outcomes. |
| Learning value | What we'll know we didn't know before. |
| Energy / morale | Will the team actually do it. |
Template
Options: A=<no-build>, B=<config-only>, C=<lean build>, D=<full build>, E=<dark-horse>
Criterion Weight A B C D E
Outcome value 5 1 2 4 5 3
Implementation effort 3 5 5 3 1 4
Reversibility 4 5 5 3 1 4
Risk surface 4 5 4 3 1 3
Operational burden 3 5 4 3 1 4
User comprehension 2 3 4 4 3 2
Testability 3 5 4 3 2 3
Architecture fit 3 5 3 4 3 2
Time to feedback 4 5 5 3 1 4
Weighted totals: (compute) - recommend top-2 + dark-horse.Sensitivity check
After the table, state:
- The winner.
- The single change that would unseat it (e.g. "if Reversibility drops to weight 2, D wins").
- Which assumption the user should challenge first.
Pitfalls
- Don't pad with criteria that don't differentiate options.
- Don't let the sponsor weight; weight first, then reveal the winner.
- If two options tie, run a sacrifice test (cut the most expensive function in each) before adding more criteria.