RICE vs Stack Ranking: When Scoring Frameworks Help and When They Hurt
RICE is the prioritization framework you reach for when you want a number. Stack ranking is what you reach for when you want a decision. They sound like the same thing. They're not.
This post is about why RICE looks rigorous and often isn't, why stack ranking looks crude and often isn't, and how to combine them so the rigor actually pays for itself.
The short version
- RICE scores each item on Reach × Impact × Confidence ÷ Effort. The output is a number per item.
- Stack ranking asks people to put items in order. The output is a list.
- RICE feels objective but is built entirely on subjective inputs that get multiplied together — small estimation errors compound into a fake-precise score.
- Stack ranking is honest about its subjectivity and forces tradeoffs that RICE quietly hides.
- Use RICE for individual analysis when you have real data on Reach and Impact. Use stack ranking for group decisions, especially when teams are aligned on goals but disagree on order.
How RICE actually works
RICE was popularized by Intercom. Each item gets four scores:
- Reach — how many users/customers/events per time period
- Impact — how much it matters per affected user (often 0.25, 0.5, 1, 2, 3)
- Confidence — how sure you are in your numbers (50%, 80%, 100%)
- Effort — person-months required
The score is (Reach × Impact × Confidence) / Effort. Higher score = higher priority.
What RICE does well: Forces you to write down assumptions. Makes invisible thinking explicit. Surfaces effort estimates early. Good as a thinking exercise for individual PMs.
What RICE does poorly: The output is a number that looks objective but is built on four guesses multiplied together. If you're 20% off on each input — which is generous — the compound error on the final score is significant. Two items can have nearly identical RICE scores while everyone on the team knows one is obviously more important.
The deeper problem: RICE rewards confident estimators. The PM who writes "Impact: 3, Confidence: 100%" beats the PM who writes "Impact: 2, Confidence: 80%" — not because they're right, but because they typed bigger numbers. Over time, this trains the team to inflate. Soon every Impact is 2 or 3 and the framework collapses.
How stack ranking actually works
Stack ranking has one rule: every item gets a position. No ties. No buckets. If you think A and B are equally important, you still have to put one above the other — and the decision of which one matters is the value of the exercise.
In a group setting, every team member ranks individually. Then you aggregate the rankings using a method like the Schulze method that finds the order which would beat every alternative head-to-head.
What stack ranking does well: Forces real tradeoffs. Surfaces disagreement explicitly. Robust to estimation error (you don't need precise numbers, just relative judgment). Scales naturally — 100 people can stack rank a list, but 100 people can't agree on what Impact = 2 means.
What stack ranking does poorly: It doesn't capture magnitude. A and B might be the top two, but is A 10× more important than B or 10% more? Stack ranking won't tell you. For very long lists (>25 items), individual rankings get noisy.
The comparison
| Dimension | RICE | Stack Ranking |
|---|---|---|
| Output | A score per item | An ordered list |
| Captures magnitude | Yes (but only if numbers are real) | No (just order) |
| Group exercise | Awkward — whose Impact estimate wins? | Natural — every person ranks |
| Surfaces disagreement | No (one combined score) | Yes (rankings differ) |
| Robust to bad data | No (small errors compound) | Yes (only relative order matters) |
| Time to run | 30+ min per PM, then group debate | 2-5 min per person |
| Defensibility to leadership | "Here's the math" | "Here's what the team ranked" |
| Failure mode | Score inflation, false precision | Doesn't show how big the gaps are |
Why combining them works better than either alone
RICE alone is a single PM's spreadsheet. Stack ranking alone doesn't tell you whether to do 5 things or 15. Used together, the weakness of each cancels:
- PM does individual RICE on the candidate list. This is the right place for RICE — one analyst, working through the items, writing down assumptions. It produces a starting hypothesis and a list of inputs the team can challenge.
- Team does async stack ranking on the same list. No RICE scores shown. Each person ranks based on their own judgment.
- Compare the two. Where the team's stack rank matches the RICE ordering, you have alignment between the analytical view and the team intuition. Ship those.
- Where they diverge is the conversation. If RICE says Item 3 is #1 but the team ranks it #7, either (a) the RICE inputs are wrong (Reach is overestimated, Effort is underestimated) or (b) the team is missing context the analyst saw. Either way, that's a 15-minute discussion, not a 2-hour debate over the whole list.
This pattern — quantitative analysis + group ranking + diff-the-two — is the closest thing to a defensible prioritization process most teams will get.
When to use which
Use RICE alone when:
- One PM owns the prioritization decision
- You have real data on Reach (telemetry, customer counts, support ticket volume)
- The downstream consumer of the output wants to see numbers
- You're justifying a decision that's already been made (RICE-as-documentation)
Use stack ranking alone when:
- The decision is owned by a group, not a single person
- You don't have hard data on Reach and Impact (most cases)
- You need to surface team disagreement before locking in a roadmap
- Speed matters more than precision
Use both when:
- The stakes are high (annual planning, big budget allocation)
- You have one PM who can do the analytical work and a team that needs to buy in
- You expect the team and the data to disagree on at least some items — and you want to know which
The trap to avoid
The most common failure mode in RICE-only shops: the framework becomes the decision. The PM produces a RICE-sorted list, the team rubber-stamps it, work begins. Three months later the most-important-by-RICE item is shipped and nobody actually wanted it.
This happens because RICE doesn't reveal disagreement. The number is a number. There's no surface for "I think Impact is 1, not 3" to show up — unless you literally ask each team member to fill in their own RICE row, at which point you've turned a 30-minute analysis into a 4-hour group meeting.
Stack ranking solves this directly: every team member has a ranking, the rankings get compared, and disagreement is visible by construction.
What to do on Monday
If your team uses RICE today and the decisions feel rubber-stamped:
- Take your existing RICE-sorted list.
- Strip the scores. Just the items, in alphabetical order.
- Have every team member individually rank them. Async. 5 minutes each.
- Aggregate the rankings.
- Lay the team ranking next to the RICE ranking.
- The places they disagree are your real prioritization meeting agenda. Everywhere else, you already had alignment — you just couldn't see it.
A team that does this once usually keeps doing it.
Try stack ranking with your team. ForceRank is built for exactly this — drag-and-drop ranking, async collection, automatic Schulze aggregation, and a side-by-side view that shows where individuals agreed and disagreed. Free for groups up to 20. No signup for participants. The whole exercise typically takes 5 minutes per person.