The one framework every engineer should know
Solve problems, ace interviews, and communicate like an executive using the MECE principle
Production-Ready Backend Templates (Sponsor)
Stop losing hours to backend boilerplate ❌
Start your next project with a production-ready template built on Directus, the open-source backend + CMS.
Automatic REST & GraphQL APIs — zero code required
Built-in auth, roles, and permissions
Visual data modeling + content management
Directus powers projects for Domino’s, Red Bull, US Space Force, Apple and thousands more.
Hi fellow High Growth Engineer, Jordan here 👋
Today’s article features a special guest, Torsten Walbaum, ex Director at Rippling and manager at Uber and Meta. He’s also the author of The Operator’s Handbook. I’ve been incredibly impressed with Torsten’s writing. He provides insanely in-depth, actionable advice and has a ton of experience and examples to back it up. So strap in.
In this article, you’ll learn a single, hugely applicable framework that takes your communication and problem-solving to a whole new level.
Without further ado, I’ll pass the mic 🎤 to Torsten 👏
I’ve interviewed hundreds of candidates as a hiring manager at Uber, Meta, and Rippling.
The most common reason I rejected candidates, by far, was a lack of structured problem-solving and communication.
If they make a mistake, I can help them correct it. Or if they spend too much time on an unimportant part of the question, I can redirect the conversation.
But if the candidate doesn’t show a structured thought process and I can’t follow their reasoning, that’s a huge problem–even if they get to the right answer.
Why? Because I need to make sure that they didn’t get lucky during the interview and “stumbled upon” the right answer. In other words, I want to know they can repeat that success on the job.
Over time, I realized that the best responses all have a similar structure. They sound somewhat like this:
✅ “There are generally four ways we can improve the performance of the system: A, B, C and D. We can eliminate A and B because they are not feasible under the constraints. Between the remaining options, C is the most promising because of [reason], so I’ll focus my answer on this option.”
What makes this a good answer? Instead of diving straight into the technical details, the candidate first gives an overview of possible solutions. As the interviewer, that shows me they analyzed the problem, evaluated tradeoffs, and intentionally chose their approach.
☝ But there’s one key requirement for this to work: The way the candidate structures the problem needs to be MECE (Mutually Exclusive and Collectively Exhaustive). Let’s take a look at what that means:
Let’s say we’re trying to segment our users based on which search engine they use.
We first need to make the user buckets mutually exclusive, i.e. not overlapping:
Since we removed the overlap between the segments, there’s no risk of double counting.
🚫 But there’s another issue: Some people aren’t captured by any of these buckets!
To fix this, we need to make the buckets collectively exhaustive:
👉 Now, our segmentation is crystal clear. These buckets capture everyone, and nobody falls into more than one.
The MECE framework was developed by the management consulting firm McKinsey, but its central ideas go back to the philosopher Aristotle. It can be applied to almost anything, including Engineering use cases, to sharpen your thinking and problem-solving.
⭐ What you’ll learn
In this article, you’ll learn how to apply this principle in 5 real-life use cases with actionable examples:
Acing interviews (we just covered this)
Diagnosing problems
Giving investigation updates
Explaining something to your colleagues
Deciding what to work on
Let’s get into it.
Use case #2: Diagnosing problems
You’re checking some dashboards and notice new signups dropped by 50%. When you flag this, everyone is freaking out and an incident is created.
This has been my life for years. It felt like every week I was in a war room dealing with an investigation, and the MECE principle has been a lifesaver during those chaotic periods.
Here’s what usually happens at the beginning of an incident. People throw out some hypotheses for what could be going on, and start investigating those:
❌ “I think I know what’s happening. The signup page might be down again; didn’t that happen last quarter? Or it could be related to that iOS update we shipped. But it’s also December, and it could just be a weak time for signups. Maybe it will go away on its own…”
What’s the problem with this? This set of random hypotheses doesn’t have a structure and is not collectively exhaustive. If you get lucky, you find the problem; but if you’re not lucky, you might spend the entire day chasing down dead ends.
And if you don’t finish the investigation quickly, you’ll likely start going in circles since you lose track of what you already looked into.
🎯 The solution: Break the problem into an issue tree that follows the MECE principle.
How? You start by dividing the possible root causes into high-level groups. Then, you break these groups down further, ensuring every possibility is captured and none of the buckets overlap.
✅ After: “There are two options for what could be happening: Either something is broken, or the data only makes it look like there’s a problem. If something’s broken, it could be isolated to a specific OS, or affect all devices. And the issue could be at any of the three steps of the signup process, so we’ll check them one by one.”
Don’t rely on luck to find the root cause. The MECE framework gives you a structure that ensures you will find it, and makes it easy to…
get everyone on the same page on what’s being investigated, and
divide the work between different people or teams.
One added benefit of using a structure like this: Since you have an overview of all possible things you need to check, you can prioritize what to investigate first.
For example, you can see that checking for a data issue or seasonality first can potentially save you a lengthy investigation.
Use case #3: Giving investigation updates
Continuing the example from above: Imagine you’re in a Slack channel for the incident you’re investigating, and you need to give regular status updates.
Given the severity of the issue, a lot of people are paying attention to how things are progressing.
This is what a lot of incident updates look like: