Decide what a strong answer covers before the interview.
Each question below includes “what to listen for” — turn those into the criteria on your scorecard.
Interview questions · Problem solving
Field-tested behavioral questions for assessing analytical thinking, judgment under uncertainty, and creativity — plus the evaluation guidance most question banks skip.
How to use these questions
Pick the questions that match what the role actually demands — analytical depth for a data role, calm triage for an on-call engineer, cross-functional influence for a product manager — and ask every candidate the same ones, in the same order. Consistency is what makes answers comparable: if each candidate gets a different interview, you end up comparing impressions, not evidence. Problem solving rewards a tidy narrative more than most competencies, so depth matters even more — two questions pursued through follow-ups beat six asked at the surface.
Each question below includes “what to listen for” — turn those into the criteria on your scorecard.
Memory flattens fast, and the most confident storyteller shouldn't be the tiebreaker.
If you want question variants tuned to a specific role, the free AI interview question generator produces behavioral questions like these for any competency and seniority.
The questions
What to listen for
Follow-ups
What to listen for
Follow-ups
What to listen for
Follow-ups
What to listen for
Follow-ups
What to listen for
Follow-ups
What to listen for
Follow-ups
What to listen for
Follow-ups
What to listen for
Follow-ups
What to listen for
Follow-ups
What to listen for
Follow-ups
What to listen for
Follow-ups
What to listen for
Follow-ups
What to listen for
Follow-ups
What to listen for
Follow-ups
What to listen for
Follow-ups
Evaluation
The questions get you stories. Evaluation is what turns stories into a hiring decision — and with problem-solving questions, the polished story is the trap.
A real problem, real constraints, and a real outcome — ideally one they can quantify. Weak answers stay generic (“I broke it down and prioritized”) or describe the method in the abstract without ever landing on what they actually did.
Strong candidates show the analysis and the judgment call inside it. Naming a framework — five whys, a 2x2, an MVP — proves nothing on its own; the signal is the decision they made within it and why.
The best problem-solving stories include a wrong turn, a tradeoff, or a constraint they couldn't beat. An answer where everything went perfectly is usually a rehearsed answer, not a real one.
What they'd do differently, what the experience taught them. “Why that approach and not another?” separates someone who reasoned their way to a solution from someone who got lucky once and kept the slide deck.
Red flags: a tidy framework with no real decision inside it; all outcome and no method; “we just brainstormed and it worked”; answers that can't survive one level of “why that approach and not another?”
Getting past a rehearsed answer is a matter of going deeper on one story rather than moving to the next question. Our guide to asking interview follow-up questions walks a single answer through seven dimensions — what to probe, and what each layer reveals.
Then put the judgment on a scorecard, not in your memory. Decide the criteria in advance (the “what to listen for” bullets are a starting set), rate each one independently right after the interview, and write down the evidence behind each rating. Scoring this way is what makes two interviewers comparable and a debrief about evidence rather than vibes. If you're assembling this from scratch, interview scorecard software exists to make that the default rather than a discipline you have to maintain by hand.
From questions to hiring evidence
The reason to systematize it is consistency at scale: the third problem-solving interview this month should be as rigorous as the first. Yardstick is a structured-interview ATS — teams create job-specific interview plans, run consistent interviews, and collect scorecards, so every interview produces usable hiring evidence. Questions like these live in an interview plan with the criteria attached; interviewers score against the same rubric; and AI assembles the evidence into a decision brief for the hiring team — with humans making the actual call. AI assists; the hiring decision stays with people.
You can start free: Yardstick's interview guide builder includes three lifetime interview guides, and the AI question generator is free to use. New to the approach? What is a structured interview explains the method these questions fit into.
Every interview produces usable hiring evidence when the criteria are set before the interview and scored on a scorecard.
FAQ
Behavioral questions ask what someone actually did, which is a far better signal than a brainteaser or a “what would you do” hypothetical. Hypotheticals reward people who think well on their feet in an interview room; they don't tell you whether the candidate has done the messy, constrained, real-world version of the work. “Tell me about a time” forces a concrete example you can probe with follow-ups.
Two or three, explored deeply with follow-ups — not a checklist of ten. Problem solving especially rewards a smooth, well-structured narrative, so depth is your defense: one story pursued through “why that approach?” and “what did the data not tell you?” reveals more than six surface answers. If problem solving is central to the role, give it its own interview in the loop.
Strong signals: a specific problem with real constraints, the actual thinking and decisions (not just a named framework), ownership of what didn't work, and reflection on what they'd change. Red flags: a tidy framework with no real decision inside it, all outcome and no method, “we just brainstormed and it worked,” and answers that fall apart on the first “why that approach and not another?” Score these against criteria you set in advance, rather than reacting to how confident the answer sounded.
Keep the competency, change the scope. Early-career candidates can draw on coursework, projects, or a part-time job — the incomplete-information, ambiguous-problem, and quick-unexpected-problem questions work well. For mid-level roles, weight functional problems of moderate complexity. For senior roles, emphasize the cross-functional, root-cause, systems-improvement, and financial-impact questions, and expect answers with real organizational stakes.
Decide the criteria before the interviews — problem definition, analysis, decision-making, execution, and reflection are a good starting set — and rate every candidate against the same ones, using concrete examples from their answers. Score independently right after each interview and write down the evidence behind each rating, then compare candidates on those notes rather than on who sounded most polished. A scorecard disciplines that decision; it shouldn't automate it — keep human judgment in the loop.
Generate role-specific behavioral questions for free, or see how Yardstick connects questions, scorecards, and hiring decisions in one workflow.