Core Philosophy: Information That Could Cut Either Way
Evaluate questions ask you to find the piece of information that would be MOST USEFUL in evaluating the argument — meaning the information whose answer could either strengthen OR weaken the argument depending on what the answer turns out to be.
The defining feature of a correct evaluate answer is its two-directional potential: if the answer to the question in the choice is "yes," the argument is strengthened; if "no," it is weakened (or vice versa). This distinguishes evaluate answers from pure strengtheners or weakeners.
Core Insight: The correct evaluate answer must swing the argument in BOTH directions. If an answer choice can only strengthen or only weaken, it is wrong.
What Makes an Evaluate Answer Correct
The Two-Direction Test Strategy
Identify the argument's core assumption
The most useful information is almost always related to the argument's central assumption or causal bridge.
For each choice, answer "yes" and "no"
Apply the two-direction test: if yes → what happens to the argument? If no → what happens? The correct choice must affect the argument differently depending on the answer.
Eliminate one-directional choices
If yes and no both strengthen (or both weaken) the argument, the choice does not evaluate — it only confirms or denies.
Select the most pivotal choice
The correct choice addresses the most central question in the argument — the answer with the greatest ability to change your assessment.
Worked Examples
Argument: "After the city opened a new subway line, traffic on the parallel highway decreased by 15%. The subway is responsible for the traffic reduction."
Same argument above. Consider: "Whether the subway line serves popular destinations."
10 Evaluate Question Traps
1. One-directional choice
A choice that always strengthens (or always weakens) regardless of the answer is not an evaluate choice.
2. Out-of-scope question
A choice asking about an issue tangential to the argument's core assumption.
3. Already-answered question
A choice whose answer is already provided in the stimulus.
4. True-either-way trap
The correct answer must be uncertain — if the answer is obviously yes or no, it fails to evaluate.
5. Overly specific question
A choice that asks about a minor detail rather than the pivotal assumption.
6. Mechanism vs evaluation
Explaining how something works doesn't evaluate whether the argument's conclusion follows.
7. General vs specific relevance
A choice that would evaluate all arguments of this type, not specifically this one, is too general.
8. Counter-example trap
A choice that could only produce a counter-example, not genuine two-directional evaluation.
9. Loaded question trap
A choice that assumes something not in the stimulus as its premise.
10. Statistical precision trap
Asking for exact statistics when the argument would not be changed by the precision.
Evaluate vs Strengthen/Weaken — The Critical Distinction
| Feature | Evaluate | Strengthen/Weaken |
|---|---|---|
| Answer form | A question whose answer could go either way | A statement that moves in one direction |
| Test to apply | Yes → strengthen AND no → weaken (or vice versa) | Does this make the conclusion more/less likely? |
| Correct answer looks like | A question about the key assumption or alternative cause | New information relevant to the argument's gap |
| Wrong answer pattern | Choices that only go one direction | Choices that are out of scope or neutral |
10 GMAT-Style Practice Questions
Select your answer, then reveal the step-by-step explanation. Each question reflects real GMAT difficulty and format.
A company replaced its existing customer service team with an AI chatbot. Customer satisfaction scores subsequently rose by 22%. Management concluded that the AI chatbot was responsible for the improvement. Which of the following would be MOST useful in evaluating the management's conclusion?
A nutrition researcher argues that daily consumption of green tea reduces the risk of developing type 2 diabetes, citing a study in which regular green tea drinkers had a 28% lower incidence of diabetes over ten years. Which of the following would be most useful in evaluating this argument?
An airline concluded that a new boarding process, which reversed the traditional back-to-front boarding order, would reduce average boarding time by 10 minutes per flight. Which of the following would be most useful in evaluating this conclusion?
A city installed speed cameras on all major roads and subsequently reported a 30% decrease in traffic fatalities. The city council concluded that the speed cameras caused the reduction in fatalities. Which of the following would be most useful in evaluating the council's conclusion?
A company introduced a four-day workweek and found that employee productivity, measured by output per employee per week, remained unchanged. Management concluded that a four-day workweek is just as productive as a five-day workweek. Which of the following would be most useful in evaluating this conclusion?
A school district implemented a new reading curriculum and reported that average reading test scores rose by 15 points over two years. The superintendent concluded that the new curriculum caused the improvement. Which of the following would be most useful in evaluating the superintendent's conclusion?
A tech company launched a mentorship program pairing junior engineers with senior engineers. Six months later, junior engineer retention improved from 70% to 85%. The HR director concluded that the mentorship program caused the retention improvement. Which of the following would be most useful in evaluating the HR director's conclusion?
A hospital introduced mandatory daily team briefings for medical staff and reported a 25% reduction in medical errors over the subsequent year. Administration concluded that the briefings caused the error reduction. Which of the following would be most useful in evaluating the administration's conclusion?
A retail chain began playing classical music in all its stores and found that average transaction value (amount spent per customer visit) increased by 12% over the following quarter. Marketing management concluded that the classical music caused customers to spend more. Which of the following would be most useful in evaluating this conclusion?
An insurance company reduced its claims processing time from 14 days to 6 days after implementing a new software platform. Management concluded that the software caused the improvement. Which of the following would be most useful in evaluating management's conclusion?
Key Takeaways
Apply yes/no to every choice. The correct answer affects the argument differently depending on the response.
The most useful evaluating information addresses the argument's core causal bridge or assumption.
If yes and no both strengthen (or both weaken), the choice cannot evaluate — eliminate it.
Questions asking whether an alternative explanation exists are among the strongest evaluate choices.